Research Article

A machine learning approach for voice pathology detection using mode decomposition-based acoustic cepstral features

Volume: 4 Number: 4 December 30, 2024
EN

A machine learning approach for voice pathology detection using mode decomposition-based acoustic cepstral features

Abstract

In this paper, a mode decomposition analysis-based adaptive approach is proposed to provide high diagnostic performance for automated voice pathology detection systems. The aim of the study is to develop a reliable and effective system using adaptive cepstral domain features derived from the empirical mode decomposition (EMD), ensemble empirical mode decomposition (EEMD), and complete empirical mode decomposition with adaptive noise (CEEMDAN) methods. The descriptive feature sets are obtained by applying mel-frequency cepstral coefficients (MFCCs) and their derivatives, linear predictive coefficients (LPCs) and linear predictive cepstral coefficients (LPCCs) techniques to each decomposition level. The class-balanced data are generated on the VOice ICar fEDerico II database samples using the synthetic minority oversampling technique (SMOTE). The ReliefF algorithm is used to select the most effective and distinctive features. A combination of selected features and a support vector machine (SVM) classifier is used to identify pathological voices. In the pathology detection approach, the results show that the cepstral features based on EMD and SVM-cubic achieves the highest performance with 99.85\% accuracy, 99.85\% F1-score and 0.997 Matthews correlation coefficient (MCC). In pathology-type classification, the cepstral features based on EEMD and SVM-quadratic approach provided the highest performance with 96.49\% accuracy, 96.46\% F1 and 0.949 MCC values. The comprehensive results of this study reveal that mode decomposition-based approaches are more successful and effective than traditional methods for detection and classification of pathological voices.

Keywords

Voice pathology, SMOTE algorithm, mode decomposition, cepstral-domain coefficients, ReliefF algorithm, support vector machine

References

  1. [1] Hegde, S., Shetty, S., Rai, S. and Dodderi, T. A survey on machine learning approaches for automatic detection of voice disorders. Journal of Voice, 33(6), 947.e11-947.e33, (2019).
  2. [2] Ding, H., Gu, Z., Dai, P., Zhou, Z., Wang, L. and Wu, X. Deep connected attention (DCA) ResNet for robust voice pathology detection and classification. Biomedical Signal Processing and Control, 70, 102973, (2021).
  3. [3] Verde, L., De Pietro, G. and Sannino, G. Voice disorder identification by using machine learning techniques. IEEE Access, 6, 16246-16255, (2018).
  4. [4] Islam, R., Abdel-Raheem, E. and Tarique, M. Voice pathology detection using convolutional neural networks with electroglottographic (EGG) and speech signals. Computer Methods and Programs in Biomedicine Update, 2, 100074, (2022).
  5. [5] Chen, L. and Chen, J. Deep neural network for automatic classification of pathological voice signals. Journal of Voice, 36(2), 288.e15-288.e24, (2022).
  6. [6] Al-Nasheri, A., Muhammad, G., Alsulaiman, M., Ali, Z., Mesallam, T.A., Farahat, M. et al. An investigation of multidimensional voice program parameters in three different databases for voice pathology detection and classification. Journal of Voice, 31(1), 113.e9-113.e18, (2017).
  7. [7] Brockmann, M., Drinnan, M.J., Storck, C. and Carding, P.N. Reliable jitter and shimmer measurements in voice clinics: the relevance of vowel, gender, vocal intensity, and fundamental frequency effects in a typical clinical task. Journal of Voice, 25(1), 44-53, (2011).
  8. [8] Ferrand, C.T. Harmonics-to-noise ratio: an index of vocal aging. Journal of Voice, 16(4), 480-487, (2002).
  9. [9] Neto, B.G.A., Fechine, J.M., Costa, S.C. and Muppa, M. Feature estimation for vocal fold edema detection using short-term cepstral analysis. In Proceedings, IEEE 7th International Symposium on BioInformatics and BioEngineering, pp. 1158-1162, Boston, USA, (2007, October).
  10. [10] Gelzinis, A., Verikas, A. and Bacauskiene, M. Automated speech analysis applied to laryngeal disease categorization. Computer Methods and Programs in Biomedicine, 91(1), 36-47, (2008).
APA
Arslan, Ö. (2024). A machine learning approach for voice pathology detection using mode decomposition-based acoustic cepstral features. Mathematical Modelling and Numerical Simulation With Applications, 4(4), 469-494. https://doi.org/10.53391/mmnsa.1473574
AMA
1.Arslan Ö. A machine learning approach for voice pathology detection using mode decomposition-based acoustic cepstral features. MMNSA. 2024;4(4):469-494. doi:10.53391/mmnsa.1473574
Chicago
Arslan, Özkan. 2024. “A Machine Learning Approach for Voice Pathology Detection Using Mode Decomposition-Based Acoustic Cepstral Features”. Mathematical Modelling and Numerical Simulation With Applications 4 (4): 469-94. https://doi.org/10.53391/mmnsa.1473574.
EndNote
Arslan Ö (December 1, 2024) A machine learning approach for voice pathology detection using mode decomposition-based acoustic cepstral features. Mathematical Modelling and Numerical Simulation with Applications 4 4 469–494.
IEEE
[1]Ö. Arslan, “A machine learning approach for voice pathology detection using mode decomposition-based acoustic cepstral features”, MMNSA, vol. 4, no. 4, pp. 469–494, Dec. 2024, doi: 10.53391/mmnsa.1473574.
ISNAD
Arslan, Özkan. “A Machine Learning Approach for Voice Pathology Detection Using Mode Decomposition-Based Acoustic Cepstral Features”. Mathematical Modelling and Numerical Simulation with Applications 4/4 (December 1, 2024): 469-494. https://doi.org/10.53391/mmnsa.1473574.
JAMA
1.Arslan Ö. A machine learning approach for voice pathology detection using mode decomposition-based acoustic cepstral features. MMNSA. 2024;4:469–494.
MLA
Arslan, Özkan. “A Machine Learning Approach for Voice Pathology Detection Using Mode Decomposition-Based Acoustic Cepstral Features”. Mathematical Modelling and Numerical Simulation With Applications, vol. 4, no. 4, Dec. 2024, pp. 469-94, doi:10.53391/mmnsa.1473574.
Vancouver
1.Özkan Arslan. A machine learning approach for voice pathology detection using mode decomposition-based acoustic cepstral features. MMNSA. 2024 Dec. 1;4(4):469-94. doi:10.53391/mmnsa.1473574