Research Article

Unified voice analysis: speaker recognition, age group and gender estimation using spectral features and machine learning classifiers

Number: 057 June 30, 2024
EN

Unified voice analysis: speaker recognition, age group and gender estimation using spectral features and machine learning classifiers

Abstract

Predicting speaker's personal traits from voice data has been a subject of attention in many fields such as forensic cases, automatic voice response systems, and biomedical applications. Within the scope of this study, gender and age group prediction was made with the voice data recorded from 24 volunteers. Mel-frequency cepstral coefficients (MFCC) were extracted from the audio data as hybrid time/frequency domain features, and fundamental frequencies and formants were extracted as frequency domain features. These obtained features were fused in a feature pool and age group and gender estimation studies were carried out with 4 different machine learning algorithms. According to the results obtained, the age groups of the participants could be classified with 93% accuracy and the genders with 99% accuracy with the Support Vector Machines algorithm. Also, speaker recognition task was successfully completed with 93% accuracy with the Support Vector Machines.

Keywords

References

  1. [1] A. Rana, A. Dumka, R. Singh, M. Rashid, N. Ahmad, and M. K. Panda, “An Efficient Machine Learning Approach for Diagnosing Parkinson’s Disease by Utilizing Voice Features,” Electronics (Basel), vol. 11, no. 22, p. 3782, 2022.
  2. [2] E. H. Houssein, A. Hammad, and A. A. Ali, “Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review,” Neural Comput Appl, vol. 34, no. 15, pp. 12527–12557, 2022.
  3. [3] E. Dritsas and M. Trigka, “Stroke risk prediction with machine learning techniques,” Sensors, vol. 22, no. 13, p. 4670, 2022.
  4. [4] M. M. Kumbure, C. Lohrmann, P. Luukka, and J. Porras, “Machine learning techniques and data for stock market forecasting: A literature review,” Expert Syst Appl, vol. 197, p. 116659, 2022.
  5. [5] N. N. Arslan, D. Ozdemir, and H. Temurtas, “ECG heartbeats classification with dilated convolutional autoencoder,” Signal Image Video Process, vol. 18, no. 1, pp. 417–426, 2024, doi: 10.1007/s11760-023-02737-2.
  6. [6] S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine learning: A review of classification techniques,” Emerging artificial intelligence applications in computer engineering, vol. 160, no. 1, pp. 3–24, 2007.
  7. [7] S. Duan, J. Zhang, P. Roe, and M. Towsey, “A survey of tagging techniques for music, speech and environmental sound,” Artif Intell Rev, vol. 42, no. 4, pp. 637–661, 2014, doi: 10.1007/s10462-012-9362-y.
  8. [8] S. Jayalakshmy and G. F. Sudha, “GTCC-based BiLSTM deep-learning framework for respiratory sound classification using empirical mode decomposition,” Neural Comput Appl, vol. 33, no. 24, pp. 17029–17040, 2021, doi: 10.1007/s00521-021-06295-x.

Details

Primary Language

English

Subjects

Speech Recognition

Journal Section

Research Article

Publication Date

June 30, 2024

Submission Date

January 19, 2024

Acceptance Date

February 12, 2024

Published in Issue

Year 2024 Number: 057

APA
Akgün, K., & Sadık, Ş. A. (2024). Unified voice analysis: speaker recognition, age group and gender estimation using spectral features and machine learning classifiers. Journal of Scientific Reports-A, 057, 12-26. https://doi.org/10.59313/jsr-a.1422792
AMA
1.Akgün K, Sadık ŞA. Unified voice analysis: speaker recognition, age group and gender estimation using spectral features and machine learning classifiers. JSR-A. 2024;(057):12-26. doi:10.59313/jsr-a.1422792
Chicago
Akgün, Kaya, and Şerif Ali Sadık. 2024. “Unified Voice Analysis: Speaker Recognition, Age Group and Gender Estimation Using Spectral Features and Machine Learning Classifiers”. Journal of Scientific Reports-A, nos. 057: 12-26. https://doi.org/10.59313/jsr-a.1422792.
EndNote
Akgün K, Sadık ŞA (June 1, 2024) Unified voice analysis: speaker recognition, age group and gender estimation using spectral features and machine learning classifiers. Journal of Scientific Reports-A 057 12–26.
IEEE
[1]K. Akgün and Ş. A. Sadık, “Unified voice analysis: speaker recognition, age group and gender estimation using spectral features and machine learning classifiers”, JSR-A, no. 057, pp. 12–26, June 2024, doi: 10.59313/jsr-a.1422792.
ISNAD
Akgün, Kaya - Sadık, Şerif Ali. “Unified Voice Analysis: Speaker Recognition, Age Group and Gender Estimation Using Spectral Features and Machine Learning Classifiers”. Journal of Scientific Reports-A. 057 (June 1, 2024): 12-26. https://doi.org/10.59313/jsr-a.1422792.
JAMA
1.Akgün K, Sadık ŞA. Unified voice analysis: speaker recognition, age group and gender estimation using spectral features and machine learning classifiers. JSR-A. 2024;:12–26.
MLA
Akgün, Kaya, and Şerif Ali Sadık. “Unified Voice Analysis: Speaker Recognition, Age Group and Gender Estimation Using Spectral Features and Machine Learning Classifiers”. Journal of Scientific Reports-A, no. 057, June 2024, pp. 12-26, doi:10.59313/jsr-a.1422792.
Vancouver
1.Kaya Akgün, Şerif Ali Sadık. Unified voice analysis: speaker recognition, age group and gender estimation using spectral features and machine learning classifiers. JSR-A. 2024 Jun. 1;(057):12-26. doi:10.59313/jsr-a.1422792