PERFORMING ACCURATE SPEAKER RECOGNITION BY USE OF SVM AND CEPSTRAL FEATURES
Abstract
The task of performing speaker recognition over voice recordings is an active research area in the relevant literature in which many applications has been proposed so far. In this study, speaker recognition is performed over cepstral features extracted from raw voice recordings. Some of the most prominent cepstral feature selection methods, namely, LPC, LPCC, MFCC, PLP and RASTA-PLP are utilized and their contribution to the performance of the applied method is investigated. Obtained features are handled by SVM classification algorithm to finalize the speaker recognition task. As a result, it is observed that cepstral feature selection methods such as LPCC and MFCC combined with SVM classification result in around 97% accuracy.
Keywords
References
- "Principles of Data Acquisition and Conversion". Texas Instruments. April 2015. http://www.ti.com/lit/an/sbaa051a/sbaa051a.pdf (08.06.2017)
- Ambikairajah, E. (2007, December). Emerging features for speaker recognition. In Information, Communications & Signal Processing, 2007 6th International Conference on (pp. 1-7). IEEE.
- Kurzekar, P. K., Deshmukh, R. R., Waghmare, V. B., & Shrishrimal, P. P. (2014). A comparative study of feature extraction techniques for speech recognition system. International Journal of Innovative Research in Science, Engineering and Technology, 3(12), 18006-18016.
- Makhoul, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4), 561-580.
- Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE transactions on acoustics, speech, and signal processing, 28(4), 357-366.
- Yusnita, M. A., Paulraj, M. P., Yaacob, S., Fadzilah, M. N., & Shahriman, A. B. (2013). Acoustic analysis of formants across genders and ethnical accents in Malaysian English using ANOVA. Procedia Engineering, 64, 385-394.
- Hermansky, H. (1990). Perceptual linear predictive (PLP) analysis of speech. the Journal of the Acoustical Society of America, 87(4), 1738-1752.
- Dumitru, C. O., & Gavat, I. (2006, June). A comparative study of feature extraction methods applied to continuous speech recognition in Romanian Language. In Multimedia Signal Processing and Communications, 48th International Symposium ELMAR-2006 focused on (pp. 115-118). IEEE.
Details
Primary Language
English
Subjects
-
Journal Section
Research Article
Publication Date
January 1, 2019
Submission Date
May 16, 2018
Acceptance Date
July 18, 2018
Published in Issue
Year 2018 Volume: 3 Number: 2