Research Article
BibTex RIS Cite

Automatic Speaker Gender Identification for the German Language

Year 2016, Volume: 4 Issue: 2, 79 - 83, 30.09.2016

Abstract

Authentication systems necessitate transmission, design and classification of biometric data in a secure manner. Moreover, in voice process of biometric can be obtained successful results by determining gender of speaker. In this study, the aim was to designed system taking German sound forms and properties for automatic recognition gender of speaker. Approximately 2658 German voice samples of words and clauses with differing lengths have been collected from 50 males and 50 females. This voice samples includes more than one word as a word. Features of these voice samples have been obtained using MFCC (Mel Frequency Cepstral Coefficients). Feature vectors of the voice samples obtained have been trained with such methods as Hidden Markov Model, Dynamic Time Warping and Artifical Neural Network. In the test phase, gender of a given voice sample has been identified taking the trained voice samples into consideration. Results and performances of the algorithms employed in the study for classification have been also demonstrated in a comparative manner.

References

  • [1] Quan, Jie-Fu, Fan Gang, Zeng F and Robert, Shannon etc., (“Importance of tonal envelope cues in Chinese speech recognition”, The Journal of the Acoustical Societct of America, Vol.104, No.1, pp.505-510, 1998. [2] Keiichi, Tokuda , Heiga, Zen and Alan, Black, “An HMM- Based Speech Synthesis System Applied to English”, Proc.of 2002 IEEE SSW, pp.227-230, 2012. [3] Douglas, Reynold , Walter, Andrews and Joseph, Campbell etc.,“The SuperSID Project: Exploiting High-Level Information for High-Accuracy Speaker Recognition”, In.Proc. ICASSP, Hong Kong, pp.784-787, 2003. [4] Lindasalwa, Muda and Mumtaj, Began, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Journal Computing, Vol.2, No.3, pp.138-143, ISBN 2151-9617, 2010. [5] Edmondo, Trentin and Marko, Gori, “A survey of hybrid ANN/HMM models for automatic speech recognition”, Elsevier Neurocomputing 37, pp.91-126, 2001. [6] Seok, Oh and Ching, Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, vol.35, issue 1, pp.229-244, 2002. [7] Theodore L. Perry, Ralph N. Ohde,a) and Daniel H. Ashmead, ” The acoustic bases for gender identification from children’s voices”, J. Acoust. Soc. Am. 109 (6), pp.2988-2998, 2001. [8] Douglas, Reynolds, Thomas, Quatieri and Robert, Dunn, “Speaker Verification using Adapted Gaussian Mixture Models”, Digital Signal Processing 10, pp.19-41, 2000. [9] Wouter, Gevaert, Georgi, Tsenov and Valeri, Mladenov, “Neural networks used for speech recognition”, Journal of Automatic Control, Vol.20, pp.1-7, 2010. [10] Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, “ Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Jornal of Computing, Vol.2, No.3, pp.138-143, ISSN 2151-9617, 2010. [11] Eluned, Parris, Micheal, Carey, “Language Independent Gender Identification”, Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on, Vol.2, pp.685-688, 1996. [12] Lihang, Li, Dongqing, Chen and Sarang, Lakare etc, “Image segmentation approach to extract colon lümen through colonic material taggng and hidden markov random field model for virtual colonoskopy”, Medical Imaging, 2002. [13] Seok, Oh and Ching, Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, Vol.35, No.1, pp.229-244, 2002.
Year 2016, Volume: 4 Issue: 2, 79 - 83, 30.09.2016

Abstract

References

  • [1] Quan, Jie-Fu, Fan Gang, Zeng F and Robert, Shannon etc., (“Importance of tonal envelope cues in Chinese speech recognition”, The Journal of the Acoustical Societct of America, Vol.104, No.1, pp.505-510, 1998. [2] Keiichi, Tokuda , Heiga, Zen and Alan, Black, “An HMM- Based Speech Synthesis System Applied to English”, Proc.of 2002 IEEE SSW, pp.227-230, 2012. [3] Douglas, Reynold , Walter, Andrews and Joseph, Campbell etc.,“The SuperSID Project: Exploiting High-Level Information for High-Accuracy Speaker Recognition”, In.Proc. ICASSP, Hong Kong, pp.784-787, 2003. [4] Lindasalwa, Muda and Mumtaj, Began, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Journal Computing, Vol.2, No.3, pp.138-143, ISBN 2151-9617, 2010. [5] Edmondo, Trentin and Marko, Gori, “A survey of hybrid ANN/HMM models for automatic speech recognition”, Elsevier Neurocomputing 37, pp.91-126, 2001. [6] Seok, Oh and Ching, Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, vol.35, issue 1, pp.229-244, 2002. [7] Theodore L. Perry, Ralph N. Ohde,a) and Daniel H. Ashmead, ” The acoustic bases for gender identification from children’s voices”, J. Acoust. Soc. Am. 109 (6), pp.2988-2998, 2001. [8] Douglas, Reynolds, Thomas, Quatieri and Robert, Dunn, “Speaker Verification using Adapted Gaussian Mixture Models”, Digital Signal Processing 10, pp.19-41, 2000. [9] Wouter, Gevaert, Georgi, Tsenov and Valeri, Mladenov, “Neural networks used for speech recognition”, Journal of Automatic Control, Vol.20, pp.1-7, 2010. [10] Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, “ Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Jornal of Computing, Vol.2, No.3, pp.138-143, ISSN 2151-9617, 2010. [11] Eluned, Parris, Micheal, Carey, “Language Independent Gender Identification”, Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on, Vol.2, pp.685-688, 1996. [12] Lihang, Li, Dongqing, Chen and Sarang, Lakare etc, “Image segmentation approach to extract colon lümen through colonic material taggng and hidden markov random field model for virtual colonoskopy”, Medical Imaging, 2002. [13] Seok, Oh and Ching, Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, Vol.35, No.1, pp.229-244, 2002.
There are 1 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Araştırma Articlessi
Authors

Cigdem Bakır

Publication Date September 30, 2016
Published in Issue Year 2016 Volume: 4 Issue: 2

Cite

APA Bakır, C. (2016). Automatic Speaker Gender Identification for the German Language. Balkan Journal of Electrical and Computer Engineering, 4(2), 79-83.

All articles published by BAJECE are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.Creative Commons Lisansı