Research Article
BibTex RIS Cite
Year 2018, Volume: 6 Issue: 2, 122 - 128, 30.04.2018
https://doi.org/10.17694/bajece.419557

Abstract

References

  • [1] Mohammed Shami, Wemen Verhelst, “An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech”, Speech Communication, 2007, 49(3), p.201-212. [2] Lijiang Chen , Xia Mao, Yuli Xue , Lee Lung Cheng , “Speech emotion recognition: Features and classification models”, Digital Signal Processing, 22(6), 2012, p.1154-1160. [3] Ling He, Margaret Lech, Namunu C. Maddage, Nicholas B. Allen, “Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speech”, Biomedical Signal Processing and Control, 2011, 6(2), p.139-146. [4] Tim Polzehl , Shiva Sundaram , Hamed Ketabdar , Michael Wagner and Florian Metze, “Emotion Classification in Children’s Speech Using Fusion of Acoustic and Linguistic Features”, Interspeech 2009: 10th Annual Conference of the International Speech Communication Association, 2009. [5] Halicioglu, Tin Lay Nwe, Foo Say Wei and Liyanage C De Silva, “Speech Based Emotion Classification”, TENCON 2001. Proceedings of IEEE Region 10 International Conference on Electrical and Electronic Technology, 2001. [6] Jasmine Bhaskar, Sruthi Ka and Prema Nedungadi, “Hybrid Approach for Emotion Classification of Audio Conversation Based on Text and Speech Mining”, Procedia Computer Science, 2015, 46, p.635-643. [7] Jinkyu Lee and Ivan Tashev. “High-level Feature Representation using Recurrent Neural Network for Speech Emotion Recognition”, Interspeech 2015, 2015. [8] S.Oh and C.Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, 2002, 35(1), p.229-244. [9] Dimitros and Kontropulos, “Emotional speech recognition: Resources, features, and methods”, Speech Communication, 2006, 48(9), p.1162-1181. [10] D.A. Reynolds and R.C. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE Trans. Speech Audio Proc., 1995, 3, p. 72–83. [11] Seok, Oh and Ching, Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, 2002, 35(1), p.229-244. [12] Lihang, Li, Dongqing, Chen and Sarang, Lakare etc, “Image segmentation approach to extract colon lümen through colonic material taggng and hidden markov random field model for virtual colonoskopy”, Medical Imaging, 2002. [13] Edmondo, Trentin and Marko, Gori, “A survey of hybrid ANN/HMM models for automatic speech recognition”, Elsevier Neurocomputing 37, p.91-126, 2001. [14] Lindasalwa, Muda and Mumtaj, Began, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Journal Computing, 2010, 2(3), p.138-143, ISBN 2151-9617, 2010. [15] Hao Hu, Ming-XingXu, and Wei Wu, “GMM Supervector Based SVM with Spectral Features for Speech Emotion Recognition”, Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, 2007. [16] Cigdem Bakir, “Automatic Speaker Gender Identification for the German Language”, Balkan Journal of Electrical&Computer Engineering, 2015, 4(2), p.79-83, 2015. [17] Cigdem Bakir, “Automatic voice and Speech Recognition System for the German Language”, 1st International Conference on Engineering Technology and Applied Sciences, 2016, p.131-134. [18] Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, “ Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Journal of Computing, vol.2, issue 3, p.138-143, ISSN 2151-9617, 2010. [19] M., Fahid M. and M.A, “Robust Voice conversion systems using MFDWC”, 2008 International Symposium on Telecommunications, p.778-781, 2008.

Speech Emotion Classification and Recognition with different methods for Turkish Language

Year 2018, Volume: 6 Issue: 2, 122 - 128, 30.04.2018
https://doi.org/10.17694/bajece.419557

Abstract

In several
application, emotion  recognition from
the speech signal has been research topic since many years. To determine the
emotions from the speech signal, many systems have been developed. To solve the
speaker emotion recognition problem, hybrid model is proposed to classify five
speech emotions, including  anger,
sadness, fear, happiness and neutral. The aim this study of was to actualize
automatic voice and speech emotion recognition system using hybrid model taking
Turkish sound forms and properties into consideration.  Approximately 3000 Turkish voice samples of
words and clauses with differing lengths have been collected from 25 males
and  25 females. In this study, an
authentic and unique  Turkish  database has been used. Features of these
voice samples have been obtained using Mel Frequency Cepstral Coefficients
(MFCC) and Mel Frequency Discrete Wavelet Coefficients (MFDWC). Moreover,
spectral features of these voice samples have been obtained  using Support Vector Machine (SVM). Feature
vectors of the voice samples obtained have been trained with such methods as
Gauss Mixture Model( GMM), Artifical Neural Network (ANN), Dynamic Time Warping
(DTW), Hidden Markov Model (HMM) and hybrid model(GMM with combined SVM).  This hybrid model has been carried out by
combining with SVM and GMM.  In first
stage of this model, with SVM has been performed  subsets obtained vector of  spectral features. In the second  phase, a set of training and tests have been
formed from these spectral features. In the test phase, owner of a given voice
sample has been identified taking the trained voice samples into consideration.
Results and performances of the algorithms employed in the study for
classification have been also demonstrated in a comparative manner.         

References

  • [1] Mohammed Shami, Wemen Verhelst, “An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech”, Speech Communication, 2007, 49(3), p.201-212. [2] Lijiang Chen , Xia Mao, Yuli Xue , Lee Lung Cheng , “Speech emotion recognition: Features and classification models”, Digital Signal Processing, 22(6), 2012, p.1154-1160. [3] Ling He, Margaret Lech, Namunu C. Maddage, Nicholas B. Allen, “Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speech”, Biomedical Signal Processing and Control, 2011, 6(2), p.139-146. [4] Tim Polzehl , Shiva Sundaram , Hamed Ketabdar , Michael Wagner and Florian Metze, “Emotion Classification in Children’s Speech Using Fusion of Acoustic and Linguistic Features”, Interspeech 2009: 10th Annual Conference of the International Speech Communication Association, 2009. [5] Halicioglu, Tin Lay Nwe, Foo Say Wei and Liyanage C De Silva, “Speech Based Emotion Classification”, TENCON 2001. Proceedings of IEEE Region 10 International Conference on Electrical and Electronic Technology, 2001. [6] Jasmine Bhaskar, Sruthi Ka and Prema Nedungadi, “Hybrid Approach for Emotion Classification of Audio Conversation Based on Text and Speech Mining”, Procedia Computer Science, 2015, 46, p.635-643. [7] Jinkyu Lee and Ivan Tashev. “High-level Feature Representation using Recurrent Neural Network for Speech Emotion Recognition”, Interspeech 2015, 2015. [8] S.Oh and C.Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, 2002, 35(1), p.229-244. [9] Dimitros and Kontropulos, “Emotional speech recognition: Resources, features, and methods”, Speech Communication, 2006, 48(9), p.1162-1181. [10] D.A. Reynolds and R.C. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE Trans. Speech Audio Proc., 1995, 3, p. 72–83. [11] Seok, Oh and Ching, Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, 2002, 35(1), p.229-244. [12] Lihang, Li, Dongqing, Chen and Sarang, Lakare etc, “Image segmentation approach to extract colon lümen through colonic material taggng and hidden markov random field model for virtual colonoskopy”, Medical Imaging, 2002. [13] Edmondo, Trentin and Marko, Gori, “A survey of hybrid ANN/HMM models for automatic speech recognition”, Elsevier Neurocomputing 37, p.91-126, 2001. [14] Lindasalwa, Muda and Mumtaj, Began, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Journal Computing, 2010, 2(3), p.138-143, ISBN 2151-9617, 2010. [15] Hao Hu, Ming-XingXu, and Wei Wu, “GMM Supervector Based SVM with Spectral Features for Speech Emotion Recognition”, Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, 2007. [16] Cigdem Bakir, “Automatic Speaker Gender Identification for the German Language”, Balkan Journal of Electrical&Computer Engineering, 2015, 4(2), p.79-83, 2015. [17] Cigdem Bakir, “Automatic voice and Speech Recognition System for the German Language”, 1st International Conference on Engineering Technology and Applied Sciences, 2016, p.131-134. [18] Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, “ Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Journal of Computing, vol.2, issue 3, p.138-143, ISSN 2151-9617, 2010. [19] M., Fahid M. and M.A, “Robust Voice conversion systems using MFDWC”, 2008 International Symposium on Telecommunications, p.778-781, 2008.
There are 1 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Araştırma Articlessi
Authors

Cigdem Bakır

Mecit Yuzkat This is me

Publication Date April 30, 2018
Published in Issue Year 2018 Volume: 6 Issue: 2

Cite

APA Bakır, C., & Yuzkat, M. (2018). Speech Emotion Classification and Recognition with different methods for Turkish Language. Balkan Journal of Electrical and Computer Engineering, 6(2), 122-128. https://doi.org/10.17694/bajece.419557

All articles published by BAJECE are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.Creative Commons Lisansı