EMOTION RECOGNITION STUDY BETWEEN DIFFERENT DATA SETS
Year 2014,
Volume: 16 Issue: 48, 21 - 29, 01.09.2014
Cevahir Parlak
,
Banu Diri
Abstract
Speaking is the most important communication tool between people. By speaking people
can transfer not only their thoughts but also their feelings to each other, too. When speaking
we can estimate the conception, feeling, gender and age of the person we are talking to. In this
study, a new sense of a data set called EmoSTAR is presented and cross tests with Berlin
emotions data set is made. In the cross-test, one of the data set is used as training set while the
other data set is used as test set. Additionally, in the study the performance of feature selector
has been also examined. For feature extraction MFCC number is increased from 12 to 24 in
openSMILE Emobase and Emo_large configurations and also developed by adding the
Harmonic-to-Noise-Ratio features. The feature selection and classification is made by the Weka
tool. EmoSTAR is currently under development for more emotion and sample type
References
- Batliner A., Steidl S., Schüller B., Seppi D., Vogt T., Wagner J., Devillers L., Vidrascu L., Aharonson V., Kessous L. ve Amir N. (2009): “Whodunnit Searching for the Most Important Feature Types Signalling Emotion-Related User States in Speech: Appendix”, Preprint submitted to Elsevier 24 January 2010, Computer Speech and Language, doi:10.1016/j.csl.2009.12.003.
- Bhargava M., Polzehl T. (2012): “Improving Automatic Emotion Recognition From Speech Using Rhythm and Temporal Feature”, Proceedings of Icecit-2012 Published by Elsevier, s.139.
- Black A. W., Bunnel H. T., Dou Y., Kumar P., Metze F., Perry D., Polzehl T., Prahallad K., Steidl S. ve Vaughn C. (2011): “New Parameterization for Emotional Speech Synthesis”, CSLP Proc., Johns Hopkins Summer Workshop, Baltimore.
- Chavhan Y., Dhore M. L., Yesaware P. (2010): “Speech Emotion Recognition Using Support Vector Machine”, International Journal of Computer Applications (0975 - 8887), Cilt 1, No. 20.
- Eyben F., Wöllmer M., Schuller S. (2009): “openSMILE-The Munich Versatile and Fast Open- Source Audio Feature Extractor”, In Proc. ACM Multimedia (MM), ACM, Florence, Italy, ACM, ISBN 978-1-60558-933-6, pp. 1459-1462, doi:10.1145/1873951.1874246.
- Eyben F., Batliner A., Schuller B., Seppi D., Steidl S. (2010): "Cross-Corpus Classification of Realistic Emotions – Some Pilot Experiments”, In Proc. 7th Intern. Conf. on Language Resources and Evaluation (LREC 2010), Valletta, Elra, 2010.19.-21.05.2010.
- Hall M., Frank E., Holmes G., Pfahringer B., Reutemann P., Witten I. H. (2009): “The Weka data mining software: an update”. Sigkdd Explor. Newsl. Cilt 11, s.10–18.
- He L. (2010): “Stress and Emotion Recognition in Natural Speech in the Work and Family Environments”, PhD, Rmit University.
- Iida A., Campbell N., Higuchi F., Yasumura M. (2003): "A corpus-based speech synthesis system with emotion”, Speech Communication, Cilt 40, s.161–187
- Mairesse F., Polifroni J., Di Fabbrizio G. (2012): ”Can Prosody Inform Sentiment Analysis? Experiments On Short Spoken Reviews”, Nokia Research, ICASSP.
- Oflazoğlu C., Yildirim S. (2013): “Recognizing emotion from Turkish speech using acoustic features”, Eurasip Journal on Audio, Speech, and Music Processing.
- Pan Y., Shen P., Shen L. (2012): “Speech Emotion Recognition Using Support Vector Machine”, International Journal of Smart Home, Cilt 6, No. 2.
- Ramakrishnan S. (2012): “Recognition of Emotion from Speech: A Review”, International Journal of Speech Technology, Cilt 15, Sayı 2, s.99-117.
- Schuller B., Batliner A., Seppi D., Steidl S., Vogt T., Wagner J., Devillers L., Vidrascu L., Amir N., Kessous L., Aharonson V. (2007): “The Relevance of Feature Type for the Automatic Recognition of Emotional User States: Low Level Descriptors and Functionals”, Interspeech, Antwerp.
- Schuller B., Vlasenko B., Eyben F., Rigoll G., Wendemuth A. (2009): ”Acoustic Emotion Recognition: A Benchmark Comparison of Performances”, IEEE Workshop on Automatic Speech Recognition & Understanding, Asru 2009 Proc., Merano.
- Schuller B., Vlasenko B., Eyben F., Wollmer M., Stuhlsatz A.,Wendemuth A., Rigoll G. (2010): “Cross-corpus acoustic emotion recognition: variances and strategies”, IEEE Transactions on Affective Computing, Cilt 1, Sayı 2, s.119–131.
- Schuller B., Zhang Z., Weninger F., Rigoll G. (2011): "Selecting Training Data for Cross- Corpus Speech Emotion Recognition: Prototypicality vs. Generalization”, Speech Processing Conference, Avios Proc., Telaviv.
- Shahzadi A., Ahmadyfard A., Yaghmaie K., Harimi A. (2013): “Recognition of Emotion In Speech Using Spectral Patterns”, Malaysian Journal of Computer Science, Cilt 26, Sayı 2, s.143-158
- Wu S., Falk T. H., Chan W. (2010): "Automatic speech emotion recognition using modulation spectral features", Speech Communication, doi:10.1016/j.specom.2010.08.013.
- Wu D., Parsons D.T., Narayanan S. S. (2011): “Acoustic Feature Analysis in Speech Emotion Primitives Estimation”, Interspeech 2011, 26-30,Makuhari, Chiba, Japan.
- Zhang Z., Weninger F., Wöllmer M., Schuller B. (2011): "Unsupervised Learning in Cross- Corpus Acoustic Emotion Recognition”, IEEE Workshop on Automatic Speech Recognition & Understanding, Asru Proc., Waikoloa, Hawaii.
FARKLI VERİ SETLERİ ARASINDA DUYGU TANIMA ÇALIŞMASI
Year 2014,
Volume: 16 Issue: 48, 21 - 29, 01.09.2014
Cevahir Parlak
,
Banu Diri
Abstract
İnsanlar arasındaki en önemli iletişim aracı konuşmadır. Konuşma ile insanlar birbirlerine
sadece düşüncelerini değil duygularını da aktarabilirler. Konuşma ile karşımızdaki kişinin
düşüncesini, duygusunu, cinsiyetini ve yaşını da tahmin edebilir. Bu çalışmada EmoSTAR adlı
yeni bir duygu veri seti sunulmuş ve Berlin Duygu Veri seti ile çapraz testler yapılmıştır. Çapraz
testlerde, setlerden biri eğitim diğeri test seti olarak kullanılmıştır. Ayrıca, çalışmada özellik
seçicilerin performansı da incelenmiştir. Özellik çıkarımı için openSMILE Emobase ve
Emo_large konfigürasyonlarında MFCC sayısı 12’den 24’e çıkartılarak ve Harmonik Gürültü
Oranı özellikleri eklenerek gerçekleştirilmiştir. Özellik seçme ve sınıflandırma ise Weka
aracıyla yapılmıştır. EmoSTAR halen daha fazla duygu türü ve örnek için geliştirilme
aşamasındadır.
References
- Batliner A., Steidl S., Schüller B., Seppi D., Vogt T., Wagner J., Devillers L., Vidrascu L., Aharonson V., Kessous L. ve Amir N. (2009): “Whodunnit Searching for the Most Important Feature Types Signalling Emotion-Related User States in Speech: Appendix”, Preprint submitted to Elsevier 24 January 2010, Computer Speech and Language, doi:10.1016/j.csl.2009.12.003.
- Bhargava M., Polzehl T. (2012): “Improving Automatic Emotion Recognition From Speech Using Rhythm and Temporal Feature”, Proceedings of Icecit-2012 Published by Elsevier, s.139.
- Black A. W., Bunnel H. T., Dou Y., Kumar P., Metze F., Perry D., Polzehl T., Prahallad K., Steidl S. ve Vaughn C. (2011): “New Parameterization for Emotional Speech Synthesis”, CSLP Proc., Johns Hopkins Summer Workshop, Baltimore.
- Chavhan Y., Dhore M. L., Yesaware P. (2010): “Speech Emotion Recognition Using Support Vector Machine”, International Journal of Computer Applications (0975 - 8887), Cilt 1, No. 20.
- Eyben F., Wöllmer M., Schuller S. (2009): “openSMILE-The Munich Versatile and Fast Open- Source Audio Feature Extractor”, In Proc. ACM Multimedia (MM), ACM, Florence, Italy, ACM, ISBN 978-1-60558-933-6, pp. 1459-1462, doi:10.1145/1873951.1874246.
- Eyben F., Batliner A., Schuller B., Seppi D., Steidl S. (2010): "Cross-Corpus Classification of Realistic Emotions – Some Pilot Experiments”, In Proc. 7th Intern. Conf. on Language Resources and Evaluation (LREC 2010), Valletta, Elra, 2010.19.-21.05.2010.
- Hall M., Frank E., Holmes G., Pfahringer B., Reutemann P., Witten I. H. (2009): “The Weka data mining software: an update”. Sigkdd Explor. Newsl. Cilt 11, s.10–18.
- He L. (2010): “Stress and Emotion Recognition in Natural Speech in the Work and Family Environments”, PhD, Rmit University.
- Iida A., Campbell N., Higuchi F., Yasumura M. (2003): "A corpus-based speech synthesis system with emotion”, Speech Communication, Cilt 40, s.161–187
- Mairesse F., Polifroni J., Di Fabbrizio G. (2012): ”Can Prosody Inform Sentiment Analysis? Experiments On Short Spoken Reviews”, Nokia Research, ICASSP.
- Oflazoğlu C., Yildirim S. (2013): “Recognizing emotion from Turkish speech using acoustic features”, Eurasip Journal on Audio, Speech, and Music Processing.
- Pan Y., Shen P., Shen L. (2012): “Speech Emotion Recognition Using Support Vector Machine”, International Journal of Smart Home, Cilt 6, No. 2.
- Ramakrishnan S. (2012): “Recognition of Emotion from Speech: A Review”, International Journal of Speech Technology, Cilt 15, Sayı 2, s.99-117.
- Schuller B., Batliner A., Seppi D., Steidl S., Vogt T., Wagner J., Devillers L., Vidrascu L., Amir N., Kessous L., Aharonson V. (2007): “The Relevance of Feature Type for the Automatic Recognition of Emotional User States: Low Level Descriptors and Functionals”, Interspeech, Antwerp.
- Schuller B., Vlasenko B., Eyben F., Rigoll G., Wendemuth A. (2009): ”Acoustic Emotion Recognition: A Benchmark Comparison of Performances”, IEEE Workshop on Automatic Speech Recognition & Understanding, Asru 2009 Proc., Merano.
- Schuller B., Vlasenko B., Eyben F., Wollmer M., Stuhlsatz A.,Wendemuth A., Rigoll G. (2010): “Cross-corpus acoustic emotion recognition: variances and strategies”, IEEE Transactions on Affective Computing, Cilt 1, Sayı 2, s.119–131.
- Schuller B., Zhang Z., Weninger F., Rigoll G. (2011): "Selecting Training Data for Cross- Corpus Speech Emotion Recognition: Prototypicality vs. Generalization”, Speech Processing Conference, Avios Proc., Telaviv.
- Shahzadi A., Ahmadyfard A., Yaghmaie K., Harimi A. (2013): “Recognition of Emotion In Speech Using Spectral Patterns”, Malaysian Journal of Computer Science, Cilt 26, Sayı 2, s.143-158
- Wu S., Falk T. H., Chan W. (2010): "Automatic speech emotion recognition using modulation spectral features", Speech Communication, doi:10.1016/j.specom.2010.08.013.
- Wu D., Parsons D.T., Narayanan S. S. (2011): “Acoustic Feature Analysis in Speech Emotion Primitives Estimation”, Interspeech 2011, 26-30,Makuhari, Chiba, Japan.
- Zhang Z., Weninger F., Wöllmer M., Schuller B. (2011): "Unsupervised Learning in Cross- Corpus Acoustic Emotion Recognition”, IEEE Workshop on Automatic Speech Recognition & Understanding, Asru Proc., Waikoloa, Hawaii.