Konuşma Tabanlı Duygu Tanımada Ön İşleme ve Öznitelik Seçim Yöntemlerinin Etkisi

Turgut Özseven

doi:10.24012/dumf.498727

Araştırma Makalesi

The Impact of Pre-processing and Feature Selection Methods for Speech Emotion Recognition

Yıl 2019, Cilt: 10 Sayı: 1, 99 - 112, 15.03.2019

Turgut Özseven

https://doi.org/10.24012/dumf.498727

Cited By: 2

Öz

Speech emotion
recognition uses features obtained from digital signal processing and digitized
sound signal. All of the features
extracted from the speech can be handled under one dimension as well as grouped
in terms of dimensional or structure. In
this study, the effects of feature selection and preprocessing methods on
emotion detection were investigated. For
this purpose, EMO-DB data set and three different classifiers are used. According to the results obtained, the
highest success was achieved with 90.3% with multi-layer perceptron and
high-pass filter. Spectral features
provide higher success than prosodic features.
In addition, females compared to males and individuals in 20-29 age
interval compared to individuals in 30-35 age interval reflect their emotions
more to their voices. Among the
filtering methods obtained in the study, high-pass filtering increased the
success of classifier whereas low-pass filtering, band-pass filtering and noise
reduction reduced it.

Anahtar Kelimeler

Speech emotion recognition, Pre-processing, Feature selection, Filtering, Emotion classification

Kaynakça

Altun, Halis, ve Gökhan Polat. 2009. “Boosting Selection of Speech Related Features to Improve Performance of Multi-Class SVMs in Emotion Detection”. Expert Systems with Applications 36 (4): 8197-8203. https://doi.org/10.1016/j.eswa.2008.10.005.
Bänziger, Tanja, Sona Patel, ve Klaus R. Scherer. 2014. “The Role of Perceived Voice and Speech Characteristics in Vocal Emotion Communication”. Journal of Nonverbal Behavior 38 (1): 31-52. https://doi.org/10.1007/s10919-013-0165-x.
Batliner, Anton, Stefan Steidl, Björn Schuller, Dino Seppi, Thurid Vogt, Johannes Wagner, Laurence Devillers, vd. 2011. “Whodunnit – Searching for the Most Important Feature Types Signalling Emotion-Related User States in Speech”. Computer Speech & Language 25 (1): 4-28. https://doi.org/10.1016/j.csl.2009.12.003.
Bayrakdar, Sümeyye, Devrim Akgün, ve İbrahim Yücedağ. 2017. “Video dosyaları üzerinde yüz ifade analizi için hızlandırılmış bir yaklaşım.” Pamukkale University Journal of Engineering Sciences 23 (5).
Boersma, Paul, ve David Weenink. 2010. Praat: doing phonetics by computer [Computer program], Version 5.1. 44.
Boll, Steven F. 1979. “Suppression of acoustic noise in speech using spectral subtraction”. Acoustics, Speech and Signal Processing, IEEE Transactions on 27 (2): 113–120.
Burkhardt, Felix, Astrid Paeschke, Miriam Rolfes, Walter F. Sendlmeier, ve Benjamin Weiss. 2005. “A database of German emotional speech.” Içinde Interspeech, 5:1517–1520. https://www.kw.tu-berlin.de/fileadmin/a01311100/A_Database_of_German_Emotional_Speech_-_Burkhardt_01.pdf.
Chen, Lijiang, Xia Mao, Pengfei Wei, Yuli Xue, ve Mitsuru Ishizuka. 2012. “Mandarin Emotion Recognition Combining Acoustic and Emotional Point Information”. Applied Intelligence 37 (4): 602-12. https://doi.org/10.1007/s10489-012-0352-1.
Chen, Lijiang, Xia Mao, Yuli Xue, ve Lee Lung Cheng. 2012. “Speech Emotion Recognition: Features and Classification Models”. Digital Signal Processing 22 (6): 1154-60. https://doi.org/10.1016/j.dsp.2012.05.007.
Clavel, C., I. Vasilescu, L. Devillers, G. Richard, ve T. Ehrette. 2008. “Fear-Type Emotion Recognition for Future Audio-Based Surveillance Systems”. Speech Communication 50 (6): 487-503. https://doi.org/10.1016/j.specom.2008.03.012.
El Ayadi, Moataz, Mohamed S. Kamel, ve Fakhri Karray. 2011. “Survey on Speech Emotion Recognition: Features, Classification Schemes, and Databases”. Pattern Recognition 44 (3): 572-87. https://doi.org/10.1016/j.patcog.2010.09.020.
Eyben, Florian, Martin Wöllmer, ve Björn Schuller. 2010. “Opensmile: the munich versatile and fast open-source audio feature extractor”. Içinde Proceedings of the international conference on Multimedia, 1459–1462. ACM. http://dl.acm.org/citation.cfm?id=1874246.
Goudbeek, Martijn, ve Klaus Scherer. 2010. “Beyond Arousal: Valence and Potency/Control Cues in the Vocal Expression of Emotion”. The Journal of the Acoustical Society of America 128 (3): 1322. https://doi.org/10.1121/1.3466853.
Grimm, Michael, Kristian Kroschel, Emily Mower, ve Shrikanth Narayanan. 2007. “Primitives-Based Evaluation and Estimation of Emotions in Speech”. Speech Communication 49 (10-11): 787-800. https://doi.org/10.1016/j.specom.2007.01.010.
Hall, Mark, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, ve Ian H. Witten. 2009. “The WEKA data mining software: an update”. ACM SIGKDD explorations newsletter 11 (1): 10–18.
Hansen, John HL, Sahar E. Bou-Ghazale, Ruhi Sarikaya, ve Bryan Pellom. 1997. “Getting started with SUSAS: a speech under simulated and actual stress database.” Içinde Eurospeech, 97:1743–46. https://catalog.ldc.upenn.edu/docs/LDC99S78/susas_rev1b4.ps.
Hoque, Mohammed E., Mohammed Yeasin, ve Max M. Louwerse. 2006. “Robust recognition of emotion from speech”. Içinde Intelligent Virtual Agents, 42–53. Springer. http://link.springer.com/chapter/10.1007/11821830_4.
Huang, Zheng-wei, Wen-tao Xue, ve Qi-rong Mao. 2015. “Speech emotion recognition with unsupervised feature learning”. Frontiers of Information Technology & Electronic Engineering 16: 358–366.
Joshi, Dipti D., ve M. B. Zalte. 2013. Recognition of Emotion from Marathi Speech Using MFCC and DWT Algorithms. IJACECT. http://www.irdindia.co.in/journal/journal_ijacect/pdf/vol2_iss2/11.pdf.
Kandali, Aditya Bihar, Aurobinda Routray, ve Tapan Kumar Basu. 2009. “Vocal Emotion Recognition in Five Native Languages of Assam Using New Wavelet Features”. International Journal of Speech Technology 12 (1): 1-13. https://doi.org/10.1007/s10772-009-9046-4.
Khanchandani, K. B., ve Moiz A. Hussain. 2009. “Emotion recognition using multilayer perceptron and generalized feed forward neural network”. Journal of Scientific and Industrial Research 68 (5): 367.
Laukka, Petri, Daniel Neiberg, Mimmi Forsell, Inger Karlsson, ve Kjell Elenius. 2011. “Expression of Affect in Spontaneous Speech: Acoustic Correlates and Automatic Detection of Irritation and Resignation”. Computer Speech & Language 25 (1): 84-104. https://doi.org/10.1016/j.csl.2010.03.004.
Lee, Chi-Chun, Emily Mower, Carlos Busso, Sungbok Lee, ve Shrikanth Narayanan. 2011. “Emotion Recognition Using a Hierarchical Binary Decision Tree Approach”. Speech Communication 53 (9-10): 1162-71. https://doi.org/10.1016/j.specom.2011.06.004.
Ludeña-Choez, Jimmy, ve Ascensión Gallardo-Antolín. 2015. “Feature extraction based on the high-pass filtering of audio signals for Acoustic Event Classification”. Computer Speech & Language 30 (1): 32–42.
Luengo, Iker, Eva Navas, ve Inmaculada Hernaez. 2010. “Feature Analysis and Evaluation for Automatic Emotion Identification in Speech”. IEEE Transactions on Multimedia 12 (6): 490-501. https://doi.org/10.1109/TMM.2010.2051872.
MacCallum, Julia K., Aleksandra E. Olszewski, Yu Zhang, ve Jack J. Jiang. 2011. “Effects of low-pass filtering on acoustic analysis of voice”. Journal of Voice 25 (1): 15–20.Mariooryad, S., ve C. Busso. 2013. “Exploring Cross-Modality Affective Reactions for Audiovisual Emotion Recognition”. IEEE Transactions on Affective Computing 4 (2): 183-96. https://doi.org/10.1109/T-AFFC.2013.11.
Ntalampiras, S., ve N. Fakotakis. 2012. “Modeling the Temporal Evolution of Acoustic Parameters for Speech Emotion Recognition”. IEEE Transactions on Affective Computing 3 (1): 116-25. https://doi.org/10.1109/T-AFFC.2011.31.
Orlandi, Silvia, P. H. Dejonckere, Jean Schoentgen, Jean Lebacq, N. Rruqja, ve Claudia Manfredi. 2013. “Effective pre-processing of long term noisy audio recordings: An aid to clinical monitoring”. Biomedical Signal Processing and Control 8 (6): 799–810.
Özseven, Turgut, ve Muharrem Düğenci. 2017. “The effects of digital filters on acoustic parameters, gender, age and emotion”. Pamukkale University Journal of Engineering Sciences 23 (2): 144-48. https://doi.org/10.5505/pajes.2016.00922.
Özseven, Turgut, Muharrem Düğenci, ve Alptekin Durmuşoğlu. 2018. “A Content Analaysis of the Research Aapproaches in Speech Emotion Recognition” 7 (1): 1-26.Patel, Sona, Klaus R. Scherer, Eva Björkner, ve Johan Sundberg. 2011. “Mapping Emotions into Acoustic Space: The Role of Voice Production”. Biological Psychology 87 (1): 93-98. https://doi.org/10.1016/j.biopsycho.2011.02.010.
Polzehl, Tim, Alexander Schmitt, Florian Metze, ve Michael Wagner. 2011. “Anger Recognition in Speech Using Acoustic and Linguistic Cues”. Speech Communication 53 (9-10): 1198-1209. https://doi.org/10.1016/j.specom.2011.05.002.
Rabiner, Lawrence R., ve Ronald W. Schafer. 1978. Digital processing of speech signals. Prentice Hall.
Scherer, Klaus R., Johan Sundberg, Lucas Tamarit, ve Gláucia L. Salomão. 2015. “Comparing the Acoustic Expression of Emotion in the Speaking and the Singing Voice”. Computer Speech & Language 29 (1): 218-35. https://doi.org/10.1016/j.csl.2013.10.002.
Schuller, Björn, Ronald Müller, Manfred K. Lang, ve Gerhard Rigoll. 2005. “Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles.” Içinde INTERSPEECH, 805–808. Citeseer. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.5140&rep=rep1&type=pdf.
Sezgin, Mehmet Cenk, Bilge Gunsel, ve Gunes Karabulut Kurt. 2012. “Perceptual audio features for emotion detection”. EURASIP Journal on Audio, Speech, and Music Processing 2012 (1): 1–21.
Sundberg, Johan, Sona Patel, Eva Bjorkner, ve Klaus R. Scherer. 2011. “Interdependencies among Voice Source Parameters in Emotional Speech”. IEEE Transactions on Affective Computing 2 (3): 162-74. https://doi.org/10.1109/T-AFFC.2011.14.
Tao, Jianhua, Yongguo Kang, ve Aijun Li. 2006. “Prosody conversion from neutral speech to emotional speech”. Audio, Speech, and Language Processing, IEEE Transactions on 14 (4): 1145–1154.
Tarng, Wernhuar, Yuan-Yuan Chen, Chien-Lung Li, Kun-Rong Hsie, ve Mingteh Chen. 2010. “Applications of support vector machines on smart phone systems for emotional speech recognition”. World Academy of Science, Engineering and Technology 72: 106–113.
Truong, Khiet P., David A. van Leeuwen, ve Franciska M.G. de Jong. 2012. “Speech-Based Recognition of Self-Reported and Observed Emotion in a Dimensional Space”. Speech Communication 54 (9): 1049-63. https://doi.org/10.1016/j.specom.2012.04.006.
Zhao, Xiaoming, Shiqing Zhang, ve Bicheng Lei. 2014. “Robust Emotion Recognition in Noisy Speech via Sparse Representation”. Neural Computing and Applications 24 (7-8): 1539-53. https://doi.org/10.1007/s00521-013-1377-z.
Zupan, Barbra, Dawn Neumann, Duncan R. Babbage, ve Barry Willer. 2009. “The importance of vocal affect to bimodal processing of emotion: implications for individuals with traumatic brain injury”. Journal of Communication Disorders 42 (1): 1–17.

Konuşma Tabanlı Duygu Tanımada Ön İşleme ve Öznitelik Seçim Yöntemlerinin Etkisi

Yıl 2019, Cilt: 10 Sayı: 1, 99 - 112, 15.03.2019

Turgut Özseven

https://doi.org/10.24012/dumf.498727

Cited By: 2

Öz

Konuşma tabanlı duygu tanımada sayısal sinyal işleme ile sayısal hale getirilen ses sinyalinden elde edilen öznitelikler kullanılmaktadır. Konuşmadan çıkartılan özniteliklerin tamamı tek boyut altında ele alınabildiği gibi boyutsal veya yapı bakımdan gruplandırılarak da ele alınmaktadır. Bu çalışmada, öznitelik seçim ve ön işleme yöntemlerinin duygu tanımadaki etkisi araştırılmıştır. Bu amaçla, EMO-DB veri seti ve üç farklı sınıflandırıcı kullanılmıştır. Elde edilen sonuçlara göre, en yüksek başarı çok katmanlı algılayıcı ve yüksek geçiren filtre ile %90.3 olarak elde edilmiştir. Spektral öznitelikler prosodik özniteliklerden daha yüksek başarı sağlamıştır. Ayrıca, bayanlar erkeklere göre ve 20-29 yaş aralığındaki bireyler 30-35 yaş aralığındaki bireylere göre duygularını seslerine daha fazla yansıtmaktadır. Çalışmada ele alınan ön işleme yöntemlerinden yüksek geçiren filtreler sınıflandırıcı başarısı artırırken alçak geçiren, bant geçiren filtreler ve gürültü giderme başarı oranını düşürmüştür.

Anahtar Kelimeler

Konuşmadan duygu tanıma, ön işleme, öznitelik seçimi, filtreleme, duygu sınıflandırma

Kaynakça

Altun, Halis, ve Gökhan Polat. 2009. “Boosting Selection of Speech Related Features to Improve Performance of Multi-Class SVMs in Emotion Detection”. Expert Systems with Applications 36 (4): 8197-8203. https://doi.org/10.1016/j.eswa.2008.10.005.
Bänziger, Tanja, Sona Patel, ve Klaus R. Scherer. 2014. “The Role of Perceived Voice and Speech Characteristics in Vocal Emotion Communication”. Journal of Nonverbal Behavior 38 (1): 31-52. https://doi.org/10.1007/s10919-013-0165-x.
Batliner, Anton, Stefan Steidl, Björn Schuller, Dino Seppi, Thurid Vogt, Johannes Wagner, Laurence Devillers, vd. 2011. “Whodunnit – Searching for the Most Important Feature Types Signalling Emotion-Related User States in Speech”. Computer Speech & Language 25 (1): 4-28. https://doi.org/10.1016/j.csl.2009.12.003.
Bayrakdar, Sümeyye, Devrim Akgün, ve İbrahim Yücedağ. 2017. “Video dosyaları üzerinde yüz ifade analizi için hızlandırılmış bir yaklaşım.” Pamukkale University Journal of Engineering Sciences 23 (5).
Boersma, Paul, ve David Weenink. 2010. Praat: doing phonetics by computer [Computer program], Version 5.1. 44.
Boll, Steven F. 1979. “Suppression of acoustic noise in speech using spectral subtraction”. Acoustics, Speech and Signal Processing, IEEE Transactions on 27 (2): 113–120.
Burkhardt, Felix, Astrid Paeschke, Miriam Rolfes, Walter F. Sendlmeier, ve Benjamin Weiss. 2005. “A database of German emotional speech.” Içinde Interspeech, 5:1517–1520. https://www.kw.tu-berlin.de/fileadmin/a01311100/A_Database_of_German_Emotional_Speech_-_Burkhardt_01.pdf.
Chen, Lijiang, Xia Mao, Pengfei Wei, Yuli Xue, ve Mitsuru Ishizuka. 2012. “Mandarin Emotion Recognition Combining Acoustic and Emotional Point Information”. Applied Intelligence 37 (4): 602-12. https://doi.org/10.1007/s10489-012-0352-1.
Chen, Lijiang, Xia Mao, Yuli Xue, ve Lee Lung Cheng. 2012. “Speech Emotion Recognition: Features and Classification Models”. Digital Signal Processing 22 (6): 1154-60. https://doi.org/10.1016/j.dsp.2012.05.007.
Clavel, C., I. Vasilescu, L. Devillers, G. Richard, ve T. Ehrette. 2008. “Fear-Type Emotion Recognition for Future Audio-Based Surveillance Systems”. Speech Communication 50 (6): 487-503. https://doi.org/10.1016/j.specom.2008.03.012.
El Ayadi, Moataz, Mohamed S. Kamel, ve Fakhri Karray. 2011. “Survey on Speech Emotion Recognition: Features, Classification Schemes, and Databases”. Pattern Recognition 44 (3): 572-87. https://doi.org/10.1016/j.patcog.2010.09.020.
Eyben, Florian, Martin Wöllmer, ve Björn Schuller. 2010. “Opensmile: the munich versatile and fast open-source audio feature extractor”. Içinde Proceedings of the international conference on Multimedia, 1459–1462. ACM. http://dl.acm.org/citation.cfm?id=1874246.
Goudbeek, Martijn, ve Klaus Scherer. 2010. “Beyond Arousal: Valence and Potency/Control Cues in the Vocal Expression of Emotion”. The Journal of the Acoustical Society of America 128 (3): 1322. https://doi.org/10.1121/1.3466853.
Grimm, Michael, Kristian Kroschel, Emily Mower, ve Shrikanth Narayanan. 2007. “Primitives-Based Evaluation and Estimation of Emotions in Speech”. Speech Communication 49 (10-11): 787-800. https://doi.org/10.1016/j.specom.2007.01.010.
Hall, Mark, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, ve Ian H. Witten. 2009. “The WEKA data mining software: an update”. ACM SIGKDD explorations newsletter 11 (1): 10–18.
Hansen, John HL, Sahar E. Bou-Ghazale, Ruhi Sarikaya, ve Bryan Pellom. 1997. “Getting started with SUSAS: a speech under simulated and actual stress database.” Içinde Eurospeech, 97:1743–46. https://catalog.ldc.upenn.edu/docs/LDC99S78/susas_rev1b4.ps.
Hoque, Mohammed E., Mohammed Yeasin, ve Max M. Louwerse. 2006. “Robust recognition of emotion from speech”. Içinde Intelligent Virtual Agents, 42–53. Springer. http://link.springer.com/chapter/10.1007/11821830_4.
Huang, Zheng-wei, Wen-tao Xue, ve Qi-rong Mao. 2015. “Speech emotion recognition with unsupervised feature learning”. Frontiers of Information Technology & Electronic Engineering 16: 358–366.
Joshi, Dipti D., ve M. B. Zalte. 2013. Recognition of Emotion from Marathi Speech Using MFCC and DWT Algorithms. IJACECT. http://www.irdindia.co.in/journal/journal_ijacect/pdf/vol2_iss2/11.pdf.
Kandali, Aditya Bihar, Aurobinda Routray, ve Tapan Kumar Basu. 2009. “Vocal Emotion Recognition in Five Native Languages of Assam Using New Wavelet Features”. International Journal of Speech Technology 12 (1): 1-13. https://doi.org/10.1007/s10772-009-9046-4.
Khanchandani, K. B., ve Moiz A. Hussain. 2009. “Emotion recognition using multilayer perceptron and generalized feed forward neural network”. Journal of Scientific and Industrial Research 68 (5): 367.
Laukka, Petri, Daniel Neiberg, Mimmi Forsell, Inger Karlsson, ve Kjell Elenius. 2011. “Expression of Affect in Spontaneous Speech: Acoustic Correlates and Automatic Detection of Irritation and Resignation”. Computer Speech & Language 25 (1): 84-104. https://doi.org/10.1016/j.csl.2010.03.004.
Lee, Chi-Chun, Emily Mower, Carlos Busso, Sungbok Lee, ve Shrikanth Narayanan. 2011. “Emotion Recognition Using a Hierarchical Binary Decision Tree Approach”. Speech Communication 53 (9-10): 1162-71. https://doi.org/10.1016/j.specom.2011.06.004.
Ludeña-Choez, Jimmy, ve Ascensión Gallardo-Antolín. 2015. “Feature extraction based on the high-pass filtering of audio signals for Acoustic Event Classification”. Computer Speech & Language 30 (1): 32–42.
Luengo, Iker, Eva Navas, ve Inmaculada Hernaez. 2010. “Feature Analysis and Evaluation for Automatic Emotion Identification in Speech”. IEEE Transactions on Multimedia 12 (6): 490-501. https://doi.org/10.1109/TMM.2010.2051872.
MacCallum, Julia K., Aleksandra E. Olszewski, Yu Zhang, ve Jack J. Jiang. 2011. “Effects of low-pass filtering on acoustic analysis of voice”. Journal of Voice 25 (1): 15–20.Mariooryad, S., ve C. Busso. 2013. “Exploring Cross-Modality Affective Reactions for Audiovisual Emotion Recognition”. IEEE Transactions on Affective Computing 4 (2): 183-96. https://doi.org/10.1109/T-AFFC.2013.11.
Ntalampiras, S., ve N. Fakotakis. 2012. “Modeling the Temporal Evolution of Acoustic Parameters for Speech Emotion Recognition”. IEEE Transactions on Affective Computing 3 (1): 116-25. https://doi.org/10.1109/T-AFFC.2011.31.
Orlandi, Silvia, P. H. Dejonckere, Jean Schoentgen, Jean Lebacq, N. Rruqja, ve Claudia Manfredi. 2013. “Effective pre-processing of long term noisy audio recordings: An aid to clinical monitoring”. Biomedical Signal Processing and Control 8 (6): 799–810.
Özseven, Turgut, ve Muharrem Düğenci. 2017. “The effects of digital filters on acoustic parameters, gender, age and emotion”. Pamukkale University Journal of Engineering Sciences 23 (2): 144-48. https://doi.org/10.5505/pajes.2016.00922.
Özseven, Turgut, Muharrem Düğenci, ve Alptekin Durmuşoğlu. 2018. “A Content Analaysis of the Research Aapproaches in Speech Emotion Recognition” 7 (1): 1-26.Patel, Sona, Klaus R. Scherer, Eva Björkner, ve Johan Sundberg. 2011. “Mapping Emotions into Acoustic Space: The Role of Voice Production”. Biological Psychology 87 (1): 93-98. https://doi.org/10.1016/j.biopsycho.2011.02.010.
Polzehl, Tim, Alexander Schmitt, Florian Metze, ve Michael Wagner. 2011. “Anger Recognition in Speech Using Acoustic and Linguistic Cues”. Speech Communication 53 (9-10): 1198-1209. https://doi.org/10.1016/j.specom.2011.05.002.
Rabiner, Lawrence R., ve Ronald W. Schafer. 1978. Digital processing of speech signals. Prentice Hall.
Scherer, Klaus R., Johan Sundberg, Lucas Tamarit, ve Gláucia L. Salomão. 2015. “Comparing the Acoustic Expression of Emotion in the Speaking and the Singing Voice”. Computer Speech & Language 29 (1): 218-35. https://doi.org/10.1016/j.csl.2013.10.002.
Schuller, Björn, Ronald Müller, Manfred K. Lang, ve Gerhard Rigoll. 2005. “Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles.” Içinde INTERSPEECH, 805–808. Citeseer. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.5140&rep=rep1&type=pdf.
Sezgin, Mehmet Cenk, Bilge Gunsel, ve Gunes Karabulut Kurt. 2012. “Perceptual audio features for emotion detection”. EURASIP Journal on Audio, Speech, and Music Processing 2012 (1): 1–21.
Sundberg, Johan, Sona Patel, Eva Bjorkner, ve Klaus R. Scherer. 2011. “Interdependencies among Voice Source Parameters in Emotional Speech”. IEEE Transactions on Affective Computing 2 (3): 162-74. https://doi.org/10.1109/T-AFFC.2011.14.
Tao, Jianhua, Yongguo Kang, ve Aijun Li. 2006. “Prosody conversion from neutral speech to emotional speech”. Audio, Speech, and Language Processing, IEEE Transactions on 14 (4): 1145–1154.
Tarng, Wernhuar, Yuan-Yuan Chen, Chien-Lung Li, Kun-Rong Hsie, ve Mingteh Chen. 2010. “Applications of support vector machines on smart phone systems for emotional speech recognition”. World Academy of Science, Engineering and Technology 72: 106–113.
Truong, Khiet P., David A. van Leeuwen, ve Franciska M.G. de Jong. 2012. “Speech-Based Recognition of Self-Reported and Observed Emotion in a Dimensional Space”. Speech Communication 54 (9): 1049-63. https://doi.org/10.1016/j.specom.2012.04.006.
Zhao, Xiaoming, Shiqing Zhang, ve Bicheng Lei. 2014. “Robust Emotion Recognition in Noisy Speech via Sparse Representation”. Neural Computing and Applications 24 (7-8): 1539-53. https://doi.org/10.1007/s00521-013-1377-z.
Zupan, Barbra, Dawn Neumann, Duncan R. Babbage, ve Barry Willer. 2009. “The importance of vocal affect to bimodal processing of emotion: implications for individuals with traumatic brain injury”. Journal of Communication Disorders 42 (1): 1–17.

Toplam 41 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Bölüm	Makaleler
Yazarlar	Turgut Özseven 0000-0002-6325-461X
Yayımlanma Tarihi	15 Mart 2019
Gönderilme Tarihi	18 Aralık 2018
Yayımlandığı Sayı	Yıl 2019 Cilt: 10 Sayı: 1

Kaynak Göster

IEEE	T. Özseven, “Konuşma Tabanlı Duygu Tanımada Ön İşleme ve Öznitelik Seçim Yöntemlerinin Etkisi”, DÜMF MD, c. 10, sy. 1, ss. 99–112, 2019, doi: 10.24012/dumf.498727.

Cited By

Konuşma Duygu Tanıma Uygulamalarında Hiper Parametre Optimizasyonu ile Derin Öğrenme Metotlarının Geliştirilmesi

Karadeniz Fen Bilimleri Dergisi

https://doi.org/10.31466/kfbd.1508578

Konuşmadan Duygu Tanıma Üzerine Detaylı bir İnceleme: Özellikler ve Sınıflandırma Metotları

European Journal of Science and Technology

https://doi.org/10.31590/ejosat.1039403

Kapak Resmi İndir

Makale Dosyaları

Tam Metin

DUJE tarafından yayınlanan tüm makaleler, Creative Commons Atıf 4.0 Uluslararası Lisansı ile lisanslanmıştır. Bu, orijinal eser ve kaynağın uygun şekilde belirtilmesi koşuluyla, herkesin eseri kopyalamasına, yeniden dağıtmasına, yeniden düzenlemesine, iletmesine ve uyarlamasına izin verir. 24456