CREMA-D: Improving Accuracy with BPSO-Based Feature Selection for Emotion Recognition Using Speech

Kenan Donuk

doi:10.55195/jscai.1214312

Araştırma Makalesi

Yıl 2022, Cilt: 3 Sayı: 2, 51 - 57, 28.12.2022

Kenan Donuk

https://doi.org/10.55195/jscai.1214312

Öz

Kaynakça

M. Bojanić, V. Delić, and A. Karpov, “Call Redistribution for a Call Center Based on Speech Emotion Recognition,” Applied Sciences 2020, Vol. 10, Page 4653, vol. 10, no. 13, p. 4653, Jul. 2020, doi: 10.3390/APP10134653.
A. S. S. Kyi and K. Z. Lin, “Detecting Voice Features for Criminal Case,” 2019 International Conference on Advanced Information Technologies, ICAIT 2019, pp. 212–216, Nov. 2019, doi: 10.1109/AITC.2019.8921212.
M. Zielonka, A. Piastowski, A. Czyżewski, P. Nadachowski, M. Operlejn, and K. Kaczor, “Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets,” Electronics (Switzerland), vol. 11, no. 22, Nov. 2022.
R. Shankar, A. H. Kenfack, A. Somayazulu, and A. Venkataraman, “A Comparative Study of Data Augmentation Techniques for Deep Learning Based Emotion Recognition,” Nov. 2022, doi: 10.48550/arxiv.2211.05047.
K. Donuk and D. Hanbay, “Konuşma Duygu Tanıma için Akustik Özelliklere Dayalı LSTM Tabanlı Bir Yaklaşım,” Computer Science, vol. 7, no. 2, pp. 54–67, 2022, doi: 10.53070/bbd.1113379.
Y. B. Singh and S. Goel, “A systematic literature review of speech emotion recognition approaches,” Neurocomputing, vol. 492, pp. 245–263, Jul. 2022, doi: 10.1016/J.NEUCOM.2022.04.028.
H. Cao, D. G. Cooper, M. K. Keutmann, R. C. Gur, A. Nenkova, and R. Verma, “CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset,” IEEE Trans Affect Comput, vol. 5, no. 4, p. 377, Oct. 2014, doi: 10.1109/TAFFC.2014.2336244.
Ö. F. ÖZTÜRK and E. PASHAEİ, “Konuşmalardaki duygunun evrişimsel LSTM modeli ile tespiti,” Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi, vol. 12, no. 4, pp. 581–589, Sep. 2021, doi: 10.24012/DUMF.1001914.
“librosa — librosa 0.9.2 documentation.” https://librosa.org/doc/latest/index.html (accessed Dec. 01, 2022).
S. S. Stevens, J. Volkmann, and E. B. Newman, “A Scale for the Measurement of the Psychological Magnitude Pitch,” J Acoust Soc Am, vol. 8, no. 3, pp. 185–190, Jan. 1937, doi: 10.1121/1.1915893.
Q. Chen and G. Huang, “A novel dual attention-based BLSTM with hybrid features in speech emotion recognition,” Eng Appl Artif Intell, vol. 102, p. 104277, Jun. 2021, doi: 10.1016/J.ENGAPPAI.2021.104277.
J. Kennedy and R. Eberhart, “Particle swarm optimization,” Proceedings of ICNN’95 - International Conference on Neural Networks, vol. 4, pp. 1942–1948, doi: 10.1109/ICNN.1995.488968.
K. Donuk, N. Özbey, M. Inan, C. Yeroǧlu, and D. Hanbay, “Investigation of PIDA Controller Parameters via PSO Algorithm,” 2018 International Conference on Artificial Intelligence and Data Processing, IDAP 2018, Jan. 2019, doi: 10.1109/IDAP.2018.8620871.
F. Pedregosa FABIANPEDREGOSA et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, no. 85, pp. 2825–2830, 2011, Accessed: Dec. 02, 2022. [Online]. Available: http://jmlr.org/papers/v12/pedregosa11a.html
K. Donuk et al., “Deep Feature Selection for Facial Emotion Recognition Based on BPSO and SVM,” Politeknik Dergisi, pp. 1–1, Dec. 2022, doi: 10.2339/POLITEKNIK.992720.

CREMA-D: Improving Accuracy with BPSO-Based Feature Selection for Emotion Recognition Using Speech

Yıl 2022, Cilt: 3 Sayı: 2, 51 - 57, 28.12.2022

Kenan Donuk

https://doi.org/10.55195/jscai.1214312

Öz

People mostly communicate through speech or facial expressions. People's feelings and thoughts are reflected in their faces and speech. This phenomenon is an important tool for people to empathize when communicating with each other. Today, human emotions can be recognized automatically with the help of artificial intelligence systems. Automatic recognition of emotions can increase productivity in all areas including virtual reality, psychology, behavior modeling, in short, human-computer interaction. In this study, we propose a method based on improving the accuracy of emotion recognition using speech data. In this method, new features are determined using convolutional neural networks from MFCC coefficient matrices of speech records in Crema-D dataset. By applying particle swarm optimization to the features obtained, the accuracy was increased by selecting the features that are important for speech emotion classification. In addition, 64 attributes used for each record were reduced to 33 attributes. In the test results, 62.86% accuracy was obtained with CNN, 63.93% accuracy with SVM and 66.01% accuracy with CNN+BPSO+SVM.

Anahtar Kelimeler

Crema-D, CNN, SVM Algorithm, BPSO Algorithm, Speech Emotion

Kaynakça

M. Bojanić, V. Delić, and A. Karpov, “Call Redistribution for a Call Center Based on Speech Emotion Recognition,” Applied Sciences 2020, Vol. 10, Page 4653, vol. 10, no. 13, p. 4653, Jul. 2020, doi: 10.3390/APP10134653.
A. S. S. Kyi and K. Z. Lin, “Detecting Voice Features for Criminal Case,” 2019 International Conference on Advanced Information Technologies, ICAIT 2019, pp. 212–216, Nov. 2019, doi: 10.1109/AITC.2019.8921212.
M. Zielonka, A. Piastowski, A. Czyżewski, P. Nadachowski, M. Operlejn, and K. Kaczor, “Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets,” Electronics (Switzerland), vol. 11, no. 22, Nov. 2022.
R. Shankar, A. H. Kenfack, A. Somayazulu, and A. Venkataraman, “A Comparative Study of Data Augmentation Techniques for Deep Learning Based Emotion Recognition,” Nov. 2022, doi: 10.48550/arxiv.2211.05047.
K. Donuk and D. Hanbay, “Konuşma Duygu Tanıma için Akustik Özelliklere Dayalı LSTM Tabanlı Bir Yaklaşım,” Computer Science, vol. 7, no. 2, pp. 54–67, 2022, doi: 10.53070/bbd.1113379.
Y. B. Singh and S. Goel, “A systematic literature review of speech emotion recognition approaches,” Neurocomputing, vol. 492, pp. 245–263, Jul. 2022, doi: 10.1016/J.NEUCOM.2022.04.028.
H. Cao, D. G. Cooper, M. K. Keutmann, R. C. Gur, A. Nenkova, and R. Verma, “CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset,” IEEE Trans Affect Comput, vol. 5, no. 4, p. 377, Oct. 2014, doi: 10.1109/TAFFC.2014.2336244.
Ö. F. ÖZTÜRK and E. PASHAEİ, “Konuşmalardaki duygunun evrişimsel LSTM modeli ile tespiti,” Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi, vol. 12, no. 4, pp. 581–589, Sep. 2021, doi: 10.24012/DUMF.1001914.
“librosa — librosa 0.9.2 documentation.” https://librosa.org/doc/latest/index.html (accessed Dec. 01, 2022).
S. S. Stevens, J. Volkmann, and E. B. Newman, “A Scale for the Measurement of the Psychological Magnitude Pitch,” J Acoust Soc Am, vol. 8, no. 3, pp. 185–190, Jan. 1937, doi: 10.1121/1.1915893.
Q. Chen and G. Huang, “A novel dual attention-based BLSTM with hybrid features in speech emotion recognition,” Eng Appl Artif Intell, vol. 102, p. 104277, Jun. 2021, doi: 10.1016/J.ENGAPPAI.2021.104277.
J. Kennedy and R. Eberhart, “Particle swarm optimization,” Proceedings of ICNN’95 - International Conference on Neural Networks, vol. 4, pp. 1942–1948, doi: 10.1109/ICNN.1995.488968.
K. Donuk, N. Özbey, M. Inan, C. Yeroǧlu, and D. Hanbay, “Investigation of PIDA Controller Parameters via PSO Algorithm,” 2018 International Conference on Artificial Intelligence and Data Processing, IDAP 2018, Jan. 2019, doi: 10.1109/IDAP.2018.8620871.
F. Pedregosa FABIANPEDREGOSA et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, no. 85, pp. 2825–2830, 2011, Accessed: Dec. 02, 2022. [Online]. Available: http://jmlr.org/papers/v12/pedregosa11a.html
K. Donuk et al., “Deep Feature Selection for Facial Emotion Recognition Based on BPSO and SVM,” Politeknik Dergisi, pp. 1–1, Dec. 2022, doi: 10.2339/POLITEKNIK.992720.

Toplam 15 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Yapay Zeka
Bölüm	Research Articles
Yazarlar	Kenan Donuk 0000-0002-7421-5587
Yayımlanma Tarihi	28 Aralık 2022
Gönderilme Tarihi	4 Aralık 2022
Yayımlandığı Sayı	Yıl 2022 Cilt: 3 Sayı: 2

Kaynak Göster

APA	Donuk, K. (2022). CREMA-D: Improving Accuracy with BPSO-Based Feature Selection for Emotion Recognition Using Speech. Journal of Soft Computing and Artificial Intelligence, 3(2), 51-57. https://doi.org/10.55195/jscai.1214312
AMA	Donuk K. CREMA-D: Improving Accuracy with BPSO-Based Feature Selection for Emotion Recognition Using Speech. JSCAI. Aralık 2022;3(2):51-57. doi:10.55195/jscai.1214312
Chicago	Donuk, Kenan. “CREMA-D: Improving Accuracy With BPSO-Based Feature Selection for Emotion Recognition Using Speech”. Journal of Soft Computing and Artificial Intelligence 3, sy. 2 (Aralık 2022): 51-57. https://doi.org/10.55195/jscai.1214312.
EndNote	Donuk K (01 Aralık 2022) CREMA-D: Improving Accuracy with BPSO-Based Feature Selection for Emotion Recognition Using Speech. Journal of Soft Computing and Artificial Intelligence 3 2 51–57.
IEEE	K. Donuk, “CREMA-D: Improving Accuracy with BPSO-Based Feature Selection for Emotion Recognition Using Speech”, JSCAI, c. 3, sy. 2, ss. 51–57, 2022, doi: 10.55195/jscai.1214312.
ISNAD	Donuk, Kenan. “CREMA-D: Improving Accuracy With BPSO-Based Feature Selection for Emotion Recognition Using Speech”. Journal of Soft Computing and Artificial Intelligence 3/2 (Aralık 2022), 51-57. https://doi.org/10.55195/jscai.1214312.
JAMA	Donuk K. CREMA-D: Improving Accuracy with BPSO-Based Feature Selection for Emotion Recognition Using Speech. JSCAI. 2022;3:51–57.
MLA	Donuk, Kenan. “CREMA-D: Improving Accuracy With BPSO-Based Feature Selection for Emotion Recognition Using Speech”. Journal of Soft Computing and Artificial Intelligence, c. 3, sy. 2, 2022, ss. 51-57, doi:10.55195/jscai.1214312.
Vancouver	Donuk K. CREMA-D: Improving Accuracy with BPSO-Based Feature Selection for Emotion Recognition Using Speech. JSCAI. 2022;3(2):51-7.