Araştırma Makalesi
BibTex RIS Kaynak Göster

Arnavutça Konuşma Verilerini Kullanan Derin Öğrenme Tabanlı Duygu Durum Analizi ve Sınıflandırma

Yıl 2024, Cilt: 7 Sayı: 2, 30 - 40, 26.12.2024

Öz

Günümüzde konuşma veya ses verilerinden konuşmacının duygu durumunun analiz edilebildiği derin öğrenme tabanlı yazılımlar sayesinde etkileşimli sesli çağrı yanıtlama sistemleri oluşturulabilmektedir. Çalışmamızda, Arnavutça bir veri kümesi oluşturulmuştur. Ses verilerinin spektral ve duyusal açıdan analizi çeşitli derin öğrenme modelleri kullanılarak gerçekleştirilmiştir. Oluşturulan veri kümesi, dört farklı duygu sınıfını (öfkeli, mutlu, üzgün, şaşkın) içeren Arnavutça konuşma verilerini içermektedir. Sınıflandırma işleminde, evrişimli sinir ağı (ESA) modeli kullanılmıştır. Deneysel sonuçlara göre, Arnavutça duygu durumu sınıflandırma başarımı, alıcı işletim karakteristik (AİK) eğrisi altında kalan alan (EAKA) bazında; öfkeli sınıfı için 0.76, mutlu sınıfı için 1.00, üzgün sınıfı için 1.00 ve şaşkın sınıfı için 0.93 olarak elde edilmiştir. Çalışmaya dair bilimsel bulgular ve tartışmalar da sunulmuştur.

Etik Beyan

Veri kümemizin oluşturulmasında katkı sunan katılımcılar gönüllülük esasına göre deneylere katılmışlardır. Bu çalışmadaki veriler hiçbir yerde daha önce yayınlanmamış veya yayınlanmak üzere bir mecraya gönderilmemiştir.

Teşekkür

Veri kümemizin oluşturulmasında gönüllü olarak katkı sunan değerli katılımcılara teşekkür ederiz.

Kaynakça

  • [1]. Sudhanshu K., Partha Pratim R., Debi Prosad D., Byung-Gyu K. "A Comprehensive Review on Sentiment Analysis: Tasks, Approaches and Applications". arXiv Prepr. arXiv: 2311.11250v1, 2023.
  • [2]. Goodfellow, I., Bengio, Y., & Courville, A. Deep Learning, USA, MIT Press, 2016.
  • [3]. Wenxuan Z., Yue D., Bing L., Sinno Jialin P., Lidong B. “Sentiment Analysis in the Era of Large Language Models: A Reality Check”. arXiv Prepr. arXiv:2305.15005, 2023.
  • [4]. Burkhardt F, Paeschke A, Rolfes M, Sendlmeier F, Weiss B. “A Database of German Emotional Speech", 9th European Conference on Speech Communication and Technology, INTERSPEECH 2005 - Eurospeech, 1517-1520, Lisbon, Portugal, 4-8 Eylül, 2005.
  • [5]. Livingstone S. R., Russo F. A. “The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English” PLOS ONE, 13(5), .e0196391. 2018.
  • [6]. Muraku B., Xiao L., Meçe E. K. "Toward Detection of Fake News Using Sentiment Analysis for Albanian News Articles". Editors: Barolli L., Advances in Internet, Data & Web Technologies, EIDWT, Lecture Notes on Data Engineering and Communications Technologies, 19, 575–585, Springer, Cham, 2024.
  • [7]. Çano E., “AlbMoRe: A Corpus of Movie Reviews for Sentiment Analysis in Albanian”. Digital Philology Data Mining and Machine Learning, arXiv Prepr. arXiv:2306.085262023. University of Vienna, Austria, 2023.
  • [8]. Vasili, R., Xhina, E., Ninka, I., & Terpo, D. "Sentiment Analysis on Social Media for Albanian Language". OALib, 8(1-31), 2021.
  • [9]. McFee B, Raffel C, Liang D, Ellis D, Mcvicar M, Battenberg E, Nieto O. “Librosa: Audio and Music Signal Analysis in Python”. Proceedings of the 14th Python in Science Conference, January 2015.
  • [10]. Hunter J.D. “Matplotlib: A 2D Graphics Environment”. Computing in Science & Engineering, 9(3), 90-95, 2007.
  • [11]. Clark A. PIL: Python Imaging Library. Pillow (PIL Fork) Documentation, 1999.
  • [12]. Tensorflow web sitesi, 2024. https://www.tensorflow.org/
  • [13]. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É. "Scikit-learn: Machine Learning in Python". Journal of Machine Learning Research, 12(85), 2825-2830, 2011.
  • [14]. Bisong, E.. "Google Colaboratory". Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners, 59-64, Apress, Berkeley, CA. 2019.
  • [15]. Grabas, L.. "NLP-Text Dataset, A balanced dataset with 5 labels joy, sad, anger, fear, and neutral, 3202 unique values". 2024. https://github.com/lukasgarbas/nlp-text-emotion/blob/master/data/data_test.csv
  • [16]. He K, Zhang X, Ren S, Sun J. "Deep residual learning for image recognition", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778, Las Vegas, Nevada, USA, 27-30 Haziran 2016.
  • [17]. Chollet, F. "Xception: Deep Learning with Depthwise Separable Convolutions". IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  • [18]. Powers, D. M. W. "Evaluation: from Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation". arXiv Prepr. arXiv:2010.16061, 2020.
  • [19]. Ying, X. "An Overview of Overfitting and its Solutions". Journal of Physics: Conference Series, 1168(2), 022022, 2019.
  • [20]. Kadriu, A., Rista, A. "A Model for Albanian Speech Recognition Using End-to-End Deep Learning Techniques". International Journal of Research and Development, 9(3), 2022.
  • [21]. Kaçuri, M. Explainability of Hate Speech Classification for Albanian Language Using Rule Based Systems and Neural Networks. Diploma Thesis, Technische Universität Wien. reposiTUm, 2023.

Deep Learning Based Emotional State Analysis and Classification Using Albanian Speech Data

Yıl 2024, Cilt: 7 Sayı: 2, 30 - 40, 26.12.2024

Öz

Nowadays, interactive voice call response systems can be built using deep learning-based software that analyzes the speaker's emotional state from speech or audio data. In our study, an Albanian dataset was created. Spectral and emotional analysis of the audio data were performed using various deep learning models. The dataset contains Albanian speech data with four different emotion classes (angry, happy, sad, surprised). A convolutional neural network (CNN) model was used for classification. The developed classification system was also tested with other datasets to verify its accuracy. According to the experimental results, the performance of Albanian emotion classification, based on the Area under the Receiver Operating Characteristics Curve (AUC-ROC), was 0.76 for the angry class, 1.00 for the happy class, 1.00 for the sad class, and 0.93 for the surprised class. Scientific findings and discussions are also included.

Kaynakça

  • [1]. Sudhanshu K., Partha Pratim R., Debi Prosad D., Byung-Gyu K. "A Comprehensive Review on Sentiment Analysis: Tasks, Approaches and Applications". arXiv Prepr. arXiv: 2311.11250v1, 2023.
  • [2]. Goodfellow, I., Bengio, Y., & Courville, A. Deep Learning, USA, MIT Press, 2016.
  • [3]. Wenxuan Z., Yue D., Bing L., Sinno Jialin P., Lidong B. “Sentiment Analysis in the Era of Large Language Models: A Reality Check”. arXiv Prepr. arXiv:2305.15005, 2023.
  • [4]. Burkhardt F, Paeschke A, Rolfes M, Sendlmeier F, Weiss B. “A Database of German Emotional Speech", 9th European Conference on Speech Communication and Technology, INTERSPEECH 2005 - Eurospeech, 1517-1520, Lisbon, Portugal, 4-8 Eylül, 2005.
  • [5]. Livingstone S. R., Russo F. A. “The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English” PLOS ONE, 13(5), .e0196391. 2018.
  • [6]. Muraku B., Xiao L., Meçe E. K. "Toward Detection of Fake News Using Sentiment Analysis for Albanian News Articles". Editors: Barolli L., Advances in Internet, Data & Web Technologies, EIDWT, Lecture Notes on Data Engineering and Communications Technologies, 19, 575–585, Springer, Cham, 2024.
  • [7]. Çano E., “AlbMoRe: A Corpus of Movie Reviews for Sentiment Analysis in Albanian”. Digital Philology Data Mining and Machine Learning, arXiv Prepr. arXiv:2306.085262023. University of Vienna, Austria, 2023.
  • [8]. Vasili, R., Xhina, E., Ninka, I., & Terpo, D. "Sentiment Analysis on Social Media for Albanian Language". OALib, 8(1-31), 2021.
  • [9]. McFee B, Raffel C, Liang D, Ellis D, Mcvicar M, Battenberg E, Nieto O. “Librosa: Audio and Music Signal Analysis in Python”. Proceedings of the 14th Python in Science Conference, January 2015.
  • [10]. Hunter J.D. “Matplotlib: A 2D Graphics Environment”. Computing in Science & Engineering, 9(3), 90-95, 2007.
  • [11]. Clark A. PIL: Python Imaging Library. Pillow (PIL Fork) Documentation, 1999.
  • [12]. Tensorflow web sitesi, 2024. https://www.tensorflow.org/
  • [13]. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É. "Scikit-learn: Machine Learning in Python". Journal of Machine Learning Research, 12(85), 2825-2830, 2011.
  • [14]. Bisong, E.. "Google Colaboratory". Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners, 59-64, Apress, Berkeley, CA. 2019.
  • [15]. Grabas, L.. "NLP-Text Dataset, A balanced dataset with 5 labels joy, sad, anger, fear, and neutral, 3202 unique values". 2024. https://github.com/lukasgarbas/nlp-text-emotion/blob/master/data/data_test.csv
  • [16]. He K, Zhang X, Ren S, Sun J. "Deep residual learning for image recognition", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778, Las Vegas, Nevada, USA, 27-30 Haziran 2016.
  • [17]. Chollet, F. "Xception: Deep Learning with Depthwise Separable Convolutions". IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  • [18]. Powers, D. M. W. "Evaluation: from Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation". arXiv Prepr. arXiv:2010.16061, 2020.
  • [19]. Ying, X. "An Overview of Overfitting and its Solutions". Journal of Physics: Conference Series, 1168(2), 022022, 2019.
  • [20]. Kadriu, A., Rista, A. "A Model for Albanian Speech Recognition Using End-to-End Deep Learning Techniques". International Journal of Research and Development, 9(3), 2022.
  • [21]. Kaçuri, M. Explainability of Hate Speech Classification for Albanian Language Using Rule Based Systems and Neural Networks. Diploma Thesis, Technische Universität Wien. reposiTUm, 2023.
Toplam 21 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Derin Öğrenme, Makine Öğrenme (Diğer)
Bölüm Makaleler
Yazarlar

Bahadir Karasulu 0000-0001-8524-874X

Elif Avcı 0009-0007-3553-2858

Tesnim Strazimiri 0009-0005-0252-7345

Betül Cengiz 0009-0000-9309-7208

Yayımlanma Tarihi 26 Aralık 2024
Gönderilme Tarihi 4 Ekim 2024
Kabul Tarihi 8 Kasım 2024
Yayımlandığı Sayı Yıl 2024 Cilt: 7 Sayı: 2

Kaynak Göster

APA Karasulu, B., Avcı, E., Strazimiri, T., Cengiz, B. (2024). Arnavutça Konuşma Verilerini Kullanan Derin Öğrenme Tabanlı Duygu Durum Analizi ve Sınıflandırma. Veri Bilimi, 7(2), 30-40.



Dergimizin Tarandığı Dizinler (İndeksler)


Academic Resource Index

logo.png

journalseeker.researchbib.com

Google Scholar

scholar_logo_64dp.png

ASOS Index

asos-index.png

Rooting Index

logo.png

www.rootindexing.com

The JournalTOCs Index

journal-tocs-logo.jpg?w=584

www.journaltocs.ac.uk

General Impact Factor (GIF) Index

images?q=tbn%3AANd9GcQ0CrEQm4bHBnwh4XJv9I3ZCdHgQarj_qLyPTkGpeoRRmNh10eC

generalif.com

Directory of Research Journals Indexing

DRJI_Logo.jpg

olddrji.lbp.world/indexedJournals.aspx

I2OR Index

8c492a0a466f9b2cd59ec89595639a5c?AccessKeyId=245B99561176BAE11FEB&disposition=0&alloworigin=1

http://www.i2or.com/8.html



logo.png