Araştırma Makalesi

Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset

Cilt: 12 Sayı: 1 1 Mart 2024
PDF İndir
EN

Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset

Öz

Emotion recognition using multimodal data is a widely adopted approach due to its potential to enhance human interactions and various applications. By leveraging multimodal data for emotion recognition, the quality of human interactions can be significantly improved. We present the Multimodal Emotion Lines Dataset (MELD) and a novel method for multimodal emotion recognition using a bi-lateral gradient graph neural network (Bi-LG-GNN) and feature extraction and pre-processing. The multimodal dataset uses fine-grained emotion labeling for textual, audio, and visual modalities. This work aims to identify affective computing states successfully concealed in the textual and audio data for emotion recognition and sentiment analysis. We use pre-processing techniques to improve the quality and consistency of the data to increase the dataset’s usefulness. The process also includes noise removal, normalization, and linguistic processing to deal with linguistic variances and background noise in the discourse. The Kernel Principal Component Analysis (K-PCA) is employed for feature extraction, aiming to derive valuable attributes from each modality and encode labels for array values. We propose a Bi-LG-GCN-based architecture explicitly tailored for multimodal emotion recognition, effectively fusing data from various modalities. The Bi-LG-GCN system takes each modality's feature-extracted and pre-processed representation as input to the generator network, generating realistic synthetic data samples that capture multimodal relationships. These generated synthetic data samples, reflecting multimodal relationships, serve as inputs to the discriminator network, which has been trained to distinguish genuine from synthetic data. With this approach, the model can learn discriminative features for emotion recognition and make accurate predictions regarding subsequent emotional states. Our method was evaluated on the MELD dataset, yielding notable results in terms of accuracy (80%), F1-score (81%), precision (81%), and recall (81%) when using the MELD dataset. The pre-processing and feature extraction steps enhance input representation quality and discrimination. Our Bi-LG-GCN-based approach, featuring multimodal data synthesis, outperforms contemporary techniques, thus demonstrating its practical utility.

Anahtar Kelimeler

Kaynakça

  1. [1] P. Savci and B. Das, “Comparison of pre-trained language models in terms of carbon emissions, time and accuracy in multi-label text classification using AutoML,” Heliyon, vol. 9, no. 5, p. e15670, 2023-05-01. [Online]. Available: https://www.sciencedirect.com/science/ article/pii/S2405844023028773
  2. [2] M. Aydogan, “A hybrid deep neural network-based automated diagnosis system using x-ray images and clinical findings,” International Journalof Imaging Systems and Technology, vol. 33, no. 4, pp. 1368–1382, 2023, eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/ima.22856. [On-line]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/ima.
  3. [3] D. Dupr´e, E. G. Krumhuber, D. K¨uster, and G. J. McKeown, “A performance comparison of eight commercially available automatic classifiers for facial affect recognition,” PLOS ONE, vol. 15, no. 4, p. e0231968, 2020, publisher: Public Library of Science. [Online]. Available: https://journals.plos.org/plosone/article?id=10.1371/ journal.pone.0231968
  4. [4] E. Cameron and M. Green, Making Sense of Change Management: A Complete Guide to the Models, Tools and Techniques of Organizational Change. Kogan Page Publishers, 2019. [Online]. Available: https://www.example.com/your-book-url
  5. [5] W. Zehra, A. R. Javed, Z. Jalil, H. U. Khan, and T. R. Gadekallu, “Cross corpus multi-lingual speech emotion recognition using ensemble learning,” Complex & Intelligent Systems, vol. 7, no. 4, pp. 1845–1854, 2021. [Online]. Available: https://doi.org/10.1007/s40747-020-00250-4
  6. [6] A survey of emotion recognition methods with emphasis on e-learning environments | journal of network and computer applications. [Online]. Available: https://dl.acm.org/doi/10.1016/j.jnca.2019.102423
  7. [7] S. K. Yadav, K. Tiwari, H. M. Pandey, and S. A. Akbar, “A review of multimodal human activity recognition with special emphasis on classification, applications, challenges and future directions,” Knowledge- Based Systems, vol. 223, p. 106970, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950705121002331
  8. [8] R. Das and M. Soylu, “A key review on graph data science: The power of graphs in scientific studies,” Chemometrics and Intelligent Laboratory Systems, vol. 240, p. 104896, 2023-09-15. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0169743923001466

Ayrıntılar

Birincil Dil

İngilizce

Konular

Bilgisayar Yazılımı, Yazılım Testi, Doğrulama ve Validasyon

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

1 Mart 2024

Gönderilme Tarihi

6 Ekim 2023

Kabul Tarihi

16 Ekim 2023

Yayımlandığı Sayı

Yıl 2024 Cilt: 12 Sayı: 1

Kaynak Göster

APA
Alsaadawı, H. F. T., & Daş, R. (2024). Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset. Balkan Journal of Electrical and Computer Engineering, 12(1), 36-46. https://doi.org/10.17694/bajece.1372107
AMA
1.Alsaadawı HFT, Daş R. Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset. Balkan Journal of Electrical and Computer Engineering. 2024;12(1):36-46. doi:10.17694/bajece.1372107
Chicago
Alsaadawı, Hussein Farooq Tayeb, ve Resul Daş. 2024. “Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset”. Balkan Journal of Electrical and Computer Engineering 12 (1): 36-46. https://doi.org/10.17694/bajece.1372107.
EndNote
Alsaadawı HFT, Daş R (01 Mart 2024) Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset. Balkan Journal of Electrical and Computer Engineering 12 1 36–46.
IEEE
[1]H. F. T. Alsaadawı ve R. Daş, “Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset”, Balkan Journal of Electrical and Computer Engineering, c. 12, sy 1, ss. 36–46, Mar. 2024, doi: 10.17694/bajece.1372107.
ISNAD
Alsaadawı, Hussein Farooq Tayeb - Daş, Resul. “Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset”. Balkan Journal of Electrical and Computer Engineering 12/1 (01 Mart 2024): 36-46. https://doi.org/10.17694/bajece.1372107.
JAMA
1.Alsaadawı HFT, Daş R. Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset. Balkan Journal of Electrical and Computer Engineering. 2024;12:36–46.
MLA
Alsaadawı, Hussein Farooq Tayeb, ve Resul Daş. “Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset”. Balkan Journal of Electrical and Computer Engineering, c. 12, sy 1, Mart 2024, ss. 36-46, doi:10.17694/bajece.1372107.
Vancouver
1.Hussein Farooq Tayeb Alsaadawı, Resul Daş. Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset. Balkan Journal of Electrical and Computer Engineering. 01 Mart 2024;12(1):36-4. doi:10.17694/bajece.1372107

Cited By

All articles published by BAJECE are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.Creative Commons Lisans