<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN"
        "https://jats.nlm.nih.gov/publishing/1.4/JATS-journalpublishing1-4.dtd">
<article  article-type="research-article"        dtd-version="1.4">
            <front>

                <journal-meta>
                                    <journal-id></journal-id>
            <journal-title-group>
                                                                                    <journal-title>Balkan Journal of Electrical and Computer Engineering</journal-title>
            </journal-title-group>
                            <issn pub-type="ppub">2147-284X</issn>
                                        <issn pub-type="epub">2147-284X</issn>
                                                                                            <publisher>
                    <publisher-name>MUSA YILMAZ</publisher-name>
                </publisher>
                    </journal-meta>
                <article-meta>
                                        <article-id pub-id-type="doi">10.17694/bajece.1372107</article-id>
                                                                <article-categories>
                                            <subj-group  xml:lang="en">
                                                            <subject>Computer Software</subject>
                                                            <subject>Software Testing, Verification and Validation</subject>
                                                    </subj-group>
                                            <subj-group  xml:lang="tr">
                                                            <subject>Bilgisayar Yazılımı</subject>
                                                            <subject>Yazılım Testi, Doğrulama ve Validasyon</subject>
                                                    </subj-group>
                                    </article-categories>
                                                                                                                                                        <title-group>
                                                                                                                                                            <article-title>Multimodal Emotion Recognition Using Bi-LG-GCN for MELD Dataset</article-title>
                                                                                                    </title-group>
            
                                                    <contrib-group content-type="authors">
                                                                        <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0009-0005-2559-8816</contrib-id>
                                                                <name>
                                    <surname>Alsaadawı</surname>
                                    <given-names>Hussein Farooq Tayeb</given-names>
                                </name>
                                                                    <aff>FIRAT UNIVERSITY</aff>
                                                            </contrib>
                                                    <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0000-0002-6113-4649</contrib-id>
                                                                <name>
                                    <surname>Daş</surname>
                                    <given-names>Resul</given-names>
                                </name>
                                                                    <aff>Firat University, Technology Faculty, Department of Software Engineering</aff>
                                                            </contrib>
                                                                                </contrib-group>
                        
                                        <pub-date pub-type="pub" iso-8601-date="20240301">
                    <day>03</day>
                    <month>01</month>
                    <year>2024</year>
                </pub-date>
                                        <volume>12</volume>
                                        <issue>1</issue>
                                        <fpage>36</fpage>
                                        <lpage>46</lpage>
                        
                        <history>
                                    <date date-type="received" iso-8601-date="20231006">
                        <day>10</day>
                        <month>06</month>
                        <year>2023</year>
                    </date>
                                                    <date date-type="accepted" iso-8601-date="20231016">
                        <day>10</day>
                        <month>16</month>
                        <year>2023</year>
                    </date>
                            </history>
                                        <permissions>
                    <copyright-statement>Copyright © 2013, Balkan Journal of Electrical and Computer Engineering</copyright-statement>
                    <copyright-year>2013</copyright-year>
                    <copyright-holder>Balkan Journal of Electrical and Computer Engineering</copyright-holder>
                </permissions>
            
                                                                                                                        <abstract><p>Emotion recognition using multimodal data is a widely adopted approach due to its potential to enhance human interactions and various applications. By leveraging multimodal data for emotion recognition, the quality of human interactions can be significantly improved. We present the Multimodal Emotion Lines Dataset (MELD) and a novel method for multimodal emotion recognition using a bi-lateral gradient graph neural network (Bi-LG-GNN) and feature extraction and pre-processing. The multimodal dataset uses fine-grained emotion labeling for textual, audio, and visual modalities. This work aims to identify affective computing states successfully concealed in the textual and audio data for emotion recognition and sentiment analysis. We use pre-processing techniques to improve the quality and consistency of the data to increase the dataset’s usefulness. The process also includes noise removal, normalization, and linguistic processing to deal with linguistic variances and background noise in the discourse. The Kernel Principal Component Analysis (K-PCA) is employed for feature extraction, aiming to derive valuable attributes from each modality and encode labels for array values. We propose a Bi-LG-GCN-based architecture explicitly tailored for multimodal emotion recognition, effectively fusing data from various modalities. The Bi-LG-GCN system takes each modality&#039;s feature-extracted and pre-processed representation as input to the generator network, generating realistic synthetic data samples that capture multimodal relationships. These generated synthetic data samples, reflecting multimodal relationships, serve as inputs to the discriminator network, which has been trained to distinguish genuine from synthetic data. With this approach, the model can learn discriminative features for emotion recognition and make accurate predictions regarding subsequent emotional states. Our method was evaluated on the MELD dataset, yielding notable results in terms of accuracy (80%), F1-score (81%), precision (81%), and recall (81%) when using the MELD dataset. The pre-processing and feature extraction steps enhance input representation quality and discrimination. Our Bi-LG-GCN-based approach, featuring multimodal data synthesis, outperforms contemporary techniques, thus demonstrating its practical utility.</p></abstract>
                                                            
            
                                                                                        <kwd-group>
                                                    <kwd>Bimodal emotion recognition</kwd>
                                                    <kwd>  text and speech recognition</kwd>
                                                    <kwd>  Multimodal Emotion Lines Dataset (MELD)</kwd>
                                                    <kwd>  bi-lateral gradient graph convolutional network (Bi-LG-GCN)</kwd>
                                                    <kwd>  Affective computing identification.</kwd>
                                            </kwd-group>
                            
                                                                                                                                                    </article-meta>
    </front>
    <back>
                            <ref-list>
                                    <ref id="ref1">
                        <label>1</label>
                        <mixed-citation publication-type="journal">[1] P. Savci and B. Das, “Comparison of pre-trained language models in terms of carbon emissions, time and accuracy in multi-label text classification using AutoML,” Heliyon, vol. 9, no. 5, p. e15670, 2023-05-01. [Online]. Available: https://www.sciencedirect.com/science/ article/pii/S2405844023028773</mixed-citation>
                    </ref>
                                    <ref id="ref2">
                        <label>2</label>
                        <mixed-citation publication-type="journal">[2] M. Aydogan, “A hybrid deep neural network-based automated diagnosis system using x-ray images and clinical findings,” International Journalof Imaging Systems and Technology, vol. 33, no. 4, pp. 1368–1382, 2023, eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/ima.22856. [On-line]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/ima.</mixed-citation>
                    </ref>
                                    <ref id="ref3">
                        <label>3</label>
                        <mixed-citation publication-type="journal">[3] D. Dupr´e, E. G. Krumhuber, D. K¨uster, and G. J. McKeown, “A
performance comparison of eight commercially available automatic
classifiers for facial affect recognition,” PLOS ONE, vol. 15,
no. 4, p. e0231968, 2020, publisher: Public Library of Science.
[Online]. Available: https://journals.plos.org/plosone/article?id=10.1371/
journal.pone.0231968</mixed-citation>
                    </ref>
                                    <ref id="ref4">
                        <label>4</label>
                        <mixed-citation publication-type="journal">[4] E. Cameron and M. Green, Making Sense of Change Management:
A Complete Guide to the Models, Tools and Techniques of
Organizational Change. Kogan Page Publishers, 2019. [Online].
Available: https://www.example.com/your-book-url</mixed-citation>
                    </ref>
                                    <ref id="ref5">
                        <label>5</label>
                        <mixed-citation publication-type="journal">[5] W. Zehra, A. R. Javed, Z. Jalil, H. U. Khan, and T. R. Gadekallu,
“Cross corpus multi-lingual speech emotion recognition using ensemble
learning,” Complex &amp; Intelligent Systems, vol. 7, no. 4, pp. 1845–1854,
2021. [Online]. Available: https://doi.org/10.1007/s40747-020-00250-4</mixed-citation>
                    </ref>
                                    <ref id="ref6">
                        <label>6</label>
                        <mixed-citation publication-type="journal">[6] A survey of emotion recognition methods with emphasis on e-learning
environments | journal of network and computer applications. [Online].
Available: https://dl.acm.org/doi/10.1016/j.jnca.2019.102423</mixed-citation>
                    </ref>
                                    <ref id="ref7">
                        <label>7</label>
                        <mixed-citation publication-type="journal">[7] S. K. Yadav, K. Tiwari, H. M. Pandey, and S. A. Akbar, “A review of
multimodal human activity recognition with special emphasis on classification,
applications, challenges and future directions,” Knowledge-
Based Systems, vol. 223, p. 106970, 2021. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0950705121002331</mixed-citation>
                    </ref>
                                    <ref id="ref8">
                        <label>8</label>
                        <mixed-citation publication-type="journal">[8] R. Das and M. Soylu, “A key review on graph data science: The power of
graphs in scientific studies,” Chemometrics and Intelligent Laboratory
Systems, vol. 240, p. 104896, 2023-09-15. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0169743923001466</mixed-citation>
                    </ref>
                                    <ref id="ref9">
                        <label>9</label>
                        <mixed-citation publication-type="journal">[9] P. Savci and B. Das, “Prediction of the customers’ interests using
sentiment analysis in e-commerce data for comparison of arabic,
english, and turkish languages,” Journal of King Saud University -
Computer and Information Sciences, vol. 35, no. 3, pp. 227–237,
2023-03-01. [Online]. Available: https://www.sciencedirect.com/science/
article/pii/S131915782300054X</mixed-citation>
                    </ref>
                                    <ref id="ref10">
                        <label>10</label>
                        <mixed-citation publication-type="journal">[10] I. Pulatov, R. Oteniyazov, F. Makhmudov, and Y.-I. Cho, “Enhancing
speech emotion recognition using dual feature extraction encoders,”
Sensors, vol. 23, no. 14, p. 6640, 2023-01, number: 14 Publisher:
Multidisciplinary Digital Publishing Institute. [Online]. Available:
https://www.mdpi.com/1424-8220/23/14/6640</mixed-citation>
                    </ref>
                                    <ref id="ref11">
                        <label>11</label>
                        <mixed-citation publication-type="journal">[11] M. Egger, M. Ley, and S. Hanke, “Emotion recognition from
physiological signal analysis: A review,” Electronic Notes in Theoretical
Computer Science, vol. 343, pp. 35–55, 2019. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S157106611930009X</mixed-citation>
                    </ref>
                                    <ref id="ref12">
                        <label>12</label>
                        <mixed-citation publication-type="journal">[12] E. S. Salama, R. A. El-Khoribi, M. E. Shoman, and M. A. W. Shalaby,
“A 3d-convolutional neural network framework with ensemble learning
techniques for multi-modal emotion recognition,” Egyptian Informatics
Journal, vol. 22, no. 2, pp. 167–176, 2021. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S1110866520301389</mixed-citation>
                    </ref>
                                    <ref id="ref13">
                        <label>13</label>
                        <mixed-citation publication-type="journal">[13] C.-H. Wu and W.-B. Liang, “Emotion recognition of affective speech
based on multiple classifiers using acoustic-prosodic information and
semantic labels,” T. Affective Computing, vol. 2, pp. 10–21, 2011.
[Online]. Available: https://ieeexplore.ieee.org/document/5674019</mixed-citation>
                    </ref>
                                    <ref id="ref14">
                        <label>14</label>
                        <mixed-citation publication-type="journal">[14] M. Soylu, A. Soylu, and R. Das, “A new approach to recognizing
the use of attitude markers by authors of academic journal
articles,” Expert Systems with Applications, vol. 230, p. 120538,
2023-11. [Online]. Available: https://linkinghub.elsevier.com/retrieve/
pii/S0957417423010400</mixed-citation>
                    </ref>
                                    <ref id="ref15">
                        <label>15</label>
                        <mixed-citation publication-type="journal">[15] Speech emotion recognition with acoustic and lexical features. [Online].
Available: https://ieeexplore.ieee.org/abstract/document/7178872/</mixed-citation>
                    </ref>
                                    <ref id="ref16">
                        <label>16</label>
                        <mixed-citation publication-type="journal">[16] K. D. N. and A. Patil, “Multimodal emotion recognition using crossmodal
attention and 1d convolutional neural networks,” in Interspeech
2020. ISCA, 2020, pp. 4243–4247. [Online]. Available: https:
//www.isca-speech.org/archive/interspeech 2020/n20 interspeech.html</mixed-citation>
                    </ref>
                                    <ref id="ref17">
                        <label>17</label>
                        <mixed-citation publication-type="journal">[17] Y. Cimtay, E. Ekmekcioglu, and S. Caglar-Ozhan, “Cross-subject
multimodal emotion recognition based on hybrid fusion,” IEEE Access,
vol. 8, pp. 168 865–168 878, 2020, conference Name: IEEE Access.
[Online]. Available: https://ieeexplore.ieee.org/document/9195813</mixed-citation>
                    </ref>
                                    <ref id="ref18">
                        <label>18</label>
                        <mixed-citation publication-type="journal">[18] T. Dalgleish and M. Power, Handbook of Cognition and
Emotion. John Wiley &amp; Sons, 2000-11-21, google-Books-ID:
vsLvrhohXhAC. [Online]. Available: https://www.google.com.tr/books/
edition/Handbook of Cognition and Emotion/vsLvrhohXhAC?hl=en&amp;
gbpv=1&amp;dq=isbn:9780470842218&amp;printsec=frontcover&amp;pli=1</mixed-citation>
                    </ref>
                                    <ref id="ref19">
                        <label>19</label>
                        <mixed-citation publication-type="journal">[19] C. Guanghui and Z. Xiaoping, “Multi-modal emotion recognition by
fusing correlation features of speech-visual,” IEEE Signal Processing
Letters, vol. 28, pp. 533–537, 2021, conference Name: IEEE
Signal Processing Letters. [Online]. Available: https://ieeexplore.ieee.
org/document/9340264</mixed-citation>
                    </ref>
                                    <ref id="ref20">
                        <label>20</label>
                        <mixed-citation publication-type="journal">[20] S. K. Bharti, S. Varadhaganapathy, R. K. Gupta, P. K. Shukla, M. Bouye,
S. K. Hingaa, and A. Mahmoud, “Text-based emotion recognition usingdeep learning approach,” Computational Intelligence and Neuroscience,
vol. 2022, p. e2645381, 2022, publisher: Hindawi. [Online]. Available:
https://www.hindawi.com/journals/cin/2022/2645381/</mixed-citation>
                    </ref>
                                    <ref id="ref21">
                        <label>21</label>
                        <mixed-citation publication-type="journal">[21] Z. Lian, J. Tao, B. Liu, J. Huang, Z. Yang, and R. Li, “Context-dependent
domain adversarial neural network for multimodal emotion recognition.”
in Interspeech, 2020, pp. 394–398. [Online]. Available: https://www.
iscaspeech.org/archive/interspeech 2020/lian20b interspeech.html</mixed-citation>
                    </ref>
                                    <ref id="ref22">
                        <label>22</label>
                        <mixed-citation publication-type="journal">[22] D. Priyasad, T. Fernando, S. Denman, C. Fookes, and S. Sridharan,
“Attention driven fusion for multi-modal emotion recognition.” [Online].
Available: http://arxiv.org/abs/2009.10991</mixed-citation>
                    </ref>
                                    <ref id="ref23">
                        <label>23</label>
                        <mixed-citation publication-type="journal">[23] T. Mittal, U. Bhattacharya, R. Chandra, A. Bera, and D. Manocha,
“M3er: Multiplicative multimodal emotion recognition using facial,
textual, and speech cues,” Proceedings of the AAAI Conference
on Artificial Intelligence, vol. 34, pp. 1359–1367, 2020. [Online].
Available: https://doi.org/10.48550/arXiv.1911.05659</mixed-citation>
                    </ref>
                                    <ref id="ref24">
                        <label>24</label>
                        <mixed-citation publication-type="journal">[24] W. Liu, J.-L. Qiu, W.-L. Zheng, and B.-L. Lu, “Multimodal emotion
recognition using deep canonical correlation analysis.” [Online].
Available: http://arxiv.org/abs/1908.05349</mixed-citation>
                    </ref>
                                    <ref id="ref25">
                        <label>25</label>
                        <mixed-citation publication-type="journal">[25] T. Mittal, P. Guhan, U. Bhattacharya, R. Chandra, A. Bera,
and D. Manocha, “Emoticon: Context-aware multimodal emotion
recognition using frege’s principle,” in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), June
2020. [Online]. Available: https://ieeexplore.ieee.org/document/9156904</mixed-citation>
                    </ref>
                                    <ref id="ref26">
                        <label>26</label>
                        <mixed-citation publication-type="journal">[26] M. R. Makiuchi, K. Uto, and K. Shinoda, “Multimodal emotion
recognition with high-level speech and text features.” [Online].
Available: http://arxiv.org/abs/2111.10202</mixed-citation>
                    </ref>
                                    <ref id="ref27">
                        <label>27</label>
                        <mixed-citation publication-type="journal">[27] Y.-T. Lan, W. Liu, and B.-L. Lu, “Multimodal emotion recognition
using deep generalized canonical correlation analysis with an attention
mechanism,” in 2020 International Joint Conference on Neural
Networks (IJCNN). IEEE, 2020-07, pp. 1–6. [Online]. Available:
https://ieeexplore.ieee.org/document/9207625/</mixed-citation>
                    </ref>
                                    <ref id="ref28">
                        <label>28</label>
                        <mixed-citation publication-type="journal">[28] H. Zhang, “Expression-EEG based collaborative multimodal emotion
recognition using deep AutoEncoder,” IEEE Access, vol. 8, pp.
164 130–164 143, 2020, conference Name: IEEE Access. [Online].
Available: https://ieeexplore.ieee.org/document/9187342</mixed-citation>
                    </ref>
                                    <ref id="ref29">
                        <label>29</label>
                        <mixed-citation publication-type="journal">[29] S. R. Zaman, D. Sadekeen, M. A. Alfaz, and R. Shahriyar, “One source
to detect them all: Gender, age, and emotion detection from voice,”
in 2021 IEEE 45th Annual Computers, Software, and Applications
Conference (COMPSAC), 2021, pp. 338–343, ISSN: 0730-3157.
[Online]. Available: https://ieeexplore.ieee.org/document/9529731</mixed-citation>
                    </ref>
                                    <ref id="ref30">
                        <label>30</label>
                        <mixed-citation publication-type="journal">[30] X. Wu, W.-L. Zheng, and B.-L. Lu, “Investigating EEG-based functional
connectivity patterns for multimodal emotion recognition.” [Online].
Available: http://arxiv.org/abs/2004.01973</mixed-citation>
                    </ref>
                                    <ref id="ref31">
                        <label>31</label>
                        <mixed-citation publication-type="journal">[31] M. S. Akhtar, D. Chauhan, D. Ghosal, S. Poria, A. Ekbal,
and P. Bhattacharyya, “Multi-task learning for multi-modal emotion
recognition and sentiment analysis,” in Proceedings of the 2019
Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, Volume 1
(Long and Short Papers). Association for Computational Linguistics,
2019, pp. 370–379. [Online]. Available: https://aclanthology.org/
N19-1034</mixed-citation>
                    </ref>
                                    <ref id="ref32">
                        <label>32</label>
                        <mixed-citation publication-type="journal">[32] S. Nemati, R. Rohani, M. E. Basiri, M. Abdar, N. Y. Yen, and
V. Makarenkov, “A hybrid latent space data fusion method for
multimodal emotion recognition,” IEEE Access, vol. 7, pp. 172 948–
172 964, 2019, conference Name: IEEE Access. [Online]. Available:
https://ieeexplore.ieee.org/document/8911364</mixed-citation>
                    </ref>
                                    <ref id="ref33">
                        <label>33</label>
                        <mixed-citation publication-type="journal">[33] Z. Fang, A. He, Q. Yu, B. Gao, W. Ding, T. Zhang, and L. Ma, “FAF:
A novel multimodal emotion recognition approach integrating face,
body and text.” [Online]. Available: http://arxiv.org/abs/2211.15425</mixed-citation>
                    </ref>
                                    <ref id="ref34">
                        <label>34</label>
                        <mixed-citation publication-type="journal">[34] L. Sun, Z. Lian, J. Tao, B. Liu, and M. Niu, “Multi-modal
continuous dimensional emotion recognition using recurrent neural
network and self-attention mechanism,” in Proceedings of the
1st International on Multimodal Sentiment Analysis in Real-life
Media Challenge and Workshop, ser. MuSe’20. Association for
Computing Machinery, 2020-10-15, pp. 27–34. [Online]. Available:
https://doi.org/10.1145/3423327.3423672</mixed-citation>
                    </ref>
                                    <ref id="ref35">
                        <label>35</label>
                        <mixed-citation publication-type="journal">[35] L. Cai, Y. Hu, J. Dong, and S. Zhou, “Audio-textual emotion
recognition based on improved neural networks,” Mathematical
Problems in Engineering, vol. 2019, pp. 1–9, 2019. [Online]. Available:
https://www.hindawi.com/journals/mpe/2019/2593036/</mixed-citation>
                    </ref>
                                    <ref id="ref36">
                        <label>36</label>
                        <mixed-citation publication-type="journal">[36] M. Aydo˘gan and A. Karci, “Improving the accuracy using pretrained
word embeddings on deep neural networks for turkish text
classification,” Physica A: Statistical Mechanics and its Applications,
vol. 541, p. 123288, 2020-03. [Online]. Available: https://linkinghub.
elsevier.com/retrieve/pii/S0378437119318436</mixed-citation>
                    </ref>
                                    <ref id="ref37">
                        <label>37</label>
                        <mixed-citation publication-type="journal">[37] Q.-T. Truong and H. Lauw, “VistaNet: Visual aspect attention
network for multimodal sentiment analysis,” Proceedings of the AAAI
Conference on Artificial Intelligence, vol. 33, pp. 305–312, 2019-07-17.
[Online]. Available: https://doi.org/10.1609/aaai.v33i01.3301305</mixed-citation>
                    </ref>
                                    <ref id="ref38">
                        <label>38</label>
                        <mixed-citation publication-type="journal">[38] N. Ahmed, Z. A. Aghbari, and S. Girija, “A systematic survey on
multimodal emotion recognition using learning algorithms,” Intelligent
Systems with Applications, vol. 17, p. 200171, 2023. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S2667305322001089</mixed-citation>
                    </ref>
                                    <ref id="ref39">
                        <label>39</label>
                        <mixed-citation publication-type="journal">[39] A. Gandhi, K. Adhvaryu, S. Poria, E. Cambria, and A. Hussain,
“Multimodal sentiment analysis: A systematic review of history,
datasets, multimodal fusion methods, applications, challenges and
future directions,” Information Fusion, vol. 91, pp. 424–444, 2023-03-
01. [Online]. Available: https://www.sciencedirect.com/science/article/
pii/S1566253522001634</mixed-citation>
                    </ref>
                                    <ref id="ref40">
                        <label>40</label>
                        <mixed-citation publication-type="journal">[40] A. Solgi, A. Pourhaghi, R. Bahmani, and H. Zarei, “Improving SVR
and ANFIS performance using wavelet transform and PCA algorithm
for modeling and predicting biochemical oxygen demand (BOD),”
Ecohydrology &amp; Hydrobiology, vol. 17, no. 2, pp. 164–175, 2017-04-
01. [Online]. Available: https://www.sciencedirect.com/science/article/
pii/S1642359316300672</mixed-citation>
                    </ref>
                                    <ref id="ref41">
                        <label>41</label>
                        <mixed-citation publication-type="journal">[41] J. Li, X. Wang, G. Lv, and Z. Zeng, “GraphMFT: A graph
network based multimodal fusion technique for emotion recognition in
conversation.” [Online]. Available: http://arxiv.org/abs/2208.00339</mixed-citation>
                    </ref>
                            </ref-list>
                    </back>
    </article>
