Research Article
BibTex RIS Cite

EmotionUnet: Konuşma Duygu Tanıma için U-Net Tabanlı Özgün Derin Öğrenme Modeli

Year 2025, Volume: 16 Issue: 3, 232 - 250, 31.08.2025
https://doi.org/10.5824/ajite.2025.03.003.x

Abstract

Konuşma, insanlar arasındaki iletişimin en temel ve etkili yolu olarak değerlendirilmektedir. İnsanlar konuşma yolu ile duygu, düşünce ve bilgilerini paylaşmakta, ilişkilerini güçlendirmekte ve toplumsal bağlarını pekiştirmektedir. Konuşma sırasında karşıdaki kişinin duygu durumunun anlaşılması, empati kurarak daha etkili ve anlamlı bir iletişim sağlamak için önemlidir. Günümüzde telefon gibi araçlarla yapılan uzaktan konuşmalarda ifade edilen duygu tonlarının anlaşılması için konuşma duygu tanıma yöntemlerinden sıklıkla faydalanılmaktadır. Konuşma duygu tanıma müşteri hizmetleri, sağlık, eğitim, eğlence ve akıllı sistemler gibi birçok alanda kullanılmaktadır. Konuşma duygu tanımada sinyal işleme, istatistiksel analiz ve biyometrik teknikler gibi yöntemler kullanılırken, son zamanlarda derin öğrenme yöntemleri de yaygınlaşmıştır. Bu çalışmada konuşma duygu tanıma için evrişimsel sinir ağları kullanılarak U-Net tabanlı özgün derin öğrenme modeli önerilmiştir. Önerilen modelin hiper-parametre optimizasyonları için Bayesian optimizasyon yönteminden faydalanılmıştır. Önerilen model Türkçe, İngilizce, Arapça ve Bangla dillerinden dört farklı veri ile analiz edilmiştir. Önerilen model ile farklı veri setlerinde %56,55 ile %99,71 arasında doğruluk hesaplanmıştır.

References

  • Ahmad, J., Muhammad, K., Kwon, S., Baik, S. W., & Rho, S. (2016). Dempster-Shafer Fusion Based Gender Recognition for Speech Analysis Applications. 2016 International Conference on Platform Technology and Service (PlatCon), 1–4. https://doi.org/10.1109/PlatCon.2016.7456788
  • Allen, J. B., & Rabiner, L. R. (1977). A unified approach to short-time Fourier analysis and synthesis. Proceedings of the IEEE, 65(11), 1558–1564. Proceedings of the IEEE. https://doi.org/10.1109/PROC.1977.10770
  • Alsabhan, W. (2023). Human–Computer Interaction with a Real-Time Speech Emotion Recognition with Ensembling Techniques 1D Convolution Neural Network and Attention. Sensors, 23(3), Article 3. https://doi.org/10.3390/s23031386
  • Altamimi, M., & Alayba, A. M. (2023). ANAD: Arabic news article dataset. Data in Brief, 50, 109460. https://doi.org/10.1016/j.dib.2023.109460
  • Anvarjon, T., Mustaqeem, & Kwon, S. (2020). Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features. Sensors, 20(18), Article 18. https://doi.org/10.3390/s20185212
  • Aziz, S., Arif, N. H., Ahbab, S., Ahmed, S., Ahmed, T., & Kabir, Md. H. (2023). Improved Speech Emotion Recognition in Bengali Language using Deep Learning. 2023 26th International Conference on Computer and Information Technology (ICCIT), 1–6. https://doi.org/10.1109/ICCIT60459.2023.10441053
  • Canpolat, S. F., Ormanoğlu, Z., & Zeyrek, D. (2020). Turkish Emotion Voice Database (TurEV-DB). In D. Beermann, L. Besacier, S. Sakti, & C. Soria (Eds.), Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL) (pp. 368–375). European Language Resources association. https://aclanthology.org/2020.sltu-1.52
  • Das, R. K., Islam, N., Ahmed, Md. R., Islam, S., Shatabda, S., & Islam, A. K. M. M. (2022). BanglaSER: A speech emotion recognition dataset for the Bangla language. Data in Brief, 42, 108091. https://doi.org/10.1016/j.dib.2022.108091
  • Ghai, M., Lal, S., Duggal, S., & Manik, S. (2017). Emotion recognition on speech signals using machine learning. 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC), 34–39. https://doi.org/10.1109/ICBDACI.2017.8070805
  • Harár, P., Burget, R., & Dutta, M. K. (2017). Speech emotion recognition with deep learning. 2017 4th International Conference on Signal Processing and Integrated Networks (SPIN), 137–140. https://doi.org/10.1109/SPIN.2017.8049931
  • Ismaiel, W., Alhalangy, A., Mohamed, A. O. Y., & Musa, A. I. A. (2024). Deep Learning, Ensemble and Supervised Machine Learning for Arabic Speech Emotion Recognition. Engineering, Technology & Applied Science Research, 14(2), Article 2. https://doi.org/10.48084/etasr.7134
  • Issa, D., Fatih Demirci, M., & Yazici, A. (2020). Speech emotion recognition with deep convolutional neural networks. Biomedical Signal Processing and Control, 59, 101894. https://doi.org/10.1016/j.bspc.2020.101894
  • Jha, T., Kavya, R., Christopher, J., & Arunachalam, V. (2022). Machine learning techniques for speech emotion recognition using paralinguistic acoustic features. International Journal of Speech Technology, 25(3), 707–725. https://doi.org/10.1007/s10772-022-09985-6
  • Keras: Deep Learning for humans. (2024, July 20). https://keras.io/
  • Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Ali Mahjoub, M., & Cleder, C. (2020). Automatic Speech Emotion Recognition Using Machine Learning. In A. Cano (Ed.), Social Media and Machine Learning. IntechOpen. https://doi.org/10.5772/intechopen.84856
  • Khan, M., Gueaieb, W., El Saddik, A., & Kwon, S. (2024). MSER: Multimodal speech emotion recognition using cross-attention with deep fusion. Expert Systems with Applications, 245, 122946. https://doi.org/10.1016/j.eswa.2023.122946
  • Kotowski, K., Smolarczyk, T., Roterman-Konieczna, I., & Stapor, K. (2021). ProteinUnet—An efficient alternative to SPIDER3-single for sequence-based prediction of protein secondary structures. Journal of Computational Chemistry, 42(1), 50–59. https://doi.org/10.1002/jcc.26432
  • Krishna, K. V., Sainath, N., & Posonia, A. M. (2022). Speech Emotion Recognition using Machine Learning. 2022 6th International Conference on Computing Methodologies and Communication (ICCMC), 1014–1018. https://doi.org/10.1109/ICCMC53470.2022.9753976
  • Lin, Y.-L., & Wei, G. (2005). Speech emotion recognition based on HMM and SVM. 2005 International Conference on Machine Learning and Cybernetics, 8, 4898-4901 Vol. 8. https://doi.org/10.1109/ICMLC.2005.1527805
  • Liu, Z.-T., Han, M.-T., Wu, B.-H., & Rehman, A. (2023). Speech emotion recognition based on convolutional neural network with attention-based bidirectional long short-term memory network and multi-task learning. Applied Acoustics, 202, 109178. https://doi.org/10.1016/j.apacoust.2022.109178
  • Liu, Z.-T., Wu, M., Cao, W.-H., Mao, J.-W., Xu, J.-P., & Tan, G.-Z. (2018). Speech emotion recognition based on feature selection and extreme learning machine decision tree. Neurocomputing, 273, 271–280. https://doi.org/10.1016/j.neucom.2017.07.050
  • Madanian, S., Chen, T., Adeleye, O., Templeton, J. M., Poellabauer, C., Parry, D., & Schneider, S. L. (2023). Speech emotion recognition using machine learning—A systematic review. Intelligent Systems with Applications, 20, 200266. https://doi.org/10.1016/j.iswa.2023.200266
  • Mary Little Flower, T., Jaya, T., & Christopher Ezhil Singh, S. (2024). Data augmentation using a 1D-CNN model with MFCC/MFMC features for speech emotion recognition. Automatika, 65(4), 1325–1338. https://doi.org/10.1080/00051144.2024.2371249
  • Mishra, D., & rawat, A. (2015). Emotion Recognition through Speech Using Neural Network. International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE), 5.
  • Mishra, S. P., Warule, P., & Deb, S. (2024). Speech emotion recognition using MFCC-based entropy feature. Signal, Image and Video Processing, 18(1), 153–161. https://doi.org/10.1007/s11760-023-02716-7
  • Mustaqeem, & Kwon, S. (2021). MLT-DNet: Speech emotion recognition using 1D dilated CNN based on multi-learning trick approach. Expert Systems with Applications, 167, 114177. https://doi.org/10.1016/j.eswa.2020.114177
  • Mustaqeem, Sajjad, M., & Kwon, S. (2020). Clustering-Based Speech Emotion Recognition by Incorporating Learned Features and Deep BiLSTM. IEEE Access, 8, 79861–79875. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2990405
  • Nediyanchath, A., Paramasivam, P., & Yenigalla, P. (2020). Multi-Head Attention for Speech Emotion Recognition with Auxiliary Learning of Gender Recognition. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 7179–7183. https://doi.org/10.1109/ICASSP40776.2020.9054073
  • Pichora-Fuller, M. K., & Dupuis, K. (2020). Toronto emotional speech set (TESS) [Dataset]. Borealis. https://doi.org/10.5683/SP2/E8H2MF
  • Sankara Pandiammal, K., Karishma, S., Harine Sakthe, K., Manimaran, V., Kalaiselvi, S., & Anitha, V. (2024). Emotion Recognition from Speech – an LSTM approach with the Tess Dataset. 2024 5th International Conference on Innovative Trends in Information Technology (ICITIIT), 1–6. https://doi.org/10.1109/ICITIIT61487.2024.10580351
  • Satt, A., Rozenberg, S., & Hoory, R. (2017). Efficient Emotion Recognition from Speech Using Deep Learning on Spectrograms. Interspeech 2017, 1089–1093. https://doi.org/10.21437/Interspeech.2017-200
  • scikit-optimize: Sequential model-based optimization toolbox. (Version 0.10.2). (2025). [Python; MacOS, Microsoft :: Windows, POSIX, Unix]. https://scikit-optimize.readthedocs.io/en/latest/contents.html
  • Singh, P., Sahidullah, M., & Saha, G. (2023). Modulation spectral features for speech emotion recognition using deep neural networks. Speech Communication, 146, 53–69. https://doi.org/10.1016/j.specom.2022.11.005
  • Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. Advances in Neural Information Processing Systems, 25. https://proceedings.neurips.cc/paper/2012/hash/05311655a15b75fab86956663e1819cd-Abstract.html
  • Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181. https://doi.org/10.1016/j.specom.2006.04.003
  • Wagner, J., Kim, J., & Andre, E. (2005). From Physiological Signals to Emotions: Implementing and Comparing Selected Methods for Feature Extraction and Classification. 2005 IEEE International Conference on Multimedia and Expo, 940–943. https://doi.org/10.1109/ICME.2005.1521579
  • Zhou, X., Guo, J., & Bie, R. (2016). Deep Learning Based Affective Model for Speech Emotion Recognition. 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), 841–846. https://doi.org/10.1109/UIC-ATC-ScalCom-CBDCom-IoP-SmartWorld.2016.0133
  • Zhu, L., Chen, L., Zhao, D., Zhou, J., & Zhang, W. (2017). Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN. Sensors, 17(7), Article 7. https://doi.org/10.3390/s17071694

EmotionUnet: A Novel Deep Learning Model Based on U-Net for Speech Emotion Recognition

Year 2025, Volume: 16 Issue: 3, 232 - 250, 31.08.2025
https://doi.org/10.5824/ajite.2025.03.003.x

Abstract

Speech is considered to be the most basic and effective way of communication between people. Through speaking, people share their feelings, thoughts and information, strengthen their relationships and reinforce their social bonds. It is important to understand the emotional state of the other person during the conversation in order to provide a more effective and meaningful communication by empathizing. Today, speech emotion recognition methods are frequently used to understand the emotional tones expressed in remote conversations via tools such as telephones. Speech emotion recognition is used in many fields such as customer service, healthcare, education, entertainment, and intelligent systems. While signal processing, statistical analysis and biometric techniques are used in speech emotion recognition, deep learning methods have recently become widespread. In this study, a novel U-Net based deep learning model for speech emotion recognition using convolutional neural networks is proposed. Bayesian optimization method is used for hyper-parameter optimization of the proposed model. The proposed model is analyzed with four different datasets from Turkish, English, Arabic and Bangla languages. The accuracy of the proposed model is calculated between 56,55% and 99,71% on different datasets.

References

  • Ahmad, J., Muhammad, K., Kwon, S., Baik, S. W., & Rho, S. (2016). Dempster-Shafer Fusion Based Gender Recognition for Speech Analysis Applications. 2016 International Conference on Platform Technology and Service (PlatCon), 1–4. https://doi.org/10.1109/PlatCon.2016.7456788
  • Allen, J. B., & Rabiner, L. R. (1977). A unified approach to short-time Fourier analysis and synthesis. Proceedings of the IEEE, 65(11), 1558–1564. Proceedings of the IEEE. https://doi.org/10.1109/PROC.1977.10770
  • Alsabhan, W. (2023). Human–Computer Interaction with a Real-Time Speech Emotion Recognition with Ensembling Techniques 1D Convolution Neural Network and Attention. Sensors, 23(3), Article 3. https://doi.org/10.3390/s23031386
  • Altamimi, M., & Alayba, A. M. (2023). ANAD: Arabic news article dataset. Data in Brief, 50, 109460. https://doi.org/10.1016/j.dib.2023.109460
  • Anvarjon, T., Mustaqeem, & Kwon, S. (2020). Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features. Sensors, 20(18), Article 18. https://doi.org/10.3390/s20185212
  • Aziz, S., Arif, N. H., Ahbab, S., Ahmed, S., Ahmed, T., & Kabir, Md. H. (2023). Improved Speech Emotion Recognition in Bengali Language using Deep Learning. 2023 26th International Conference on Computer and Information Technology (ICCIT), 1–6. https://doi.org/10.1109/ICCIT60459.2023.10441053
  • Canpolat, S. F., Ormanoğlu, Z., & Zeyrek, D. (2020). Turkish Emotion Voice Database (TurEV-DB). In D. Beermann, L. Besacier, S. Sakti, & C. Soria (Eds.), Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL) (pp. 368–375). European Language Resources association. https://aclanthology.org/2020.sltu-1.52
  • Das, R. K., Islam, N., Ahmed, Md. R., Islam, S., Shatabda, S., & Islam, A. K. M. M. (2022). BanglaSER: A speech emotion recognition dataset for the Bangla language. Data in Brief, 42, 108091. https://doi.org/10.1016/j.dib.2022.108091
  • Ghai, M., Lal, S., Duggal, S., & Manik, S. (2017). Emotion recognition on speech signals using machine learning. 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC), 34–39. https://doi.org/10.1109/ICBDACI.2017.8070805
  • Harár, P., Burget, R., & Dutta, M. K. (2017). Speech emotion recognition with deep learning. 2017 4th International Conference on Signal Processing and Integrated Networks (SPIN), 137–140. https://doi.org/10.1109/SPIN.2017.8049931
  • Ismaiel, W., Alhalangy, A., Mohamed, A. O. Y., & Musa, A. I. A. (2024). Deep Learning, Ensemble and Supervised Machine Learning for Arabic Speech Emotion Recognition. Engineering, Technology & Applied Science Research, 14(2), Article 2. https://doi.org/10.48084/etasr.7134
  • Issa, D., Fatih Demirci, M., & Yazici, A. (2020). Speech emotion recognition with deep convolutional neural networks. Biomedical Signal Processing and Control, 59, 101894. https://doi.org/10.1016/j.bspc.2020.101894
  • Jha, T., Kavya, R., Christopher, J., & Arunachalam, V. (2022). Machine learning techniques for speech emotion recognition using paralinguistic acoustic features. International Journal of Speech Technology, 25(3), 707–725. https://doi.org/10.1007/s10772-022-09985-6
  • Keras: Deep Learning for humans. (2024, July 20). https://keras.io/
  • Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Ali Mahjoub, M., & Cleder, C. (2020). Automatic Speech Emotion Recognition Using Machine Learning. In A. Cano (Ed.), Social Media and Machine Learning. IntechOpen. https://doi.org/10.5772/intechopen.84856
  • Khan, M., Gueaieb, W., El Saddik, A., & Kwon, S. (2024). MSER: Multimodal speech emotion recognition using cross-attention with deep fusion. Expert Systems with Applications, 245, 122946. https://doi.org/10.1016/j.eswa.2023.122946
  • Kotowski, K., Smolarczyk, T., Roterman-Konieczna, I., & Stapor, K. (2021). ProteinUnet—An efficient alternative to SPIDER3-single for sequence-based prediction of protein secondary structures. Journal of Computational Chemistry, 42(1), 50–59. https://doi.org/10.1002/jcc.26432
  • Krishna, K. V., Sainath, N., & Posonia, A. M. (2022). Speech Emotion Recognition using Machine Learning. 2022 6th International Conference on Computing Methodologies and Communication (ICCMC), 1014–1018. https://doi.org/10.1109/ICCMC53470.2022.9753976
  • Lin, Y.-L., & Wei, G. (2005). Speech emotion recognition based on HMM and SVM. 2005 International Conference on Machine Learning and Cybernetics, 8, 4898-4901 Vol. 8. https://doi.org/10.1109/ICMLC.2005.1527805
  • Liu, Z.-T., Han, M.-T., Wu, B.-H., & Rehman, A. (2023). Speech emotion recognition based on convolutional neural network with attention-based bidirectional long short-term memory network and multi-task learning. Applied Acoustics, 202, 109178. https://doi.org/10.1016/j.apacoust.2022.109178
  • Liu, Z.-T., Wu, M., Cao, W.-H., Mao, J.-W., Xu, J.-P., & Tan, G.-Z. (2018). Speech emotion recognition based on feature selection and extreme learning machine decision tree. Neurocomputing, 273, 271–280. https://doi.org/10.1016/j.neucom.2017.07.050
  • Madanian, S., Chen, T., Adeleye, O., Templeton, J. M., Poellabauer, C., Parry, D., & Schneider, S. L. (2023). Speech emotion recognition using machine learning—A systematic review. Intelligent Systems with Applications, 20, 200266. https://doi.org/10.1016/j.iswa.2023.200266
  • Mary Little Flower, T., Jaya, T., & Christopher Ezhil Singh, S. (2024). Data augmentation using a 1D-CNN model with MFCC/MFMC features for speech emotion recognition. Automatika, 65(4), 1325–1338. https://doi.org/10.1080/00051144.2024.2371249
  • Mishra, D., & rawat, A. (2015). Emotion Recognition through Speech Using Neural Network. International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE), 5.
  • Mishra, S. P., Warule, P., & Deb, S. (2024). Speech emotion recognition using MFCC-based entropy feature. Signal, Image and Video Processing, 18(1), 153–161. https://doi.org/10.1007/s11760-023-02716-7
  • Mustaqeem, & Kwon, S. (2021). MLT-DNet: Speech emotion recognition using 1D dilated CNN based on multi-learning trick approach. Expert Systems with Applications, 167, 114177. https://doi.org/10.1016/j.eswa.2020.114177
  • Mustaqeem, Sajjad, M., & Kwon, S. (2020). Clustering-Based Speech Emotion Recognition by Incorporating Learned Features and Deep BiLSTM. IEEE Access, 8, 79861–79875. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2990405
  • Nediyanchath, A., Paramasivam, P., & Yenigalla, P. (2020). Multi-Head Attention for Speech Emotion Recognition with Auxiliary Learning of Gender Recognition. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 7179–7183. https://doi.org/10.1109/ICASSP40776.2020.9054073
  • Pichora-Fuller, M. K., & Dupuis, K. (2020). Toronto emotional speech set (TESS) [Dataset]. Borealis. https://doi.org/10.5683/SP2/E8H2MF
  • Sankara Pandiammal, K., Karishma, S., Harine Sakthe, K., Manimaran, V., Kalaiselvi, S., & Anitha, V. (2024). Emotion Recognition from Speech – an LSTM approach with the Tess Dataset. 2024 5th International Conference on Innovative Trends in Information Technology (ICITIIT), 1–6. https://doi.org/10.1109/ICITIIT61487.2024.10580351
  • Satt, A., Rozenberg, S., & Hoory, R. (2017). Efficient Emotion Recognition from Speech Using Deep Learning on Spectrograms. Interspeech 2017, 1089–1093. https://doi.org/10.21437/Interspeech.2017-200
  • scikit-optimize: Sequential model-based optimization toolbox. (Version 0.10.2). (2025). [Python; MacOS, Microsoft :: Windows, POSIX, Unix]. https://scikit-optimize.readthedocs.io/en/latest/contents.html
  • Singh, P., Sahidullah, M., & Saha, G. (2023). Modulation spectral features for speech emotion recognition using deep neural networks. Speech Communication, 146, 53–69. https://doi.org/10.1016/j.specom.2022.11.005
  • Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. Advances in Neural Information Processing Systems, 25. https://proceedings.neurips.cc/paper/2012/hash/05311655a15b75fab86956663e1819cd-Abstract.html
  • Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181. https://doi.org/10.1016/j.specom.2006.04.003
  • Wagner, J., Kim, J., & Andre, E. (2005). From Physiological Signals to Emotions: Implementing and Comparing Selected Methods for Feature Extraction and Classification. 2005 IEEE International Conference on Multimedia and Expo, 940–943. https://doi.org/10.1109/ICME.2005.1521579
  • Zhou, X., Guo, J., & Bie, R. (2016). Deep Learning Based Affective Model for Speech Emotion Recognition. 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), 841–846. https://doi.org/10.1109/UIC-ATC-ScalCom-CBDCom-IoP-SmartWorld.2016.0133
  • Zhu, L., Chen, L., Zhao, D., Zhou, J., & Zhang, W. (2017). Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN. Sensors, 17(7), Article 7. https://doi.org/10.3390/s17071694
There are 38 citations in total.

Details

Primary Language Turkish
Subjects Speech Recognition
Journal Section Research Articles
Authors

Yasin Görmez 0000-0001-8276-2030

Publication Date August 31, 2025
Submission Date February 3, 2025
Acceptance Date June 2, 2025
Published in Issue Year 2025 Volume: 16 Issue: 3

Cite

APA Görmez, Y. (2025). EmotionUnet: Konuşma Duygu Tanıma için U-Net Tabanlı Özgün Derin Öğrenme Modeli. AJIT-E: Academic Journal of Information Technology, 16(3), 232-250. https://doi.org/10.5824/ajite.2025.03.003.x