Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences

Şeyda Portillo - Palma; Sergi Alvarez - Vidal

doi:10.29228/transLogos.67

EN

Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences

Abstract

Turkish, a language that does not explicitly mark gender in pronouns, poses a unique challenge for machine translation systems, particularly in cases of gender-neutral or ambiguous context. This study investigates the performance of neural machine translation (NMT) and large language models (LLMs) in resolving gender ambiguity when translating Turkish subject-dropped sentences into English. The analysis examines four prominent models—Google Translate, DeepL, ChatGPT, and Gemini—evaluating their pronoun selection and the extent of gender bias, especially in emotionally charged or contextually nuanced sentences. A primarily quantitative evaluation reveals a persistent gender bias across all models, with LLMs demonstrating relatively better performance than NMTs when clearer contextual information is present. However, all models exhibit limitations in managing the complexities of cross-linguistic gender representation. This research highlights the pressing need for gender-neutral solutions and advancements in context-sensitive translation. Furthermore, we introduce a moderately sized annotated Turkish corpus, designed to facilitate future studies on gender ambiguity in machine translation (MT). This dataset provides a valuable resource for enhancing the accuracy of gendered pronoun resolution and fostering more inclusive, bias-reduced translation systems. Overall, the study contributes to the growing discourse on reducing bias in language models while addressing the challenges of nuanced linguistic diversity in translation.

Keywords

References

Aleçakır, Hüseyin, Necva Bölücü, and Burcu Can. 2022. “TurkishDelightNLP: A Neural Turkish NLP Toolkit.” In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations, 17–26. Association for Computational Linguistics. doi:10.18653/v1/2022.naacl-demo.3.
Asghar, Muhammad Zubair, Fazli Subhan, Muhammad Imran, Fazal Masud Kundi, Adil Khan, Shahboddin Shamshirband, Amir Mosavi, Peter Csiba, and Annamaria R. Varkonyi Koczy. 2020. “Performance Evaluation of Supervised Machine Learning Techniques for Efficient Detection of Emotions from Online Content.” Computers, Materials & Continua 63 (3): 1093–1118. doi:10.32604/cmc.2020.07709.
Basta, Christine Raouf Saad. 2022 “Gender Bias in Natural Language Processing.” PhD diss., Universitat Politècnica de Catalunya. doi:10.5821/dissertation-2117-379361.
Bentz, Christian, Ximena Gutierrez-Vasques, Olga Sozinova, and Tanja Samardžić. 2023. “Complexity Trade-Offs and Equi-Complexity in Natural Languages: A Meta-Analysis.” In “Measuring Language Complexity,” edited by Katharina Ehret, Aleksandrs Berdicevskis, Christian Bentz, and Alice Blumenthal-Dramé. Special Issue, Linguistics Vanguard 9 (1): 9–25. doi:10.1515/lingvan-2021-0054.
Brown, Tom, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. 2020. “Language Models Are Few-Shot Learners.” In Advances in Neural Information Processing Systems 33 (NeurIPS 2020), edited by Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, and Maria-Florina Balcan. Curran Associates. doi:10.48550/arXiv.2005.14165.
Castilho, Sheila, Clodagh Quinn Mallon, Rahel Meister, and Shengya Yue. 2023. “Do Online Machine Translation Systems Care for Context? What About a GPT Model?” In Proceedings of 24th Annual Conference of the European Association for Machine Translation, edited by Mary Nurminen, Judith Brenner, Maarit Koponen, Sirkku Latomaa, Mikhail Mikhailov, Frederike Schierl, Tharindu Ranasinghe, et al., 393–417. European Association for Machine Translation. https://aclanthology.org/2023.eamt-1.39.
Ciora, Chloe, Nur Iren, and Malihe Alikhani. 2021. “Examining Covert Gender Bias: A Case Study in Turkish and English Machine Translation Models.” In Proceedings of 14th International Conference on Natural Language Generation, edited by Anya Belz, Angela Fan, Ehud Reiter, and Yaji Sripada, 55–63. Association for Computational Linguistics. doi:10.18653/v1/2021.inlg-1.7.
Costa-jussà, Marta R., Carlos Escolano, Christine Basta, Javier Ferrando, Roser Batlle, and Ksenia Kharitonova. 2022. “Interpreting Gender Bias in Neural Machine Translation: Multilingual Architecture Matters.” In Proceedings of the AAAI Conference on Artificial Intelligence 36 (11): 11855–11863. Association for the Advancement of Artificial Intelligence. California: AAAI Press. doi:10.1609/aaai.v36i11.21442.

Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), edited by Jill Burstein, Christy Doran, and Thamar Solorio, 4171–4186. Minneapolis, Minnesota: Association for Computational Linguistics. doi:10.18653/v1/N19-1423.
Dursunoğlu, Halit. 2006. “Türkiye Türkçesinde Konuşma Dili ile Yazı Dili Arasındaki İlişki.” [The relationship between spoken language and written language in Turkish from Türkiye.] Atatürk Üniversitesi Türkiyat Araştırmaları Enstitüsü Dergisi 12 (30): 1–21. https://dergipark.org.tr/tr/pub/ataunitaed/issue/2869/39214.
Ekman, Paul. 1992. “An Argument for Basic Emotions.” Cognition and Emotion 6 (3-4): 169–200. doi:10.1080/02699939208411068.
Friedman, Batya, and Helen Nissenbaum. 1996. “Bias in Computer Systems.” ACM Transactions on Information Systems (TOIS) 14 (3): 330–347. doi:10.1145/230538.230561.
Ghosh, Sourojit, and Aylin Caliskan. 2023. “ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages.” In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (AIES 23), edited by Francesca Rossi, Sanmay Das, Jenny Davis, Kay Firth-Butterfield, and Alex John, 901–912. New York: Association for Computing Machinery. doi:10.48550/arXiv.2305.10510.
Göksel, Aslı, and Celia Kerslake. 2005. Turkish: A Comprehensive Grammar. 1st ed. London: Routledge.
Hendy, Amr, Mohamed Abdelrehim, Amr Sharaf, Vikas Raunak, Mohamed Gabr, Hitokazu Matsushita, Young J. Kim, Mohamed Afify, and Hany Hassan Awadalla. 2023. “How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation.” arXiv:abs/2302.09210. doi:10.48550/arXiv.2302.09210.
Kuczmarski, James, and Melvin Johnson. 2018. “Gender-Aware Natural Language Translation.” Technical Disclosure Commons. https://www.tdcommons.org/dpubs_series/1577.
Lison, Pierre, and Jörg Tiedemann. 2016. “OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles.” In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16), edited by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, 923–929. European Language Resources Association (ELRA). https://aclanthology.org/L16-1147.
Li, Xinchen. 2024. “Comparison of Translation Quality between Large Language Models and Neural Machine Translation Systems: A Case Study of Chinese-English Language Pair.” International Journal of Education and Humanities 4 (2): 121–128. doi:10.58557/(ijeh).v4i2.213.
Odbal, Guanhong Zhang, and Sophia Ananiadou. 2022. “Examining and Mitigating Gender Bias in Text Emotion Detection Task.” Neurocomputing, no. 493, 422–434. doi:10.1016/j.neucom.2022.04.057.
Ono, Kensuke, and Akira Morita. 2024. “Evaluating Large Language Models: ChatGPT-4, Mistral 8x7B, and Google Gemini Benchmarked Against MMLU.” TechRxiv. doi:10.36227/techrxiv.170956672.21573677/v1.
Piazzolla, S. Alma, Beatrice Savoldi, and Luisa Bentivogli. 2023. “Good, but not always Fair: An Evaluation of Gender Bias for three Commercial Machine Translation Systems.” HERMES - Journal of Language and Communication in Business, no. 63, 209–225. doi:10.7146/hjlcb.vi63.137553.
Plant, E. Ashby, Janet Shibley Hyde, Dacher Keltner, and Patricia G. Devine. 2000. “The Gender Stereotyping of Emotions.” Psychology of Women Quarterly 24 (1): 81–92. doi:10.1111/j.1471-6402.2000.tb01024.x.
Plaza-del-Arco, Flor Miriam, Amanda Cercas Curry, Alba Curry, Gavin Abercrombie, and Dirk Hovy. 2024. “Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution.” arXiv preprint. doi:10.48550/arXiv.2403.03121.
Russo, Lorenza, Sharid Loáiciga, and Asheesh Gulati. 2012. “Italian and Spanish Null Subjects. A Case Study Evaluation in an MT Perspective.” In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), edited by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, 1779–1784. http://www.lrec-conf.org/proceedings/lrec2012/pdf/813_Paper.pdf.
Sakallı-Uğurlu, Nuray, Beril Türkoğlu, and Abdülkadir Kuzlak. 2018. “How are Women and Men Perceived? Structure of Gender Stereotypes in Contemporary Turkey.” Nesne Psikoloji Dergisi 6 (13): 309–336. doi:10.7816/nesne-06-13-04.
Sánchez, Eduardo, Pierre Andrews, Pontus Stenetorp, Mikel Artetxe, and Marta R. Costa-jussà. 2023. “Gender-specific Machine Translation with Large Language Models.” arXiv preprint. doi:10.48550/arXiv.2309.03175.
Scherer, Klaus R., and Harald G. Wallbott. 1994. “Evidence for Universality and Cultural Variation of Differential Emotion Response Patterning.” Journal of Personality and Social Psychology 66 (2): 310–28. doi:10.1037/0022-3514.66.2.310. Shields, Stephanie A. 2013. “Gender: An Intersectionality Perspective.” Sex Roles 68 (11–12): 675–689. doi:10.1007/s11199-008-9501-8.
Stanczak, Karolina, and Isabelle Augenstein. 2021. “A Survey on Gender Bias in Natural Language Processing.” arXiv preprint. doi:10.48550/arXiv.2112.14168.
Tiedemann, Jörg. 2012. “Parallel Data, Tools and Interfaces in OPUS.” In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’2012), edited by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, 2214–2218. http://www.lrec-conf.org/proceedings/lrec2012/pdf/463_Paper.pdf.
Turovsky, Barak. 2016. “Found in Translation: More Accurate, Fluent Sentences in Google Translate.” Google Blog. November 15. https://blog.google/products/translate/found-translation-more-accurate-fluent-sentences-google-translate/.
Vanmassenhove, Eva. 2024. “Gender Bias in Machine Translation and The Era of Large Language Models.” arXiv preprint. doi:10.48550/arXiv.2401.10016.
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. “Attention is All You Need.” In Advances in Neural Information Processing Systems 30. 31st Annual Conference on Neural Information Processing Systems (NIPS 2017), edited by Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, Vishy Vishwanathan, and Roman Garnett, 5999–6009. https://papers.nips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
Wang, Haifeng, Hua Wu, Zhongjun He, Liang Huang, and Kenneth Ward Church. 2022. “Progress in Machine Translation.” Engineering 18 (11): 143–153. doi:10.1016/j.eng.2021.03.023.
Wang, Longyue, Chenyang Lyu, Tianbo Ji, Zhirui Zhang, Dian Yu, Shuming Shi, and Zhaopeng Tu. 2023. “Document-level Machine Translation with Large Language Models.” In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, edited by Houda Bouamor, Juan Pino, and Kalika Bali, 16646–16661. Association for Computational Linguistics. doi:10.18653/v1/2023.emnlp-main.1036.
Wang, Longyue, Zhaopeng Tu, Shuming Shi, Tong Zhang, Yvette Graham, and Qun Liu. 2018. “Translating Pro-Drop Languages with Reconstruction Models.” Proceedings of the AAAI Conference on Artificial Intelligence 32 (1): 4937–4945. doi:10.1609/aaai.v32i1.11913.
Wang, Longyue, Zhaopeng Tu, Xiaojun Zhang, Siyou Liu, Hang Li, Andy Way, and Qun Liu. 2017. “A Novel and Robust Approach for Pro-Drop Language Translation.” Machine Translation 31 (1–2): 65–87. doi:10.1007/s10590-016-9184-9.
Wegge, Maximilian, and Roman Klinger. 2024. “Topic Bias in Emotion Classification.” In Proceedings of the Ninth Workshop on Noisy and User-generated Text (W-NUT 2024), edited by Rob van der Goot, JinYeong Bak, Max Müller-Eberstein, Wei Xu, Alan Ritter, and Tim Baldwin, 89–103. https://aclanthology.org/2024.wnut-1.9/.
Wu, Yonghui, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, et al. 2016. “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation.” Computing Research Repository (CoRR). doi:10.48550/arXiv.1609.08144.
Xu, Mingzhou, Longyue Wang, Derek F. Wong, Hongye Liu, Linfeng Song, Lidia S. Chao, Shuming Shi, and Zhaopeng Tu. 2022. “GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation.” In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, edited by Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, 11266–11278. Association for Computational Linguistics. doi:10.18653/v1/2022.emnlp-main.774.
Zeng, Jiali, Fandong Meng, Yongjing Yin, and Jie Zhou. 2024. “Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding.” arXiv preprint. doi:10.48550/arXiv.2311.02851.
Zhu, Wenhao, Hongyi Liu, Qingxiu Dong, Jingjing Xu, Shujian Huang, Lingpeng Kong, Jiajun Chen, and Lei Li. 2024. “Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis.” In Findings of the Association for Computational Linguistics (NAACL 2024), edited by Kevin Duh, Helena Gomez, and Steven Bethard, 2765–2781. Association for Computational Linguistics. https://aclanthology.org/2024.findings-naacl.176/.

Details

Primary Language

English

Subjects

Translation and Interpretation Studies

Journal Section

Research Article

Authors

Şeyda Portillo - Palma ^* This is me
0000-0001-6986-231X
Spain

Sergi Alvarez - Vidal This is me
0000-0002-6355-4559
Spain

Publication Date

December 31, 2024

Submission Date

October 19, 2024

Acceptance Date

December 12, 2024

Published in Issue

Year 2024 Volume: 7 Number: 2

DOI

https://doi.org/10.29228/transLogos.67

IZ

https://izlik.org/JA63NW83EW

Cite

RIS / Bibtex

APA

Portillo - Palma, Ş., & Alvarez - Vidal, S. (2024). Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences. TransLogos Translation Studies Journal, 7(2), 1-28. https://doi.org/10.29228/transLogos.67

AMA

1.Portillo - Palma Ş, Alvarez - Vidal S. Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences. transLogos Translation Studies Journal. 2024;7(2):1-28. doi:10.29228/transLogos.67

Chicago

Portillo - Palma, Şeyda, and Sergi Alvarez - Vidal. 2024. “Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences”. TransLogos Translation Studies Journal 7 (2): 1-28. https://doi.org/10.29228/transLogos.67.

EndNote

Portillo - Palma Ş, Alvarez - Vidal S (December 1, 2024) Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences. transLogos Translation Studies Journal 7 2 1–28.

IEEE

[1]Ş. Portillo - Palma and S. Alvarez - Vidal, “Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences”, transLogos Translation Studies Journal, vol. 7, no. 2, pp. 1–28, Dec. 2024, doi: 10.29228/transLogos.67.

ISNAD

Portillo - Palma, Şeyda - Alvarez - Vidal, Sergi. “Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences”. transLogos Translation Studies Journal 7/2 (December 1, 2024): 1-28. https://doi.org/10.29228/transLogos.67.

JAMA

1.Portillo - Palma Ş, Alvarez - Vidal S. Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences. transLogos Translation Studies Journal. 2024;7:1–28.

MLA

Portillo - Palma, Şeyda, and Sergi Alvarez - Vidal. “Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences”. TransLogos Translation Studies Journal, vol. 7, no. 2, Dec. 2024, pp. 1-28, doi:10.29228/transLogos.67.

Vancouver

1.Şeyda Portillo - Palma, Sergi Alvarez - Vidal. Gender Bias and Contextual Sensitivity in Machine Translation: A Focus on Turkish Subject-Dropped Sentences. transLogos Translation Studies Journal. 2024 Dec. 1;7(2):1-28. doi:10.29228/transLogos.67