Turkish, a language that does not explicitly mark gender in pronouns, poses a unique challenge for machine translation systems, particularly in cases of gender-neutral or ambiguous context. This study investigates the performance of neural machine translation (NMT) and large language models (LLMs) in resolving gender ambiguity when translating Turkish subject-dropped sentences into English. The analysis examines four prominent models—Google Translate, DeepL, ChatGPT, and Gemini—evaluating their pronoun selection and the extent of gender bias, especially in emotionally charged or contextually nuanced sentences. A primarily quantitative evaluation reveals a persistent gender bias across all models, with LLMs demonstrating relatively better performance than NMTs when clearer contextual information is present. However, all models exhibit limitations in managing the complexities of cross-linguistic gender representation. This research highlights the pressing need for gender-neutral solutions and advancements in context-sensitive translation. Furthermore, we introduce a moderately sized annotated Turkish corpus, designed to facilitate future studies on gender ambiguity in machine translation (MT). This dataset provides a valuable resource for enhancing the accuracy of gendered pronoun resolution and fostering more inclusive, bias-reduced translation systems. Overall, the study contributes to the growing discourse on reducing bias in language models while addressing the challenges of nuanced linguistic diversity in translation.
Primary Language | English |
---|---|
Subjects | Translation and Interpretation Studies |
Journal Section | Research Articles |
Authors | |
Publication Date | December 31, 2024 |
Submission Date | October 19, 2024 |
Acceptance Date | December 12, 2024 |
Published in Issue | Year 2024 Volume: 7 Issue: 2 |