Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish

Eyüpcan Şensoy; Mehmet Çıtırık

doi:10.54005/geneltip.1627508

TR EN

Yapay Zeka Sohbet Robotlarının Diller Arası Değerlendirmesi: ChatGPT-3.5, Copilot ve Gemini'nin Nöro-oftalmolojik Değerlendirmede İngilizce ve Türkçe Performansı

Öz

Özet Amaç: ChatGPT-3,5, Copilot ve Gemini yapay zeka sohbet botlarının nöro-oftalmolojik değerlendirmede İngilizce ve Türkçe aynı sorulardaki performanslarını değerlendirmek. Gereç ve Yöntem: Nöro-oftalmoloji ile ilişkili 40 soru çalışmaya dahil edildi. Tüm İngilizce soruların sertifikasyonlu çevirmen (native speaker) tarafından Türkçeye çevirileri gerçekleştirildikten sonra soruların her iki versiyonu ChatGPT-3,5, Copilot ve Gemini sohbet botlarına soruldu. Verilen cevaplar cevap anahtarı ile karşılaştırılarak doğru ve yanlış olarak gruplandırıldı. Birbirlerine üstünlükleri istatistiksel olarak karşılaştırıldı. Bulgular: Sorulan İngilizce sorulara ChatGPT-3,5 %47,5, Copilot %57,5 ve Gemini %32,5 oranında doğru cevap verdi. Sorulan Türkçe sorulara ChatGPT-3,5 %57,5, Copilot %52,5 ve Gemini %32,5 oranında doğru cevap verdi. Sohbet botları arasında, İngilizce ve Türkçe aynı soruları cevaplamada farklı başarı düzeyi olduğu halde, istatistiksel olarak anlamlı başarı farkı tespit edilmedi (p>0,05). Sonuç: İstatistiksel olarak anlamlı bir fark izlenmemesine rağmen sohbet botları aynı sorulara farklı cevaplar verebilmektedir. Sohbet botlarının bilgi düzeylerinin geliştirilmesinin yanında dil becerilerinin de geliştirilmeye ihtiyacı vardır.

Anahtar Kelimeler

ChatGPT-3.5, Copilot, Gemini, İngilizce, Nöro-oftalmoloji, Türkçe, Yapay zeka uygulamaları

Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish

Öz

Abstract Background/Aims: To evaluate the performance of ChatGPT-3.5, Copilot, and Gemini artificial intelligence chatbots on the same questions in neuro-ophthalmologic evaluation in English and Turkish. Methods: Forty questions related to neuro-ophthalmology were included in the study. After all English questions were translated into Turkish by a certified native speaker, both versions of the questions were asked to ChatGPT-3.5, Copilot, and Gemini chatbots. The answers were compared with the answer key and grouped as correct and incorrect. Their superiority over each other was compared statistically. Results: ChatGPT-3,5 47.5%, Copilot 57.5%, and Gemini 32.5% answered the English questions correctly. ChatGPT-3,5 57.5%, Copilot 52.5%, and Gemini 32.5% answered the questions correctly in Turkish. No statistically significant difference was detected between chatbots in answering the same questions in English and Turkish, although there were different levels of success (p>0.05). Conclusions: Although there is no statistically significant difference, chatbots can answer the same questions differently. In addition to improving the knowledge level of chatbots, their language skills also need to be improved.

Anahtar Kelimeler

ChatGPT-3.5, Copilot, Gemini, Neuro-ophthalmology, Turkish, Artificial intelligence applications

Etik Beyan

Since the data in our study is not from any animal or human sources, ethics committee approval is not required.

Kaynakça

1. Madadi Y, Delsoz M, Lao PA, Fong JW, Hollingsworth T, Kahook MY, et al. ChatGPT Assisting Diagnosis of Neuro-ophthalmology Diseases Based on Case Reports. medRxiv 2023.
2. Stunkel L, Sharma RA, Mackay DD, Wilson B, Van Stavern GP, Newman NJ, et al. Patient Harm Due to Diagnostic Error of Neuro-Ophthalmologic Conditions. Ophthalmology 2021; 128:1356–1362.
3. Frohman LP. The human resource crisis in neuro-ophthalmology. J Neuroophthalmol 2008; 28:231–234.
4. Debusk A, Subramanian PS, Scannell Bryan M, Moster ML, Calvert PC, and Frohman LP. Mismatch in Supply and Demand for Neuro-Ophthalmic Care. J Neuroophthalmol 2022; 42:62–67.
5. Ting DSW, Pasquale LR, Peng L, Campbell JP, Lee AY, Raman R, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol 2019; 103:167.
6. Sensoy E and Citirik M. A comparative study on the knowledge levels of artificial intelligence programs in diagnosing ophthalmic pathologies and intraocular tumors evaluated their superiority and potential utility. Int Ophthalmol 2023; 43:4905–4909.
7. Bhatti TM, Chen JJ, Danesh-Meyer H V., Levin LA, Moss HE, Philips PH, et al., editors. Neuro-Ophthalmology. San Francisco: American Academy of Ophthalmology; 2023.
8. Şensoy E, Çıtırık M. ChatGPT-3.5, Copilot ve Gemini'nin oküler inflamasyon ve üveit konusundaki çoktan seçmeli sorularda performans analizi: Dil farklılıklarının etkisi: Kesitsel araştırma. Turkiye Klinikleri J Ophthalmol 2025;34:12-16
9. Şensoy E and Çıtırık M. Performance of chatgptChatGPT-3.5, copilot Copilot, and gemini Gemini in answering english English and turkish Turkish questions related to ocular surface diseases and cornea: a comparison study. Turkish Journal of Clinical and Experimental Ophthalmology 2025; 20:37-41.
10. Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS digital health 2023; 2:e0000198.

11. Khan RA, Jawaid M, Khan AR, and Sajjad M. ChatGPT - Reshaping medical education and clinical management. Pak J Med Sci 2023; 39:605.
12. Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, et al. ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports. 2022.
13. Shukla R, Mishra AK, Banerjee N, and Verma A. The Comparison of ChatGPT 3.5, Microsoft Bing, and Google Gemini for Diagnosing Cases of Neuro-Ophthalmology. Cureus 2024; 16.
14. Haddad F and Saade JS. Performance of ChatGPT on Ophthalmology-Related Questions Across Various Examination Levels: Observational Study. JMIR Med Educ 2024; 10:e50842.
15. Tao BKL, Hua N, Milkovich J, and Micieli JA. ChatGPT-3.5 and Bing Chat in ophthalmology: an updated evaluation of performance, readability, and informative sources. Eye 2024 2024; 1–6.
16. Tailor PD, Dalvin LA, Starr MR, Tajfirouz DA, Chodnicki KD, Brodsky MC, et al. A Comparative Study of Large Language Models, Human Experts, and Expert-Edited Large Language Models to Neuro-Ophthalmology Questions. J Neuroophthalmol 2024.
17. Mihalache A, Grad J, Patil NS, Huang RS, Popovic MM, Mallipatna A, et al. Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye 2024 2024; 1–6.
18. Canleblebici M, Dal A, and Erdağ M. Evaluation of the Performance of Large Language Models (ChatGPT-3.5, ChatGPT-4, Bing, and Bard) in Turkish Ophthalmology Chief-Assistant Exams: A Comparative Study. Turkiye Klinikleri J of Ophthalmol 2024;33:163-170.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Klinik Tıp Bilimleri (Diğer)

Bölüm

Araştırma Makalesi

Yazarlar

Eyüpcan Şensoy ^*
0000-0002-4401-8435
Türkiye

Mehmet Çıtırık
0000-0002-0558-5576
Türkiye

Erken Görünüm Tarihi

29 Ağustos 2025

Yayımlanma Tarihi

29 Ağustos 2025

Gönderilme Tarihi

27 Ocak 2025

Kabul Tarihi

22 Temmuz 2025

Yayımlandığı Sayı

Yıl 2025 Cilt: 35 Sayı: 4

DOI

https://doi.org/10.54005/geneltip.1627508

IZ

https://izlik.org/JA57ZY34CU

APA

Şensoy, E., & Çıtırık, M. (2025). Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish. Genel Tıp Dergisi, 35(4), 597-604. https://doi.org/10.54005/geneltip.1627508

AMA

1.Şensoy E, Çıtırık M. Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish. Genel Tıp Derg. 2025;35(4):597-604. doi:10.54005/geneltip.1627508

Chicago

Şensoy, Eyüpcan, ve Mehmet Çıtırık. 2025. “Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish”. Genel Tıp Dergisi 35 (4): 597-604. https://doi.org/10.54005/geneltip.1627508.

EndNote

Şensoy E, Çıtırık M (01 Ağustos 2025) Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish. Genel Tıp Dergisi 35 4 597–604.

IEEE

[1]E. Şensoy ve M. Çıtırık, “Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish”, Genel Tıp Derg, c. 35, sy 4, ss. 597–604, Ağu. 2025, doi: 10.54005/geneltip.1627508.

ISNAD

Şensoy, Eyüpcan - Çıtırık, Mehmet. “Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish”. Genel Tıp Dergisi 35/4 (01 Ağustos 2025): 597-604. https://doi.org/10.54005/geneltip.1627508.

JAMA

1.Şensoy E, Çıtırık M. Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish. Genel Tıp Derg. 2025;35:597–604.

MLA

Şensoy, Eyüpcan, ve Mehmet Çıtırık. “Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish”. Genel Tıp Dergisi, c. 35, sy 4, Ağustos 2025, ss. 597-04, doi:10.54005/geneltip.1627508.

Vancouver

1.Eyüpcan Şensoy, Mehmet Çıtırık. Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish. Genel Tıp Derg. 01 Ağustos 2025;35(4):597-604. doi:10.54005/geneltip.1627508

Genel Tıp Dergisi Creative Commons Atıf-GayriTicari 4.0 Uluslararası Lisansı (CC BY NC) ile lisanslanmıştır.

Yapay Zeka Sohbet Robotlarının Diller Arası Değerlendirmesi: ChatGPT-3.5, Copilot ve Gemini'nin Nöro-oftalmolojik Değerlendirmede İngilizce ve Türkçe Performansı

Öz

Anahtar Kelimeler

Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish

Öz

Anahtar Kelimeler

Etik Beyan

Kaynakça

Ayrıntılar

Birincil Dil

Konular

Bölüm

Yazarlar

Erken Görünüm Tarihi

Yayımlanma Tarihi

Gönderilme Tarihi

Kabul Tarihi

Yayımlandığı Sayı

DOI

IZ

Kaynak Göster