Araştırma Makalesi
BibTex RIS Kaynak Göster

Impact of Language Variation English and Turkish on Artificial Intelligence Chatbot Performance in Oculofacial Plastic and Orbital Surgery: A Study of ChatGPT-3.5, Copilot, and Gemini

Yıl 2024, Cilt: 46 Sayı: 5, 781 - 786, 12.09.2024
https://doi.org/10.20515/otd.1520495

Öz

The aim is to investigate the effects of applying the same questions in different languages related to oculofacial plastic and orbital surgery to ChatGPT-3.5, Copilot, and Gemini artificial intelligence chatbots, which are freely accessible, on the performance of these programs. English and Turkish versions of 30 questions related to oculofacial plastic and orbital surgery were applied to ChatGPT-3.5, Copilot, and Gemini chatbots. The answers given by the chatbots were compared with the answer key at the back of the book and grouped as correct and incorrect. Their superiority over each other was compared statistically. While ChatGPT-3.5 answered 43.3% of the English questions correctly, it answered 23.3% of the Turkish questions correctly (p=0.07). While Copilot answered 73.3% of the English questions correctly, it answered 63.3% of the Turkish questions correctly (p=0.375). While Gemini answered 46.7% of the English questions correctly, it answered 33.3% of the Turkish questions correctly (p=0.344). Copilot showed higher performance than other programs in answering Turkish questions (p<0.05). In addition to improving the knowledge level of chatbots, their performance in different languages also needs to be examined and improved. Correcting these disadvantages in chatbots will pave the way for more widespread and reliable use of these programs.

Kaynakça

  • 1. Rahimy E. Deep learning applications in ophthalmology. Curr Opin Ophthalmol. 2018;29(3):254-60.
  • 2. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-75.
  • 3. Antaki F, Coussa RG, Kahwati G, Hammamji K, Sebag M, Duval R. Accuracy of automated machine learning in classifying retinal pathologies from ultra-widefield pseudocolour fundus images. Br J Ophthalmol. 2023;107(1):90-5.
  • 4. Schmidt-Erfurth U, Sadeghipour A, Gerendas BS, Waldstein SM, Bogunović H. Artificial intelligence in retina. Prog Retin Eye Res. 2018;67:1-29.
  • 5. Mikolov T, Deoras A, Povey D, Burget L, Černocký J. Strategies for training large scale neural network language models. 2011 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2011, Proceedings. Published online 2011:196-201.
  • 6. Google AI updates: Bard and new AI features in Search. Accessed July 4, 2024. https://blog.google/technology/ai/bard-google-ai-search-updates/
  • 7. Bing Chat | Microsoft Edge. Accessed July 4, 2024. https://www.microsoft.com/en-us/edge/features/bing-chat?form=MT00D8
  • 8. Korn BS, Burkat CN, Couch SM, et al., eds. Oculofacial Plastic and Orbital Surgery. American Academy of Ophthalmology; 2023.
  • 9. Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT - Reshaping medical education and clinical management. Pak J Med Sci. 2023;39(2):605.
  • 10. Jeblick K, Schachtner B, Dexl J, et al. ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports. Published online December 30, 2022. Accessed June 10, 2023. https://arxiv.org/abs/2212.14882v1
  • 11. Al-Sharif EM, Penteado RC, Dib El Jalbout N, et al. Evaluating the Accuracy of ChatGPT and Google BARD in Fielding Oculoplastic Patient Queries: A Comparative Study on Artificial versus Human Intelligence. Ophthalmic Plast Reconstr Surg. 2024;40(3):303-11.
  • 12. Haddad F, Saade JS. Performance of ChatGPT on Ophthalmology-Related Questions Across Various Examination Levels: Observational Study. JMIR Med Educ. 2024;10:e50842.
  • 13. Tao BKL, Hua N, Milkovich J, Micieli JA. ChatGPT-3.5 and Bing Chat in ophthalmology: an updated evaluation of performance, readability, and informative sources. Eye 2024. Published online March 20, 2024:1-6.
  • 14. Canleblebici M, Dal A, Erdağ M. Evaluation of the Performance of Large Language Models (ChatGPT-3.5, ChatGPT-4, Bing and Bard) in Turkish Ophthalmology Chief-Assistant Exams: A Comparative Study. Turkiye Klinikleri J Ophthalmol. Published online June 11, 2024.
  • 15. Mihalache A, Grad J, Patil NS, et al. Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye 2024. Published online April 13, 2024:1-6.

Okülofasiyal Plastik ve Orbital Cerrahide İngilizce ve Türkçe Dil Çeşitliliğinin Yapay Zeka Chatbot Performansına Etkisi: ChatGPT-3.5, Copilot ve Gemini Üzerine Bir Çalışma

Yıl 2024, Cilt: 46 Sayı: 5, 781 - 786, 12.09.2024
https://doi.org/10.20515/otd.1520495

Öz

Ücretsiz olarak erişim sağlanabilen ChatGPT-3,5, Copilot ve Gemini yapay zeka sohbet botlarına okülofasiyal plastik ve orbita cerrahisi ile ilişkili farklı dillerdeki aynı soru uygulamalarının bu programların performanslarına olan etkilerini araştırmaktır. Okülofasiyal plastik ve orbita cerrahisi ile ilişkili 30 sorunun İngilizce ve Türkçe versiyonları ChatGPT-3,5, Copilot ve Gemini sohbet botlarına uygulandı. Sohbet botlarının verdikleri cevaplar kitap arkasında yer alan cevap anahtarı ile karşılaştırıldı, doğru ve yanlış olarak gruplandırıldı. Birbirlerine üstünlükleri istatistiksel olarak karşılaştırıldı. ChatGPT-3,5 İngilizce soruların %43,3’üne doğru cevap verirken, Türkçe soruların %23,3’üne doğru cevap verdi (p=0,07). Copilot İngilizce soruların %73,3’üne doğru cevap verirken, Türkçe soruların %63,3’üne doğru cevap verdi (p=0,375). Gemini İngilizce soruların %46,7’sine doğru cevap verirken, Türkçe soruların %33,3’üne doğru cevap verdi (p=0,344). Copilot, Türkçe soruları cevaplamada diğer programlardan daha yüksek performans gösterdi (p<0,05). Sohbet botlarının bilgi düzeylerinin geliştirilmesinin yanında farklı dillerdeki performanslarının da incelenmeye ve geliştirilmeye ihtiyacı vardır. Sohbet botlarındaki bu dezavantajların düzeltilmesi, bu programların daha yaygın ve güvenilir bir şekilde kullanılmasına zemin hazırlayacaktır.

Kaynakça

  • 1. Rahimy E. Deep learning applications in ophthalmology. Curr Opin Ophthalmol. 2018;29(3):254-60.
  • 2. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-75.
  • 3. Antaki F, Coussa RG, Kahwati G, Hammamji K, Sebag M, Duval R. Accuracy of automated machine learning in classifying retinal pathologies from ultra-widefield pseudocolour fundus images. Br J Ophthalmol. 2023;107(1):90-5.
  • 4. Schmidt-Erfurth U, Sadeghipour A, Gerendas BS, Waldstein SM, Bogunović H. Artificial intelligence in retina. Prog Retin Eye Res. 2018;67:1-29.
  • 5. Mikolov T, Deoras A, Povey D, Burget L, Černocký J. Strategies for training large scale neural network language models. 2011 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2011, Proceedings. Published online 2011:196-201.
  • 6. Google AI updates: Bard and new AI features in Search. Accessed July 4, 2024. https://blog.google/technology/ai/bard-google-ai-search-updates/
  • 7. Bing Chat | Microsoft Edge. Accessed July 4, 2024. https://www.microsoft.com/en-us/edge/features/bing-chat?form=MT00D8
  • 8. Korn BS, Burkat CN, Couch SM, et al., eds. Oculofacial Plastic and Orbital Surgery. American Academy of Ophthalmology; 2023.
  • 9. Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT - Reshaping medical education and clinical management. Pak J Med Sci. 2023;39(2):605.
  • 10. Jeblick K, Schachtner B, Dexl J, et al. ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports. Published online December 30, 2022. Accessed June 10, 2023. https://arxiv.org/abs/2212.14882v1
  • 11. Al-Sharif EM, Penteado RC, Dib El Jalbout N, et al. Evaluating the Accuracy of ChatGPT and Google BARD in Fielding Oculoplastic Patient Queries: A Comparative Study on Artificial versus Human Intelligence. Ophthalmic Plast Reconstr Surg. 2024;40(3):303-11.
  • 12. Haddad F, Saade JS. Performance of ChatGPT on Ophthalmology-Related Questions Across Various Examination Levels: Observational Study. JMIR Med Educ. 2024;10:e50842.
  • 13. Tao BKL, Hua N, Milkovich J, Micieli JA. ChatGPT-3.5 and Bing Chat in ophthalmology: an updated evaluation of performance, readability, and informative sources. Eye 2024. Published online March 20, 2024:1-6.
  • 14. Canleblebici M, Dal A, Erdağ M. Evaluation of the Performance of Large Language Models (ChatGPT-3.5, ChatGPT-4, Bing and Bard) in Turkish Ophthalmology Chief-Assistant Exams: A Comparative Study. Turkiye Klinikleri J Ophthalmol. Published online June 11, 2024.
  • 15. Mihalache A, Grad J, Patil NS, et al. Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye 2024. Published online April 13, 2024:1-6.
Toplam 15 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Göz Hastalıkları
Bölüm ORİJİNAL MAKALELER / ORIGINAL ARTICLES
Yazarlar

Eyüpcan Şensoy 0000-0002-4401-8435

Mehmet Çıtırık 0000-0002-0558-5576

Yayımlanma Tarihi 12 Eylül 2024
Gönderilme Tarihi 22 Temmuz 2024
Kabul Tarihi 3 Eylül 2024
Yayımlandığı Sayı Yıl 2024 Cilt: 46 Sayı: 5

Kaynak Göster

Vancouver Şensoy E, Çıtırık M. Okülofasiyal Plastik ve Orbital Cerrahide İngilizce ve Türkçe Dil Çeşitliliğinin Yapay Zeka Chatbot Performansına Etkisi: ChatGPT-3.5, Copilot ve Gemini Üzerine Bir Çalışma. Osmangazi Tıp Dergisi. 2024;46(5):781-6.


13299        13308       13306       13305    13307  1330126978