A comparative evaluation of AI chatbots in veterinary anatomy: performance of ChatGPT, Gemini and DeepSeek models

Ezgi Deniz Mavili; Barış Batur; Aytaç Akçay; Çağdaş Oto

doi:10.33188/vetheder.1805359

Araştırma Makalesi

Veteriner anatomisinde yapay zeka sohbet robotlarının karşılaştırmalı değerlendirmesi: ChatGPT, Gemini ve DeepSeek modellerinin performansı

Yıl 2026, Cilt: 97 Sayı: 1, 47 - 51, 15.01.2026

Ezgi Deniz Mavili , Barış Batur , Aytaç Akçay , Çağdaş Oto

https://doi.org/10.33188/vetheder.1805359

Öz

Bu çalışma, veteriner anatomisi alanında dört yapay zeka sohbet robotunun (ChatGPT-3.5, ChatGPT-4.0, Gemini 2.5 Flash ve DeepSeek-V3) güvenilirliğini ve doğruluğunu değerlendirmek amacıyla yapılmıştır. Başlıca anatomik sistemleri kapsayan toplam 85 çoktan seçmeli soru, aynı koşullar altında her modele ayrı ayrı sunulmuştur. Yanıtlar doğruluk açısından değerlendirilmiş ve başarı oranları yüzde olarak hesaplanmıştır. Modeller arasındaki istatistiksel farklılıklar Pearson ki-kare testi (p<0,05) kullanılarak analiz edilmiştir. Sonuçlar, Gemini 2.5 Flash'ın en yüksek doğruluk oranını (%85,88) elde ettiğini, onu ChatGPT-4.0 (%85,53), DeepSeek-V3 (%84,71) ve ChatGPT-3.5 (%82,35) izlediğini gösterdi. Bu farklılıklara rağmen, farklar istatistiksel olarak anlamlı değildi (χ²=0,629, p=0,890). Niteliksel analiz, açıklayıcı derinlik açısından farklılıklar ortaya koydu: ChatGPT-4.0 ve Gemini 2.5 Flash, yanlış seçenekler için düzeltici geri bildirim sağlarken, DeepSeek-V3 ve ChatGPT-3.5 esas olarak doğru cevaplara odaklandı. Gemini 2.5 Flash ayrıca görsel yardımcılar da kullanmıştır, ancak bunların bazıları veteriner anatomisi yerine insan anatomisine dayanmaktadır. Genel olarak, değerlendirilen tüm AI sohbet robotları doğru anatomik muhakeme konusunda önemli bir kapasite sergilemiş olsa da, açıklama stilleri ve destekleyici materyalleri farklılık göstermektedir.

Anahtar Kelimeler

Yapay , Zeka , Büyük dil modelleri

Kaynakça

Ganapathy A, Kaushal P. Cognitive domain assessment of artificial intelligence chatbots: a comparative study between ChatGPT and Gemini’s understanding of anatomy education. Med Sci Educ. 2025;35:1295-1304.
Al-Khater KMK. Comparative assessment of three AI platforms in answering USMLE Step 1 anatomy questions or identifying anatomical structures on radiographs. Clin Anat. 2025;38(2):186-199.
Choudhary OP, Saini J, Challana A. ChatGPT for veterinary anatomy education: an overview of the prospects and drawbacks. Int J Morphol. 2023;41(4):1198-1202.
Arun G, Perumal V, Urias FPJB, Ler YE, Tan BWT, Vallabhajosyula R, et al. ChatGPT versus a customized AI chatbot (Anatbuddy) for anatomy education: a comparative pilot study. Anat Sci Educ. 2024;17(7):1396-1405.
Singal A, Goyal S. Reliability and efficiency of ChatGPT 3.5 and 4.0 as a tool for scalenovertebral triangle anatomy education. Surg Radiol Anat. 2024;47(1):24.
Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, et al. GPT-4 technical report. arXiv preprint arXiv:2303.08774; 2023.
Bélisle-Pipon JC. Why we need to be careful with large language models in medicine. Front Med. 2024;11:1495582.
Bessa RF, de Oliveira AC, Sousa DL, Alves R, Barbosa A, Carneiro A, et al. Performance comparison of large language models on Brazil’s medical revalidation exam for foreign-trained graduates. Appl Sci. 2025;15(13):7134.
Meo SA, Abukhalaf FA, ElToukhy RA, Sattar K. Exploring the role of DeepSeek-R1, ChatGPT-4, and Google Gemini in medical education: how valid and reliable are they? Pak J Med Sci. 2025;41(7):1887-1892.
Campos VMS, Prudente TP, Leão LL, da Costa MS, Oliva HNP, Monteiro-Junior RS. Analyses of different prescriptions for health using artificial intelligence: a critical approach based on international guidelines of health institutions. Health Inf Sci Syst. 2025;13(1):52.
Ilgaz HB, Çelik Z. The significance of artificial intelligence platforms in anatomy education: an experience with ChatGPT and Google Bard. Cureus. 2023;15(9):e45301.

A comparative evaluation of AI chatbots in veterinary anatomy: performance of ChatGPT, Gemini and DeepSeek models

Yıl 2026, Cilt: 97 Sayı: 1, 47 - 51, 15.01.2026

Ezgi Deniz Mavili , Barış Batur , Aytaç Akçay , Çağdaş Oto

https://doi.org/10.33188/vetheder.1805359

Öz

This study aimed to evaluate the reliability and accuracy of four AI chatbots—ChatGPT-3.5, ChatGPT-4.0, Gemini 2.5 Flash, and DeepSeek-V3—in the field of veterinary anatomy. A total of 85 multiple-choice questions encompassing major anatomical systems were presented individually to each model under identical conditions. Responses were evaluated for accuracy, and success rates were calculated as percentages. Statistical differences among models were analyzed using the Pearson chi-square test (p<0.05). The results indicated that Gemini 2.5 Flash achieved the highest accuracy rate (85.88%), followed by ChatGPT-4.0 (85.53%), DeepSeek-V3 (84.71%), and ChatGPT-3.5 (82.35%). Despite these variations, the differences were not statistically significant (χ²=0.629, p=0.890). Qualitative analysis revealed differences in explanatory depth: ChatGPT-4.0 and Gemini 2.5 Flash provided corrective feedback for incorrect options, while DeepSeek-V3 and ChatGPT-3.5 focused mainly on correct answers. Gemini 2.5 Flash additionally incorporated visual aids, though some were based on human rather than veterinary anatomy. Overall, while all evaluated AI chatbots demonstrated a substantial capacity for accurate anatomical reasoning, their explanatory styles and supporting materials varied.

Anahtar Kelimeler

Artificial , Intelligence , Large language models

Kaynakça

Ganapathy A, Kaushal P. Cognitive domain assessment of artificial intelligence chatbots: a comparative study between ChatGPT and Gemini’s understanding of anatomy education. Med Sci Educ. 2025;35:1295-1304.
Al-Khater KMK. Comparative assessment of three AI platforms in answering USMLE Step 1 anatomy questions or identifying anatomical structures on radiographs. Clin Anat. 2025;38(2):186-199.
Choudhary OP, Saini J, Challana A. ChatGPT for veterinary anatomy education: an overview of the prospects and drawbacks. Int J Morphol. 2023;41(4):1198-1202.
Arun G, Perumal V, Urias FPJB, Ler YE, Tan BWT, Vallabhajosyula R, et al. ChatGPT versus a customized AI chatbot (Anatbuddy) for anatomy education: a comparative pilot study. Anat Sci Educ. 2024;17(7):1396-1405.
Singal A, Goyal S. Reliability and efficiency of ChatGPT 3.5 and 4.0 as a tool for scalenovertebral triangle anatomy education. Surg Radiol Anat. 2024;47(1):24.
Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, et al. GPT-4 technical report. arXiv preprint arXiv:2303.08774; 2023.
Bélisle-Pipon JC. Why we need to be careful with large language models in medicine. Front Med. 2024;11:1495582.
Bessa RF, de Oliveira AC, Sousa DL, Alves R, Barbosa A, Carneiro A, et al. Performance comparison of large language models on Brazil’s medical revalidation exam for foreign-trained graduates. Appl Sci. 2025;15(13):7134.
Meo SA, Abukhalaf FA, ElToukhy RA, Sattar K. Exploring the role of DeepSeek-R1, ChatGPT-4, and Google Gemini in medical education: how valid and reliable are they? Pak J Med Sci. 2025;41(7):1887-1892.
Campos VMS, Prudente TP, Leão LL, da Costa MS, Oliva HNP, Monteiro-Junior RS. Analyses of different prescriptions for health using artificial intelligence: a critical approach based on international guidelines of health institutions. Health Inf Sci Syst. 2025;13(1):52.
Ilgaz HB, Çelik Z. The significance of artificial intelligence platforms in anatomy education: an experience with ChatGPT and Google Bard. Cureus. 2023;15(9):e45301.

Toplam 11 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Veteriner Anatomi ve Fizyoloji
Bölüm	Araştırma Makalesi
Yazarlar	Ezgi Deniz Mavili 0009-0002-3565-8825 Barış Batur 0000-0001-9669-9917 Aytaç Akçay 0000-0001-6263-5181 Çağdaş Oto 0000-0002-2727-3768
Gönderilme Tarihi	16 Ekim 2025
Kabul Tarihi	30 Aralık 2025
Yayımlanma Tarihi	15 Ocak 2026
Yayımlandığı Sayı	Yıl 2026 Cilt: 97 Sayı: 1

Kaynak Göster

Vancouver	1.Mavili ED, Batur B, Akçay A, Oto Ç. A comparative evaluation of AI chatbots in veterinary anatomy: performance of ChatGPT, Gemini and DeepSeek models. Vet Hekim Der Derg [Internet]. 01 Ocak 2026;97(1):47-51. Erişim adresi: https://izlik.org/JA29KU56CH

Makale Dosyaları

Tam Metin

Veteriner Hekimler Derneği Dergisi açık erişimli bir dergi olup, derginin yayın modeli Budapeşte Erişim Girişimi (BOAI) bildirisine dayanmaktadır. Yayınlanan tüm içerik, çevrimiçi ve ücretsiz olarak sunulan Creative Commons CC BY-NC 4.0 lisansı altında lisanslanmıştır. Yazarlar, Veteriner Hekimler Derneği Dergisi'nde yayınlanan eserlerinin telif haklarını saklı tutarlar.

Veteriner Hekimler Derneği / Turkish Veterinary Medical Society