A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases

Başak Erdemli Gürsel; Gökhan Öngen; Dilek Sağlam

doi:10.16899/jcm.1626433

EN TR

A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases

Abstract

Aim To evaluate the diagnostic performance of Large Language Models (LLM) (ChatGPT 3.5, ChatGPT 4, Gemini 1.0, and Gemini Advance) in Ultrasound (US) cases and their superiority over each other Materials and Methods In this retrospective study, the data of 20 real cases with US examination and confirmed diagnoses were evaluated between 2020-2024. Clinical information, relevant laboratory data, and US findings of these cases were simultaneously presented to four Artificial Intelligence (AI) (ChatGPT 3.5, ChatGPT 4, Gemini 1.0, Gemini Advance). The correct response rates of the four AIs to the cases were compared. Two radiology experts in the US evaluated the answers. Results The correct response rates of ChatGPT 3.5, ChatGPT 4, Gemini 1.0, and Gemini Advance models in the cases were 92% (23/25), 92% (23/25), 76% (19/25), 84% (21/25), respectively, and with no statistically significant differences between them. Conclucion This is the first study about four AI performances in diagnosis in real US cases. The results suggest that no matter which AI we use, AIs have the potential to assist radiologists in diagnosis significantly. The fact that they are easy and fast to use can also significantly speed up the daily workflow. However, it should be remembered that they cannot yet completely replace a radiologist.

Keywords

Ultrason Tabanlı Vakalarda Çeşitli Yapay Zeka (YZ) Algoritmalarının Tanısal Etkinliğinin Karşılaştırmalı Analizi

Abstract

Amaç Ultrason (US) vakalarında Geniş Dil Modellerinin (LLM) (ChatGPT 3.5, ChatGPT 4, Gemini 1.0 ve Gemini Advance) tanısal performansını ve birbirlerine göre üstünlüklerini değerlendirmek Gereç ve Yöntem Bu retrospektif çalışmada, 2020-2024 yılları arasında US incelemesi yapılmış ve tanıları doğrulanmış 20 gerçek vakanın verileri değerlendirilmiştir. Bu vakaların klinik bilgileri, ilgili laboratuvar verileri ve US bulguları eş zamanlı olarak dört Yapay Zekaya (YZ) (ChatGPT 3.5, ChatGPT 4, Gemini 1.0, Gemini Advance) sunulmuştur. Dört YZ'nin vakalara doğru yanıt verme oranları karşılaştırılmıştır. Yanıtlar iki radyoloji uzmanı tarafından değerlendirmiştir. Bulgular ChatGPT 3.5, ChatGPT 4, Gemini 1.0 ve Gemini Advance modellerinin vakalardaki doğru yanıt oranları sırasıyla %92 (23/25), %92 (23/25), %76 (19/25), %84 (21/25) olup aralarında istatistiksel olarak anlamlı farklılık yoktur. Tartışma Bu çalışma, gerçek US vakalarıyla yapılmış, 4 YZ’nin tanı performanslarının değerlendirildiği ilk çalışmadır. Sonuçlar, hangi YZ'yi kullanırsak kullanalım, YZ'lerin radyologlara tanıda önemli ölçüde yardımcı olma potansiyeline sahip olduğunu göstermektedir. Kullanımlarının kolay ve hızlı olması da günlük iş akışını önemli ölçüde hızlandırabilir. Bununla birlikte, henüz gerçek bir radyoloğun yerini tamamen alamayacakları da unutulmamalıdır.

Keywords

References

1- Biswas SS. Role of ChatGPT in radiology with a focus on pediatric radiology: proof by examples. Pediatr Radiol. 2023;53(5):818-822.
2- Srivastav S, Chandrakar R, Gupta S, et al. ChatGPT in Radiology: The Advantages and Limitations of Artificial Intelligence for Medical Imaging Diagnosis. Cureus. 2023;15(7):e41435.
3- Harrer S. Attention is not all you need: the complicated case of ethically using large language models in healthcare and medicine. EBioMedicine. 2023;90:104512.
4- Sallam M. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthcare (Basel). 2023;11(6):887.
5- Ueda D, Mitsuyama Y, Takita H, et al. ChatGPT's Diagnostic Performance from Patient History and Imaging Findings on the Diagnosis Please Quizzes. Radiology. 2023;308(1):e231040.
6- Ueda, D., Walston, S.L., Matsumoto, T. et al. Evaluating GPT-4-based ChatGPT's clinical potential on the NEJM quiz. BMC Digit Health. 2024; 2 (1): 4.
7- Suthar PP, Kounsal A, Chhetri L, Saini D, Dua SG. Artificial Intelligence (AI) in Radiology: A Deep Dive Into ChatGPT 4.0's Accuracy with the American Journal of Neuroradiology's (AJNR) "Case of the Month". Cureus. 2023;15(8):e43958.
8- Lecler A, Duron L, Soyer P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn Interv Imaging. 2023;104(6):269-274.

Details

Primary Language

English

Subjects

Radiology and Organ Imaging

Journal Section

Research Article

Authors

Başak Erdemli Gürsel ^*
0000-0002-0047-1780
Türkiye

Gökhan Öngen
0000-0002-7348-0813
Türkiye

Dilek Sağlam
0000-0002-5778-6847
Türkiye

Publication Date

September 30, 2025

Submission Date

February 1, 2025

Acceptance Date

September 22, 2025

Published in Issue

Year 2025 Volume: 15 Number: 5

DOI

https://doi.org/10.16899/jcm.1626433

IZ

https://izlik.org/JA46UM99HK

Cite

RIS / Bibtex

APA

Erdemli Gürsel, B., Öngen, G., & Sağlam, D. (2025). A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases. Journal of Contemporary Medicine, 15(5), 245-249. https://doi.org/10.16899/jcm.1626433

AMA

1.Erdemli Gürsel B, Öngen G, Sağlam D. A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases. J Contemp Med. 2025;15(5):245-249. doi:10.16899/jcm.1626433

Chicago

Erdemli Gürsel, Başak, Gökhan Öngen, and Dilek Sağlam. 2025. “A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases”. Journal of Contemporary Medicine 15 (5): 245-49. https://doi.org/10.16899/jcm.1626433.

EndNote

Erdemli Gürsel B, Öngen G, Sağlam D (September 1, 2025) A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases. Journal of Contemporary Medicine 15 5 245–249.

IEEE

[1]B. Erdemli Gürsel, G. Öngen, and D. Sağlam, “A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases”, J Contemp Med, vol. 15, no. 5, pp. 245–249, Sept. 2025, doi: 10.16899/jcm.1626433.

ISNAD

Erdemli Gürsel, Başak - Öngen, Gökhan - Sağlam, Dilek. “A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases”. Journal of Contemporary Medicine 15/5 (September 1, 2025): 245-249. https://doi.org/10.16899/jcm.1626433.

JAMA

1.Erdemli Gürsel B, Öngen G, Sağlam D. A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases. J Contemp Med. 2025;15:245–249.

MLA

Erdemli Gürsel, Başak, et al. “A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases”. Journal of Contemporary Medicine, vol. 15, no. 5, Sept. 2025, pp. 245-9, doi:10.16899/jcm.1626433.

Vancouver

1.Başak Erdemli Gürsel, Gökhan Öngen, Dilek Sağlam. A Comparative Analysis of The Diagnostic Efficacy of Diverse Artificial Intelligence (AI) Algorithms in Ultrasound-Based Cases. J Contemp Med. 2025 Sep. 1;15(5):245-9. doi:10.16899/jcm.1626433