CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?

Çisem Yıldız; Batuhan Küçükali; Nuran Belder; Merve Kutlar; Nihal Karaçayır; Pelin Esmeray Şenol; Deniz Gezgin Yıldırım; Sevcan Bakkaloğlu

doi:10.24938/kutfd.1708478

Research Article

CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?

Year 2025, Volume: 27 Issue: 2, 211 - 216, 25.08.2025

Çisem Yıldız , Batuhan Küçükali , Nuran Belder , Merve Kutlar , Nihal Karaçayır , Pelin Esmeray Şenol , Deniz Gezgin Yıldırım , Sevcan Bakkaloğlu

https://doi.org/10.24938/kutfd.1708478

Abstract

Objectives: Artificial intelligence (AI) encompasses systems designed to perform tasks that require human cognitive abilities, such as reasoning, decision-making, and problem-solving. Open AI’s Generative Pre-Trained Transformer (GPT) model family, including ChatGPT, is widely recognized for its ability to generate human-like text and facilitate interactive discussions. ChatGPT has potential applications in diagnosis assistance and medical education in healthcare, yet its adoption raises concerns. Our study aims to evaluate ChatGPT’s diagnostic performance in identifying autoinflammatory diseases compared to clinicians, exploring its potential as an accessible tool for physicians and patients.
Material and Methods: We evaluated the diagnostic performance of a publicly accessible AI model against two clinicians for identifying familial Mediterranean fever (FMF) and periodic fever, aphthous stomatitis, pharyngitis, and adenitis syndrome (PFAPA). Clinical data from 50 patients were presented anonymously in structured format to both the AI model and the clinicians. Diagnoses were compared to confirmed clinical diagnoses.
Results: A total of 50 patients were included in the study. The AI model suggested a rheumatologic diagnosis in 94% of cases but correctly diagnosed only 50% of them. In comparison, clinicians made accurate diagnoses in 76% and 70% of cases, respectively.
Conclusion: The development of AI has attracted significant attention in healthcare, as it has in other fields. However, AIgenerated data may be incorrect, highlighting the importance of expert supervision. AI should complement, not replace physicians, enhancing their capabilities. Future research should evaluate AI performance across different fields and its impact on decision-making to ensure reliable use through standardized guidelines.

Keywords

Artificial intelligence , autoinflammatory diseases , rheumatology.

References

Garg S, Chauhan A. Chat GPT-4: Potentials, barriers, and future directions for newer medical researchers. Am J Emerg Med. 2024;367(6):406-408.
Ariyaratne S, Jenko N, Davies AM, Iyengar KP, Botchu R. Could ChatGPT pass the UK radiology fellowship examinations? Acad Radiol. 2024;31(5):2178-2182.
Scheschenja M, Viniol S, Bastian MB, Wessendorf J, König AM, Mahnken AH. Feasibility of GPT-3 and GPT-4 for in-depth patient education prior to interventional radiological procedures: a comparative analysis. Cardiovasc Intervent Radiol. 2024;47(2):245-250.
Tran CG, Chang J, Sherman SK, De Andrade JP. Performance of ChatGPT on American board of surgery in-training examination preparation questions. J Surg Res. 2024;299:329-335.
Palenzuela DL, Mullen JT, Phitayakorn R. AI Versus MD: Evaluating the surgical decision-making accuracy of ChatGPT-4. Surgery. 2024;176(2):241-245.
Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: a systematic review and meta-analysis. J Biomed Inform. 2024:104620.
Zaboli A, Brigo F, Sibilio S, Mian M, Turcato G. Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage? Am J Emerg Med. 2024;79:44-47.
Venerito V, Bilgin E, Iannone F, Kiraz S. AI am a rheumatologist: a practical primer to large language models for rheumatologists. Rheumatology. 2023;62(10):3256-3260.
Günay S, Öztürk A, Yiğit Y. The accuracy of Gemini, GPT-4, and GPT-4o in ECG analysis: A comparison with cardiologists and emergency medicine specialists. Am J Emerg Med.. 2024;84:68-73.
Yeo YH, Samaan JS, Ng WH, et al. Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma. Clin Mol Hepatol. 2023;29(3):721.
Howard A, Hope W, Gerada A. ChatGPT and antimicrobial advice: the end of the consulting infection doctor? Lancet Infect Dis. 2023;23(4):405-406.
Wei Q, Wang Y, Yao Z, et al. Evaluation of ChatGPT's performance in providing treatment recommendations for pediatric diseases. Pediatr Discov. 2023;1(3):e42.
Nakhleh A, Spitzer S, Shehadeh N. ChatGPT's response to the diabetes knowledge questionnaire: implications for diabetes education. Diabetes Technol Ther. 2023;25(8):571-573.
Cadamuro J, Cabitza F, Debeljak Z, et al. Potentials and pitfalls of ChatGPT and natural-language artificial intelligence models for the understanding of laboratory medicine test results. An assessment by the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) Working Group on Artificial Intelligence (WG-AI). Clin Chem Lab Med. 2023;61(7):1158-1166.
Pawar VV, Farooqui S. Ethical consideration for implementing AI in healthcare: A chat GPT perspective. Oral Oncol. 2024;149:106682-106682.
La Bella S, Attanasi M, Porreca A, et al. Reliability of a generative artificial intelligence tool for pediatric familial Mediterranean fever: insights from a multicentre expert survey. Pediatr Rheumatol Online J.. 2024;22(1):78.
Gattorno M, Hofer M, Federici S, et al. Classification criteria for autoinflammatory recurrent fevers. Ann Rheum Dis. 2019;78(8):1025-1032.
Adrovic A, Sahin S, Barut K, Kasapcopur O. Familial Mediterranean fever and periodic fever, aphthous stomatitis, pharyngitis, and adenitis (PFAPA) syndrome: shared features and main differences. Rheumatol Int. 2019;39(1):29-36.
Xu D, Zhao J, Liu R, et al. ChatGPT4’s proficiency in addressing patients’ questions on systemic lupus erythematosus: a blinded comparative study with specialists. Rheumatology. 2024;63(9):2450-2456.
Madrid-García A, Rosales-Rosado Z, Freites-Nuñez D, et al. Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training. Sci Rep. 2023;13(1):22129.
Krusche M, Callhoff J, Knitza J, Ruffer N. Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4. Rheumatol Int. 2024;44(2):303-306.
Haase I, Xiong T, Rissmann A, Knitza J, Greenfield J, Krusche M. ChatSLE: consulting ChatGPT-4 for 100 frequently asked lupus questions. Lancet Rheumatol. 2024;6(4):e196-e199.
Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. AMA Intern Med. 2023;183(6):589-596.
Jo A. The promise and peril of generative AI. Nature. 2023;614(1):214-216.

ChatGPT ile Romatologların Karşılaştırılması: Hala Klinisyene İhtiyaç Var mı?

Year 2025, Volume: 27 Issue: 2, 211 - 216, 25.08.2025

Çisem Yıldız , Batuhan Küçükali , Nuran Belder , Merve Kutlar , Nihal Karaçayır , Pelin Esmeray Şenol , Deniz Gezgin Yıldırım , Sevcan Bakkaloğlu

https://doi.org/10.24938/kutfd.1708478

Abstract

Amaç: Yapay zeka (YZ), insanın bilişsel yeteneklerini gerektiren görevleri yerine getirmek üzere tasarlanmış sistemleri ifade eder; bu görevler arasında akıl yürütme, karar verme ve problem çözme yer alır. OpenAI’nın Generatif Önceden Eğitilmiş Dönüştürücü (GPT) model ailesi, ChatGPT dahil, insan benzeri metin üretme ve etkileşimli tartışmalar yapabilme yeteneği ile geniş çapta tanınmaktadır. ChatGPT, tanı desteği ve tıbbi eğitimde sağlık alanında potansiyel uygulamalara sahipken, bu teknolojinin benimsenmesi bazı endişeleri de beraberinde getirmektedir. Bu çalışmanın amacı, ChatGPT’nin, otoinflamatuar hastalıkları tanımlama konusundaki tanısal performansını, klinisyenlerle karşılaştırarak değerlendirmek ve bunu hekimler ve hastalar için erişilebilir bir araç olarak incelemektir.
Gereç ve Yöntemler: Aşağıda belirtilen hastalıkların tanısını koymada bir yapay zekâ modelinin, iki klinisyenle karşılaştırılan tanısal performansı değerlendirilmiştir: Ailevi Akdeniz ateşi (AAA) ve periyodik ateş, aftöz stomatit, farenjit ve adenit sendromu (PFAPA). 50 hastanın klinik verileri anonim olarak yapılandırılmış bir formatta hem yapay zekâ modeline hem de klinisyenlere sunulmuştur. Tanılar, doğrulanmış klinik tanılarla karşılaştırılmıştır.
Bulgular: Çalışmaya toplam 50 hasta dahil edilmiştir. Yapay zeka modeli, vakaların %94’ünde romatolojik bir tanı önermiş, ancak bunların yalnızca %50’sini doğru bir şekilde teşhis etmiştir. Buna karşılık, klinisyenler sırasıyla %76 ve %70 oranında doğru tanı koymuştur.
Sonuç: Yapay zeka teknolojisinin gelişimi, sağlık hizmetleri dahil olmak üzere birçok alanda büyük ilgi uyandırmıştır. Ancak, yapay zeka ile üretilen veriler hatalı olabilir, bu da uzman denetiminin önemini vurgulamaktadır. Yapay zeka, hekimleri ikame etmek yerine tamamlayıcı bir araç olarak kullanılmalı ve hekimlerin yeteneklerini artırmalıdır. Gelecekteki araştırmalar, yapay zekânın farklı alanlardaki performansını ve karar verme süreçlerine etkisini değerlendirerek, standartlaştırılmış kılavuzlarla güvenilir kullanımını sağlamayı hedeflemelidir.

Keywords

Yapay zeka , otoinflamatuar hastalıklar , romatoloji.

References

Garg S, Chauhan A. Chat GPT-4: Potentials, barriers, and future directions for newer medical researchers. Am J Emerg Med. 2024;367(6):406-408.
Ariyaratne S, Jenko N, Davies AM, Iyengar KP, Botchu R. Could ChatGPT pass the UK radiology fellowship examinations? Acad Radiol. 2024;31(5):2178-2182.
Scheschenja M, Viniol S, Bastian MB, Wessendorf J, König AM, Mahnken AH. Feasibility of GPT-3 and GPT-4 for in-depth patient education prior to interventional radiological procedures: a comparative analysis. Cardiovasc Intervent Radiol. 2024;47(2):245-250.
Tran CG, Chang J, Sherman SK, De Andrade JP. Performance of ChatGPT on American board of surgery in-training examination preparation questions. J Surg Res. 2024;299:329-335.
Palenzuela DL, Mullen JT, Phitayakorn R. AI Versus MD: Evaluating the surgical decision-making accuracy of ChatGPT-4. Surgery. 2024;176(2):241-245.
Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: a systematic review and meta-analysis. J Biomed Inform. 2024:104620.
Zaboli A, Brigo F, Sibilio S, Mian M, Turcato G. Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage? Am J Emerg Med. 2024;79:44-47.
Venerito V, Bilgin E, Iannone F, Kiraz S. AI am a rheumatologist: a practical primer to large language models for rheumatologists. Rheumatology. 2023;62(10):3256-3260.
Günay S, Öztürk A, Yiğit Y. The accuracy of Gemini, GPT-4, and GPT-4o in ECG analysis: A comparison with cardiologists and emergency medicine specialists. Am J Emerg Med.. 2024;84:68-73.
Yeo YH, Samaan JS, Ng WH, et al. Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma. Clin Mol Hepatol. 2023;29(3):721.
Howard A, Hope W, Gerada A. ChatGPT and antimicrobial advice: the end of the consulting infection doctor? Lancet Infect Dis. 2023;23(4):405-406.
Wei Q, Wang Y, Yao Z, et al. Evaluation of ChatGPT's performance in providing treatment recommendations for pediatric diseases. Pediatr Discov. 2023;1(3):e42.
Nakhleh A, Spitzer S, Shehadeh N. ChatGPT's response to the diabetes knowledge questionnaire: implications for diabetes education. Diabetes Technol Ther. 2023;25(8):571-573.
Cadamuro J, Cabitza F, Debeljak Z, et al. Potentials and pitfalls of ChatGPT and natural-language artificial intelligence models for the understanding of laboratory medicine test results. An assessment by the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) Working Group on Artificial Intelligence (WG-AI). Clin Chem Lab Med. 2023;61(7):1158-1166.
Pawar VV, Farooqui S. Ethical consideration for implementing AI in healthcare: A chat GPT perspective. Oral Oncol. 2024;149:106682-106682.
La Bella S, Attanasi M, Porreca A, et al. Reliability of a generative artificial intelligence tool for pediatric familial Mediterranean fever: insights from a multicentre expert survey. Pediatr Rheumatol Online J.. 2024;22(1):78.
Gattorno M, Hofer M, Federici S, et al. Classification criteria for autoinflammatory recurrent fevers. Ann Rheum Dis. 2019;78(8):1025-1032.
Adrovic A, Sahin S, Barut K, Kasapcopur O. Familial Mediterranean fever and periodic fever, aphthous stomatitis, pharyngitis, and adenitis (PFAPA) syndrome: shared features and main differences. Rheumatol Int. 2019;39(1):29-36.
Xu D, Zhao J, Liu R, et al. ChatGPT4’s proficiency in addressing patients’ questions on systemic lupus erythematosus: a blinded comparative study with specialists. Rheumatology. 2024;63(9):2450-2456.
Madrid-García A, Rosales-Rosado Z, Freites-Nuñez D, et al. Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training. Sci Rep. 2023;13(1):22129.
Krusche M, Callhoff J, Knitza J, Ruffer N. Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4. Rheumatol Int. 2024;44(2):303-306.
Haase I, Xiong T, Rissmann A, Knitza J, Greenfield J, Krusche M. ChatSLE: consulting ChatGPT-4 for 100 frequently asked lupus questions. Lancet Rheumatol. 2024;6(4):e196-e199.
Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. AMA Intern Med. 2023;183(6):589-596.
Jo A. The promise and peril of generative AI. Nature. 2023;614(1):214-216.

There are 24 citations in total.

Details

Primary Language	English
Subjects	Health Services and Systems (Other)
Journal Section	Research Article
Authors	Çisem Yıldız 0000-0002-6901-9944 Batuhan Küçükali 0000-0002-4268-8603 Nuran Belder 0000-0001-6058-5037 Merve Kutlar 0000-0003-0218-3130 Nihal Karaçayır 0000-0001-5038-7539 Pelin Esmeray Şenol 0000-0002-9493-9355 Deniz Gezgin Yıldırım 0000-0002-4823-2076 Sevcan Bakkaloğlu 0000-0001-6530-9672
Publication Date	August 25, 2025
Submission Date	May 29, 2025
Acceptance Date	July 13, 2025
Published in Issue	Year 2025 Volume: 27 Issue: 2

Cite

APA	Yıldız, Ç., Küçükali, B., Belder, N., … Kutlar, M. (2025). CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? The Journal of Kırıkkale University Faculty of Medicine, 27(2), 211-216. https://doi.org/10.24938/kutfd.1708478
AMA	Yıldız Ç, Küçükali B, Belder N, et al. CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? Kırıkkale Uni Med J. August 2025;27(2):211-216. doi:10.24938/kutfd.1708478
Chicago	Yıldız, Çisem, Batuhan Küçükali, Nuran Belder, Merve Kutlar, Nihal Karaçayır, Pelin Esmeray Şenol, Deniz Gezgin Yıldırım, and Sevcan Bakkaloğlu. “CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?”. The Journal of Kırıkkale University Faculty of Medicine 27, no. 2 (August 2025): 211-16. https://doi.org/10.24938/kutfd.1708478.
EndNote	Yıldız Ç, Küçükali B, Belder N, Kutlar M, Karaçayır N, Esmeray Şenol P, Gezgin Yıldırım D, Bakkaloğlu S (August 1, 2025) CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? The Journal of Kırıkkale University Faculty of Medicine 27 2 211–216.
IEEE	Ç. Yıldız, B. Küçükali, N. Belder, M. Kutlar, N. Karaçayır, P. Esmeray Şenol, D. Gezgin Yıldırım, and S. Bakkaloğlu, “CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?”, Kırıkkale Uni Med J, vol. 27, no. 2, pp. 211–216, 2025, doi: 10.24938/kutfd.1708478.
ISNAD	Yıldız, Çisem et al. “CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?”. The Journal of Kırıkkale University Faculty of Medicine 27/2 (August2025), 211-216. https://doi.org/10.24938/kutfd.1708478.
JAMA	Yıldız Ç, Küçükali B, Belder N, Kutlar M, Karaçayır N, Esmeray Şenol P, Gezgin Yıldırım D, Bakkaloğlu S. CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? Kırıkkale Uni Med J. 2025;27:211–216.
MLA	Yıldız, Çisem et al. “CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?”. The Journal of Kırıkkale University Faculty of Medicine, vol. 27, no. 2, 2025, pp. 211-6, doi:10.24938/kutfd.1708478.
Vancouver	Yıldız Ç, Küçükali B, Belder N, Kutlar M, Karaçayır N, Esmeray Şenol P, et al. CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? Kırıkkale Uni Med J. 2025;27(2):211-6.

Download Cover Image

Article Files

Full Text

This Journal is a Publication of Kırıkkale University Faculty of Medicine.