Research Article
BibTex RIS Cite

CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?

Year 2025, Volume: 27 Issue: 2, 211 - 216, 25.08.2025
https://doi.org/10.24938/kutfd.1708478

Abstract

Objectives: Artificial intelligence (AI) encompasses systems designed to perform tasks that require human cognitive abilities, such as reasoning, decision-making, and problem-solving. Open AI’s Generative Pre-Trained Transformer (GPT) model family, including ChatGPT, is widely recognized for its ability to generate human-like text and facilitate interactive discussions. ChatGPT has potential applications in diagnosis assistance and medical education in healthcare, yet its adoption raises concerns. Our study aims to evaluate ChatGPT’s diagnostic performance in identifying autoinflammatory diseases compared to clinicians, exploring its potential as an accessible tool for physicians and patients.
Material and Methods: We evaluated the diagnostic performance of a publicly accessible AI model against two clinicians for identifying familial Mediterranean fever (FMF) and periodic fever, aphthous stomatitis, pharyngitis, and adenitis syndrome (PFAPA). Clinical data from 50 patients were presented anonymously in structured format to both the AI model and the clinicians. Diagnoses were compared to confirmed clinical diagnoses.
Results: A total of 50 patients were included in the study. The AI model suggested a rheumatologic diagnosis in 94% of cases but correctly diagnosed only 50% of them. In comparison, clinicians made accurate diagnoses in 76% and 70% of cases, respectively.
Conclusion: The development of AI has attracted significant attention in healthcare, as it has in other fields. However, AIgenerated data may be incorrect, highlighting the importance of expert supervision. AI should complement, not replace physicians, enhancing their capabilities. Future research should evaluate AI performance across different fields and its impact on decision-making to ensure reliable use through standardized guidelines.

References

  • Garg S, Chauhan A. Chat GPT-4: Potentials, barriers, and future directions for newer medical researchers. Am J Emerg Med. 2024;367(6):406-408.
  • Ariyaratne S, Jenko N, Davies AM, Iyengar KP, Botchu R. Could ChatGPT pass the UK radiology fellowship examinations? Acad Radiol. 2024;31(5):2178-2182.
  • Scheschenja M, Viniol S, Bastian MB, Wessendorf J, König AM, Mahnken AH. Feasibility of GPT-3 and GPT-4 for in-depth patient education prior to interventional radiological procedures: a comparative analysis. Cardiovasc Intervent Radiol. 2024;47(2):245-250.
  • Tran CG, Chang J, Sherman SK, De Andrade JP. Performance of ChatGPT on American board of surgery in-training examination preparation questions. J Surg Res. 2024;299:329-335.
  • Palenzuela DL, Mullen JT, Phitayakorn R. AI Versus MD: Evaluating the surgical decision-making accuracy of ChatGPT-4. Surgery. 2024;176(2):241-245.
  • Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: a systematic review and meta-analysis. J Biomed Inform. 2024:104620.
  • Zaboli A, Brigo F, Sibilio S, Mian M, Turcato G. Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage? Am J Emerg Med. 2024;79:44-47.
  • Venerito V, Bilgin E, Iannone F, Kiraz S. AI am a rheumatologist: a practical primer to large language models for rheumatologists. Rheumatology. 2023;62(10):3256-3260.
  • Günay S, Öztürk A, Yiğit Y. The accuracy of Gemini, GPT-4, and GPT-4o in ECG analysis: A comparison with cardiologists and emergency medicine specialists. Am J Emerg Med.. 2024;84:68-73.
  • Yeo YH, Samaan JS, Ng WH, et al. Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma. Clin Mol Hepatol. 2023;29(3):721.
  • Howard A, Hope W, Gerada A. ChatGPT and antimicrobial advice: the end of the consulting infection doctor? Lancet Infect Dis. 2023;23(4):405-406.
  • Wei Q, Wang Y, Yao Z, et al. Evaluation of ChatGPT's performance in providing treatment recommendations for pediatric diseases. Pediatr Discov. 2023;1(3):e42.
  • Nakhleh A, Spitzer S, Shehadeh N. ChatGPT's response to the diabetes knowledge questionnaire: implications for diabetes education. Diabetes Technol Ther. 2023;25(8):571-573.
  • Cadamuro J, Cabitza F, Debeljak Z, et al. Potentials and pitfalls of ChatGPT and natural-language artificial intelligence models for the understanding of laboratory medicine test results. An assessment by the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) Working Group on Artificial Intelligence (WG-AI). Clin Chem Lab Med. 2023;61(7):1158-1166.
  • Pawar VV, Farooqui S. Ethical consideration for implementing AI in healthcare: A chat GPT perspective. Oral Oncol. 2024;149:106682-106682.
  • La Bella S, Attanasi M, Porreca A, et al. Reliability of a generative artificial intelligence tool for pediatric familial Mediterranean fever: insights from a multicentre expert survey. Pediatr Rheumatol Online J.. 2024;22(1):78.
  • Gattorno M, Hofer M, Federici S, et al. Classification criteria for autoinflammatory recurrent fevers. Ann Rheum Dis. 2019;78(8):1025-1032.
  • Adrovic A, Sahin S, Barut K, Kasapcopur O. Familial Mediterranean fever and periodic fever, aphthous stomatitis, pharyngitis, and adenitis (PFAPA) syndrome: shared features and main differences. Rheumatol Int. 2019;39(1):29-36.
  • Xu D, Zhao J, Liu R, et al. ChatGPT4’s proficiency in addressing patients’ questions on systemic lupus erythematosus: a blinded comparative study with specialists. Rheumatology. 2024;63(9):2450-2456.
  • Madrid-García A, Rosales-Rosado Z, Freites-Nuñez D, et al. Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training. Sci Rep. 2023;13(1):22129.
  • Krusche M, Callhoff J, Knitza J, Ruffer N. Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4. Rheumatol Int. 2024;44(2):303-306.
  • Haase I, Xiong T, Rissmann A, Knitza J, Greenfield J, Krusche M. ChatSLE: consulting ChatGPT-4 for 100 frequently asked lupus questions. Lancet Rheumatol. 2024;6(4):e196-e199.
  • Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. AMA Intern Med. 2023;183(6):589-596.
  • Jo A. The promise and peril of generative AI. Nature. 2023;614(1):214-216.

ChatGPT ile Romatologların Karşılaştırılması: Hala Klinisyene İhtiyaç Var mı?

Year 2025, Volume: 27 Issue: 2, 211 - 216, 25.08.2025
https://doi.org/10.24938/kutfd.1708478

Abstract

Amaç: Yapay zeka (YZ), insanın bilişsel yeteneklerini gerektiren görevleri yerine getirmek üzere tasarlanmış sistemleri ifade eder; bu görevler arasında akıl yürütme, karar verme ve problem çözme yer alır. OpenAI’nın Generatif Önceden Eğitilmiş Dönüştürücü (GPT) model ailesi, ChatGPT dahil, insan benzeri metin üretme ve etkileşimli tartışmalar yapabilme yeteneği ile geniş çapta tanınmaktadır. ChatGPT, tanı desteği ve tıbbi eğitimde sağlık alanında potansiyel uygulamalara sahipken, bu teknolojinin benimsenmesi bazı endişeleri de beraberinde getirmektedir. Bu çalışmanın amacı, ChatGPT’nin, otoinflamatuar hastalıkları tanımlama konusundaki tanısal performansını, klinisyenlerle karşılaştırarak değerlendirmek ve bunu hekimler ve hastalar için erişilebilir bir araç olarak incelemektir.
Gereç ve Yöntemler: Aşağıda belirtilen hastalıkların tanısını koymada bir yapay zekâ modelinin, iki klinisyenle karşılaştırılan tanısal performansı değerlendirilmiştir: Ailevi Akdeniz ateşi (AAA) ve periyodik ateş, aftöz stomatit, farenjit ve adenit sendromu (PFAPA). 50 hastanın klinik verileri anonim olarak yapılandırılmış bir formatta hem yapay zekâ modeline hem de klinisyenlere sunulmuştur. Tanılar, doğrulanmış klinik tanılarla karşılaştırılmıştır.
Bulgular: Çalışmaya toplam 50 hasta dahil edilmiştir. Yapay zeka modeli, vakaların %94’ünde romatolojik bir tanı önermiş, ancak bunların yalnızca %50’sini doğru bir şekilde teşhis etmiştir. Buna karşılık, klinisyenler sırasıyla %76 ve %70 oranında doğru tanı koymuştur.
Sonuç: Yapay zeka teknolojisinin gelişimi, sağlık hizmetleri dahil olmak üzere birçok alanda büyük ilgi uyandırmıştır. Ancak, yapay zeka ile üretilen veriler hatalı olabilir, bu da uzman denetiminin önemini vurgulamaktadır. Yapay zeka, hekimleri ikame etmek yerine tamamlayıcı bir araç olarak kullanılmalı ve hekimlerin yeteneklerini artırmalıdır. Gelecekteki araştırmalar, yapay zekânın farklı alanlardaki performansını ve karar verme süreçlerine etkisini değerlendirerek, standartlaştırılmış kılavuzlarla güvenilir kullanımını sağlamayı hedeflemelidir.

References

  • Garg S, Chauhan A. Chat GPT-4: Potentials, barriers, and future directions for newer medical researchers. Am J Emerg Med. 2024;367(6):406-408.
  • Ariyaratne S, Jenko N, Davies AM, Iyengar KP, Botchu R. Could ChatGPT pass the UK radiology fellowship examinations? Acad Radiol. 2024;31(5):2178-2182.
  • Scheschenja M, Viniol S, Bastian MB, Wessendorf J, König AM, Mahnken AH. Feasibility of GPT-3 and GPT-4 for in-depth patient education prior to interventional radiological procedures: a comparative analysis. Cardiovasc Intervent Radiol. 2024;47(2):245-250.
  • Tran CG, Chang J, Sherman SK, De Andrade JP. Performance of ChatGPT on American board of surgery in-training examination preparation questions. J Surg Res. 2024;299:329-335.
  • Palenzuela DL, Mullen JT, Phitayakorn R. AI Versus MD: Evaluating the surgical decision-making accuracy of ChatGPT-4. Surgery. 2024;176(2):241-245.
  • Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: a systematic review and meta-analysis. J Biomed Inform. 2024:104620.
  • Zaboli A, Brigo F, Sibilio S, Mian M, Turcato G. Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage? Am J Emerg Med. 2024;79:44-47.
  • Venerito V, Bilgin E, Iannone F, Kiraz S. AI am a rheumatologist: a practical primer to large language models for rheumatologists. Rheumatology. 2023;62(10):3256-3260.
  • Günay S, Öztürk A, Yiğit Y. The accuracy of Gemini, GPT-4, and GPT-4o in ECG analysis: A comparison with cardiologists and emergency medicine specialists. Am J Emerg Med.. 2024;84:68-73.
  • Yeo YH, Samaan JS, Ng WH, et al. Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma. Clin Mol Hepatol. 2023;29(3):721.
  • Howard A, Hope W, Gerada A. ChatGPT and antimicrobial advice: the end of the consulting infection doctor? Lancet Infect Dis. 2023;23(4):405-406.
  • Wei Q, Wang Y, Yao Z, et al. Evaluation of ChatGPT's performance in providing treatment recommendations for pediatric diseases. Pediatr Discov. 2023;1(3):e42.
  • Nakhleh A, Spitzer S, Shehadeh N. ChatGPT's response to the diabetes knowledge questionnaire: implications for diabetes education. Diabetes Technol Ther. 2023;25(8):571-573.
  • Cadamuro J, Cabitza F, Debeljak Z, et al. Potentials and pitfalls of ChatGPT and natural-language artificial intelligence models for the understanding of laboratory medicine test results. An assessment by the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) Working Group on Artificial Intelligence (WG-AI). Clin Chem Lab Med. 2023;61(7):1158-1166.
  • Pawar VV, Farooqui S. Ethical consideration for implementing AI in healthcare: A chat GPT perspective. Oral Oncol. 2024;149:106682-106682.
  • La Bella S, Attanasi M, Porreca A, et al. Reliability of a generative artificial intelligence tool for pediatric familial Mediterranean fever: insights from a multicentre expert survey. Pediatr Rheumatol Online J.. 2024;22(1):78.
  • Gattorno M, Hofer M, Federici S, et al. Classification criteria for autoinflammatory recurrent fevers. Ann Rheum Dis. 2019;78(8):1025-1032.
  • Adrovic A, Sahin S, Barut K, Kasapcopur O. Familial Mediterranean fever and periodic fever, aphthous stomatitis, pharyngitis, and adenitis (PFAPA) syndrome: shared features and main differences. Rheumatol Int. 2019;39(1):29-36.
  • Xu D, Zhao J, Liu R, et al. ChatGPT4’s proficiency in addressing patients’ questions on systemic lupus erythematosus: a blinded comparative study with specialists. Rheumatology. 2024;63(9):2450-2456.
  • Madrid-García A, Rosales-Rosado Z, Freites-Nuñez D, et al. Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training. Sci Rep. 2023;13(1):22129.
  • Krusche M, Callhoff J, Knitza J, Ruffer N. Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4. Rheumatol Int. 2024;44(2):303-306.
  • Haase I, Xiong T, Rissmann A, Knitza J, Greenfield J, Krusche M. ChatSLE: consulting ChatGPT-4 for 100 frequently asked lupus questions. Lancet Rheumatol. 2024;6(4):e196-e199.
  • Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. AMA Intern Med. 2023;183(6):589-596.
  • Jo A. The promise and peril of generative AI. Nature. 2023;614(1):214-216.
There are 24 citations in total.

Details

Primary Language English
Subjects Health Services and Systems (Other)
Journal Section Özgün Araştırma
Authors

Çisem Yıldız 0000-0002-6901-9944

Batuhan Küçükali 0000-0002-4268-8603

Nuran Belder 0000-0001-6058-5037

Merve Kutlar 0000-0003-0218-3130

Nihal Karaçayır 0000-0001-5038-7539

Pelin Esmeray Şenol 0000-0002-9493-9355

Deniz Gezgin Yıldırım 0000-0002-4823-2076

Sevcan Bakkaloğlu 0000-0001-6530-9672

Publication Date August 25, 2025
Submission Date May 29, 2025
Acceptance Date July 13, 2025
Published in Issue Year 2025 Volume: 27 Issue: 2

Cite

APA Yıldız, Ç., Küçükali, B., Belder, N., … Kutlar, M. (2025). CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? The Journal of Kırıkkale University Faculty of Medicine, 27(2), 211-216. https://doi.org/10.24938/kutfd.1708478
AMA Yıldız Ç, Küçükali B, Belder N, et al. CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? Kırıkkale Uni Med J. August 2025;27(2):211-216. doi:10.24938/kutfd.1708478
Chicago Yıldız, Çisem, Batuhan Küçükali, Nuran Belder, Merve Kutlar, Nihal Karaçayır, Pelin Esmeray Şenol, Deniz Gezgin Yıldırım, and Sevcan Bakkaloğlu. “CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?”. The Journal of Kırıkkale University Faculty of Medicine 27, no. 2 (August 2025): 211-16. https://doi.org/10.24938/kutfd.1708478.
EndNote Yıldız Ç, Küçükali B, Belder N, Kutlar M, Karaçayır N, Esmeray Şenol P, Gezgin Yıldırım D, Bakkaloğlu S (August 1, 2025) CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? The Journal of Kırıkkale University Faculty of Medicine 27 2 211–216.
IEEE Ç. Yıldız, B. Küçükali, N. Belder, M. Kutlar, N. Karaçayır, P. Esmeray Şenol, D. Gezgin Yıldırım, and S. Bakkaloğlu, “CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?”, Kırıkkale Uni Med J, vol. 27, no. 2, pp. 211–216, 2025, doi: 10.24938/kutfd.1708478.
ISNAD Yıldız, Çisem et al. “CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?”. The Journal of Kırıkkale University Faculty of Medicine 27/2 (August2025), 211-216. https://doi.org/10.24938/kutfd.1708478.
JAMA Yıldız Ç, Küçükali B, Belder N, Kutlar M, Karaçayır N, Esmeray Şenol P, Gezgin Yıldırım D, Bakkaloğlu S. CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? Kırıkkale Uni Med J. 2025;27:211–216.
MLA Yıldız, Çisem et al. “CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN?”. The Journal of Kırıkkale University Faculty of Medicine, vol. 27, no. 2, 2025, pp. 211-6, doi:10.24938/kutfd.1708478.
Vancouver Yıldız Ç, Küçükali B, Belder N, Kutlar M, Karaçayır N, Esmeray Şenol P, et al. CHAT GPT VS. RHEUMATOLOGISTS: DO WE STILL NEED THE CLINICIAN? Kırıkkale Uni Med J. 2025;27(2):211-6.

This Journal is a Publication of Kırıkkale University Faculty of Medicine.