Research Article

Assessing ChatGPT’s accuracy and reliability in medical education: a cross-sectional study

Volume: 23 Number: 1 April 25, 2025
EN

Assessing ChatGPT’s accuracy and reliability in medical education: a cross-sectional study

An Erratum to this article was published on August 9, 2025. https://dergipark.org.tr/en/pub/tjph/article/1759338

Abstract

Objective: Artificial intelligence (AI), specifically ChatGPT, developed by Open AI provides human-like understanding and answers to a variety of domain questions and has the potential to transform medical education. However, its reliability in providing accurate clinical information is highly uncertain. This study is aimed at evaluating the accuracy and reliability of ChatGPT in answering multiple-choice questions (MCQs) and protocol-based questions in the field of medicine. Methods: This cross-sectional study was conducted using mixed methods at MVJ Medical College and Research Hospital (April 2024), Hoskote, India, i.e. MCQs (n=228) and protocol-based questions (n=10) from all 19 MBBS Subjects from standard medical literature were used to test ChatGPT. Subject experts checked the responses for accuracy. Statistical analysis, by chi-square test, was performed using IBM SPSS Version 20.0 for Windows. Results: The study findings stated that ChatGPT in easy and simple MCQs, had good accuracy, but its performance lowered with more complex questions, and overall answered about 57.02% of MCQs correctly. Protocol-based questions were given average scores, i.e. 6.35/10 for textbook accurate knowledge and 5.75/10 for real-life application. Conclusion: ChatGPT shows potential as a tool for medical education, especially in recalling basic facts but, it should not be relied upon as a sole source of information, instead used in conjunction with traditional methods to ensure a comprehensive understanding of medical concepts.

Keywords

References

  1. McCarthy, J., Minsky, M.L., Rochester, N. and Shannon, C.E. 2006. A Proposal for the Dartmouth Summer Re-search Project on Artificial Intelligence, August 31, 1955. AI Magazine. 27, 4 (Dec. 2006), 12. DOI:https://doi.org/10.1609/aimag.v27i4.1904.
  2. Chen J. Playing to our human strengths to prepare medical students for the future. Korean J Med Educ. 2017;29(3):193-197. doi:10.3946/kjme.2017.65
  3. Meskó B, Hetényi G, Győrffy Z. Will artificial intelligence solve the human resource crisis in healthcare? BMC Health Serv Res. 2018;18(1):545. Published 2018 Jul 13. doi:10.1186/s12913-018-3359-4
  4. OpenAI. ChatGPT [Internet]. OpenAI API; 2022
  5. Savery M, Abacha AB, Gayen S, Demner-Fushman D. Question-driven summarization of answers to consumer health questions. Sci Data. 2020;7(1):322. Published 2020 Oct 2. doi:10.1038/s41597-020-00667-z
  6. Gutiérrez BJ, McNeal N, Washington C, Chen Y, Li L, Sun H, et al. Thinking about GPT-3 in-context learning for biomedical IE? Think again. arXiv. Preprint posted online on November 5, 2022. [doi: 10.48550/arXiv.2203.08410]
  7. Kolachalama, V. B., & Garg, P. S. (2018). Machine learning and medical education. NPJ digital medicine, 1(1), 54.
  8. Zarei M, Mamaghani HE, Abbasi A, Hosseini M. Application of artificial intelligence in medical education: A review of benefits, challenges, and solutions. Medicina Clínica Práctica. doi:10.1016/j.mcpsp.2023.100422

Details

Primary Language

English

Subjects

Health Services and Systems (Other)

Journal Section

Research Article

Early Pub Date

April 20, 2025

Publication Date

April 25, 2025

Submission Date

September 2, 2024

Acceptance Date

March 29, 2025

Published in Issue

Year 2025 Volume: 23 Number: 1

APA
Vishal, A. R., Harshitha, A. S., Sindhu, A. V., R, A., Mb, P., & Madhukumar, S. (2025). Assessing ChatGPT’s accuracy and reliability in medical education: a cross-sectional study. Turkish Journal of Public Health, 23(1), 11-17. https://doi.org/10.20518/tjph.1498611
AMA
1.Vishal AR, Harshitha AS, Sindhu AV, R A, Mb P, Madhukumar S. Assessing ChatGPT’s accuracy and reliability in medical education: a cross-sectional study. TJPH. 2025;23(1):11-17. doi:10.20518/tjph.1498611
Chicago
Vishal, A Ra, A S Harshitha, A V Sindhu, Abhivanth R, Pavithra Mb, and Suwarna Madhukumar. 2025. “Assessing ChatGPT’s Accuracy and Reliability in Medical Education: A Cross-Sectional Study”. Turkish Journal of Public Health 23 (1): 11-17. https://doi.org/10.20518/tjph.1498611.
EndNote
Vishal AR, Harshitha AS, Sindhu AV, R A, Mb P, Madhukumar S (April 1, 2025) Assessing ChatGPT’s accuracy and reliability in medical education: a cross-sectional study. Turkish Journal of Public Health 23 1 11–17.
IEEE
[1]A. R. Vishal, A. S. Harshitha, A. V. Sindhu, A. R, P. Mb, and S. Madhukumar, “Assessing ChatGPT’s accuracy and reliability in medical education: a cross-sectional study”, TJPH, vol. 23, no. 1, pp. 11–17, Apr. 2025, doi: 10.20518/tjph.1498611.
ISNAD
Vishal, A Ra - Harshitha, A S - Sindhu, A V - R, Abhivanth - Mb, Pavithra - Madhukumar, Suwarna. “Assessing ChatGPT’s Accuracy and Reliability in Medical Education: A Cross-Sectional Study”. Turkish Journal of Public Health 23/1 (April 1, 2025): 11-17. https://doi.org/10.20518/tjph.1498611.
JAMA
1.Vishal AR, Harshitha AS, Sindhu AV, R A, Mb P, Madhukumar S. Assessing ChatGPT’s accuracy and reliability in medical education: a cross-sectional study. TJPH. 2025;23:11–17.
MLA
Vishal, A Ra, et al. “Assessing ChatGPT’s Accuracy and Reliability in Medical Education: A Cross-Sectional Study”. Turkish Journal of Public Health, vol. 23, no. 1, Apr. 2025, pp. 11-17, doi:10.20518/tjph.1498611.
Vancouver
1.A Ra Vishal, A S Harshitha, A V Sindhu, Abhivanth R, Pavithra Mb, Suwarna Madhukumar. Assessing ChatGPT’s accuracy and reliability in medical education: a cross-sectional study. TJPH. 2025 Apr. 1;23(1):11-7. doi:10.20518/tjph.1498611

                     13955                      13956                         13959                        28911                              13958

  

       

TURKISH JOURNAL OF PUBLIC HEALTH - TURK J PUBLIC HEALTH. online-ISSN: 1304-1096 

Copyright holder Turkish Journal of Public Health. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.