The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging

Ahmet Öztürk; Gurbet Yanarateş; Serkan Günay; Erdal Komut; Seval Komut; Yavuz Yiğit

doi:10.31832/smj.1855415

EN

The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging

Abstract

Objective: Large language models (LLMs) such as GPT-4o have recently introduced real-time multimodal capabilities, including medical image interpretation. No prior studies have systematically assessed GPT-4o’s diagnostic accuracy for brain CT scans in trauma patients. We aimed to evaluate GPT-4o’s performance in identifying intracranial pathologies on brain CT images in this setting.
Methods: This retrospective cross-sectional study included adult patients presenting with head trauma between January and June 2024. For each patient, four representative CT slices were selected by a board-certified radiologist, whose interpretations served as the reference standard. The selected images were analysed by GPT-4o. Model outputs were compared with the radiologist’s assessments, and diagnostic accuracy metrics were calculated.
Results: A total of 54 patients were included. Observed pathologies comprised epidural hematoma (22.2%), subdural hematoma (44.4%), subarachnoid hemorrhage (57.4%), parenchymal hemorrhage/contusion (29.6%), pneumocephalus (13.0%), and intraventricular hemorrhage (3.7%). GPT-4o correctly identified all present pathologies in 7.4% of cases and at least one pathology in 24.1%. In 68.5% of cases, no pathology was correctly detected. Sensitivity was low across all categories: epidural hematoma 8.3% (AUC 0.506), subdural hematoma 25.0% (AUC 0.508), subarachnoid hemorrhage 3.2% (AUC 0.451), parenchymal hemorrhage/contusion 50.0% (AUC 0.592), and pneumocephalus 14.3% (AUC 0.571). GPT-4o also generated false positives across multiple pathology categories.
Conclusions: GPT-4o demonstrated insufficient diagnostic accuracy for the independent interpretation of brain CT scans in trauma patients, with low sensitivity and frequent misclassification. These findings underscore the necessity for task-specific training and rigorous validation in larger, multicentre studies prior to clinical implementation.

Keywords

Ethical Statement

This study was approved by the Research Ethics Committee of the Faculty of Medicine at Hitit University on November 19, 2024 (Decision No: 2024-113). The requirement for informed consent was waived by the committee.

References

Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69S:S36-S40. doi:10.1016/j.metabol.2017.01.011.
Johnson D, Goodman R, Patrinely J, Stone C, Zimmerman E, Donald R, et al. Assessing the accuracy and reliability of AI-Generated medical responses: an evaluation of the Chat-GPT model. Preprint. Res Sq. 2023;rs.3.rs-2566942. doi:10.21203/rs.3.rs-2566942/v1.
Temsah MH, Jamal A, Alhasan K, Aljamaan F, Altamimi I, Malki KH, et al. Transforming virtual healthcare: the potentials of ChatGPT-4omni in telemedicine. Cureus. 2024;16(5). doi:10.7759/cureus.61377.
Rowe BH, Yang E, Corrick S, Hussain MW. Reducing computed tomography (CT) imaging for adults with minor traumatic brain injuries in the emergency department. BMJ. 2024;386:e074867. doi:10.1136/bmj-2023-074867.
Chen YH, Handly N, Chang DC, Chen YW. Racial difference in receiving computed tomography for head injury patients in emergency departments. Am J Emerg Med. 2024;83:54-8. doi:10.1016/j.ajem.2024.06.025.
Dehdab R, Brendlin A, Werner S, Almansour H, Gassenmaier S, Brendel JM, et al. Evaluating ChatGPT-4V in chest CT diagnostics: a critical image interpretation assessment. Jpn J Radiol. 2024;42(10):1168-77. doi:10.1007/s11604-024-01606-3.
Günay S, Öztürk A, Yiğit Y. The accuracy of Gemini, GPT-4, and GPT-4o in ECG analysis: a comparison with cardiologists and emergency medicine specialists. Am J Emerg Med. 2024;84:68-73. doi:10.1016/j.ajem.2024.07.043.
Hindy JR, Souaid T, Kovacs CS. Capabilities of GPT-4o and Gemini 1.5 Pro in Gram stain and bacterial shape identification. Future Microbiol. 2024;19(15):1283-92. doi:10.1080/17460913.2024.2381967.

Details

Primary Language

English

Subjects

Emergency Medicine

Journal Section

Research Article

Authors

Ahmet Öztürk ^*
0000-0002-9678-2195
Türkiye

Gurbet Yanarateş
0000-0003-0780-8814
Türkiye

Serkan Günay
0000-0002-8343-0916
Türkiye

Erdal Komut
0000-0003-2656-0420
Türkiye

Seval Komut
0000-0002-9558-4832
Türkiye

Yavuz Yiğit
0000-0002-7226-983X
Qatar

Early Pub Date

June 26, 2026

Publication Date

-

Submission Date

January 3, 2026

Acceptance Date

March 22, 2026

Published in Issue

Year 2026 Number: Advanced Online Publication

DOI

https://doi.org/10.31832/smj.1855415

IZ

https://izlik.org/JA58ZW77BA

Cite

RIS / Bibtex

APA

Öztürk, A., Yanarateş, G., Günay, S., Komut, E., Komut, S., & Yiğit, Y. (2026). The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging. Sakarya Medical Journal, Advanced Online Publication. https://doi.org/10.31832/smj.1855415

AMA

1.Öztürk A, Yanarateş G, Günay S, Komut E, Komut S, Yiğit Y. The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging. Sakarya Medical Journal. 2026;(Advanced Online Publication). doi:10.31832/smj.1855415

Chicago

Öztürk, Ahmet, Gurbet Yanarateş, Serkan Günay, Erdal Komut, Seval Komut, and Yavuz Yiğit. 2026. “The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging”. Sakarya Medical Journal, no. Advanced Online Publication. https://doi.org/10.31832/smj.1855415.

EndNote

Öztürk A, Yanarateş G, Günay S, Komut E, Komut S, Yiğit Y (June 1, 2026) The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging. Sakarya Medical Journal Advanced Online Publication

IEEE

[1]A. Öztürk, G. Yanarateş, S. Günay, E. Komut, S. Komut, and Y. Yiğit, “The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging”, Sakarya Medical Journal, no. Advanced Online Publication, June 2026, doi: 10.31832/smj.1855415.

ISNAD

Öztürk, Ahmet - Yanarateş, Gurbet - Günay, Serkan - Komut, Erdal - Komut, Seval - Yiğit, Yavuz. “The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging”. Sakarya Medical Journal. Advanced Online Publication (June 1, 2026). https://doi.org/10.31832/smj.1855415.

JAMA

1.Öztürk A, Yanarateş G, Günay S, Komut E, Komut S, Yiğit Y. The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging. Sakarya Medical Journal. 2026. doi:10.31832/smj.1855415.

MLA

Öztürk, Ahmet, et al. “The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging”. Sakarya Medical Journal, no. Advanced Online Publication, June 2026, doi:10.31832/smj.1855415.

Vancouver

1.Ahmet Öztürk, Gurbet Yanarateş, Serkan Günay, Erdal Komut, Seval Komut, Yavuz Yiğit. The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging. Sakarya Medical Journal. 2026 Jun. 1;(Advanced Online Publication). doi:10.31832/smj.1855415