The Effectiveness of Chatgpt-4o in the Evaluation of Traumatic Brain Ct Imaging
Abstract
Objective: Large language models (LLMs) such as GPT-4o have recently introduced real-time multimodal capabilities, including medical image interpretation. No prior studies have systematically assessed GPT-4o’s diagnostic accuracy for brain CT scans in trauma patients. We aimed to evaluate GPT-4o’s performance in identifying intracranial pathologies on brain CT images in this setting.
Methods: This retrospective cross-sectional study included adult patients presenting with head trauma between January and June 2024. For each patient, four representative CT slices were selected by a board-certified radiologist, whose interpretations served as the reference standard. The selected images were analysed by GPT-4o. Model outputs were compared with the radiologist’s assessments, and diagnostic accuracy metrics were calculated.
Results: A total of 54 patients were included. Observed pathologies comprised epidural hematoma (22.2%), subdural hematoma (44.4%), subarachnoid hemorrhage (57.4%), parenchymal hemorrhage/contusion (29.6%), pneumocephalus (13.0%), and intraventricular hemorrhage (3.7%). GPT-4o correctly identified all present pathologies in 7.4% of cases and at least one pathology in 24.1%. In 68.5% of cases, no pathology was correctly detected. Sensitivity was low across all categories: epidural hematoma 8.3% (AUC 0.506), subdural hematoma 25.0% (AUC 0.508), subarachnoid hemorrhage 3.2% (AUC 0.451), parenchymal hemorrhage/contusion 50.0% (AUC 0.592), and pneumocephalus 14.3% (AUC 0.571). GPT-4o also generated false positives across multiple pathology categories.
Conclusions: GPT-4o demonstrated insufficient diagnostic accuracy for the independent interpretation of brain CT scans in trauma patients, with low sensitivity and frequent misclassification. These findings underscore the necessity for task-specific training and rigorous validation in larger, multicentre studies prior to clinical implementation.
Keywords
Ethical Statement
References
- Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69S:S36-S40. doi:10.1016/j.metabol.2017.01.011.
- Johnson D, Goodman R, Patrinely J, Stone C, Zimmerman E, Donald R, et al. Assessing the accuracy and reliability of AI-Generated medical responses: an evaluation of the Chat-GPT model. Preprint. Res Sq. 2023;rs.3.rs-2566942. doi:10.21203/rs.3.rs-2566942/v1.
- Temsah MH, Jamal A, Alhasan K, Aljamaan F, Altamimi I, Malki KH, et al. Transforming virtual healthcare: the potentials of ChatGPT-4omni in telemedicine. Cureus. 2024;16(5). doi:10.7759/cureus.61377.
- Rowe BH, Yang E, Corrick S, Hussain MW. Reducing computed tomography (CT) imaging for adults with minor traumatic brain injuries in the emergency department. BMJ. 2024;386:e074867. doi:10.1136/bmj-2023-074867.
- Chen YH, Handly N, Chang DC, Chen YW. Racial difference in receiving computed tomography for head injury patients in emergency departments. Am J Emerg Med. 2024;83:54-8. doi:10.1016/j.ajem.2024.06.025.
- Dehdab R, Brendlin A, Werner S, Almansour H, Gassenmaier S, Brendel JM, et al. Evaluating ChatGPT-4V in chest CT diagnostics: a critical image interpretation assessment. Jpn J Radiol. 2024;42(10):1168-77. doi:10.1007/s11604-024-01606-3.
- Günay S, Öztürk A, Yiğit Y. The accuracy of Gemini, GPT-4, and GPT-4o in ECG analysis: a comparison with cardiologists and emergency medicine specialists. Am J Emerg Med. 2024;84:68-73. doi:10.1016/j.ajem.2024.07.043.
- Hindy JR, Souaid T, Kovacs CS. Capabilities of GPT-4o and Gemini 1.5 Pro in Gram stain and bacterial shape identification. Future Microbiol. 2024;19(15):1283-92. doi:10.1080/17460913.2024.2381967.
Details
Primary Language
English
Subjects
Emergency Medicine
Journal Section
Research Article
Authors
Ahmet Öztürk
*
0000-0002-9678-2195
Türkiye
Gurbet Yanarateş
0000-0003-0780-8814
Türkiye
Serkan Günay
0000-0002-8343-0916
Türkiye
Erdal Komut
0000-0003-2656-0420
Türkiye
Seval Komut
0000-0002-9558-4832
Türkiye
Early Pub Date
June 26, 2026
Publication Date
-
Submission Date
January 3, 2026
Acceptance Date
March 22, 2026
Published in Issue
Year 2026 Number: Advanced Online Publication