Research Article

Diagnostic performance of artificial intelligence-based large language models in acute ischemic stroke detection

Volume: 6 Number: 1 April 27, 2026
TR EN

Diagnostic performance of artificial intelligence-based large language models in acute ischemic stroke detection

Abstract

Purpose: The aim of this study was to evaluate and compare the performance of two large language models (ChatGPT 5.2 and Claude 4.5 Opus) in analyzing brain diffusion magnetic resonance images (MRI) for the detection of acute ischemic stroke. Materials and Methods: This single-center retrospective study included 58 patients with acute middle cerebral artery territory infarction and 62 control patients with normal diffusion MRI findings. For each patient, one diffusion-weighted imaging (DWI) and corresponding apparent diffusion coefficient (ADC) slice was selected. Images were presented to both models using a standardized prompt. The models were asked to identify the MRI sequence type, determine the presence of diffusion restriction, and classify the affected vascular territory. Model performances were compared. Results: For sequence identification, ChatGPT achieved 97.5-100% accuracy, while Claude achieved 93.3-99.2% accuracy. The most common error in both models was misclassification of DWI images as FLAIR. For detection of diffusion restriction, ChatGPT achieved 95.8% overall accuracy (98.3% sensitivity, 93.5% specificity), while Claude achieved 87.5% overall accuracy (94.8% sensitivity, 80.6% specificity). The difference between the two models was statistically significant (p=0.041). For vascular territory classification, ChatGPT achieved 93.1% accuracy, while Claude achieved 62.1% accuracy. Conclusion: Both models demonstrated high performance in MRI sequence identification. ChatGPT outperformed Claude in detection of diffusion restriction and vascular territory classification. These findings suggest that large language models have potential as supportive tools in acute stroke imaging; however, they are not yet sufficiently reliable for independent clinical use.

Keywords

Supporting Institution

None.

Ethical Statement

This study was conducted in accordance with the Declaration of Helsinki. Ethical approval was obtained from the İzmir Katip Çelebi University Health Research Ethics Committee (decision number: 0669, date: 06.11.2025). Due to the retrospective design of the study, the requirement for informed consent was waived.

Thanks

None.

References

  1. 1. Wang W, Jiang B, Sun H, et al. Prevalence, Incidence, and Mortality of Stroke in China: Results from a Nationwide Population-Based Survey of 480 687 Adults. Circulation. 2017 Feb 21;135(8):759–71.
  2. 2. Gilotra K, Swarna S, Mani R, Basem J, Dashti R. Role of artificial intelligence and machine learning in the diagnosis of cerebrovascular disease. Front Hum Neurosci. 2023 Sep 7;17.
  3. 3. Santana Baskar P, Cordato D, Wardman D, Bhaskar S. In‐hospital acute stroke workflow in acute stroke – Systems‐based approaches. Acta Neurol Scand. 2021 Feb 12;143(2):111–20.
  4. 4. Simonsen CZ, Madsen MH, Schmitz ML, Mikkelsen IK, Fisher M, Andersen G. Sensitivity of diffusion- and perfusion-weighted imaging for diagnosing acute ischemic stroke is 97.5%. Stroke. 2015 Jan;46(1):98–101.
  5. 5. Naveed H, Khan AU, Qiu S, et al. A Comprehensive Overview of Large Language Models. ACM Trans Intell Syst Technol. 2025 Oct 31;16(5):1–72.
  6. 6. Yasar Y, Demir M, Canturk A, Ozyilmaz S, Turgan AH, Agackaya Y. Effect of ChatGPT-Assisted Reflective Reasoning on Guideline-Concordant Procedural Decision-Making Among Early-Career Interventional Radiologists. Acad Radiol. 2026 Jan;
  7. 7. Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023 Aug 17;29(8):1930–40.
  8. 8. Mijares J, Jairath N, Zhang A, Que SKT. Validation of a Dermatology-Focused Multimodal Large Language Model in Classification of Pigmented Skin Lesions. Diagnostics. 2025 Nov 6;15(21):2808.

Details

Primary Language

English

Subjects

Radiology and Organ Imaging

Journal Section

Research Article

Publication Date

April 27, 2026

Submission Date

February 4, 2026

Acceptance Date

March 12, 2026

Published in Issue

Year 2026 Volume: 6 Number: 1

APA
Şalbaş, A., Yoğurtçu, M., Ertem, Ö., & Gelal, F. (2026). Diagnostic performance of artificial intelligence-based large language models in acute ischemic stroke detection. Sağlık Bilimlerinde Yapay Zeka Dergisi, 6(1), 13-21. https://doi.org/10.52309/jaihs.1880689
AMA
1.Şalbaş A, Yoğurtçu M, Ertem Ö, Gelal F. Diagnostic performance of artificial intelligence-based large language models in acute ischemic stroke detection. JAIHS. 2026;6(1):13-21. doi:10.52309/jaihs.1880689
Chicago
Şalbaş, Ali, Murat Yoğurtçu, Özge Ertem, and Fazıl Gelal. 2026. “Diagnostic Performance of Artificial Intelligence-Based Large Language Models in Acute Ischemic Stroke Detection”. Sağlık Bilimlerinde Yapay Zeka Dergisi 6 (1): 13-21. https://doi.org/10.52309/jaihs.1880689.
EndNote
Şalbaş A, Yoğurtçu M, Ertem Ö, Gelal F (April 1, 2026) Diagnostic performance of artificial intelligence-based large language models in acute ischemic stroke detection. Sağlık Bilimlerinde Yapay Zeka Dergisi 6 1 13–21.
IEEE
[1]A. Şalbaş, M. Yoğurtçu, Ö. Ertem, and F. Gelal, “Diagnostic performance of artificial intelligence-based large language models in acute ischemic stroke detection”, JAIHS, vol. 6, no. 1, pp. 13–21, Apr. 2026, doi: 10.52309/jaihs.1880689.
ISNAD
Şalbaş, Ali - Yoğurtçu, Murat - Ertem, Özge - Gelal, Fazıl. “Diagnostic Performance of Artificial Intelligence-Based Large Language Models in Acute Ischemic Stroke Detection”. Sağlık Bilimlerinde Yapay Zeka Dergisi 6/1 (April 1, 2026): 13-21. https://doi.org/10.52309/jaihs.1880689.
JAMA
1.Şalbaş A, Yoğurtçu M, Ertem Ö, Gelal F. Diagnostic performance of artificial intelligence-based large language models in acute ischemic stroke detection. JAIHS. 2026;6:13–21.
MLA
Şalbaş, Ali, et al. “Diagnostic Performance of Artificial Intelligence-Based Large Language Models in Acute Ischemic Stroke Detection”. Sağlık Bilimlerinde Yapay Zeka Dergisi, vol. 6, no. 1, Apr. 2026, pp. 13-21, doi:10.52309/jaihs.1880689.
Vancouver
1.Ali Şalbaş, Murat Yoğurtçu, Özge Ertem, Fazıl Gelal. Diagnostic performance of artificial intelligence-based large language models in acute ischemic stroke detection. JAIHS. 2026 Apr. 1;6(1):13-21. doi:10.52309/jaihs.1880689