Araştırma Makalesi

Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT

Cilt: 52 10 Şubat 2026
PDF İndir
TR EN

Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT

Öz

The aim of the study is to assess the performance of multimodal large language models (MLLMs) in assigning Bone-RADS categories to bone lesions identified on CT images. An MSK radiologist selected one representative slice for 50 bone lesions seen on CT studies and assigned reference Bone-RADS categories using clinical records. Three raters categorized each case: an abdominal radiologist, OpenAI ChatGPT 5, and Google Gemini 2.5 Pro. Accuracy was defined as the correctly labeled Bone-RADS 1 and 4 cases and compared using McNemar test. Agreement with the reference was assessed using weighted Cohen’s κ with 95% CIs; pairwise κ differences were tested via bootstrap. Reference categories were Bone-RADS 1, n=23; 2, n=4; 3, n=0; 4, n=23. Accuracy was 84.8% (39/46) for the radiologist, 78.3% (36/46) for Gemini, and 65.2% (30/46) for ChatGPT. The radiologist outperformed ChatGPT (p=0.012); differences between the radiologist vs Gemini (p=0.604) and Gemini vs ChatGPT (p=0.360) were not significant. The radiologist achieved the highest agreement with the reference standard (κ = 0.715, 95% CI: [0.543-0.887]), followed by Gemini (κ = 0.542, 95% CI: [0.313-0.770]) and ChatGPT (κ = 0.292, 95% CI: [0.104-0.479]). Bootstrap comparisons showed that the radiologist’s κ was higher than ChatGPT’s (95% CI for difference, 0.140-0.675), while radiologist vs Gemini (−0.113-0.434) and Gemini vs ChatGPT (−0.041-0.522) were not significant. In conclusion, general-purpose MLLMs cannot yet replace trained radiologists for Bone-RADS classification, though they may still aid routine clinical practice.

Anahtar Kelimeler

Kaynakça

  1. 1.Blackburn CW, Richardson SM, Devita RR, et al. What Is the Prevalence of Clinically Important Findings AmongIncidentally Found Osseous Lesions? Clin Orthop Relat Res. 2023;481(10):1993-2002. doi:10.1097/CORR.0000000000002630
  2. 2. Salunke AA, Nandy K, Puj K, et al. A proposed “Radiological Evaluation Score for Bone Tumors” (REST): An objective system for assessment of a radiograph in patients with suspected bone tumor. Musculoskelet Surg. 2022;106(4):371-382. doi:10.1007/S12306-021-00711-0
  3. 3. Caracciolo JT, Ali S, Chang CY, et al. Bone Tumor RiskStratification and Management System: A Consensus Guideline from the ACR Bone Reporting and Data System Committee. Journal of the American College of Radiology. 2023;20(10):1044-1058. doi:10.1016/j.jacr.2023.07.017
  4. 4. Chhabra A, Gupta A, Thakur U, et al. Osseous TumorReporting and Data System-Multireader Validation Study. JComput Assist Tomogr. 2021;45(4):571-585. doi:10.1097/RCT.0000000000001184
  5. 5. Chang CY, Garner HW, Ahlawat S, et al. Society of SkeletalRadiology- white paper. Guidelines for the diagnostic management of incidental solitary bone lesions on CT and MRI in adults: bone reporting and data system (Bone-RADS). Skeletal Radiol. 2022;51(9):1743-1764. doi:10.1007/S00256-022-04022-8
  6. 6. Xing Y, Ding D, Dai S, et al. Bone reporting and data systemon CT (Bone-RADS-CT): a validation study by four readers on328 cases from three local and two public databases. Insights Imaging. 2025;16(1):174. doi:10.1186/S13244-025-02057-8
  7. 7. Xing Y, Hu Y, Liu X, et al. Bone Reporting and Data Systemon MRI (Bone-RADS-MRI): a validation study by four readerson 275 cases from three local and two public databases. Insights Imaging. 2025;16(1). doi:10.1186/S13244-025-02040-3
  8. 8. Doshi R, Amin KS, Khosla P, Bajaj S, Chheang S, Forman HP.Quantitative Evaluation of Large Language Models toStreamline Radiology Report Impressions: A Multimodal Retrospective Analysis. Radiology. 2024;310(3). doi: 10.1148/radiol.231593.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Radyoloji ve Organ Görüntüleme

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

10 Şubat 2026

Gönderilme Tarihi

16 Ekim 2025

Kabul Tarihi

18 Aralık 2025

Yayımlandığı Sayı

Yıl 2026 Cilt: 52

Kaynak Göster

APA
Kaya, H. E., & Ataş, A. E. (2026). Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT. Journal of Uludağ University Medical Faculty, 52, 1804768. https://doi.org/10.32708/uutfd.1804768
AMA
1.Kaya HE, Ataş AE. Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT. Uludağ Tıp Derg. 2026;52:1804768. doi:10.32708/uutfd.1804768
Chicago
Kaya, Hasan Emin, ve Abdullah Enes Ataş. 2026. “Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT”. Journal of Uludağ University Medical Faculty 52 (Şubat): 1804768. https://doi.org/10.32708/uutfd.1804768.
EndNote
Kaya HE, Ataş AE (01 Şubat 2026) Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT. Journal of Uludağ University Medical Faculty 52 1804768.
IEEE
[1]H. E. Kaya ve A. E. Ataş, “Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT”, Uludağ Tıp Derg, c. 52, s. 1804768, Şub. 2026, doi: 10.32708/uutfd.1804768.
ISNAD
Kaya, Hasan Emin - Ataş, Abdullah Enes. “Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT”. Journal of Uludağ University Medical Faculty 52 (01 Şubat 2026): 1804768. https://doi.org/10.32708/uutfd.1804768.
JAMA
1.Kaya HE, Ataş AE. Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT. Uludağ Tıp Derg. 2026;52:1804768.
MLA
Kaya, Hasan Emin, ve Abdullah Enes Ataş. “Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT”. Journal of Uludağ University Medical Faculty, c. 52, Şubat 2026, s. 1804768, doi:10.32708/uutfd.1804768.
Vancouver
1.Hasan Emin Kaya, Abdullah Enes Ataş. Diagnostic Performance of Multimodal Large Language Models in Assigning Bone-RADS Categories on CT. Uludağ Tıp Derg. 01 Şubat 2026;52:1804768. doi:10.32708/uutfd.1804768

ISSN: 1300-414X, e-ISSN: 2645-9027

Uludağ Üniversitesi Tıp Fakültesi Dergisi "Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License" ile lisanslanmaktadır.


Creative Commons License
Journal of Uludag University Medical Faculty is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

2023