Araştırma Makalesi
BibTex RIS Kaynak Göster

KARŞILAŞTIRMALI BİR ÇALIŞMA: TÜRKÇE BİLGİSAYARLI TOMOGRAFİ RAPORLARININ SADELEŞTİRİLMESİNDE BÜYÜK DİL MODELLERİNİN PERFORMANSI

Yıl 2024, Cilt: 87 Sayı: 4, 321 - 326, 25.10.2024
https://doi.org/10.26650/IUITFD.1494572

Öz

Amaç: Bu çalışmada, yaygın bir görüntüleme yöntemi olan Türk çe bilgisayarlı tomografi (BT) raporlarının sadeleştirilmesinde çeşitli büyük dil modellerinin (BDM) etkinliği değerlendirilmiştir.
Gereç ve Yöntem: Kurgusal BT bulguları kullanılarak, Tanısal Doğruluk Çalışmaları Raporlama Standartları (STARD) ve Helsinki Bildirgesi'ne uyulmuştur. Elli kurgusal Türkçe BT bulgusu oluşturuldu. Dört LLM (ChatGPT 4, ChatGPT-3.5, Gemini 1.5 Pro ve Claude 3 Opus) istemini kullanarak raporları sadeleştirdi: "Please explain them in a way that someone without a medical background can understand in Turkish". Okunabilirlik değerlen dirmesi Ateşman Okunabilirlik Endeksi, doğruluk derecesi Likert ölçeğine göre yapılmıştır.
Bulgular: Claude 3 Opus okunabilirlik açısından en yüksek puanı alırken (58,9), onu ChatGPT-3.5 (54,5), Gemini 1.5 Pro (53,7) ve ChatGPT 4 (45,1) izledi. Claude 3 Opus (ortalama: 4,7) ve Chat GPT 4 (ortalama: 4,5) için Likert skorları anlamlı bir farklılık yoktu (p>0,05). ChatGPT 4, Claude 3 Opus (90,6), Gemini 1.5 Pro (74,4) ve ChatGPT-3.5 (38,7) ile karşılaştırıldığında en yüksek kelime sayısına (96,98) sahipti (p<0,001).
Sonuç: Bu çalışma, BDM'lerin Türkçe BT raporlarını tıp bilgisi ol mayan bireylerin anlayabileceği düzeyde ve yüksek okunabilirlik ve doğrulukla sadeleştirebildiğini göstermektedir. ChatGPT 4 ve Claude 3 Opus en doğru sadeleştirmeleri yapmaktadır. ChatGPT 4'ün daha basit cümleleri, onu Türkçe BT raporları için tercih edi len seçenek haline getirebilir.

Etik Beyan

Bu çalışmada gerçek hasta bilgi ve verileri kullanılmadığı için etik kurul onay gerekmemektedir.

Kaynakça

  • Zhao WX, Zhou K, Li J, Tang T, Wang X, Hou Y, et al. A Survey of Large Language Models. 2023 http://arxiv.org/ abs/2303.18223 google scholar
  • Kung TH, Cheatham M, Medenilla A, Sillos C, Leon L De, Elepano C, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health 2023;2(2):e0000198. [CrossRef] google scholar
  • Yilmaz EC, Belue MJ, Turkbey B, Reinhold C, Choyke PL. A Brief Review of Artificial Intelligence in Genitourinary Oncological Imaging. Can Assoc Radiol J 2023;74(3):534-47. [CrossRef] google scholar
  • Akinci D’Antonoli T, Stanzione A, Bluethgen C, Vernuccio F, Ugga L, Klontzas ME, et al. Large language models in radiology: fundamentals, applications, ethical considerations, risks, and future directions. Diagnostic and Interventional Radiology 2024;30(2):80-90. [CrossRef] google scholar
  • Doshi R, Amin K, Khosla P, Bajaj S, Chheang S, Forman HP. Utilizing Large Language Models to Simplify Radiology Reports: a comparative analysis of ChatGPT3.5, ChatGPT4.0, Google Bard, and Microsoft Bing. medRxiv 2023. https:// www.medrxiv.org/content/10.1101/2023.06.04.23290786v2 [CrossRef] google scholar
  • Li H, Moon JT, Iyer D, Balthazar P, Krupinski EA, Bercu ZL, et al. Decoding radiology reports: Potential application of OpenAI ChatGPT to enhance patient understanding of diagnostic reports. Clin Imaging. 2023;101:137-41. [CrossRef] google scholar
  • Luo W, Liu F, Liu Z, Litman D. A novel ILP framework for summarizing content with high lexical variety. Nat Lang Eng 2018;24(6):887-920. [CrossRef] google scholar
  • Guadalupe Ramos J, Navarro-Alatorre I, Flores Becerra G, Flores-Sanchez O. A Formal Technique for Text Summarization from Web Pages by using Latent Semantic Analysis. Research in Computing Science 2019;148(3):11-22. [CrossRef] google scholar
  • Bossuyt PM, Reitsma JB, Bruns DE, Bruns DE, Glasziou PP, Irwig L, et al. STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies1. Radiology 2015;277(3):826-32. [CrossRef] google scholar
  • Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol 2023;1:1-9. [CrossRef] google scholar
  • Schmidt S, Zimmerer A, Cucos T, Feucht M, Navas L. Simplifying radiologic reports with natural language processing: a novel approach using ChatGPT in enhancing patient understanding of MRI results. Arch Orthop Trauma Surg 2024;144(2):611-8. [CrossRef] google scholar
  • Ateşman E. Türkçede okunabilirliğin ölçülmesi. Dil Dergisi. 1997;58:71-4. google scholar
  • Johnson AEW, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data 2023;10(1):1. [CrossRef] google scholar
  • Lyu Q, Tan J, Zapadka ME, Ponnatapura J, Niu C, Myers KJ, et al. Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential. Vis Comput Ind Biomed Art 2023;6(1):1-10. [CrossRef] google scholar

A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS

Yıl 2024, Cilt: 87 Sayı: 4, 321 - 326, 25.10.2024
https://doi.org/10.26650/IUITFD.1494572

Öz

Objective: This study evaluated the effectiveness of various large language models (LLMs) in simplifying Turkish Computed Tomograpghy (CT) reports, a common imaging modality.
Material and Method: Using fictional CT findings, we followed the Standards for Reporting of Diagnostic Accuracy Studies (STARD) and the Declaration of Helsinki. Fifty fictional Turkish CT findings were generated. Four LLMs (ChatGPT 4, ChatGPT-3.5, Gemini 1.5 Pro, and Claude 3 Opus) simplified reports using the prompt: "Please explain them in a way that someone without a medical background can understand in Turkish.” Evaluations were based on the Ateşman’s Readability Index and Likert scale for accuracy and readability.
Results: Claude 3 Opus scored the highest in readability (58.9), followed by ChatGPT-3.5 (54.5), Gemini 1.5 Pro (53.7), and ChatGPT 4 (45.1). Likert scores for Claude 3 Opus (mean: 4.7) and ChatGPT 4 (mean: 4.5) showed no significant differ ence (p>0.05). ChatGPT 4 had the highest word count (96.98) compared to Claude 3 Opus (90.6), Gemini 1.5 Pro (74.4), and ChatGPT-3.5 (38.7) (p<0.001).
Conclusion: This study shows that LLMs can simplify Turkish CT reports at a level that individuals without medical knowledge can understand and with high readability and accuracy. ChatGPT 4 and Claude 3 Opus produced the most comprehensible sim plifications. Claude 3 Opus’ simpler sentences may make it the optimal choice for simplifying Turkish CT reports.

Etik Beyan

Since real patient information and data were not used in this study, ethics committee approval was not required.

Kaynakça

  • Zhao WX, Zhou K, Li J, Tang T, Wang X, Hou Y, et al. A Survey of Large Language Models. 2023 http://arxiv.org/ abs/2303.18223 google scholar
  • Kung TH, Cheatham M, Medenilla A, Sillos C, Leon L De, Elepano C, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health 2023;2(2):e0000198. [CrossRef] google scholar
  • Yilmaz EC, Belue MJ, Turkbey B, Reinhold C, Choyke PL. A Brief Review of Artificial Intelligence in Genitourinary Oncological Imaging. Can Assoc Radiol J 2023;74(3):534-47. [CrossRef] google scholar
  • Akinci D’Antonoli T, Stanzione A, Bluethgen C, Vernuccio F, Ugga L, Klontzas ME, et al. Large language models in radiology: fundamentals, applications, ethical considerations, risks, and future directions. Diagnostic and Interventional Radiology 2024;30(2):80-90. [CrossRef] google scholar
  • Doshi R, Amin K, Khosla P, Bajaj S, Chheang S, Forman HP. Utilizing Large Language Models to Simplify Radiology Reports: a comparative analysis of ChatGPT3.5, ChatGPT4.0, Google Bard, and Microsoft Bing. medRxiv 2023. https:// www.medrxiv.org/content/10.1101/2023.06.04.23290786v2 [CrossRef] google scholar
  • Li H, Moon JT, Iyer D, Balthazar P, Krupinski EA, Bercu ZL, et al. Decoding radiology reports: Potential application of OpenAI ChatGPT to enhance patient understanding of diagnostic reports. Clin Imaging. 2023;101:137-41. [CrossRef] google scholar
  • Luo W, Liu F, Liu Z, Litman D. A novel ILP framework for summarizing content with high lexical variety. Nat Lang Eng 2018;24(6):887-920. [CrossRef] google scholar
  • Guadalupe Ramos J, Navarro-Alatorre I, Flores Becerra G, Flores-Sanchez O. A Formal Technique for Text Summarization from Web Pages by using Latent Semantic Analysis. Research in Computing Science 2019;148(3):11-22. [CrossRef] google scholar
  • Bossuyt PM, Reitsma JB, Bruns DE, Bruns DE, Glasziou PP, Irwig L, et al. STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies1. Radiology 2015;277(3):826-32. [CrossRef] google scholar
  • Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol 2023;1:1-9. [CrossRef] google scholar
  • Schmidt S, Zimmerer A, Cucos T, Feucht M, Navas L. Simplifying radiologic reports with natural language processing: a novel approach using ChatGPT in enhancing patient understanding of MRI results. Arch Orthop Trauma Surg 2024;144(2):611-8. [CrossRef] google scholar
  • Ateşman E. Türkçede okunabilirliğin ölçülmesi. Dil Dergisi. 1997;58:71-4. google scholar
  • Johnson AEW, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data 2023;10(1):1. [CrossRef] google scholar
  • Lyu Q, Tan J, Zapadka ME, Ponnatapura J, Niu C, Myers KJ, et al. Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential. Vis Comput Ind Biomed Art 2023;6(1):1-10. [CrossRef] google scholar
Toplam 14 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Sağlık Hizmetleri ve Sistemleri (Diğer)
Bölüm ARAŞTIRMA
Yazarlar

Eren Çamur 0000-0002-8774-5800

Turay Cesur 0000-0002-2726-8045

Yasin Celal Güneş 0000-0001-7631-854X

Yayımlanma Tarihi 25 Ekim 2024
Gönderilme Tarihi 3 Haziran 2024
Kabul Tarihi 2 Eylül 2024
Yayımlandığı Sayı Yıl 2024 Cilt: 87 Sayı: 4

Kaynak Göster

APA Çamur, E., Cesur, T., & Güneş, Y. C. (2024). A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS. Journal of Istanbul Faculty of Medicine, 87(4), 321-326. https://doi.org/10.26650/IUITFD.1494572
AMA Çamur E, Cesur T, Güneş YC. A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS. İst Tıp Fak Derg. Ekim 2024;87(4):321-326. doi:10.26650/IUITFD.1494572
Chicago Çamur, Eren, Turay Cesur, ve Yasin Celal Güneş. “A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS”. Journal of Istanbul Faculty of Medicine 87, sy. 4 (Ekim 2024): 321-26. https://doi.org/10.26650/IUITFD.1494572.
EndNote Çamur E, Cesur T, Güneş YC (01 Ekim 2024) A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS. Journal of Istanbul Faculty of Medicine 87 4 321–326.
IEEE E. Çamur, T. Cesur, ve Y. C. Güneş, “A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS”, İst Tıp Fak Derg, c. 87, sy. 4, ss. 321–326, 2024, doi: 10.26650/IUITFD.1494572.
ISNAD Çamur, Eren vd. “A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS”. Journal of Istanbul Faculty of Medicine 87/4 (Ekim 2024), 321-326. https://doi.org/10.26650/IUITFD.1494572.
JAMA Çamur E, Cesur T, Güneş YC. A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS. İst Tıp Fak Derg. 2024;87:321–326.
MLA Çamur, Eren vd. “A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS”. Journal of Istanbul Faculty of Medicine, c. 87, sy. 4, 2024, ss. 321-6, doi:10.26650/IUITFD.1494572.
Vancouver Çamur E, Cesur T, Güneş YC. A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS. İst Tıp Fak Derg. 2024;87(4):321-6.

Contact information and address

Addressi: İ.Ü. İstanbul Tıp Fakültesi Dekanlığı, Turgut Özal Cad. 34093 Çapa, Fatih, İstanbul, TÜRKİYE

Email: itfdergisi@istanbul.edu.tr

Phone: +90 212 414 21 61