Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study
Abstract
Objective: Family members of patients admitted to the intensive care unit (ICU) frequently experience uncertainty, emotional distress, and significant informational needs. Communication gaps remain common in ICU practice due to time constraints and clinical complexity. Large language models (LLMs) may offer scalable support for patient–family information delivery; however, their performance in responding to real-world ICU family questions has not been systematically evaluated.
Methods: This evaluator-blinded, cross-sectional study compared the accuracy of responses generated by five widely used LLMs (Claude Sonnet 4.0, ChatGPT 5.0, Gemini 2.5, Grok-4, and Sonar) to questions commonly asked by ICU family members. A standardized set of 25 questions was generated by prompting each model to list frequently asked ICU family questions. All questions were subsequently posed to all five models in blinded, independent sessions. Two intensive care medicine specialists independently rated response accuracy using a 6-point Likert scale. Inter-rater reliability was assessed using Cohen’s kappa. Differences between models were analyzed using the Friedman test with post-hoc Wilcoxon signed-rank tests.
Results: A total of 125 responses were evaluated. Inter-rater agreement was moderate (Cohen’s κ = 0.56; overall agreement 73.6%). Accuracy scores differed significantly among models (p < 0.001). Claude Sonnet 4.0 achieved the highest mean accuracy score (5.66 ± 0.61), followed by ChatGPT 5.0, Gemini 2.5, and Sonar, with no statistically significant differences among these four models. Grok-4 demonstrated significantly lower accuracy compared with all other models (all p < 0.001).
Conclusions: Most contemporary LLMs demonstrated high accuracy in answering questions commonly posed by ICU family members, although performance varied across platforms. Selected LLMs may serve as supportive tools to reinforce clinician–family communication; however, careful model selection, clinical oversight, and ethical safeguards are required before implementation in high-stakes intensive care settings.
Keywords
References
- Lautrette A, Darmon M, Megarbane B, Joly LM, Chevret S, Adrie C, Barnoud D, et al. A communication strategy and brochure for relatives of patients dying in the ICU. N Engl J Med. 2027;356(5):469-78. doi:10.1056/NEJMoa063446.
- Curtis JR, Treece PD, Nielsen EL, Gold J, Ciechanowski PS, Shannon SE, et al. Randomized trial of communication facilitators to reduce family distress and intensity of end-of-life care. Am J Respir Crit Care Med. 2016;193(2):154-62. doi:10.1164/rccm.201505-0900OC.
- Aribas YK, Tefon Aribas AB. Comparative analysis of large language models in providing patient information about keratoconus and contact lenses. Int Ophthalmol. 2025;45(1):340. doi:10.1007/s10792-025-03711-2 .
- Lambert R, Choo ZY, Gradwohl K, Schroedl L, Ruiz De Luzuriaga A. Assessing the application of large language models in generating dermatologic patient education materials according to reading level: qualitative study. JMIR Dermatol. 2024;7:e55898. doi:10.2196/55898.
- Chen D, Parsa R, Swanson K, Nunez JJ, Critch A, Bitterman DS, et al. Large language models in oncology: a review. BMJ Oncol. 2025;4(1):e000759. doi:10.1136/bmjonc-2025-000759.
- Cheungpasitporn W, Thongprayoon C, Ronco C, Kashani KB. Generative AI in critical care nephrology: applications and future prospects. Blood Purif. 2024;53(11–12):871-83. doi:10.1159/000541168.
- Biesheuvel LA, Workum JD, Reuland M, van Genderen ME, Thoral P, Dongelmans D, et al. Large language models in critical care. J Intensive Med. 2025;5(2):113-8. doi:10.1016/j.jointm.2024.12.001.
- Madden MG, McNicholas BA, Laffey JG. Assessing the usefulness of a large language model to query and summarize unstructured medical notes in intensive care. Intensive Care Med. 2023;49(8):1018-20. doi:10.1007/s00134-023-07128-2.
Details
Primary Language
English
Subjects
Internal Diseases, Intensive Care
Journal Section
Research Article
Authors
Mehmet Yıldırım
*
0000-0002-0526-5943
Türkiye
Arda Ayten
0009-0007-0639-4484
Türkiye
Esat Kivanc Kaya
0000-0002-3449-0701
Türkiye
Early Pub Date
June 15, 2026
Publication Date
-
Submission Date
December 15, 2025
Acceptance Date
February 2, 2026
Published in Issue
Year 2026 Number: Advanced Online Publication