Research Article

Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study

Number: Advanced Online Publication Early Pub Date: June 15, 2026
EN

Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study

Abstract

Objective: Family members of patients admitted to the intensive care unit (ICU) frequently experience uncertainty, emotional distress, and significant informational needs. Communication gaps remain common in ICU practice due to time constraints and clinical complexity. Large language models (LLMs) may offer scalable support for patient–family information delivery; however, their performance in responding to real-world ICU family questions has not been systematically evaluated.
Methods: This evaluator-blinded, cross-sectional study compared the accuracy of responses generated by five widely used LLMs (Claude Sonnet 4.0, ChatGPT 5.0, Gemini 2.5, Grok-4, and Sonar) to questions commonly asked by ICU family members. A standardized set of 25 questions was generated by prompting each model to list frequently asked ICU family questions. All questions were subsequently posed to all five models in blinded, independent sessions. Two intensive care medicine specialists independently rated response accuracy using a 6-point Likert scale. Inter-rater reliability was assessed using Cohen’s kappa. Differences between models were analyzed using the Friedman test with post-hoc Wilcoxon signed-rank tests.
Results: A total of 125 responses were evaluated. Inter-rater agreement was moderate (Cohen’s κ = 0.56; overall agreement 73.6%). Accuracy scores differed significantly among models (p < 0.001). Claude Sonnet 4.0 achieved the highest mean accuracy score (5.66 ± 0.61), followed by ChatGPT 5.0, Gemini 2.5, and Sonar, with no statistically significant differences among these four models. Grok-4 demonstrated significantly lower accuracy compared with all other models (all p < 0.001).
Conclusions: Most contemporary LLMs demonstrated high accuracy in answering questions commonly posed by ICU family members, although performance varied across platforms. Selected LLMs may serve as supportive tools to reinforce clinician–family communication; however, careful model selection, clinical oversight, and ethical safeguards are required before implementation in high-stakes intensive care settings.

Keywords

References

  1. Lautrette A, Darmon M, Megarbane B, Joly LM, Chevret S, Adrie C, Barnoud D, et al. A communication strategy and brochure for relatives of patients dying in the ICU. N Engl J Med. 2027;356(5):469-78. doi:10.1056/NEJMoa063446.
  2. Curtis JR, Treece PD, Nielsen EL, Gold J, Ciechanowski PS, Shannon SE, et al. Randomized trial of communication facilitators to reduce family distress and intensity of end-of-life care. Am J Respir Crit Care Med. 2016;193(2):154-62. doi:10.1164/rccm.201505-0900OC.
  3. Aribas YK, Tefon Aribas AB. Comparative analysis of large language models in providing patient information about keratoconus and contact lenses. Int Ophthalmol. 2025;45(1):340. doi:10.1007/s10792-025-03711-2 .
  4. Lambert R, Choo ZY, Gradwohl K, Schroedl L, Ruiz De Luzuriaga A. Assessing the application of large language models in generating dermatologic patient education materials according to reading level: qualitative study. JMIR Dermatol. 2024;7:e55898. doi:10.2196/55898.
  5. Chen D, Parsa R, Swanson K, Nunez JJ, Critch A, Bitterman DS, et al. Large language models in oncology: a review. BMJ Oncol. 2025;4(1):e000759. doi:10.1136/bmjonc-2025-000759.
  6. Cheungpasitporn W, Thongprayoon C, Ronco C, Kashani KB. Generative AI in critical care nephrology: applications and future prospects. Blood Purif. 2024;53(11–12):871-83. doi:10.1159/000541168.
  7. Biesheuvel LA, Workum JD, Reuland M, van Genderen ME, Thoral P, Dongelmans D, et al. Large language models in critical care. J Intensive Med. 2025;5(2):113-8. doi:10.1016/j.jointm.2024.12.001.
  8. Madden MG, McNicholas BA, Laffey JG. Assessing the usefulness of a large language model to query and summarize unstructured medical notes in intensive care. Intensive Care Med. 2023;49(8):1018-20. doi:10.1007/s00134-023-07128-2.

Details

Primary Language

English

Subjects

​Internal Diseases, Intensive Care

Journal Section

Research Article

Early Pub Date

June 15, 2026

Publication Date

-

Submission Date

December 15, 2025

Acceptance Date

February 2, 2026

Published in Issue

Year 2026 Number: Advanced Online Publication

APA
Yıldırım, M., Demiray, T. D., Ayten, A., & Kaya, E. K. (2026). Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study. Sakarya Medical Journal, Advanced Online Publication. https://doi.org/10.31832/smj.1842543
AMA
1.Yıldırım M, Demiray TD, Ayten A, Kaya EK. Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study. Sakarya Medical Journal. 2026;(Advanced Online Publication). doi:10.31832/smj.1842543
Chicago
Yıldırım, Mehmet, Tulay Dilara Demiray, Arda Ayten, and Esat Kivanc Kaya. 2026. “Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study”. Sakarya Medical Journal, no. Advanced Online Publication. https://doi.org/10.31832/smj.1842543.
EndNote
Yıldırım M, Demiray TD, Ayten A, Kaya EK (June 1, 2026) Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study. Sakarya Medical Journal Advanced Online Publication
IEEE
[1]M. Yıldırım, T. D. Demiray, A. Ayten, and E. K. Kaya, “Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study”, Sakarya Medical Journal, no. Advanced Online Publication, June 2026, doi: 10.31832/smj.1842543.
ISNAD
Yıldırım, Mehmet - Demiray, Tulay Dilara - Ayten, Arda - Kaya, Esat Kivanc. “Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study”. Sakarya Medical Journal. Advanced Online Publication (June 1, 2026). https://doi.org/10.31832/smj.1842543.
JAMA
1.Yıldırım M, Demiray TD, Ayten A, Kaya EK. Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study. Sakarya Medical Journal. 2026. doi:10.31832/smj.1842543.
MLA
Yıldırım, Mehmet, et al. “Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study”. Sakarya Medical Journal, no. Advanced Online Publication, June 2026, doi:10.31832/smj.1842543.
Vancouver
1.Mehmet Yıldırım, Tulay Dilara Demiray, Arda Ayten, Esat Kivanc Kaya. Can Large Language Models Support Family Communication in the Intensive Care Unit? A Comparative Accuracy Study. Sakarya Medical Journal. 2026 Jun. 1;(Advanced Online Publication). doi:10.31832/smj.1842543

INDEXING & ABSTRACTING & ARCHIVING


  29985  30950  30951 30954 34273


30703 The published articles in SMJ are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.