Artificial Intelligence in Patient Communication: Performance of GPT-3.5 and GPT-4 in Coronary Bypass Surgery
Year 2025,
Volume: 17 Issue: 1, 100 - 105, 27.03.2025
Muhammet Fethi Sağlam
,
Emrah Uğuz
,
Kemal Erdoğan
,
Hüseyin Ünsal Erçelik
,
Murat Yücel
,
Cevat Ahmet Sert
,
Fatih Yamac
,
Erol Sener
Abstract
Objective: This study aims to evaluate the ability of GPT-3.5 and GPT-4 to provide accurate, comprehensible, and clinically relevant responses to common patient questions about coronary bypass surgery.
Method: A cross-sectional study was conducted at Ankara Yıldırım Beyazıt University Bilkent City Hospital with 80 cardiovascular surgery specialists. Participants rated the responses of GPT-3.5 and GPT-4 to 10 common patient questions about coronary bypass surgery based on four criteria: accuracy, understandability, clinical appropriateness, and overall evaluation. Statistical analysis included independent t-tests, Cronbach’s Alpha reliability analysis, and Cohen’s d effect size calculation.
Results: GPT-4 significantly outperformed GPT-3.5 across all metrics. The mean scores for GPT-4 were higher in accuracy (3.02 vs. 1.77), understandability (2.99 vs. 1.81), clinical appropriateness (2.96 vs. 1.78), and overall evaluation (2.98 vs. 1.77) (p<0.05 for all). Cronbach's Alpha values indicated good internal consistency (≥0.69 for all metrics), and Cohen’s d effect sizes demonstrated large differences (1.54 to 1.65).
Conclusions: GPT-4 shows superior potential compared to GPT-3.5 in answering patient questions about coronary bypass surgery. Despite its strengths, occasional inaccuracies and incomplete responses highlight the need for further refinement. Future research should integrate patient feedback and evaluate the real-world clinical impact of these models to optimize their application in healthcare.
References
-
1. Roth GA, Mensah GA, Johnson CO, Addolorato G, Ammirati E, Baddour LM, et al. Global Burden of Cardiovascular Diseases and Risk Factors, 1990-2019: Update From the GBD 2019 Study. Journal of the American College of Cardiology. 2020;76(25):2982-3021.
-
2. Vogel B, Acevedo M, Appelman Y, Bairey Merz CN, Chieffo A, Figtree GA, et al. The Lancet women and cardiovascular disease Commission: reducing the global burden by 2030. Lancet (London, England). 2021;397(10292):2385-438.
-
3. Powell R, Scott NW, Manyande A, Bruce J, Vögele C, Byrne-Davis LM, et al. Psychological preparation and postoperative outcomes for adults undergoing surgery under general anaesthesia. The Cochrane database of systematic reviews. 2016;2016(5):Cd008646.
-
4. Açıkel MET. Evaluation of Depression and Anxiety in Coronary Artery Bypass Surgery Patients: A Prospective Clinical Study. Brazilian journal of cardiovascular surgery. 2019;34(4):389-95.
-
5. Aburuz ME, Maloh H. Preoperative anxiety and depressive symptoms predicted higher incidence of delirium post coronary artery bypass graft surgery. 2024.
-
6. Alowais SA, Alghamdi SS, Alsuhebany N, Alqahtani T, Alshaya AI, Almohareb SN, et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC medical education. 2023;23(1):689.
-
7. Yuan M, Bao P, Yuan J, Shen Y, Chen Z, Xie Y, et al. Large language models illuminate a progressive pathway to artificial intelligent healthcare assistant. Medicine Plus. 2024:100030.
-
8. Guo RX, Tian X, Bazoukis G. Application of artificial intelligence in the diagnosis and treatment of cardiac arrhythmia. 2024;47(6):789-801.
-
9. Ali S, Abuhmed T, El-Sappagh S, Muhammad K, Alonso-Moral JM, Confalonieri R, et al. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Information fusion. 2023;99:101805.
-
10. Coleman JJ, Owen J, Wright JH, Eells TD, Antle B, McCoy M, et al. Using Artificial Intelligence to Identify Effective Components of Computer-Assisted Cognitive Behavioural Therapy. Clinical psychology & psychotherapy. 2024;31(6):e70023.
-
11. Ismail AMA. Chat GPT in Tailoring Individualized Lifestyle-Modification Programs in Metabolic Syndrome: Potentials and Difficulties? Annals of biomedical engineering. 2023;51(12):2634-5.
-
12. Wang D, Zhang S. Large language models in medical and healthcare fields: applications, advances, and challenges. Artificial Intelligence Review. 2024;57(11):299.
-
13. Samant S, Bakhos JJ, Wu W, Zhao S, Kassab GS, Khan B, et al. Artificial Intelligence, Computational Simulations, and Extended Reality in Cardiovascular Interventions. JACC Cardiovascular interventions. 2023;16(20):2479-97.
-
14. Krajcer Z. Artificial Intelligence for Education, Proctoring, and Credentialing in Cardiovascular Medicine. Texas Heart Institute journal. 2022;49(2).
-
15. Biswas SS. Role of Chat GPT in Public Health. Annals of biomedical engineering. 2023;51(5):868-9.
-
16. Levine DM, Tuwani R, Kompa B, Varma A, Finlayson SG, Mehrotra A, et al. The diagnostic and triage accuracy of the GPT-3 artificial intelligence model: an observational study. The Lancet Digital Health. 2024;6(8):e555-e61.
17. Liu J, Wang C. Utility of ChatGPT in Clinical Practice. 2023;25:e48568.
-
18. Bajwa J, Munir U, Nori A, Williams B. Artificial intelligence in healthcare: transforming the practice of medicine. Future healthcare journal. 2021;8(2):e188-e94.
-
19. Al Kuwaiti A, Nazer K, Al-Reedy A, Al-Shehri S, Al-Muhanna A. A Review of the Role of Artificial Intelligence in Healthcare. 2023;13(6).
-
20. Maleki Varnosfaderani S, Forouzanfar M. The Role of AI in Hospitals and Clinics: Transforming Healthcare in the 21st Century. 2024;11(4).
-
21. Liu C-L, Ho C-T, Wu T-C, editors. Custom GPTs enhancing performance and evidence compared with GPT-3.5, GPT-4, and GPT-4o? A study on the emergency medicine specialist examination. Healthcare; 2024:MDPI.
Hasta İletişiminde Yapay Zeka: Koroner Bypass Cerrahisinde GPT-3.5 ve GPT-4'ün Performansı
Year 2025,
Volume: 17 Issue: 1, 100 - 105, 27.03.2025
Muhammet Fethi Sağlam
,
Emrah Uğuz
,
Kemal Erdoğan
,
Hüseyin Ünsal Erçelik
,
Murat Yücel
,
Cevat Ahmet Sert
,
Fatih Yamac
,
Erol Sener
Abstract
Amaç: Bu çalışma, GPT-3.5 ve GPT-4'ün koroner bypass cerrahisiyle ilgili yaygın hasta sorularına doğru, anlaşılır ve klinik olarak uygun yanıtlar verme yeteneğini değerlendirmeyi amaçlamaktadır.
Yöntem: Ankara Yıldırım Beyazıt Üniversitesi Bilkent Şehir Hastanesi'nde 80 kalp ve damar cerrahisi uzmanı ile kesitsel bir çalışma yürütülmüştür. Katılımcılar GPT-3.5 ve GPT-4'ün koroner bypass cerrahisi ile ilgili 10 yaygın hasta sorusuna verdiği yanıtları dört kritere göre değerlendirmiştir: doğruluk, anlaşılabilirlik, klinik uygunluk ve genel değerlendirme. İstatistiksel analiz bağımsız t-testlerini, Cronbach Alfa güvenilirlik analizini ve Cohen's d etki büyüklüğü hesaplamasını içermektedir.
Bulgular: GPT-4 tüm ölçütlerde GPT-3.5'ten önemli ölçüde daha iyi performans göstermiştir. GPT-4 için ortalama puanlar doğruluk (3,02'ye karşı 1,77), anlaşılabilirlik (2,99'a karşı 1,81), klinik uygunluk (2,96'ya karşı 1,78) ve genel değerlendirme (2,98'e karşı 1,77) açısından daha yüksekti (tümü için p<0,05). Cronbach's Alpha değerleri iyi bir iç tutarlılık (tüm ölçütler için ≥0,69) ve Cohen's d etki büyüklükleri büyük farklılıklar (1,54 ila 1,65) göstermiştir.
Sonuç: GPT-4, koroner bypass cerrahisi ile ilgili hasta sorularını yanıtlamada GPT-3.5'e kıyasla üstün potansiyel göstermektedir. Güçlü yönlerine rağmen, zaman zaman ortaya çıkan yanlışlıklar ve eksik yanıtlar daha fazla iyileştirme ihtiyacının altını çizmektedir. Gelecekteki araştırmalar, hasta geri bildirimlerini entegre etmeli ve sağlık hizmetlerinde uygulamalarını optimize etmek için bu modellerin gerçek dünyadaki klinik etkilerini değerlendirmelidir.
References
-
1. Roth GA, Mensah GA, Johnson CO, Addolorato G, Ammirati E, Baddour LM, et al. Global Burden of Cardiovascular Diseases and Risk Factors, 1990-2019: Update From the GBD 2019 Study. Journal of the American College of Cardiology. 2020;76(25):2982-3021.
-
2. Vogel B, Acevedo M, Appelman Y, Bairey Merz CN, Chieffo A, Figtree GA, et al. The Lancet women and cardiovascular disease Commission: reducing the global burden by 2030. Lancet (London, England). 2021;397(10292):2385-438.
-
3. Powell R, Scott NW, Manyande A, Bruce J, Vögele C, Byrne-Davis LM, et al. Psychological preparation and postoperative outcomes for adults undergoing surgery under general anaesthesia. The Cochrane database of systematic reviews. 2016;2016(5):Cd008646.
-
4. Açıkel MET. Evaluation of Depression and Anxiety in Coronary Artery Bypass Surgery Patients: A Prospective Clinical Study. Brazilian journal of cardiovascular surgery. 2019;34(4):389-95.
-
5. Aburuz ME, Maloh H. Preoperative anxiety and depressive symptoms predicted higher incidence of delirium post coronary artery bypass graft surgery. 2024.
-
6. Alowais SA, Alghamdi SS, Alsuhebany N, Alqahtani T, Alshaya AI, Almohareb SN, et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC medical education. 2023;23(1):689.
-
7. Yuan M, Bao P, Yuan J, Shen Y, Chen Z, Xie Y, et al. Large language models illuminate a progressive pathway to artificial intelligent healthcare assistant. Medicine Plus. 2024:100030.
-
8. Guo RX, Tian X, Bazoukis G. Application of artificial intelligence in the diagnosis and treatment of cardiac arrhythmia. 2024;47(6):789-801.
-
9. Ali S, Abuhmed T, El-Sappagh S, Muhammad K, Alonso-Moral JM, Confalonieri R, et al. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Information fusion. 2023;99:101805.
-
10. Coleman JJ, Owen J, Wright JH, Eells TD, Antle B, McCoy M, et al. Using Artificial Intelligence to Identify Effective Components of Computer-Assisted Cognitive Behavioural Therapy. Clinical psychology & psychotherapy. 2024;31(6):e70023.
-
11. Ismail AMA. Chat GPT in Tailoring Individualized Lifestyle-Modification Programs in Metabolic Syndrome: Potentials and Difficulties? Annals of biomedical engineering. 2023;51(12):2634-5.
-
12. Wang D, Zhang S. Large language models in medical and healthcare fields: applications, advances, and challenges. Artificial Intelligence Review. 2024;57(11):299.
-
13. Samant S, Bakhos JJ, Wu W, Zhao S, Kassab GS, Khan B, et al. Artificial Intelligence, Computational Simulations, and Extended Reality in Cardiovascular Interventions. JACC Cardiovascular interventions. 2023;16(20):2479-97.
-
14. Krajcer Z. Artificial Intelligence for Education, Proctoring, and Credentialing in Cardiovascular Medicine. Texas Heart Institute journal. 2022;49(2).
-
15. Biswas SS. Role of Chat GPT in Public Health. Annals of biomedical engineering. 2023;51(5):868-9.
-
16. Levine DM, Tuwani R, Kompa B, Varma A, Finlayson SG, Mehrotra A, et al. The diagnostic and triage accuracy of the GPT-3 artificial intelligence model: an observational study. The Lancet Digital Health. 2024;6(8):e555-e61.
17. Liu J, Wang C. Utility of ChatGPT in Clinical Practice. 2023;25:e48568.
-
18. Bajwa J, Munir U, Nori A, Williams B. Artificial intelligence in healthcare: transforming the practice of medicine. Future healthcare journal. 2021;8(2):e188-e94.
-
19. Al Kuwaiti A, Nazer K, Al-Reedy A, Al-Shehri S, Al-Muhanna A. A Review of the Role of Artificial Intelligence in Healthcare. 2023;13(6).
-
20. Maleki Varnosfaderani S, Forouzanfar M. The Role of AI in Hospitals and Clinics: Transforming Healthcare in the 21st Century. 2024;11(4).
-
21. Liu C-L, Ho C-T, Wu T-C, editors. Custom GPTs enhancing performance and evidence compared with GPT-3.5, GPT-4, and GPT-4o? A study on the emergency medicine specialist examination. Healthcare; 2024:MDPI.