Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy

Mehmet Ünal; Hakan Koç

doi:10.47482/acmr.1736570

TR EN

Santral Seröz Koryoretinopatide Sık Sorulan Sorular için ChatGPT 3.5, Gemini 2.5 ve Microsoft Copilot Tarafından Oluşturulan Yanıtların Okunabilirliği ve Uygunluğu

Abstract

Amaç: Bu çalışmanın amacı, Santral Seröz Koryoretinopati hakkında, üç farklı yapay zekâ tabanlı metin üretim aracının oluşturduğu içeriklerin okunabilirlik düzeylerini ve bilgi kalitesini karşılaştırmaktır. Gereç ve Yöntem: Çalışmamızda, son 10 yılda PubMed tarafından indekslenen makalelerden Santral Seröz Koryoretinopati (SSKR) konusundan 40 soru seçilmiştir. Bu sorular üç farklı yapay zekâ tabanlı metin üretim aracına (ChatGPT 3.5, Gemini 2.5 ve Copilot) yöneltilmiş ve verilen yanıtlar analiz edilmiştir. Elde edilen metinler; okunabilirlik düzeyleri [Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), Simple Measure of Gobbledygook (SMOG), Gunning Fog Index (GFOG) ve Automated Readability Index (ARI)], cümle uzunlukları ve içerik kalitesi açısından karşılaştırılmıştır. Yanıtların içerik kalitesi ise iki bağımsız araştırmacı tarafından puanlanmıştır. Bulgular: Optimal okunabilirlik için FRES değerinin ≥ 60 olması gerekirken, incelenen modellerin hiçbiri bu eşik değere ulaşamamıştır. FRES skorları açısından modeller arasında istatistiksel olarak anlamlı fark saptanmıştır (p = 0,01). FRES’in yanı sıra diğer dört ölçekte de hiçbir model kabul edilebilir okunabilirlik düzeyine ulaşamamış; tüm modellerde değerler eşiklerin oldukça üzerinde bulunmuştur, bu da içeriklerin genel olarak düşük okunabilirlikte olduğunu göstermektedir. Gemini, kalite skorlarında diğer modellere kıyasla anlamlı düzeyde daha yüksek puanlar almış ve bu durum içerik kalitesinin daha tatmin edici olduğunu göstermektedir (p <.001). Öte yandan, Copilot, daha kısa ve daha az sayıda cümle kullanarak daha sade metinler üretmiştir. Sonuç:Bu çalışma, yapay zeka modellerinin tıbbi terminolojiye aşina olmayanlar için anlaşılması zor akademik yanıtlar verdiğini ve sağlıkla ilgili içerik üretiminde okunabilirlik ve kalite açısından farklı çıktılar üretebildiğini göstermiştir. Bu bulgular, yapay zeka temelli içeriklerin sağlık alanında kullanımı için dikkatli değerlendirme yapılması gerektiğini göstermektedir. Çalışma YZ modellerinin tıbbi terminolojiye aşina olmayanlar için anlaşılması zor akademik yanıtlar verdiğini ve sağlıkla ilgili içerik üretiminde okunabilirlik ve kalite açısından farklı çıktılar üretebildiğini ortaya koymaktadır. Bu bulgular, YZ temelli içeriklerin sağlık alanında kullanımı için dikkatli değerlendirme yapılması gerektiğini göstermektedir.

Keywords

Santral Seröz Koryoretinopati, Kalite, Okunabilirlik, Yapay zekâ

Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy

Abstract

Background: This study aims to compare the readability levels and informational quality of responses generated by three different artificial intelligence (AI)-based text generation models in relation to Central Serous Chorioretinopathy (CSCR). Materials and Methods: A total of 40 questions pertaining to CSCR were formulated based on articles indexed in PubMed over the past ten years. These questions were submitted to three AI-based text generation tools ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot. The resulting responses were analyzed for readability using five standard indices: Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), Simple Measure of Gobbledygook (SMOG), Gunning Fog Index (GFOG), and Automated Readability Index (ARI). Sentence lengths and structural complexity were also assessed. Content quality was independently evaluated and scored by two researchers using a standardized rubric. Results: Nor adequate readability. A statistically significant difference in FRES scores was observed among the models (p = 0.01). Similarly, none of the models met the acceptable readability standards across the other four indices, with all scores exceeding recommended limits indicating generally poor readability. Among the tools, Gemini yielded significantly higher quality scores compared to the others (p < .001), suggesting superior informational content. Conversely, Microsoft Copilot produced more concise outputs characterized by shorter and fewer sentencesone of the models achieved the optimal FRES threshold (≥ 60) required f. Conclusions: The findings suggest that AI-generated responses regarding CSCR are often overly technical and may not be easily comprehensible to individuals without a medical background. Moreover, the study highlights variability among different AI models in terms of both readability and content quality. These results underscore the importance of critically evaluating AI-generated medical content prior to its dissemination for public or clinical use.

Keywords

Central Serous Chorioretinopathy, Readability, Content Quality, Artificial Intelligence

Supporting Institution

The authors received no financial support for the research, authorship, and/or publication of this article.

Ethical Statement

This study did not involve human participants, patient data, or animal subjects. The study was based on the analysis of responses generated by artificial intelligence chatbots to predefined questions about Central Serous Chorioretinopathy. Therefore, ethics committee approval and informed consent to participate were not required.

Thanks

Not applicable.

References

Olszewski R, Watros K, Mańczak M, Owoc J, Jeziorski K, Brzeziński J. Assessing the response quality and readability of chatbots in cardiovascular health, oncology, and psoriasis: A comparative study. Int J Med Inform. 2024;190:105562.
Bohr A, Memarzadeh K. The rise of artificial intelligence in healthcare applications. In: Artificial Intelligence in Healthcare. 2020. p. 25–60.
Nirala KK, Singh NK, Purani VS. A survey on providing customer and public administration based services using AI: chatbot. Multimed Tools Appl. 2022;81:22215–22246.
Semeraro F, Gamberini L, Carmona F, Monsieurs KG. Clinical questions on advanced life support answered by artificial intelligence: A comparison between ChatGPT, Google Bard and Microsoft Copilot. Resuscitation. 2024;195:110114.
Diamond C, Rundle CW, Albrecht JM, Nicholas MW. Chatbot utilization in dermatology: A potential amelioration to burnout in dermatology. Dermatol Online J. 2022;28(5):13030.
Yan S, Du D, Liu X, Dai Y, Kim MK, Zhou X, et al. Assessment of the reliability and clinical applicability of ChatGPT’s responses to patients’ common queries about rosacea. Patient Prefer Adherence. 2024;18:249–253.
Young JN, Ross O, Poplausky D, et al. The utility of ChatGPT in generating patient-facing and clinical responses for melanoma. J Am Acad Dermatol. 2023;89(3):602–604.
Musheyev D, Pan A, Loeb S, Kabarriti AE. How well do artificial intelligence chatbots respond to the top search queries about urological malignancies? Eur Urol. 2024;85(1):13–16.
Pan A, Musheyev D, Bockelman D, Loeb S, Kabarriti AE. Assessment of artificial intelligence chatbot responses to top searched queries about cancer. JAMA Oncol. 2023;9(10):1437–1440.
Fung AT, Yang Y, Kam AW. Central serous chorioretinopathy: A review. Clin Exp Ophthalmol. 2023;51(3):243–270.

Gupta A, Tripathy K. Central serous chorioretinopathy. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025. Available from: https://www.ncbi.nlm.nih.gov/books/NBK558973/
Zhang X, Lim CZF, Chhablani J, Wong YM. Central serous chorioretinopathy: Updates in the pathogenesis, diagnosis and therapeutic strategies. Eye Vis (Lond). 2023;10(1):33.
Kumari A, Kumari A, Singh A, Singh SK, Juhi A, Dhanvijay AD, et al. Large language models in hematology case solving: A comparative study of ChatGPT-3.5, Google Gemini, and Microsoft Copilot. Cureus. 2023;15(8):e43861.
Tavernier J, Bellot P. Flesch and Dale-Chall readability measures for INEX 2011 question-answering track. In: Geva S, Kamps J, Schenkel R, editors. Focused retrieval of content and structure. Berlin: Springer; 2012. p. 1–12.
Gbedemah ZEE, Fuseini MN, Fordjuor SKEJ, Baisie-Nkrumah EJ, Beecham REM, Amissah-Arthur KN. Readability and quality of online information on sickle cell retinopathy for patients. Am J Ophthalmol. 2024;259:45–52.
Dalillah N, Ismayanti F, Azzahra E, Kusmana S, Rahayu I. SMOG readability index in selecting reading materials and reading literacy skills of primary school students. Int J Elem Educ. 2024;13:31–38.
Marshall S, Hanish SJ, Baumann J, Groneck A, DeFroda S. A standardised method for improving patient education material readability for orthopaedic trauma patients. Musculoskelet Care. 2024;22(1):e1869.
Gencer A. Readability analysis of ChatGPT’s responses on lung cancer. Sci Rep. 2024;14:17234.
Potemkowski A, Brola W, Ratajczak A, Ratajczak M, Zaborski J, Jasińska E, et al. Internet usage by Polish patients with multiple sclerosis: A multicenter questionnaire study. Interact J Med Res. 2019;8(1):e11146.
Wong DKK, Cheung MK. Online health information seeking and eHealth literacy among patients attending a primary care clinic in Hong Kong: A cross-sectional survey. J Med Internet Res. 2019;21(3):e10831.
Güler MS, Baydemir EE. Evaluation of ChatGPT-4 responses to glaucoma patients' questions: Can artificial intelligence become a trusted advisor between doctor and patient? Clin Exp Ophthalmol. 2024;52(9):1016–1019.
Gary AA, Lai JM, Locatelli EVT, Falcone MM, Cavuoto KM. Accuracy and readability of ChatGPT responses to patient-centric strabismus questions. J Pediatr Ophthalmol Strabismus. 2025;62(3):220–227.
Podder I, Pipil N, Dhabal A, Mondal S, Pienyii V, Mondal H. Evaluation of artificial intelligence-based chatbot responses to common dermatological queries. Jordan Med J. 2024;58:271–276.
Ali R, Tang OY, Connolly ID, Fridley JS, Shin JH, Zadnik Sullivan PL, et al. Performance of ChatGPT, GPT-4, and Google Bard on a neurosurgery oral boards preparation question bank. Neurosurgery. 2023;93(5):1090–1098.
Rahsepar AA, Tavakoli N, Kim GHJ, Hassani C, Abtin F, Bedayat A. How AI responds to common lung cancer questions: ChatGPT vs Google Gemini. Radiology. 2023;307(5):e230922.
Cocci A, Pezzoli M, Lo Re M, et al. Quality of information and appropriateness of ChatGPT outputs for urology patients. Prostate Cancer Prostatic Dis. 2024;27:10.

Details

Primary Language

English

Subjects

Surgery (Other)

Journal Section

Research Article

Authors

Mehmet Ünal This is me
0009-0005-0655-7373
Türkiye

Hakan Koç ^*
0000-0003-1241-1686
Türkiye

Publication Date

June 2, 2026

Submission Date

July 7, 2025

Acceptance Date

December 25, 2025

Published in Issue

Year 2026 Volume: 7 Number: 2

DOI

https://doi.org/10.47482/acmr.1736570

IZ

https://izlik.org/JA45UU22JC

APA

Ünal, M., & Koç, H. (2026). Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Archives of Current Medical Research, 7(2), 314-320. https://doi.org/10.47482/acmr.1736570

AMA

1.Ünal M, Koç H. Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Arch Curr Med Res. 2026;7(2):314-320. doi:10.47482/acmr.1736570

Chicago

Ünal, Mehmet, and Hakan Koç. 2026. “Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy”. Archives of Current Medical Research 7 (2): 314-20. https://doi.org/10.47482/acmr.1736570.

EndNote

Ünal M, Koç H (June 1, 2026) Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Archives of Current Medical Research 7 2 314–320.

IEEE

[1]M. Ünal and H. Koç, “Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy”, Arch Curr Med Res, vol. 7, no. 2, pp. 314–320, June 2026, doi: 10.47482/acmr.1736570.

ISNAD

Ünal, Mehmet - Koç, Hakan. “Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy”. Archives of Current Medical Research 7/2 (June 1, 2026): 314-320. https://doi.org/10.47482/acmr.1736570.

JAMA

1.Ünal M, Koç H. Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Arch Curr Med Res. 2026;7:314–320.

MLA

Ünal, Mehmet, and Hakan Koç. “Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy”. Archives of Current Medical Research, vol. 7, no. 2, June 2026, pp. 314-20, doi:10.47482/acmr.1736570.

Vancouver

1.Mehmet Ünal, Hakan Koç. Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Arch Curr Med Res. 2026 Jun. 1;7(2):314-20. doi:10.47482/acmr.1736570

Archives of Current Medical Research (ACMR) provides instant open access to all content, bearing in mind the fact that presenting research

free to the public supports a greater global exchange of knowledge.

http://www.acmronline.org/

Santral Seröz Koryoretinopatide Sık Sorulan Sorular için ChatGPT 3.5, Gemini 2.5 ve Microsoft Copilot Tarafından Oluşturulan Yanıtların Okunabilirliği ve Uygunluğu

Abstract

Keywords

Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy

Abstract

Keywords

Supporting Institution

Ethical Statement

Thanks

References

Details

Primary Language

Subjects

Journal Section

Authors

Publication Date

Submission Date

Acceptance Date

Published in Issue

DOI

IZ

Cite