Research Article

Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy

Volume: 7 Number: 2 June 2, 2026
Mehmet Ünal , Hakan Koç *
TR EN

Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy

Abstract

Background: This study aims to compare the readability levels and informational quality of responses generated by three different artificial intelligence (AI)-based text generation models in relation to Central Serous Chorioretinopathy (CSCR). Materials and Methods: A total of 40 questions pertaining to CSCR were formulated based on articles indexed in PubMed over the past ten years. These questions were submitted to three AI-based text generation tools ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot. The resulting responses were analyzed for readability using five standard indices: Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), Simple Measure of Gobbledygook (SMOG), Gunning Fog Index (GFOG), and Automated Readability Index (ARI). Sentence lengths and structural complexity were also assessed. Content quality was independently evaluated and scored by two researchers using a standardized rubric. Results: Nor adequate readability. A statistically significant difference in FRES scores was observed among the models (p = 0.01). Similarly, none of the models met the acceptable readability standards across the other four indices, with all scores exceeding recommended limits indicating generally poor readability. Among the tools, Gemini yielded significantly higher quality scores compared to the others (p < .001), suggesting superior informational content. Conversely, Microsoft Copilot produced more concise outputs characterized by shorter and fewer sentencesone of the models achieved the optimal FRES threshold (≥ 60) required f. Conclusions: The findings suggest that AI-generated responses regarding CSCR are often overly technical and may not be easily comprehensible to individuals without a medical background. Moreover, the study highlights variability among different AI models in terms of both readability and content quality. These results underscore the importance of critically evaluating AI-generated medical content prior to its dissemination for public or clinical use.

Keywords

Central Serous Chorioretinopathy, Readability, Content Quality, Artificial Intelligence

Supporting Institution

The authors received no financial support for the research, authorship, and/or publication of this article.

Ethical Statement

This study did not involve human participants, patient data, or animal subjects. The study was based on the analysis of responses generated by artificial intelligence chatbots to predefined questions about Central Serous Chorioretinopathy. Therefore, ethics committee approval and informed consent to participate were not required.

Thanks

Not applicable.

References

  1. Olszewski R, Watros K, Mańczak M, Owoc J, Jeziorski K, Brzeziński J. Assessing the response quality and readability of chatbots in cardiovascular health, oncology, and psoriasis: A comparative study. Int J Med Inform. 2024;190:105562.
  2. Bohr A, Memarzadeh K. The rise of artificial intelligence in healthcare applications. In: Artificial Intelligence in Healthcare. 2020. p. 25–60.
  3. Nirala KK, Singh NK, Purani VS. A survey on providing customer and public administration based services using AI: chatbot. Multimed Tools Appl. 2022;81:22215–22246.
  4. Semeraro F, Gamberini L, Carmona F, Monsieurs KG. Clinical questions on advanced life support answered by artificial intelligence: A comparison between ChatGPT, Google Bard and Microsoft Copilot. Resuscitation. 2024;195:110114.
  5. Diamond C, Rundle CW, Albrecht JM, Nicholas MW. Chatbot utilization in dermatology: A potential amelioration to burnout in dermatology. Dermatol Online J. 2022;28(5):13030.
  6. Yan S, Du D, Liu X, Dai Y, Kim MK, Zhou X, et al. Assessment of the reliability and clinical applicability of ChatGPT’s responses to patients’ common queries about rosacea. Patient Prefer Adherence. 2024;18:249–253.
  7. Young JN, Ross O, Poplausky D, et al. The utility of ChatGPT in generating patient-facing and clinical responses for melanoma. J Am Acad Dermatol. 2023;89(3):602–604.
  8. Musheyev D, Pan A, Loeb S, Kabarriti AE. How well do artificial intelligence chatbots respond to the top search queries about urological malignancies? Eur Urol. 2024;85(1):13–16.
  9. Pan A, Musheyev D, Bockelman D, Loeb S, Kabarriti AE. Assessment of artificial intelligence chatbot responses to top searched queries about cancer. JAMA Oncol. 2023;9(10):1437–1440.
  10. Fung AT, Yang Y, Kam AW. Central serous chorioretinopathy: A review. Clin Exp Ophthalmol. 2023;51(3):243–270.
APA
Ünal, M., & Koç, H. (2026). Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Archives of Current Medical Research, 7(2), 314-320. https://doi.org/10.47482/acmr.1736570
AMA
1.Ünal M, Koç H. Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Arch Curr Med Res. 2026;7(2):314-320. doi:10.47482/acmr.1736570
Chicago
Ünal, Mehmet, and Hakan Koç. 2026. “Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy”. Archives of Current Medical Research 7 (2): 314-20. https://doi.org/10.47482/acmr.1736570.
EndNote
Ünal M, Koç H (June 1, 2026) Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Archives of Current Medical Research 7 2 314–320.
IEEE
[1]M. Ünal and H. Koç, “Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy”, Arch Curr Med Res, vol. 7, no. 2, pp. 314–320, June 2026, doi: 10.47482/acmr.1736570.
ISNAD
Ünal, Mehmet - Koç, Hakan. “Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy”. Archives of Current Medical Research 7/2 (June 1, 2026): 314-320. https://doi.org/10.47482/acmr.1736570.
JAMA
1.Ünal M, Koç H. Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Arch Curr Med Res. 2026;7:314–320.
MLA
Ünal, Mehmet, and Hakan Koç. “Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy”. Archives of Current Medical Research, vol. 7, no. 2, June 2026, pp. 314-20, doi:10.47482/acmr.1736570.
Vancouver
1.Mehmet Ünal, Hakan Koç. Readability and Appropriateness of Responses Generated by ChatGPT 3.5, Gemini 2.5, and Microsoft Copilot to Frequently Asked Questions About Central Serous Chorioretinopathy. Arch Curr Med Res. 2026 Jun. 1;7(2):314-20. doi:10.47482/acmr.1736570