Research Article
BibTex RIS Cite

How much can large language models of Artificial Intelligence inform patients about urodynamics? A comparative analysis

Year 2026, Volume: 8 Issue: 2, 218 - 223, 10.03.2026
https://izlik.org/JA95LP68CP

Abstract

Aims: To evaluate and compare the readability and informational quality of current large language models (LLMs) in providing patient information about urodynamics (UD) testing.
Methods: This cross-sectional study, conducted on October 1, 2025, analyzed five widely used LLMs-ChatGPT-5, Gemini 2.5 Pro, Grok 4, Deepseek v3.1, and Microsoft Copilot. The top 25 UD-related keywords, excluding six of them, searched on Google Trends (2004-2025), were entered into each chatbot using identical prompts. Outputs were independently evaluated using the Quality Analysis of Medical Artificial Intelligence (QAMAI) and DISCERN instruments to evaluate text quality and reliability, while Flesch-Kincaid Reading Ease (FKRE) and Grade Level (FKGL) indices measured readability. Additionally, each LLM was asked to generate a visual depiction of a UD setting to assess the educational potential of AI-based multimodal content.
Results: The evaluated LLMs showed significant differences in readability and informational quality (p=0.001). Gemini achieved the highest FKRE score (49.0±8.4) and the lowest FKGL (9.4±1.3), indicating superior readability. Deepseek achieved the highest QAMAI (27.7±1.5) and DISCERN (71.5±6.4) scores, indicating superior quality and reliability. Copilot demonstrated lower readability and consistency scores compared with the other evaluated models. AI-generated visualizations of UD settings (using Gemini, GPT-5, Grok, Copilot, and DALL-E) effectively depicted the main components of the procedures.
Conclusion: LLMs show significant variability in the quality, accuracy, and readability of UD-related patient information. Deepseek delivered the most accurate and structured content, whereas Gemini provided the most understanable language. Continuous validation, guideline-based fine-tuning, and expert supervision are essential before AI chatbots can be reliably adopted in patient education and urology practice.

Ethical Statement

No ethical approval was needed because this is not a human study, but only online information was used.

Supporting Institution

None

References

  • Lenherr SM, Clemens JQ. Urodynamics: with a focus on appropriate indications. Urol Clin North Am. 2013;40(4):545-557. doi:10.1016/j.ucl. 2013.07.001
  • Heesakkers JP, Gerretsen RR. Urinary incontinence: sphincter functioning from a urological perspective. Digestion. 2004;69(2):93-101. doi:10.1159/000077875
  • Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29(8):1930-1940. doi:10.1038/s41591-023-02448-8
  • Biswas SS. Role of Chat GPT in public health. Ann Biomed Eng. 2023;51 (5):868-869. doi:10.1007/s10439-023-03172-7
  • Davis R, Eppler M, Ayo-Ajibola O, et al. Evaluating the effectiveness of Artificial Intelligence-powered large language models application in disseminating appropriate and readable health information in urology. J Urol. 2023;210(4):688-694. doi:10.1097/JU.0000000000003615
  • Schardt, D. ChatGPT is amazing. But beware its hallucinations. Center for Science in the Public Interest. 2023.
  • Temel MH, Erden Y, Bağcıer F. Information Quality and Readability: ChatGPT's Responses to the Most Common Questions About Spinal Cord Injury. World Neurosurg. 2024;181: e1138-e1144. doi:10.1016/j.wneu.2023.11.062
  • Vaira LA, Lechien JR, Abbate V, et al. Validation of the Quality Analysis of Medical Artificial Intelligence (QAMAI) tool: a new tool to assess the quality of health information provided by AI platforms. Eur Arch Otorhinolaryngol. 2024;281(11):6123-6131. doi:10.1007/s00405-024-08710-0
  • Yüksel G, Gürkan S. Evaluation of the performance of current Artificial Intelligence Chatbots regarding patient information after coronary artery bypass surgery. J Health Sci Medicine. 2025; 8(5): 879-883. doi:10. 32322/jhsm.1752483
  • Sonmezoglu BG, Sonmezoglu HI. Comparative analysis of AI chatbots Chat GPT, Gemini, and Copilot's answers to common cataract questions. Pakistan J Ophthalmology. 2024;40(4). doi:10.36351/pjo.v40i 4.1887
  • Ayers JW, Poliak A, Dredze M, et al. Comparing physician and Artificial Intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med. 2023;183(6):589-596. doi:10. 1001/jamainternmed.2023.1838
  • Patel M, Bhide AA, Digesu GA, et al. Best practice for videourodynamics: a teaching module of the International Continence Society Urodynamics Committee. Continence. 2024; 9:101212. doi:10.1016/j.cont.2024.101212
  • Cei F, Cacciamani GE. Re: assessment of Artificial Intelligence chatbot responses to top searched queries about cancer. Eur Urol. 2024;86(3):278-279. doi:10.1016/j.eururo.2024.03.033
  • Sahin MF, Akgül M, Akpınar Ç, et al. What do the current popular Artificial Intelligence chatbots offer us regarding patient information? Comparison of responses from the ten most popular chatbots about bladder cancer. J Cancer Surviv. 2025. doi:10.1007/s11764-025-01921-2
  • Santucci J, Stapleton P, Ibrahim J, Johns-Putra L, Elmer S, Sathianathen N. Quality of patient information on interstitial cystitis from Artificial Intelligence chatbots. BJU Int. 2025. doi:10.1111/bju.70035
  • Sönmezoğlu Hİ, Güner Sönmezoğlu B, Temel MH, et al. Comprehensibility and readability of selected Artificial Intelligence chatbots in providing uveitis-related information. Medicine (Baltimore). 2025;104(43): e45135. doi:10.1097/MD.0000000000045135

Yapay zekâ tabanlı büyük dil modelleri ürodinami konusunda hastaları ne ölçüde bilgilendirebilir? Karşılaştırmalı bir analiz

Year 2026, Volume: 8 Issue: 2, 218 - 223, 10.03.2026
https://izlik.org/JA95LP68CP

Abstract

Amaç:
Ürodinami (ÜD) testi hakkında hasta bilgilendirmesi sağlama konusunda mevcut yapay zeka (YZ) destekli büyük dil modellerinin (LLM’ler) okunabilirlik ve bilgi kalitesini değerlendirmek ve karşılaştırmak.

Yöntem:
1 Ekim 2025 tarihinde yürütülen bu kesitsel çalışmada, beş yaygın kullanılan LLM (ChatGPT-5, Gemini 2.5 Pro, Grok 4, Deepseek v3.1 ve Microsoft Copilot) analiz edildi. Google Trends’te (2004–2025) aranan ÜD ile ilişkili anahtar kelimerden ilk 25’i (altı tanesi dışlanarak) her bir sohbet robotuna aynı komutlarla girildi. Çıktılar, metin kalitesini ve güvenilirliğini değerlendirmek için Tıbbi Yapay Zekânın Kalite Analizi (QAMAI) ve DISCERN araçlarıyla bağımsız olarak değerlendirildi; okunabilirlik içinse Flesch-Kincaid Okuma Kolaylığı (FKRE) ve Okuma Düzeyi (FKGL) indeksleri kullanıldı. Ayrıca, her LLM’den bir ÜD ortamını görsel olarak tasvir etmesi istendi ve bu üretimler, yapay zekâ temelli çok modlu içeriklerin eğitsel potansiyelini değerlendirmek amacıyla incelendi.

Bulgular:
Hem okunabilirlik hem de kalite parametrelerinde modeller arasında anlamlı farklılıklar bulundu (p = 0.001). Gemini en yüksek FKRE (49.0 ± 8.4) ve en düşük FKGL (9.4 ± 1.3) skorlarına ulaşarak en iyi okunabilirliği sağladı. Deepseek ise en yüksek QAMAI (27.7 ± 1.5) ve DISCERN (71.5 ± 6.4) skorlarını elde ederek genel kalite ve güvenilirlik açısından en üstün sonuçları verdi. Buna karşılık Copilot, en düşük okunabilirlik ve tutarlılığa sahip çıktılar üretti. Gemini, GPT-5, Grok, Copilot ve DALL-E tarafından oluşturulan ÜD ortamı görselleri, prosedürün ana bileşenlerini etkili biçimde yansıttı.

Sonuç:
LLM’ler, ÜD ile ilgili hasta bilgilendirme metinlerinin kalitesi, doğruluğu ve okunabilirliği açısından önemli değişkenlik göstermektedir. Deepseek en doğru ve yapılandırılmış içeriği üretirken, Gemini en anlaşılır dili sağlamıştır. Bu nedenle, YZ sohbet robotlarının hasta eğitimi ve üroloji pratiğinde güvenle kullanılabilmesi için sürekli doğrulama, kılavuz temelli düzenlemeler ve uzman denetimi gereklidir.

References

  • Lenherr SM, Clemens JQ. Urodynamics: with a focus on appropriate indications. Urol Clin North Am. 2013;40(4):545-557. doi:10.1016/j.ucl. 2013.07.001
  • Heesakkers JP, Gerretsen RR. Urinary incontinence: sphincter functioning from a urological perspective. Digestion. 2004;69(2):93-101. doi:10.1159/000077875
  • Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29(8):1930-1940. doi:10.1038/s41591-023-02448-8
  • Biswas SS. Role of Chat GPT in public health. Ann Biomed Eng. 2023;51 (5):868-869. doi:10.1007/s10439-023-03172-7
  • Davis R, Eppler M, Ayo-Ajibola O, et al. Evaluating the effectiveness of Artificial Intelligence-powered large language models application in disseminating appropriate and readable health information in urology. J Urol. 2023;210(4):688-694. doi:10.1097/JU.0000000000003615
  • Schardt, D. ChatGPT is amazing. But beware its hallucinations. Center for Science in the Public Interest. 2023.
  • Temel MH, Erden Y, Bağcıer F. Information Quality and Readability: ChatGPT's Responses to the Most Common Questions About Spinal Cord Injury. World Neurosurg. 2024;181: e1138-e1144. doi:10.1016/j.wneu.2023.11.062
  • Vaira LA, Lechien JR, Abbate V, et al. Validation of the Quality Analysis of Medical Artificial Intelligence (QAMAI) tool: a new tool to assess the quality of health information provided by AI platforms. Eur Arch Otorhinolaryngol. 2024;281(11):6123-6131. doi:10.1007/s00405-024-08710-0
  • Yüksel G, Gürkan S. Evaluation of the performance of current Artificial Intelligence Chatbots regarding patient information after coronary artery bypass surgery. J Health Sci Medicine. 2025; 8(5): 879-883. doi:10. 32322/jhsm.1752483
  • Sonmezoglu BG, Sonmezoglu HI. Comparative analysis of AI chatbots Chat GPT, Gemini, and Copilot's answers to common cataract questions. Pakistan J Ophthalmology. 2024;40(4). doi:10.36351/pjo.v40i 4.1887
  • Ayers JW, Poliak A, Dredze M, et al. Comparing physician and Artificial Intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med. 2023;183(6):589-596. doi:10. 1001/jamainternmed.2023.1838
  • Patel M, Bhide AA, Digesu GA, et al. Best practice for videourodynamics: a teaching module of the International Continence Society Urodynamics Committee. Continence. 2024; 9:101212. doi:10.1016/j.cont.2024.101212
  • Cei F, Cacciamani GE. Re: assessment of Artificial Intelligence chatbot responses to top searched queries about cancer. Eur Urol. 2024;86(3):278-279. doi:10.1016/j.eururo.2024.03.033
  • Sahin MF, Akgül M, Akpınar Ç, et al. What do the current popular Artificial Intelligence chatbots offer us regarding patient information? Comparison of responses from the ten most popular chatbots about bladder cancer. J Cancer Surviv. 2025. doi:10.1007/s11764-025-01921-2
  • Santucci J, Stapleton P, Ibrahim J, Johns-Putra L, Elmer S, Sathianathen N. Quality of patient information on interstitial cystitis from Artificial Intelligence chatbots. BJU Int. 2025. doi:10.1111/bju.70035
  • Sönmezoğlu Hİ, Güner Sönmezoğlu B, Temel MH, et al. Comprehensibility and readability of selected Artificial Intelligence chatbots in providing uveitis-related information. Medicine (Baltimore). 2025;104(43): e45135. doi:10.1097/MD.0000000000045135
There are 16 citations in total.

Details

Primary Language English
Subjects Urology
Journal Section Research Article
Authors

Çağrı Doğan 0000-0001-9681-2473

Mehmet Fatih Şahin 0000-0002-0926-3005

Submission Date December 4, 2025
Acceptance Date February 2, 2026
Publication Date March 10, 2026
IZ https://izlik.org/JA95LP68CP
Published in Issue Year 2026 Volume: 8 Issue: 2

Cite

AMA 1.Doğan Ç, Şahin MF. How much can large language models of Artificial Intelligence inform patients about urodynamics? A comparative analysis. Anatolian Curr Med J / ACMJ / acmj. 2026;8(2):218-223. https://izlik.org/JA95LP68CP

TR DİZİN ULAKBİM and International Indexes (1b)

Interuniversity Board (UAK) Equivalency:  Article published in Ulakbim TR Index journal [10 POINTS], and Article published in other (excuding 1a, b, c) international indexed journal (1d) [5 POINTS]

Note: Our journal is not WOS indexed and therefore is not classified as Q.

You can download Council of Higher Education (CoHG) [Yüksek Öğretim Kurumu (YÖK)] Criteria) decisions about predatory/questionable journals and the author's clarification text and journal charge policy from your browser. https://dergipark.org.tr/tr/journal/3449/file/4924/show

Journal Indexes and Platforms: 

TR Dizin ULAKBİM, Google Scholar, Crossref, Worldcat (OCLC), DRJI, EuroPub, OpenAIRE, Turkiye Citation Index, Turk Medline, ROAD, ICI World of Journal's, Index Copernicus, ASOS Index, General Impact Factor, Scilit.


The indexes of the journal's are;

18596


asos-index.png

f9ab67f.png

WorldCat_Logo_H_Color.png

      logo-large-explore.png

images?q=tbn:ANd9GcQgDnBwx0yUPRKuetgIurtELxYERFv20CPAUcPe4jYrrJiwXzac8rGXlzd57gl8iikb1Tk&usqp=CAU

index_copernicus.jpg


84039476_619085835534619_7808805634291269632_n.jpg





The platforms of the journal's are;

COPE.jpg

images?q=tbn:ANd9GcTbq2FM8NTdXECzlOUCeKQ1dvrISFL-LhxhC7zy1ZQeJk-GGKSx2XkWQvrsHxcfhtfHWxM&usqp=CAUicmje_1_orig.png


ncbi.png

ORCID_logo.pngimages?q=tbn:ANd9GcQlwX77nfpy3Bu9mpMBZa0miWT2sRt2zjAPJKg2V69ODTrjZM1nT1BbhWzTVPsTNKJMZzQ&usqp=CAU


images?q=tbn:ANd9GcTaWSousoprPWGwE-qxwxGH2y0ByZ_zdLMN-Oq93MsZpBVFOTfxi9uXV7tdr39qvyE-U0I&usqp=CAU






The
 
indexes/platforms of the journal are;

TR Dizin Ulakbim, Crossref (DOI), Google Scholar, EuroPub, Directory of Research Journal İndexing (DRJI), Worldcat (OCLC), OpenAIRE, ASOS Index, ROAD, Turkiye Citation Index, ICI World of Journal's, Index Copernicus, Turk Medline, General Impact Factor, Scilit 


Journal articles are evaluated as "Double-Blind Peer Review"

All articles published in this journal are licensed under a Creative Commons Attribution 4.0 International License (CC BY NC ND)