TY - JOUR T1 - Comparing the readability of human- and AI-written informed consent forms for provisional dental restorations TT - Geçici dental restorasyonlara ait insan ve yapay zekâ tarafından yazılmış aydınlatılmış onam formlarının okunabilirliğinin karşılaştırılması AU - Türker Kader, İzim AU - Arıcan, Burçin PY - 2025 DA - July Y2 - 2025 DO - 10.32322/jhsm.1700777 JF - Journal of Health Sciences and Medicine JO - J Health Sci Med /JHSM /jhsm PB - MediHealth Academy Yayıncılık WT - DergiPark SN - 2636-8579 SP - 697 EP - 702 VL - 8 IS - 4 LA - en AB - Aims: This study aimed to evaluate the readability of informed consent forms for provisional crowns and bridges by comparing a human-written version with AI-generated texts produced by two large language models (LLMs): GPT-4o (OpenAI) and Claude 3.7 Sonnet (Anthropic).Methods: A three-page informed consent form authored by a prosthodontic specialist was used as a human-written reference. Using identical structured prompts, comparable consent forms were generated by GPT-4o and Claude 3.7 Sonnet. Specifically, the models were instructed to first explain the clinical purpose of provisional dental restorations and then generate a three-page patient-oriented informed consent form, avoiding unnecessary technical jargon and adopting the tone of a prosthodontic specialist. The prompts guided the models to address each section sequentially, including: title of the form, patient identification, introductory statement, treatment and procedures, expected benefits, expected outcomes without treatment, treatment alternatives, possible risks and complications, estimated duration of the procedure, and signature section. Readability was assessed using the Flesch-Kincaid Grade Level (FKGL) metric, along with descriptive comparisons of word count, sentence count, and passive voice percentage.Results: The human-written form consisted of 1158 words, achieved an FKGL score of 10.8, and contained 34.5% passive voice. The GPT-4o form showed 956 words, an FKGL of 12.6, and 20.4% passive voice. The Claude 3.7 Sonnet form had 1338 words, an FKGL of 14.7, and 35% passive voice. These results revealed marked differences in document length, sentence count, and passive voice usage, with the AI-generated texts displaying more complex sentence structures and higher reading grade levels.Conclusion: Although all forms exceeded the recommended readability level for patient-facing documents, the AI-generated versions-particularly the Claude 3.7 Sonnet form-were more difficult to read due to greater length and more complex sentence structure. These results underscore the importance of human oversight in editing and simplifying AI-generated materials, ensuring they meet the readability standards essential for patient comprehension. KW - Artificial intelligence KW - dental restoration KW - health literacy KW - informed consent KW - natural language processing KW - readability N2 - Amaç: Bu çalışmanın amacı, geçici kron ve köprüler için hazırlanmış aydınlatılmış onam formlarının okunabilirliğini değerlendirmek ve bir prostodonti uzmanı tarafından yazılmış insan kaynaklı bir form ile iki büyük dil modeli (LLM) olan GPT-4o (OpenAI) ve Claude 3.7 Sonnet (Anthropic) tarafından üretilmiş yapay zekâ metinlerini karşılaştırmaktır.Gereç ve Yöntemler: İnsan kaynaklı referans belge olarak, bir prostodonti uzmanı tarafından hazırlanmış üç sayfalık aydınlatılmış onam formu kullanılmıştır. Aynı içerikte yapılandırılmış komutlarla, GPT-4o ve Claude 3.7 Sonnet modelleri aracılığıyla benzer formlar oluşturulmuştur. Modellerden önce geçici dental restorasyonların klinik amacını açıklamaları, ardından teknik terimlerden kaçınarak prostodonti uzmanı tonunda hasta odaklı üç sayfalık bir aydınlatılmış onam formu üretmeleri istenmiştir. Yönergeler, modelleri sırasıyla şu bölümleri içerecek şekilde yönlendirmiştir: form başlığı, hasta bilgileri, giriş açıklaması, tedavi ve prosedürler, beklenen yararlar, tedavi uygulanmazsa karşılaşılabilecek olası sonuçlar, tedavi alternatifleri, muhtemel riskler/komplikasyonlar, işlem süresi ve imza bölümü. Okunabilirlik, Flesch-Kincaid Okunabilirlik Düzeyi (FKGL) ölçütü ile değerlendirilmiş ve ayrıca kelime sayısı, cümle sayısı ve edilgen çatı kullanım oranı gibi ölçütlerle tanımlayıcı karşılaştırmalar yapılmıştır.Bulgular: İnsan tarafından yazılmış form 1158 kelime, 10,8 seviyesinde FKGL skoru ve %34,5 edilgen cümle oranına sahipti. GPT-4o tarafından üretilen form 956 kelime, 12,6 seviyesinde FKGL skoru ve %20,4 edilgenlik oranı gösterdi. Claude 3.7 Sonnet formu ise 1338 kelime, 14,7 seviyesinde FKGL skoru ve %35 edilgen cümle oranına sahipti. Bu sonuçlar, insan kaynaklı ve yapay zekâ tarafından oluşturulan formlar arasında okunabilirlik ve dil yapısı bakımından ölçülebilir farklar olduğunu göstermektedir.Sonuç: Standart komutlar kullanılmış olmasına rağmen, yapay zekâ tarafından oluşturulan onam formları insan yazımına göre daha yüksek okuma zorluğu göstermiştir. Bu bulgular, yapay zekâ destekli klinik materyallerin hastaların anlayabileceği düzeyde olmasını sağlamak için bir insanın nihai durumu gözden geçirmesi ve düzenlemesinin gerekliliğini ortaya koymaktadır. CR - Minssen T, Vayena E, Cohen IG. The challenges for regulating medical use of ChatGPT and other large language models. JAMA. 2023;330(4): 315-316. doi:10.1001/jama.2023.9651 CR - Ouyang L, Wu J, Jiang X, et al. Training language models to follow instructions with human feedback. arXiv. 2022;35:27730-27744. doi:10. 48550/arXiv.2203.02155 CR - Carchiolo V, Malgeri M. Trends, challenges, and applications of large language models in healthcare: a bibliometric and scoping review. Future Internet. 2025;17(2):76. doi:10.3390/fi17020076 CR - Sobieska A, Starke G. Beyond words: extending a pragmatic view of language to large language models for informed consent. Am J Bioeth. 2025;25(4):82-85. doi:10.1080/15265161.2025.2470669 CR - Doğan I, Günel P, Berk İ, Berk Bİ. Evaluation of the readability, understandability, and accuracy of artificial intelligence chatbots in terms of biostatistics literacy. Eur J Ther. 2024;30(6):900-909. doi:10.58600/eurjther2569 CR - Ermis S, Alkin Z, Aydın O, et al. Assessing the responses of large language models (ChatGPT-4, Claude 3, Gemini, and Microsoft Copilot) to frequently asked questions in retinopathy of prematurity: a study on readability and appropriateness. J Pediatr Ophthalmol Strabismus. 2025;62(2):84-95. doi:10.3928/01913913-20240911-05 CR - Alam M, Rana S, Ikbal M, et al. Evaluating the accuracy, reliability, consistency, and readability of different large language models in restorative dentistry. J Esthet Restor Dent. 2025;37(2):301-309. doi:10. 1111/jerd.13447 CR - Naji M, Masmoudi M, Baazaoui Zghal H. A Novel Approach for medical e-consent: leveraging language models for informed consent management. In: Nguyen, N.T., et al. Recent challenges in intelligent information and database systems. ACIIDS 2024. Communicat Comput Informat Sci. 2024:122-132. doi:10.1007/978-981-97-5937-8_9 CR - Raimann FJ, Neef V, Hennighausen MC, Zacharowski K, Flinspach AN. Evaluation of AI ChatBots for the creation of patient-informed consent sheets. Mach Learn Knowl Extr. 2024;6(2):1145-1153. doi:10.3390/make 6020053 CR - Shi Q, Luzuriaga K, Allison JJ, et al. Transforming informed consent generation using large language models: mixed methods study. JMIR Med Inform. 2025;13(1):e68139. doi:10.2196/68139 CR - Allen JW, Schaefer O, Porsdam Mann S, Earp BD, Wilkinson D. Augmenting research consent: should large language models (LLMs) be used for informed consent to clinical research? Res Ethics. 2024: 17470161241298726. doi:10.1177/17470161241298726 CR - Decker H, Trang K, Ramirez J, et al. Large language model-based chatbot vs surgeon-generated informed consent documentation for common procedures. JAMA Netw Open. 2023;6(10):e2336997. doi:10.1001/jama networkopen.2023.36997 CR - Tailor PD, Dalvin LA, Chen JJ, et al. A comparative study of responses to retina questions from either experts, expert-edited large language models (LLMs) or LLMs alone. Ophthalmol Sci. 2024;4(4):100485. doi:10.1016/j.xops.2024.100485 CR - Chen X, Meurers D. Word frequency and readability: predicting the text-level readability with a lexical-level attribute. J Res Read. 2017;41(3):486-510. doi:10.1111/1467-9817.12121 CR - Tamariz L, Palacio A, Robert M, Marcus EN. Improving the informed consent process for research subjects with low literacy: a systematic review. J Gen Intern Med. 2013;28(1):121-126. doi:10.1007/s11606-012-2133-2 CR - Hochhauser M. Consent forms: not easy to read. Appl Clin Trials. 2007; 16:74. CR - Denzen EM, Arora M, Rybicki L, et al. Easy-to-read informed consent forms for hematopoietic cell transplantation clinical trials. Biol Blood Marrow Transplant. 2012;18(2):183-189. doi:10.1016/j.bbmt.2011.07.022 CR - Larson E, Foe G, Lally R. Reading level and length of written research consent forms. Clin Transl Sci. 2015;8(4):355-356. doi:10.1111/cts.12253 CR - Donovan-Kicken E, Mackert M, Guinn T, Tollison A, Breckinridge B. Health literacy, self-efficacy, and patients’ assessment of medical disclosure and consent documentation. Health Commun. 2012;27(6):581-590. doi:10.1080/10410236.2011.618434 CR - Paasche-Orlow MK, Taylor HA, Brancati FL. Readability of consent form templates: a second look. IRB. 2013;35(4):12-19. CR - Kutner M, Greenberg E, Jin Y, Paulsen C. The Health Literacy of America’s Adults: Results from the 2003 National Assessment of Adult Literacy. NCES 2006-483. Washington, DC: National Center for Education Statistics; 2006. CR - Nishimura A, Carey J, Erwin PJ, Tilburt JC, Murad MH, McCormick JB. Improving understanding in the research informed consent process: a systematic review of 54 interventions tested in randomized control trials. BMC Med Ethics. 2013;14:28. doi:10.1186/1472-6939-14-28 CR - National Institutes of Health. Plain language at NIH. US Department of Health and Human Services. Published 2016. Accessed May 31, 2017. http://www.nih.gov/institutes-nih/nih-office-director/office-commun icationspublic-liaison/clear-communication/plain-language CR - Patras M, Naka O, Doukoudakis S, Pissiotis A. Management of provisional restorations’ deficiencies: a literature review. J Esthet Restor Dent. 2012;24(1):26-38. doi:10.1111/j.1708-8240.2011.00467.x CR - Deshmukh S, Jaiswal K. Knowledge, attitude and practice of dentists regarding provisional restorations: a cross-sectional study. J Indian Prosthodont Soc. 2020;20(Suppl 1):S22. doi:10.4103/0972-4052.306373 CR - Restrepo E, Ko N, Warner ET. An evaluation of readability and understandability of online education materials for breast cancer survivors. J Cancer Surviv. 2024;18(2):457-465. doi:10.1007/s11764-022-01240-w CR - Ozdemir ZM, Yapici E. Evaluating the accuracy, reliability, consistency, and readability of different large language models in restorative dentistry. J Esthet Restor Dent. 2025;37(7):1740-1752. doi:10.1111/jerd. 13447 CR - Mirza FN, Tang OY, Connolly ID, et al. Using ChatGPT to facilitate truly informed medical consent. NEJM AI. 2024;1(2):AIcs2300145. doi: 10.1056/AIcs2300145 CR - Clapp JT, Kruser JM, Schwarze ML, Hadler RA. Language in bioethics: beyond the representational view. Am J Bioeth. 2025;25(4):41-53. doi:10.1080/15265161.2024.2337394 CR - DuBay WH. The Principles of Readability. Costa Mesa, CA: Impact Information; 2004. UR - https://doi.org/10.32322/jhsm.1700777 L1 - https://dergipark.org.tr/tr/download/article-file/4876799 ER -