Evaluation of Large Language Model-Based Chatbots' Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment
Abstract
Aim To evaluate the accuracy and guideline compliance of ChatGPT and Claude regarding the use of cone beam computed tomography (CBCT) in endodontics, based on the European Society of Endodontology (ESE) position statement: use of CBCT in endodontics.
Material and method A structured question set comprising 32 true/false statements and 8 open-ended questions was developed based on the ESE position statement. Questions were reviewed by an endodontic specialist and covered four main categories: CBCT indications and justification, radiation dose and imaging parameters, technical application and image interpretation, and training requirements and clinical responsibility. ChatGPT version 5.2 and Claude 4.5 Sonnet were evaluated in separate chat sessions without chat history. All responses were evaluated by a single investigator based on the ESE criteria. Data collection was completed on January 14, 2026.
Results Both ChatGPT 5.2 and Claude 4.5 Sonnet achieved an overall accuracy of 97.5% (39/40 correct responses). For true/false statements, both chatbots correctly answered 31 out of 32 questions (96.88%), with identical errors on question 23. For open-ended questions, both achieved 100% accuracy (8/8 correct). McNemar's test could not be computed due to the absence of discordant pairs, indicating perfect agreement. Chi-square analysis showed no statistically significant difference between the two chatbots (χ² = 0.000, p = 1.000).Conclusion Both ChatGPT 5.2 and Claude 4.5 Sonnet demonstrated excellent accuracy and guideline compliance regarding CBCT use in endodontics based on the ESE position statement, suggesting their potential utility for guideline-based information retrieval in dental education and clinical practice.
Keywords
References
- Abella, F., Patel, S., Duran-Sindreu, F., Mercade, M., Bueno, R., & Roig, M. (2014). An evaluation of periapical lesions using digital periapical radiography and cone-beam computed tomography. International Endodontic Journal, 47, 387–396.
- Patel, S., Brown, J., Semper, M., Abella, F., & Mannocci, F. (2019). European Society of Endodontology position statement: Use of cone beam computed tomography in endodontics. International Endodontic Journal, 52, 1675–1678.
- World Health Organization. (2021). Ethics and governance of artificial intelligence for health (pp. 1–165). World Health Organization.
- Yu, K. H., Beam, A. L., & Kohane, I. S. (2018). Artificial intelligence in healthcare. Nature Biomedical Engineering, 2, 719–731.
- Abdulrab, S., Abada, H., Mashyakhy, M., Mostafa, N., Alhadainy, H., & Halboub, E. (2025). Performance of four artificial intelligence chatbots in answering endodontic questions. Journal of Endodontics, 51(5), 602–608.
- Danesh, A., Pazouki, H., Danesh, F., Danesh, A., & Vardar-Sengul, S. (2024). Artificial intelligence in dental education: ChatGPT’s performance on a dental examination. Journal of Periodontology, 95, 682–687.
- Bulut, A. C., Bahadır, H. S., & Ateş, G. (2025). Artificial intelligence in dental education: Can AI-based chatbots compete with clinicians? BMC Medical Education, 25, 1319.
- Asgari, E., Montaña-Brown, N., Dubois, M., et al. (2025). A framework to assess clinical safety and hallucination rates of LLMs for medical text summarisation. npj Digital Medicine, 8, 274.
Details
Primary Language
English
Subjects
Endodontics
Journal Section
Research Article
Publication Date
April 30, 2026
Submission Date
February 2, 2026
Acceptance Date
March 18, 2026
Published in Issue
Year 2026 Volume: 4 Number: 1
APA
Işık, V., & Şişmanoğlu, S. (2026). Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. Eurasian Dental Research, 4(1), 21-26. https://doi.org/10.62243/edr.1880544
AMA
1.Işık V, Şişmanoğlu S. Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. EDR. 2026;4(1):21-26. doi:10.62243/edr.1880544
Chicago
Işık, Vasfiye, and Soner Şişmanoğlu. 2026. “Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment”. Eurasian Dental Research 4 (1): 21-26. https://doi.org/10.62243/edr.1880544.
EndNote
Işık V, Şişmanoğlu S (April 1, 2026) Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. Eurasian Dental Research 4 1 21–26.
IEEE
[1]V. Işık and S. Şişmanoğlu, “Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment”, EDR, vol. 4, no. 1, pp. 21–26, Apr. 2026, doi: 10.62243/edr.1880544.
ISNAD
Işık, Vasfiye - Şişmanoğlu, Soner. “Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment”. Eurasian Dental Research 4/1 (April 1, 2026): 21-26. https://doi.org/10.62243/edr.1880544.
JAMA
1.Işık V, Şişmanoğlu S. Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. EDR. 2026;4:21–26.
MLA
Işık, Vasfiye, and Soner Şişmanoğlu. “Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment”. Eurasian Dental Research, vol. 4, no. 1, Apr. 2026, pp. 21-26, doi:10.62243/edr.1880544.
Vancouver
1.Vasfiye Işık, Soner Şişmanoğlu. Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. EDR. 2026 Apr. 1;4(1):21-6. doi:10.62243/edr.1880544