Evaluation of Large Language Model-Based Chatbots' Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment

Vasfiye Işık; Soner Şişmanoğlu

doi:10.62243/edr.1880544

Evaluation of Large Language Model-Based Chatbots' Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment

Abstract

Aim To evaluate the accuracy and guideline compliance of ChatGPT and Claude regarding the use of cone beam computed tomography (CBCT) in endodontics, based on the European Society of Endodontology (ESE) position statement: use of CBCT in endodontics. Material and method A structured question set comprising 32 true/false statements and 8 open-ended questions was developed based on the ESE position statement. Questions were reviewed by an endodontic specialist and covered four main categories: CBCT indications and justification, radiation dose and imaging parameters, technical application and image interpretation, and training requirements and clinical responsibility. ChatGPT version 5.2 and Claude 4.5 Sonnet were evaluated in separate chat sessions without chat history. All responses were evaluated by a single investigator based on the ESE criteria. Data collection was completed on January 14, 2026. Results Both ChatGPT 5.2 and Claude 4.5 Sonnet achieved an overall accuracy of 97.5% (39/40 correct responses). For true/false statements, both chatbots correctly answered 31 out of 32 questions (96.88%), with identical errors on question 23. For open-ended questions, both achieved 100% accuracy (8/8 correct). McNemar's test could not be computed due to the absence of discordant pairs, indicating perfect agreement. Chi-square analysis showed no statistically significant difference between the two chatbots (χ² = 0.000, p = 1.000).Conclusion Both ChatGPT 5.2 and Claude 4.5 Sonnet demonstrated excellent accuracy and guideline compliance regarding CBCT use in endodontics based on the ESE position statement, suggesting their potential utility for guideline-based information retrieval in dental education and clinical practice.

Keywords

References

Abella, F., Patel, S., Duran-Sindreu, F., Mercade, M., Bueno, R., & Roig, M. (2014). An evaluation of periapical lesions using digital periapical radiography and cone-beam computed tomography. International Endodontic Journal, 47, 387–396.
Patel, S., Brown, J., Semper, M., Abella, F., & Mannocci, F. (2019). European Society of Endodontology position statement: Use of cone beam computed tomography in endodontics. International Endodontic Journal, 52, 1675–1678.
World Health Organization. (2021). Ethics and governance of artificial intelligence for health (pp. 1–165). World Health Organization.
Yu, K. H., Beam, A. L., & Kohane, I. S. (2018). Artificial intelligence in healthcare. Nature Biomedical Engineering, 2, 719–731.
Abdulrab, S., Abada, H., Mashyakhy, M., Mostafa, N., Alhadainy, H., & Halboub, E. (2025). Performance of four artificial intelligence chatbots in answering endodontic questions. Journal of Endodontics, 51(5), 602–608.
Danesh, A., Pazouki, H., Danesh, F., Danesh, A., & Vardar-Sengul, S. (2024). Artificial intelligence in dental education: ChatGPT’s performance on a dental examination. Journal of Periodontology, 95, 682–687.
Bulut, A. C., Bahadır, H. S., & Ateş, G. (2025). Artificial intelligence in dental education: Can AI-based chatbots compete with clinicians? BMC Medical Education, 25, 1319.
Asgari, E., Montaña-Brown, N., Dubois, M., et al. (2025). A framework to assess clinical safety and hallucination rates of LLMs for medical text summarisation. npj Digital Medicine, 8, 274.

Gu, B., Desai, R. J., Lin, K. J., et al. (2024). Probabilistic medical predictions of large language models. npj Digital Medicine, 7, 367.
Pauwels, R., Araki, K., Siewerdsen, J. H., & Thongvigitmanee, S. S. (2015). Technical aspects of dental CBCT: State of the art. Dentomaxillofacial Radiology, 44, 20140224.
Walker, H. L., Ghani, S., Kuemmerli, C., Nebiker, C. A., Müller, B. P., Raptis, D. A., & Staubli, S. M. (2023). Reliability of medical information provided by ChatGPT: Assessment against clinical guidelines and patient information quality instrument. Journal of Medical Internet Research, 25, e47479.
Johnson, A. J., Singh, T. K., Gupta, A., Sankar, H., Gill, I., Shalini, M., & Mohan, N. (2025). Evaluation of validity and reliability of AI chatbots as public sources of information on dental trauma. Dental Traumatology, 41(2), 187–193.
Díaz-Flores García, V., Freire, Y., Tortosa, M., Tejedor, B., Estevez, R., & Suárez, A. (2024). Google Gemini’s performance in endodontics: A study on answer precision and reliability. Applied Sciences, 14(15), 6390.
Ekmekci, E., & Durmazpinar, P. M. (2025). Evaluation of different artificial intelligence applications in responding to regenerative endodontic procedures. BMC Oral Health, 25(1), 53.
Rabiee, H., McDonald, N. J., Jacobs, R., Aminlari, A., & Inglehart, M. R. (2018). Endodontics program directors’, residents’, and endodontists’ considerations about CBCT-related graduate education. Journal of Dental Education, 82(9), 989–999.
Fayad, M. I., & Villa-Machado, P. (2025). CBCT in endodontics: Revolutionizing endodontic diagnosis and treatment. Dental Clinics of North America, 69(4), 497–514.
Chan, F., Brown, L. F., & Parashos, P. (2023). CBCT in contemporary endodontics. Australian Dental Journal, 68(Suppl 1), S39–S55.
Rajpurkar, P., Chen, E., Banerjee, O., & Topol, E. J. (2022). AI in health and medicine. Nature Medicine, 28(1), 31–38.
Moor, M., Banerjee, O., Abad, Z. S. H., et al. (2023). Foundation models for generalist medical artificial intelligence. Nature, 616(7956), 259–265.
Portilla, N. D., Garcia-Font, M., Nagendrababu, V., Abbott, P. V., Sanchez, J. A. G., & Abella, F. (2025). Accuracy and consistency of Gemini responses regarding the management of traumatized permanent teeth. Dental Traumatology, 41(2), 171–177.
Jalali, P., Mohammad-Rahimi, H., Wang, F. M., et al. (2025). Performance of seven artificial intelligence chatbots on board-style endodontic questions. Journal of Endodontics, 51(10), 1413–1419.
Uribe, S. E., Maldupa, I., Kavadella, A., et al. (2024). Artificial intelligence chatbots and large language models in dental education: Worldwide survey of educators. European Journal of Dental Education, 28(4), 865–876.
Turan Gökduman, C., Arılı Öztürk, E., Aktaş, Ş., & Çanakçi, B. C. (2025). Comparison of chatbots’ accuracy in endodontics questions in dentistry specialization exam in Türkiye: ChatGPT-4o, Gemini Advanced, Copilot, and Claude. BMC Oral Health, 26(1), 28.
Lafourcade, C., Kérourédan, O., Ballester, B., & Richert, R. (2025). Accuracy, consistency, and contextual understanding of large language models in restorative dentistry and endodontics. Journal of Dentistry, 157, 105764.

Details

Primary Language

English

Subjects

Endodontics

Journal Section

Research Article

Authors

Vasfiye Işık ^*
0000-0003-1622-2698
Türkiye

Soner Şişmanoğlu
0000-0002-1272-5581
Türkiye

Publication Date

April 30, 2026

Submission Date

February 2, 2026

Acceptance Date

March 18, 2026

Published in Issue

Year 2026 Volume: 4 Number: 1

DOI

https://doi.org/10.62243/edr.1880544

IZ

https://izlik.org/JA33LC66CW

Cite

RIS / Bibtex

APA

Işık, V., & Şişmanoğlu, S. (2026). Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. Eurasian Dental Research, 4(1), 21-26. https://doi.org/10.62243/edr.1880544

AMA

1.Işık V, Şişmanoğlu S. Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. EDR. 2026;4(1):21-26. doi:10.62243/edr.1880544

Chicago

Işık, Vasfiye, and Soner Şişmanoğlu. 2026. “Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment”. Eurasian Dental Research 4 (1): 21-26. https://doi.org/10.62243/edr.1880544.

EndNote

Işık V, Şişmanoğlu S (April 1, 2026) Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. Eurasian Dental Research 4 1 21–26.

IEEE

[1]V. Işık and S. Şişmanoğlu, “Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment”, EDR, vol. 4, no. 1, pp. 21–26, Apr. 2026, doi: 10.62243/edr.1880544.

ISNAD

Işık, Vasfiye - Şişmanoğlu, Soner. “Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment”. Eurasian Dental Research 4/1 (April 1, 2026): 21-26. https://doi.org/10.62243/edr.1880544.

JAMA

1.Işık V, Şişmanoğlu S. Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. EDR. 2026;4:21–26.

MLA

Işık, Vasfiye, and Soner Şişmanoğlu. “Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment”. Eurasian Dental Research, vol. 4, no. 1, Apr. 2026, pp. 21-26, doi:10.62243/edr.1880544.

Vancouver

1.Vasfiye Işık, Soner Şişmanoğlu. Evaluation of Large Language Model-Based Chatbots’ Accuracy in Responding to CBCT-Related Questions in Endodontics: A Guideline-Based Assessment. EDR. 2026 Apr. 1;4(1):21-6. doi:10.62243/edr.1880544