tr-ent

The Turkish Journal of Ear Nose and Throat

2602-4837

Istanbul University

10.26650/Tr-ENT.2025.1750893

Otorhinolaryngology

Kulak Burun Boğaz

Evaluating a General-Purpose AI Model for Diagnosing Vocal Fold Lesions Using Static Laryngeal Images

https://orcid.org/0000-0003-4749-3511

Karakaya Gojayev

Ebru

Sincan Education and Research Hospital

https://orcid.org/0000-0002-0992-0079

Büyükatalay

Zahide Çiler

ANKARA UNIVERSITY

01 16 2026

35 4 171 177 07 25 2025 10 21 2025

1990

The Turkish Journal of Ear Nose and Throat

Objective:To evaluate the diagnostic performance of a general-purpose artificial intelligence (AI) model in classifying vocal fold lesions using static laryngeal images.Materials and Methods:This retrospective study included 175 cases representing 14 vocal fold pathologies. Two static endoscopic frames per case—captured during inspiration and phonation—were analysed using a GPT-4-based AI model via structured diagnostic prompts. The model had no prior training on laryngeal images. The diagnostic accuracy, sensitivity, specificity, precision, and F1-score were calculated. Chi-square testing was used to compare the observed accuracy to chance.Results:The overall diagnostic accuracy was 29.14%. The model showed perfect accuracy (100%) in vocal fold haemorrhage and chronic fungal laryngitis, but failed to identify vocal fold paralysis and leukoplakia. The sensitivity ranged from 0% to 100%, while the specificity was more stable (66%–75%). The macro average and weighted-average F1-scores were 33.38% and 29.14%, respectively. The model performed significantly better than chance (p<0.001), with substantial variation across diagnoses.Conclusion:Although the performance was inconsistent across pathologies, the model demonstrated high diagnostic accuracy in selected lesions. These findings support the potential of AI-assisted tools in laryngeal diagnostics, while underscoring the need for domain-specific training and validation.

Artificial intelligence laryngeal diseases laryngoscopy vocal cords/pathology diagnostic imaging

Van Balkum M, Buijs B, Donselaar E, Erkelens D, Goulin Lippi Fernandes E, Wegner I, et al. Systematic review of the diagnostic value of laryngeal stroboscopy in excluding early glottic carcinoma. Clin Otolaryngol 2017;42(1):123-30. google scholar

Żurek M, Jasak K, Niemczyk K, Rzepakowska A. Artificial intelligence in laryngeal endoscopy: systematic review and meta-analysis. J Clin Med 2022;11(10):2752. google scholar

Unger J, Lohscheller J, Reiter M, Eder K, Betz CS, Schuster M. A noninvasive procedure for early-stage discrimination of malignant and precancerous vocal fold lesions based on laryngeal dynamics analysis. Cancer Res 2015;75(1):31-9. google scholar

Najjar R. Redefining radiology: a review of artificial intelligence integration in medical imaging. Diagnostics 2023;13(17):2760. google scholar

Witowski J, Heacock L, Reig B, Kang SK, Lewin A, Pysarenko K, et al. Improving breast cancer diagnostics with deep learning for MRI. Sci Transl Med 2022;14(664):eabo4802. google scholar

Chamberlin J, Kocher MR, Waltz J, Snoddy M, Stringer NF, Stephenson J, et al. Automated detection of lung nodules and coronary artery calcium using artificial intelligence on low-dose CT scans for lung cancer screening: accuracy and prognostic value. BMC medicine 2021;19:1-14. google scholar

Marrero-Gonzalez AR, Diemer TJ, Nguyen SA, Camilon TJ, Meenan K, O’Rourke A. Application of artificial intelligence in laryngeal lesions: a systematic review and meta-analysis. Eur Arch Otorhinolaryngol 2024;281(1):1-13. google scholar

Zhou X, Ma L, Brown W, Little JV, Chen AY, Myers LL, et al. Automatic detection of head and neck squamous cell carcinoma on pathologic slides using polarized hyperspectral imaging and machine learning. Proc SPIE Int Soc Opt Eng 2021; 11603:116030Q. google scholar

Wellenstein DJ, Woodburn J, Marres HA, van den Broek GB. Detection of laryngeal carcinoma during endoscopy using artificial intelligence. Head Neck 2023;45(9):2217-26. google scholar