Objective: To evaluate the diagnostic performance of a general-purpose artificial intelligence (AI) model in classifying vocal fold lesions using static laryngeal images.
Materials and Methods: This retrospective study included 175 cases representing 14 vocal fold pathologies. Two static endoscopic frames per case—captured during inspiration and phonation—were analysed using a GPT-4-based AI model via structured diagnostic prompts. The model had no prior training on laryngeal images. The diagnostic accuracy, sensitivity, specificity, precision, and F1-score were calculated. Chi-square testing was used to compare the observed accuracy to chance.
Results: The overall diagnostic accuracy was 29.14%. The model showed perfect accuracy (100%) in vocal fold haemorrhage and chronic fungal laryngitis, but failed to identify vocal fold paralysis and leukoplakia. The sensitivity ranged from 0% to 100%, while the specificity was more stable (66%–75%). The macro average and weighted-average F1-scores were 33.38% and 29.14%, respectively. The model performed significantly better than chance (p<0.001), with substantial variation across diagnoses.
Conclusion: Although the performance was inconsistent across pathologies, the model demonstrated high diagnostic accuracy in selected lesions. These findings support the potential of AI-assisted tools in laryngeal diagnostics, while underscoring the need for domain-specific training and validation.
Artificial intelligence laryngeal diseases laryngoscopy vocal cords/pathology diagnostic imaging
| Primary Language | English |
|---|---|
| Subjects | Otorhinolaryngology |
| Journal Section | Research Article |
| Authors | |
| Submission Date | July 25, 2025 |
| Acceptance Date | October 21, 2025 |
| Publication Date | January 16, 2026 |
| DOI | https://doi.org/10.26650/Tr-ENT.2025.1750893 |
| IZ | https://izlik.org/JA25BL78KL |
| Published in Issue | Year 2025 Volume: 35 Issue: 4 |