Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study

Ata Baytaroğlu; Şerife Nur Çiftci

doi:10.33713/egetbd.1829121

TR EN

9Gaze Fotoğraflarını Kullanarak ChatGPT Tabanlı Şaşılık Değerlendirmesinin Pilot Doğrulaması: Tek Merkezli Geçerlilik Çalışması

Abstract

AMAÇ: Dokuz tanısal bakış pozisyonu fotoğraflarından elde edilen görüntüler üzerinden yapay zeka (YZ) tabanlı şaşılık ölçümlerinin, klinik muayene ile kaydedilen gerçek tanı ve kayma miktarları ile ne ölçüde uyumlu olduğunu incelemek. GEREÇ ve YÖNTEM: Çalışmaya yatay şaşılık tanısı ile izlenen 20 olgu dâhil edildi. Her hasta için yakın hedefe fiksasyon altında 9gaze uygulaması (See Vision LLC, Virginia, ABD) ile çekilmiş 9 bakış pozisyonu fotoğrafları ve klinik muayenede kaydedilmiş yatay/vertikal deviasyon değerleri kullanıldı. Klinik kayıtlardan horizontal ve vertikal kayma miktarı, inkomitans durumu, patern varlığı ve şaşılık tipi verileri incelendi. Aynı fotoğraflar ChatGPT-5.0-Plus’a yüklenerek YZ algoritmasının ürettiği tanı, inkomitans, patern ve kayma miktarları kaydedildi. BULGULAR: Çalışmaya alınan 20 olgunun yaş ortalaması 21,0±20,9 (1–65) yıl idi; 10’u (%50) kadın, 10’u (%50) erkekti. Gerçek tanıya göre 11 (%55) ezotropya, 9 (%45) ekzotropya mevcuttu. YZ’nin klinik tanı sınıflamasında doğru sınıflanan olgu sayısı 19/20 (%95) idi; Cohen kappa = 0,90 ile mükemmel düzeyde uyum saptandı. Ezotropya için duyarlılık %90,9, özgüllük %100, toplam doğruluk %95 bulundu. Komitans açısından klinik ve YZ analizleri %75 uyum gösterdi (Kappa=0.38). Patern kayma tespit gücünde YZ algoritmasının yetersiz kaldığı izlendi (%80 uyum, Kappa=-0.05). Horizontal ve vertikal kayma analizinde güçlü korelasyon (r=0.87, p<0.001 ve r=0.77, p<0.001) izlendi. Yaş ve cinsiyet ile mutlak hata büyüklüğü arasında anlamlı bir ilişki saptanmadı (tümü için p>0.05). SONUÇ: Dokuz tanısal bakış pozisyonu fotoğraflarından elde edilen YZ tabanlı analiz, şaşılık tipi ve kayma miktarlarının tahmini açısından klinik ölçümlerle yüksek düzeyde uyum göstermektedir. Bununla birlikte, inkomitans ve A/V-pattern gibi daha ince tanısal özelliklerde uyum belirgin olarak daha düşüktür. Sonuçlar, uygun eğitilmiş YZ sistemlerinin şaşılık pratiğinde tanısal destek aracı olarak kullanılabileceğini, ancak klinik muayenenin yerini alamayacağını göstermektedir.

Keywords

Ethical Statement

Bu çalışmada insan katılımcılarla ilgili tüm prosedürler, Uşak Üniversitesi (TR) Müdahalesiz Çalışmalar Etik Kurulu'nun etik standartlarına uygun olarak gerçekleştirilmiştir (onay numarası: 827-827-16, tarih: 11.09.2025)..

Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study

Abstract

OBJECTIVE: To investigate the degree of agreement between artificial intelligence (AI)-based strabismus measurements obtained from images of nine diagnostic gaze positions and the actual diagnosis and amount of deviation recorded during clinical examination. MATERIALS and METHODS: The study included twenty cases diagnosed with horizontal strabismus. For each patient, nine gaze position photographs taken using the 9gaze application (See Vision LLC, Virginia, USA) under fixation on a near target were used, and horizontal and vertical deviation values were recorded during clinical examination. Data on the amounts of horizontal and vertical deviations, incomitance status, pattern presence, and type of strabismus were reviewed from clinical records. The same photographs were uploaded to ChatGPT-5.0-Plus, and the diagnosis, incomitance, pattern, and deviation amounts generated by the AI algorithm were documented. RESULTS: The average age of the 20 cases included in the study was 21.0±20.9 (1–65) years; 10 (50%) were female and 10 (50%) were male. According to the actual diagnosis, 11 (55%) had esotropia and 9 (45%) had exotropia. The number of cases correctly classified in the clinical diagnosis classification of the YZ was 19/20 (95%), showing excellent agreement with Cohen's kappa = 0.90. Sensitivity for esotropia was 90.9%, specificity was 100%, and overall accuracy was 95%. Clinical and AI analyses showed 75% agreement for incomitance (Kappa=0.38). The AI algorithm was found to be inadequate in detecting pattern shift (%80 agreement, Kappa=-0.05). Strong correlations were observed in horizontal and vertical shift analyses (r=0.87, p<0.001 and r=0.77, p<0.001). No significant relationship was found between age and gender and the absolute error magnitude (p>0.05 for all). CONCLUSION: AI-based analysis of nine diagnostic gaze position photographs shows a high level of agreement with clinical measurements in estimating strabismus type and deviation magnitude. However, agreement is much lower for more subtle diagnostic features such as incommitance and A/V-pattern. The findings suggest that properly trained AI systems can serve as a useful diagnostic support tool in strabismus practice but cannot replace clinical examination, especially in cases of incomitant and patterned strabismus.

Keywords

Ethical Statement

All procedures involving human participants in this study adhered to the ethical standards of Uşak University Non-Interventional Studies Ethical Committee (approval number: 827-827-16, date: 11.09.2025),

References

1. Hashemi H, Pakzad R, Heydarian S, Yekta A, AghamirsalimM, Shokrollahzadeh F, et al. Global and regional prevalence of strabismus: a comprehensive systemat ic review and meta-analysis. Strabismus. 2019; 27: 54-65.
2. Pathai S, Cumberland PM, Rahi JS. Prevalence of and early-life influences on childhood strabismus: findings from the Millennium Cohort Study. Arch Pediatr Adolesc Med [Internet]. 2010; 164: 250-257.
3. Wu D, Huang X, Chen L, Hou P, Liu L, Yang G. Integrating Artificial Intelligence in Strabismus Management: Current Research Landscape and Future Directions. Exp Biol Med. 2024; 249.
4. Phanphruk W, Liu Y, Morley K, Gavin J, Shah AS, Hunter DG. Validation of StrabisPIX, a Mobile Application for Home Measurement of Ocular Alignment. Transl Vis Sci Technol. 2019; 8: 9.
5. Shu Q, Pang J, Liu Z, Liang X, Chen M, Tao Z, et al. Artificial Intelligence for Early Detection of Pediatric Eye Diseases Using Mobile Photos. JAMA Netw Open. 2024; 7: e2425124.
6. Zhao Z, Meng H, Li S, Wang S, Wang J, Gao S. High-Accuracy Intermittent Strabismus Screening via Wearable Eye-Tracking and AI-Enhanced Ocular Feature Analysis. Biosensors (Basel). 2025; 15: 110.
7. Jin X, Liu Y, He B, Fan Y fei, Zhou L. A Deep Learning-Based Image Analysis Model for Automated Scoring of Horizontal Ocular Movement Disorders. Front Neurol. 2025; 16.
8. Wu D, Li Y, Yang Z, Teng Y, Chen XH, Liu J, et al. Deep Learning-Based Precision Cropping of Eye Regions in Strabismus Photographs: Algorithm Development and Validation Study for Workflow Optimization. J Med Internet Res. 2025; 27: e74402-e74402.

Details

Primary Language

English

Subjects

Surgery (Other)

Journal Section

Research Article

Authors

Ata Baytaroğlu ^*
0000-0003-1993-0403
Türkiye

Şerife Nur Çiftci
0000-0002-2362-7514
Türkiye

Publication Date

December 31, 2025

Submission Date

November 24, 2025

Acceptance Date

December 9, 2025

Published in Issue

Year 2025 Volume: 8 Number: 3

DOI

https://doi.org/10.33713/egetbd.1829121

IZ

https://izlik.org/JA49RS64EP

Cite

RIS / Bibtex

APA

Baytaroğlu, A., & Çiftci, Ş. N. (2025). Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study. Ege Tıp Bilimleri Dergisi, 8(3), 155-160. https://doi.org/10.33713/egetbd.1829121

AMA

1.Baytaroğlu A, Çiftci ŞN. Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study. Ege Tıp Bilimleri Dergisi. 2025;8(3):155-160. doi:10.33713/egetbd.1829121

Chicago

Baytaroğlu, Ata, and Şerife Nur Çiftci. 2025. “Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study”. Ege Tıp Bilimleri Dergisi 8 (3): 155-60. https://doi.org/10.33713/egetbd.1829121.

EndNote

Baytaroğlu A, Çiftci ŞN (December 1, 2025) Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study. Ege Tıp Bilimleri Dergisi 8 3 155–160.

IEEE

[1]A. Baytaroğlu and Ş. N. Çiftci, “Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study”, Ege Tıp Bilimleri Dergisi, vol. 8, no. 3, pp. 155–160, Dec. 2025, doi: 10.33713/egetbd.1829121.

ISNAD

Baytaroğlu, Ata - Çiftci, Şerife Nur. “Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study”. Ege Tıp Bilimleri Dergisi 8/3 (December 1, 2025): 155-160. https://doi.org/10.33713/egetbd.1829121.

JAMA

1.Baytaroğlu A, Çiftci ŞN. Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study. Ege Tıp Bilimleri Dergisi. 2025;8:155–160.

MLA

Baytaroğlu, Ata, and Şerife Nur Çiftci. “Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study”. Ege Tıp Bilimleri Dergisi, vol. 8, no. 3, Dec. 2025, pp. 155-60, doi:10.33713/egetbd.1829121.

Vancouver

1.Ata Baytaroğlu, Şerife Nur Çiftci. Pilot Validation of ChatGPT-Based Strabismus Assessment Using 9Gaze Photographs: A Single-Center Feasibility Study. Ege Tıp Bilimleri Dergisi. 2025 Dec. 1;8(3):155-60. doi:10.33713/egetbd.1829121