Research Article
BibTex RIS Cite

Transformer-Based Deep Learning for Autism Detection from Facial Images

Year 2025, Volume: 15 Issue: 3, 755 - 764, 01.09.2025
https://doi.org/10.21597/jist.1640353

Abstract

This study presents a comparative analysis of four different Transformer-based deep learning architectures (Vision Transformer (ViT), Swin Transformer (Swin-T), Data-efficient Image Transformer (DeiT), and Convolutional Transformer (CoaT)) for Autism Spectrum Disorder (ASD) detection using facial images. In recent years, Transformer architectures have increasingly replaced traditional convolutional neural network based approaches in ASD detection research. In this context, experimental results demonstrate that the Swin-T model achieved the highest classification performance with 87.76% accuracy and an AUC of 0.96. The CoaT model followed closely with 86.01% accuracy and an AUC of 0.94, while DeiT (84.27% accuracy) and ViT (82.52% accuracy) exhibited relatively lower performance. Confusion matrix and ROC curve analyses confirm that the Swin-T model significantly reduced both false positive and false negative rates. These findings highlight the effectiveness of Swin-T and CoaT models in visual data processing and suggest that, when supported by larger datasets, these architectures could provide valuable contributions to early ASD diagnosis in both clinical and research domains.

References

  • Ahmed, Z. A., Aldhyani, T. H., Jadhav, M. E., Alzahrani, M. Y., Alzahrani, M. E., Althobaiti, M. M., . . . Al-madani, A. M. (2022, April). Facial Features Detection System To Identify Children With Autism Spectrum Disorder: Deep Learning Models. (D. Koundal, Dü.) Computational and Mathematical Methods in Medicine, 2022, 1–9. doi:10.1155/2022/3941049
  • Alam, M. S., Rashid, M. M., Roy, R., Faizabadi, A. R., Gupta, K. D., & Ahsan, M. M. (2022, November). Empirical Study of Autism Spectrum Disorder Diagnosis Using Facial Images by Improved Transfer Learning Approach. Bioengineering, 9, 710. doi:10.3390/bioengineering9110710
  • Alkahtani, H., Aldhyani, T. H., & Alzahrani, M. Y. (2023, April). Deep Learning Algorithms to Identify Autism Spectrum Disorder in Children-Based Facial Landmarks. Applied Sciences, 13, 4855. doi:10.3390/app13084855
  • Angkustsiri, K., Krakowiak, P., Moghaddam, B., Wardinsky, T., Gardner, J., Kalamkarian, N., . . . Hansen, R. L. (2011, May). Minor physical anomalies in children with autism spectrum disorders. Autism, 15, 746–760. doi:10.1177/1362361310397620
  • Awaji, B., Senan, E. M., Olayah, F., Alshari, E. A., Alsulami, M., Abosaq, H. A., . . . Janrao, P. (2023, September). Hybrid Techniques of Facial Feature Image Analysis for Early Detection of Autism Spectrum Disorder Based on Combined CNN Features. Diagnostics, 13, 2948. doi:10.3390/diagnostics13182948
  • Baio, J., Wiggins, L., Christensen, D. L., Maenner, M. J., Daniels, J., Warren, Z., . . . Dowling, N. F. (2018, April). Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years — Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2014. MMWR. Surveillance Summaries, 67, 1–23. doi:10.15585/mmwr.ss6706a1
  • Bazi, Y., Bashmal, L., Rahhal, M. M., Dayil, R. A., & Ajlan, N. A. (2021, February). Vision Transformers for Remote Sensing Image Classification. Remote Sensing, 13, 516. doi:10.3390/rs13030516
  • Beary, M., Hadsell, A., Messersmith, R., & Hosseini, M.-P. (2020). Diagnosis of Autism in Children using Facial Analysis and Deep Learning. Diagnosis of Autism in Children using Facial Analysis and Deep Learning. arXiv. doi:10.48550/ARXIV.2008.02890
  • Das, S., Chaudhury, P., & Tripathy, H. K. (2024, January). Classification of Autism Spectrum Disorder using CNN & Transfer Learning. 2024 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC) (s. 1–7). IEEE. doi:10.1109/assic60049.2024.10507982
  • Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009, June). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. doi:10.1109/cvpr.2009.5206848
  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., . . . Houlsby, N. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ArXiv, abs/2010.11929. https://api.semanticscholar.org/CorpusID:225039882 adresinden alındı
  • Hosseini, M.-P., Beary, M., Hadsell, A., Messersmith, R., & Soltanian-Zadeh, H. (2022, January). RETRACTED: Deep Learning for Autism Diagnosis and Facial Analysis in Children. Frontiers in Computational Neuroscience, 15. doi:10.3389/fncom.2021.789998
  • Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., . . . Guo, B. (2021, October). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (s. 9992–10002). IEEE. doi:10.1109/iccv48922.2021.00986
  • Manfredonia, J., Bangerter, A., Manyakov, N. V., Ness, S., Lewin, D., Skalkin, A., . . . Pandina, G. (2018, October). Automatic Recognition of Posed Facial Expression of Emotion in Individuals with Autism Spectrum Disorder. Journal of Autism and Developmental Disorders, 49, 279–293. doi:10.1007/s10803-018-3757-9
  • Mujeeb Rahman, K. K., & Subashini, M. M. (2022, January). Identification of Autism in Children Using Static Facial Features and Deep Neural Networks. Brain Sciences, 12, 94. doi:10.3390/brainsci12010094
  • P, V., & V, U. M. (2024, January). Identification of Autism Spectrum Disorder in Children from Facial Features Using Deep Learning. 2024 Fourth International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT) (s. 1–6). IEEE. doi:10.1109/icaect60202.2024.10469379
  • Parvej, B., Mahbub Alam, S. M., Fahim, F. I., Pathan, M. N., & Rahaman, M. A. (2024, September). Computer Vision-based Interactive Autism Detection System using Deep Learning. 2024 IEEE International Conference on Computing, Applications and Systems (COMPAS) (s. 1–6). IEEE. doi:10.1109/compas60761.2024.10796046
  • Pelphrey, K. A., Sasson, N. J., Reznick, J. S., Paul, G., Goldman, B. D., & Piven, J. (2002). Journal of Autism and Developmental Disorders, 32, 249–261. doi:10.1023/a:1016374617369
  • Piosenka, G. (2021). Detect Autism from a Facial Image. Detect Autism from a Facial Image. https://www.kaggle.com/cihan063/autism-image-data adresinden alındı
  • Rashid, A., & Shaker, S. (2023, March). Autism spectrum Disorder detection Using Face Features based on Deep Neural network. Wasit Journal of Computer and Mathematics Science, 2, 74–83. doi:10.31185/wjcm.100
  • Rezaee, K., Attar, H., & Khosravi, M. (2023, December). A review of machine learning-based methods for automatically detecting autism spectrum disorder in children’s faces. 2023 2nd International Engineering Conference on Electrical, Energy, and Artificial Intelligence (EICEEAI) (s. 1–5). IEEE. doi:10.1109/eiceeai60672.2023.10590257
  • Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jegou, H. (2021). Training data-efficient image transformers & distillation through attention. M. Meila, & T. Zhang (Dü.), Proceedings of the 38th International Conference on Machine Learning. içinde 139, s. 10347–10357. PMLR. https://proceedings.mlr.press/v139/touvron21a.html adresinden alındı
  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., . . . Polosukhin, I. (2017). Attention Is All You Need. Attention Is All You Need. arXiv. doi:10.48550/ARXIV.1706.03762
  • Xu, W., Xu, Y., Chang, T., & Tu, Z. (2021, October). Co-Scale Conv-Attentional Image Transformers. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE. doi:10.1109/iccv48922.2021.00983

Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme

Year 2025, Volume: 15 Issue: 3, 755 - 764, 01.09.2025
https://doi.org/10.21597/jist.1640353

Abstract

Bu çalışma, yüz görüntülerinden Otizm Spektrum Bozukluğu (OSB) tespiti amacıyla dört farklı Transformer tabanlı derin öğrenme mimarisinin (Vision Transformer (ViT), Swin Transformer (Swin-T), Data-efficient Image Transformer (DeiT) ve Convolutional Transformer (CoaT)) karşılaştırmalı analizini sunmaktadır. Son yıllarda, OSB tespitine yönelik araştırmalarda geleneksel evrişimsel sinir ağları tabanlı yaklaşımların yerini giderek Transformer mimarileri almaya başlamıştır. Bu kapsamda gerçekleştirilen deneyler, Swin-T modelinin %87,76 doğruluk ve 0,96 AUC ile en yüksek sınıflandırma performansına ulaştığını göstermektedir. CoaT modeli %86,01 doğruluk ve 0,94 AUC ile ikinci sırada yer alırken, DeiT (%84,27 doğruluk) ve ViT (%82,52 doğruluk) nispeten daha düşük başarı sergilemiştir. Karışıklık matrisi ve ROC eğrileri analizleri, Swin-T modelinin yanlış pozitif ve yanlış negatif oranlarını önemli ölçüde azalttığını ortaya koymaktadır. Elde edilen bulgular, özellikle Swin-T ve CoaT modellerinin görsel veri işleme konusundaki etkinliğini vurgulamakta ve bu mimarilerin daha büyük veri kümeleri ile desteklendiğinde erken OSB tanısı sürecine klinik ve araştırma alanlarında değerli katkılar sağlayabileceğini öne sürmektedir.

Ethical Statement

Bu çalışmada, kamuya açık veri setleri kullanılmış olup, herhangi bir etik kurul izni gerekmemektedir. Kullanılan veri setleri, https://www.kaggle.com/cihan063/autism-image-data adresinden temin edilmiştir.

References

  • Ahmed, Z. A., Aldhyani, T. H., Jadhav, M. E., Alzahrani, M. Y., Alzahrani, M. E., Althobaiti, M. M., . . . Al-madani, A. M. (2022, April). Facial Features Detection System To Identify Children With Autism Spectrum Disorder: Deep Learning Models. (D. Koundal, Dü.) Computational and Mathematical Methods in Medicine, 2022, 1–9. doi:10.1155/2022/3941049
  • Alam, M. S., Rashid, M. M., Roy, R., Faizabadi, A. R., Gupta, K. D., & Ahsan, M. M. (2022, November). Empirical Study of Autism Spectrum Disorder Diagnosis Using Facial Images by Improved Transfer Learning Approach. Bioengineering, 9, 710. doi:10.3390/bioengineering9110710
  • Alkahtani, H., Aldhyani, T. H., & Alzahrani, M. Y. (2023, April). Deep Learning Algorithms to Identify Autism Spectrum Disorder in Children-Based Facial Landmarks. Applied Sciences, 13, 4855. doi:10.3390/app13084855
  • Angkustsiri, K., Krakowiak, P., Moghaddam, B., Wardinsky, T., Gardner, J., Kalamkarian, N., . . . Hansen, R. L. (2011, May). Minor physical anomalies in children with autism spectrum disorders. Autism, 15, 746–760. doi:10.1177/1362361310397620
  • Awaji, B., Senan, E. M., Olayah, F., Alshari, E. A., Alsulami, M., Abosaq, H. A., . . . Janrao, P. (2023, September). Hybrid Techniques of Facial Feature Image Analysis for Early Detection of Autism Spectrum Disorder Based on Combined CNN Features. Diagnostics, 13, 2948. doi:10.3390/diagnostics13182948
  • Baio, J., Wiggins, L., Christensen, D. L., Maenner, M. J., Daniels, J., Warren, Z., . . . Dowling, N. F. (2018, April). Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years — Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2014. MMWR. Surveillance Summaries, 67, 1–23. doi:10.15585/mmwr.ss6706a1
  • Bazi, Y., Bashmal, L., Rahhal, M. M., Dayil, R. A., & Ajlan, N. A. (2021, February). Vision Transformers for Remote Sensing Image Classification. Remote Sensing, 13, 516. doi:10.3390/rs13030516
  • Beary, M., Hadsell, A., Messersmith, R., & Hosseini, M.-P. (2020). Diagnosis of Autism in Children using Facial Analysis and Deep Learning. Diagnosis of Autism in Children using Facial Analysis and Deep Learning. arXiv. doi:10.48550/ARXIV.2008.02890
  • Das, S., Chaudhury, P., & Tripathy, H. K. (2024, January). Classification of Autism Spectrum Disorder using CNN & Transfer Learning. 2024 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC) (s. 1–7). IEEE. doi:10.1109/assic60049.2024.10507982
  • Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009, June). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. doi:10.1109/cvpr.2009.5206848
  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., . . . Houlsby, N. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ArXiv, abs/2010.11929. https://api.semanticscholar.org/CorpusID:225039882 adresinden alındı
  • Hosseini, M.-P., Beary, M., Hadsell, A., Messersmith, R., & Soltanian-Zadeh, H. (2022, January). RETRACTED: Deep Learning for Autism Diagnosis and Facial Analysis in Children. Frontiers in Computational Neuroscience, 15. doi:10.3389/fncom.2021.789998
  • Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., . . . Guo, B. (2021, October). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (s. 9992–10002). IEEE. doi:10.1109/iccv48922.2021.00986
  • Manfredonia, J., Bangerter, A., Manyakov, N. V., Ness, S., Lewin, D., Skalkin, A., . . . Pandina, G. (2018, October). Automatic Recognition of Posed Facial Expression of Emotion in Individuals with Autism Spectrum Disorder. Journal of Autism and Developmental Disorders, 49, 279–293. doi:10.1007/s10803-018-3757-9
  • Mujeeb Rahman, K. K., & Subashini, M. M. (2022, January). Identification of Autism in Children Using Static Facial Features and Deep Neural Networks. Brain Sciences, 12, 94. doi:10.3390/brainsci12010094
  • P, V., & V, U. M. (2024, January). Identification of Autism Spectrum Disorder in Children from Facial Features Using Deep Learning. 2024 Fourth International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT) (s. 1–6). IEEE. doi:10.1109/icaect60202.2024.10469379
  • Parvej, B., Mahbub Alam, S. M., Fahim, F. I., Pathan, M. N., & Rahaman, M. A. (2024, September). Computer Vision-based Interactive Autism Detection System using Deep Learning. 2024 IEEE International Conference on Computing, Applications and Systems (COMPAS) (s. 1–6). IEEE. doi:10.1109/compas60761.2024.10796046
  • Pelphrey, K. A., Sasson, N. J., Reznick, J. S., Paul, G., Goldman, B. D., & Piven, J. (2002). Journal of Autism and Developmental Disorders, 32, 249–261. doi:10.1023/a:1016374617369
  • Piosenka, G. (2021). Detect Autism from a Facial Image. Detect Autism from a Facial Image. https://www.kaggle.com/cihan063/autism-image-data adresinden alındı
  • Rashid, A., & Shaker, S. (2023, March). Autism spectrum Disorder detection Using Face Features based on Deep Neural network. Wasit Journal of Computer and Mathematics Science, 2, 74–83. doi:10.31185/wjcm.100
  • Rezaee, K., Attar, H., & Khosravi, M. (2023, December). A review of machine learning-based methods for automatically detecting autism spectrum disorder in children’s faces. 2023 2nd International Engineering Conference on Electrical, Energy, and Artificial Intelligence (EICEEAI) (s. 1–5). IEEE. doi:10.1109/eiceeai60672.2023.10590257
  • Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jegou, H. (2021). Training data-efficient image transformers & distillation through attention. M. Meila, & T. Zhang (Dü.), Proceedings of the 38th International Conference on Machine Learning. içinde 139, s. 10347–10357. PMLR. https://proceedings.mlr.press/v139/touvron21a.html adresinden alındı
  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., . . . Polosukhin, I. (2017). Attention Is All You Need. Attention Is All You Need. arXiv. doi:10.48550/ARXIV.1706.03762
  • Xu, W., Xu, Y., Chang, T., & Tu, Z. (2021, October). Co-Scale Conv-Attentional Image Transformers. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE. doi:10.1109/iccv48922.2021.00983
There are 24 citations in total.

Details

Primary Language Turkish
Subjects Computer Software
Journal Section Bilgisayar Mühendisliği / Computer Engineering
Authors

Faruk Cengiz 0009-0009-6825-6532

Fesih Keskin 0000-0002-3798-2912

Early Pub Date August 31, 2025
Publication Date September 1, 2025
Submission Date February 17, 2025
Acceptance Date March 30, 2025
Published in Issue Year 2025 Volume: 15 Issue: 3

Cite

APA Cengiz, F., & Keskin, F. (2025). Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme. Journal of the Institute of Science and Technology, 15(3), 755-764. https://doi.org/10.21597/jist.1640353
AMA Cengiz F, Keskin F. Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme. J. Inst. Sci. and Tech. September 2025;15(3):755-764. doi:10.21597/jist.1640353
Chicago Cengiz, Faruk, and Fesih Keskin. “Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme”. Journal of the Institute of Science and Technology 15, no. 3 (September 2025): 755-64. https://doi.org/10.21597/jist.1640353.
EndNote Cengiz F, Keskin F (September 1, 2025) Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme. Journal of the Institute of Science and Technology 15 3 755–764.
IEEE F. Cengiz and F. Keskin, “Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme”, J. Inst. Sci. and Tech., vol. 15, no. 3, pp. 755–764, 2025, doi: 10.21597/jist.1640353.
ISNAD Cengiz, Faruk - Keskin, Fesih. “Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme”. Journal of the Institute of Science and Technology 15/3 (September2025), 755-764. https://doi.org/10.21597/jist.1640353.
JAMA Cengiz F, Keskin F. Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme. J. Inst. Sci. and Tech. 2025;15:755–764.
MLA Cengiz, Faruk and Fesih Keskin. “Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme”. Journal of the Institute of Science and Technology, vol. 15, no. 3, 2025, pp. 755-64, doi:10.21597/jist.1640353.
Vancouver Cengiz F, Keskin F. Yüz Görüntülerinden Otizm Tespiti İçin Transformer Tabanlı Derin Öğrenme. J. Inst. Sci. and Tech. 2025;15(3):755-64.