Feature Selection In Autism Dataset Using Apriori Algorithm
Year 2025,
Volume: 14 Issue: 1, 40 - 53
Hidayet Takcı
,
Gizem Saçal
Abstract
Autism Spectrum Disorder is a neurodevelopmental disorder characterized by linguistic, cognitive, social, and communication delays. Accurate diagnosis of autism is important for effective intervention, and there are various clinical and non-clinical methods for the diagnosis of the disease. Machine learning techniques have increased significantly in this field recently. This study focuses on selecting the most relevant features from the Autism Spectrum Disorder Screening Data for Children located in the UC Irvine Machine Learning Repository. This dataset contains 21 features, and the aim of the study is to determine which features are more important. Apriori algorithm, an association rule mining technique, was used in the analysis of the dataset. Rules with minimum 33% support and minimum 90% confidence interval were obtained with the algorithm. Feature selection was performed based on these rules. The success of the proposed feature selection method was evaluated by comparing it with traditional methods. The experimental results obtained proved the validity of the proposed approach and sampling was performed for autism diagnosis. The proposed method can also be used for other datasets.
References
-
Alghamdi, S., Alzahrani, N., & ударцев, В. С. (2025). Automated Autism Spectrum Disorder Detection from Spontaneous Speech using Transformer Networks. Applied Sciences, 15(2), 789.
-
Allison, C., Auyeung, B., Baron-Cohen, S., 2012. Journal of the American Academy of Child and Adolescent Psychiatry. 51(2), 202-212.
-
Baron-Cohen, S., Wheelwright, S., 2004. The Empathy Quotient: An Investigation of Adults with Asperger Syndrome or High Functioning Autism, and Normal Sex Differences. Journal of Autism and Developmental Disorders, 34(2), 163-175.
-
Battiti, R. (1994). Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks, 5(4), 537–550. https://doi.org/10.1109/72.298224
-
Bekerom, B., 2017. Using machine learning for detection of autism spectrum disorder. Proceedings of 20th Student Conference IT.
-
Constantino, J. N., 2013. Social Responsiveness Scale. Springer eBooks, 2919-2929.
-
Dawson, G., Rogers, S. J., Munson, J., Smith, M., Winter, J., Greenson, J., Donaldson, A. L., Varley, J. A., 2010.
Randomized, Controlled trial of an intervention for toddlers with autism: The Early Start Denver model. Pediatrics, 125(1), 17-23.
-
Gakii, C., Mireji, P. O., Rimiru, R., 2022. Graph based feature selection for reduction of dimensionality in Next-Generation RNA sequencing datasets. Algorithms, 15(1), 21.
-
Hall, M. A. (1999). Correlation-based feature selection for machine learning (Doctoral dissertation, University of Waikato, Hamilton, New Zealand). Retrieved from https://www.cs.waikato.ac.nz/~mhall/thesis.pdf
-
Heinsfeld, A. S., Franco, A. R., Craddock, R. C., Buchweitz, A., Meneguzzi, F., 2018. Identification of autism spectrum disorder using deep learning and the ABIDE dataset. NeuroImage: Clinical, 17, 16-23.
-
Hyde, K., Novack, M. N., LaHaye, N., Parlett-Pelleriti, C., Anden, R., Dixon, D. R., Linstead, E., 2019. Applications of Supervised Machine Learning in Autism Spectrum Disorder Research: a Review. Review Journal of Autism and Developmental Disorders, 6(2), 128-146.
-
Kayacik, H. G., Zincir-Heywood, A. N., & Heywood, M. I. (2007). Feature selection based on association rule mining for intrusion detection. In Proceedings of the 6th annual conference on privacy, security and trust PST'08 (pp. 11-18). IEEE.
-
Khan, M. A., Fatima, T., Ahmad, F., & Ali, M. (2022). Integration of Genetic and Behavioral Data for Enhanced Autism Spectrum Disorder Prediction via Ensemble Machine Learning. BMC Medical Informatics and Decision Making, 22(1), 256.
-
Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF. In European conference on machine learning (pp. 171–182). Springer. https://doi.org/10.1007/3-540-57868-4_57
-
Krishnan, A., Zhang, R., Yao, V., Theesfeld, C. L., Wong, A. K., Tadych, A., Volfovsky, N., Packer, A., Lash, A. E., Troyanskaya, O. G., 2016. Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder. Nature Neuroscience, 19(11), 1454-1462.
-
Li, J., & Liu, H. (2006). Using association rule mining to select useful features for classification. In Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops 1 (pp. 414-417). IEEE Computer Society.
-
Lin, Q., Gao, C., 2021. Discovering categorical main and interaction effects based on association rule mining. arXiv Cornell University.
-
Liu, B., Yu, L., 2005. Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491-502.
-
Liu, W., Li, M., Lee, K., 2016. Identifying children with autism spectrum disorder based on their face processing abnormality: A machine learning framework. Autism Research, 9(8), 888-898.
-
Lord, C., Rutter, M., Couteur, A. L., 1994. Autism Diagnostic Interview-Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. Journal of Autism and Developmental Disorders, 24(5), 659-685.
-
Lord, C., Risi, S., Lambrecht, L., Cook, E. H., Leventhal, B. L., DiLavore, P. C., Pickles, A., Rutter, M., 2000. The Autism diagnostic observation schedule-generic: A standard measure of social and communication deficits associated with the spectrum of autism. Journal of Autism and Developmental Disorders, 30(3), 205-223.
-
Manimekalai, M., Aarthi Priya, A. E., Brindha, S., 2014. Evaluating the Behavioral and Developmental Interventions for Autism Spectrum Disorder. International Journal of Information Sciences and Application, 6(1), 1-10.
-
Narayanan, S. S., Georgiou, P. G.,ным, А. В., & Zhou, Z. (2023). Enhancing Autism Spectrum Disorder Diagnosis in Toddlers Through Multi-Modal Machine Learning of Home Videos and Caregiver Questionnaires. IEEE Transactions on Affective Computing, 14(4), 2456-2468.
-
Pierce, K., Conant, D. A., Hazin, R., Stoner, R. D., Desmond, J. L., 2011. Preference for Geometric Patterns Early in Life as a Risk Factor for Autism. Archives of General Psychiatry, 68(1), 101.
-
Rakotomalala, R., 2005. TANAGRA: a free software for research and academic purposes. Proceedings of EGC'2005, RNTI-E-3, 2, 697-702.
-
Rashad, A., Maghraby, F. A., Fouad, M. M. M., Lashin, Y., Badr, A., 2018. Association rules based classification for autism spectrum disorder detection. International Journal of Intelligent Computing and Information Sciences.
-
Robins, D. L., Fein, D., Barton, M., Green, J. R., 2001. The Modified Checklist for Autism in Toddlers: An initial study investigating the early detection of autism and pervasive developmental disorders. Journal of Autism and Developmental Disorders, 31(2), 131-144.
-
Ronald, A., Hoekstra, R. A., 2011. Autism spectrum disorders and autistic traits: A decade of new twin studies. American Journal of Medical Genetics - Neuropsychiatric Genetics, 156(3), 255-274.
Schopler, E., Reichler, R. J., DeVellis, R. F., Daly, K. R., 1980. Toward objective classification of childhood autism: Childhood Autism Rating Scale (CARS). Journal of Autism and Developmental Disorders, 10(1), 91-103.
-
Scott, F., Baron-Cohen, S., Bolton, P., Brayne, C., 2002. Brief Report Prevalence of Autism Spectrum Conditions in Children Aged 5-11 Years in Cambridgeshire, UK. Autism, 6(3), 231-237.
-
Sharma, A., Hamidi, M. S., & Hotak, Y. (2025). Market Basket Analysis using Machine Learning. Global Journal of Computer Science and Technology, 24(C1), 15–20.
-
Shunmuga Sundari, M., Sampath, T., Sundarambal, B., 2020. Association Rule based Autism Spectrum Disorder Diagnosis using Apriori Algorithm. International Journal of Grid and Distributed Computing, 13(1), 458-464.
-
Thabtah, F. (2017). Autism Spectrum Disorder Screening: Machine Learning Adaptation and DSM-5 Fulfillment. Proceedings of the 1st International Conference on Medical and Health Informatics (ICMHI 2017)
-
Thabtah, F., 2018. Machine learning in autistic spectrum disorder behavioral research: A review and ways forward. Informatics for Health & Social Care, 44(3), 278-297.
-
Wall, D. P., Kosmicki, J. A., DeLuca, T., Harstad, E., Fusaro, V. A., 2012. Use of machine learning to shorten observation-based screening and diagnosis of autism. Translational Psychiatry, 2(4), 100.
-
Witten, I. H., Frank, E., Hall, M. A., Pal, C., 2016. Data Mining, Fourth Edition: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Inc. eBooks.
-
Yahata, N., Morimoto, J., Hashimoto, R., Lisi, G., Shibata, K., Kawakubo, Y., Kuwabara, H., Kuroda, M., Yamada, T., Megumi, F., Imamizu, H., Náñez, J. E., Takahashi, H., Okamoto, Y., Kasai, K., Kato, N., Sasaki, Y., Watanabe, T., Kawato, M., 2016. A small number of abnormal brain connections predicts adult autism spectrum disorder. Nature Communications, 7(1).
-
Yin, X., Han, J., 2003. CPAR: Classification based on predictive association rules. Proceedings of the 2003 SIAM international conference on data mining.
-
Zhang, Y., Wang, J., Li, X., & Chen, Y. (2024). Leveraging Deep Learning on Eye-Tracking Data for Early Detection of Autism Spectrum Disorder. Journal of Biomedical Informatics, 143, 104385.
Apriori Algoritması Yardımıyla Otizm Veri Setinde Özellik Seçimi
Year 2025,
Volume: 14 Issue: 1, 40 - 53
Hidayet Takcı
,
Gizem Saçal
Abstract
Otizm Spektrum Bozukluğu, dilsel, bilişsel, sosyal ve iletişimsel gecikmelerle karakterize edilebilen nörogelişimsel bir bozukluktur. Otizmin doğru tanısı, etkili müdahale için önemlidir ve hastalığın tanısı için çeşitli klinik ve klinik olmayan yöntemler bulunmaktadır. Makine öğrenimi teknikleri son zamanlarda bu alanda belirgin olarak artmıştır. Bu çalışmada, UC Irvine Makine Öğrenmesi Deposunda yer alan Çocuklar İçin Otizm Spektrum Bozukluğu Tarama Verilerinden en alakalı özellikleri seçmeye odaklanmaktadır. Bu veri seti 21 özellik içerir ve çalışmanın amacı hangi özelliklerin daha önemli olduğunu belirlemektir. Veri setinin analizinde bir ilişki kuralı madenciliği tekniği olan Apriori algoritması kullanılmıştır. Algoritma ile minimum %33 destek ve minimum %90 güven aralığına sahip kurallar elde edilmiştir. Bu kurallara dayanarak özellik seçimi yapılmıştır. Önerilen özellik seçimi yönteminin başarısı, geleneksel yöntemlerle karşılaştırılarak değerlendirilmiştir. Elde edilen deneysel sonuçlar önerilen yaklaşımın geçerliliği ispat etmiş olup otizm tanısı için örnekleme yapılmıştır. Önerilen yöntem diğer veri setleri için de kullanılabilir durumdadır.
References
-
Alghamdi, S., Alzahrani, N., & ударцев, В. С. (2025). Automated Autism Spectrum Disorder Detection from Spontaneous Speech using Transformer Networks. Applied Sciences, 15(2), 789.
-
Allison, C., Auyeung, B., Baron-Cohen, S., 2012. Journal of the American Academy of Child and Adolescent Psychiatry. 51(2), 202-212.
-
Baron-Cohen, S., Wheelwright, S., 2004. The Empathy Quotient: An Investigation of Adults with Asperger Syndrome or High Functioning Autism, and Normal Sex Differences. Journal of Autism and Developmental Disorders, 34(2), 163-175.
-
Battiti, R. (1994). Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks, 5(4), 537–550. https://doi.org/10.1109/72.298224
-
Bekerom, B., 2017. Using machine learning for detection of autism spectrum disorder. Proceedings of 20th Student Conference IT.
-
Constantino, J. N., 2013. Social Responsiveness Scale. Springer eBooks, 2919-2929.
-
Dawson, G., Rogers, S. J., Munson, J., Smith, M., Winter, J., Greenson, J., Donaldson, A. L., Varley, J. A., 2010.
Randomized, Controlled trial of an intervention for toddlers with autism: The Early Start Denver model. Pediatrics, 125(1), 17-23.
-
Gakii, C., Mireji, P. O., Rimiru, R., 2022. Graph based feature selection for reduction of dimensionality in Next-Generation RNA sequencing datasets. Algorithms, 15(1), 21.
-
Hall, M. A. (1999). Correlation-based feature selection for machine learning (Doctoral dissertation, University of Waikato, Hamilton, New Zealand). Retrieved from https://www.cs.waikato.ac.nz/~mhall/thesis.pdf
-
Heinsfeld, A. S., Franco, A. R., Craddock, R. C., Buchweitz, A., Meneguzzi, F., 2018. Identification of autism spectrum disorder using deep learning and the ABIDE dataset. NeuroImage: Clinical, 17, 16-23.
-
Hyde, K., Novack, M. N., LaHaye, N., Parlett-Pelleriti, C., Anden, R., Dixon, D. R., Linstead, E., 2019. Applications of Supervised Machine Learning in Autism Spectrum Disorder Research: a Review. Review Journal of Autism and Developmental Disorders, 6(2), 128-146.
-
Kayacik, H. G., Zincir-Heywood, A. N., & Heywood, M. I. (2007). Feature selection based on association rule mining for intrusion detection. In Proceedings of the 6th annual conference on privacy, security and trust PST'08 (pp. 11-18). IEEE.
-
Khan, M. A., Fatima, T., Ahmad, F., & Ali, M. (2022). Integration of Genetic and Behavioral Data for Enhanced Autism Spectrum Disorder Prediction via Ensemble Machine Learning. BMC Medical Informatics and Decision Making, 22(1), 256.
-
Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF. In European conference on machine learning (pp. 171–182). Springer. https://doi.org/10.1007/3-540-57868-4_57
-
Krishnan, A., Zhang, R., Yao, V., Theesfeld, C. L., Wong, A. K., Tadych, A., Volfovsky, N., Packer, A., Lash, A. E., Troyanskaya, O. G., 2016. Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder. Nature Neuroscience, 19(11), 1454-1462.
-
Li, J., & Liu, H. (2006). Using association rule mining to select useful features for classification. In Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops 1 (pp. 414-417). IEEE Computer Society.
-
Lin, Q., Gao, C., 2021. Discovering categorical main and interaction effects based on association rule mining. arXiv Cornell University.
-
Liu, B., Yu, L., 2005. Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491-502.
-
Liu, W., Li, M., Lee, K., 2016. Identifying children with autism spectrum disorder based on their face processing abnormality: A machine learning framework. Autism Research, 9(8), 888-898.
-
Lord, C., Rutter, M., Couteur, A. L., 1994. Autism Diagnostic Interview-Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. Journal of Autism and Developmental Disorders, 24(5), 659-685.
-
Lord, C., Risi, S., Lambrecht, L., Cook, E. H., Leventhal, B. L., DiLavore, P. C., Pickles, A., Rutter, M., 2000. The Autism diagnostic observation schedule-generic: A standard measure of social and communication deficits associated with the spectrum of autism. Journal of Autism and Developmental Disorders, 30(3), 205-223.
-
Manimekalai, M., Aarthi Priya, A. E., Brindha, S., 2014. Evaluating the Behavioral and Developmental Interventions for Autism Spectrum Disorder. International Journal of Information Sciences and Application, 6(1), 1-10.
-
Narayanan, S. S., Georgiou, P. G.,ным, А. В., & Zhou, Z. (2023). Enhancing Autism Spectrum Disorder Diagnosis in Toddlers Through Multi-Modal Machine Learning of Home Videos and Caregiver Questionnaires. IEEE Transactions on Affective Computing, 14(4), 2456-2468.
-
Pierce, K., Conant, D. A., Hazin, R., Stoner, R. D., Desmond, J. L., 2011. Preference for Geometric Patterns Early in Life as a Risk Factor for Autism. Archives of General Psychiatry, 68(1), 101.
-
Rakotomalala, R., 2005. TANAGRA: a free software for research and academic purposes. Proceedings of EGC'2005, RNTI-E-3, 2, 697-702.
-
Rashad, A., Maghraby, F. A., Fouad, M. M. M., Lashin, Y., Badr, A., 2018. Association rules based classification for autism spectrum disorder detection. International Journal of Intelligent Computing and Information Sciences.
-
Robins, D. L., Fein, D., Barton, M., Green, J. R., 2001. The Modified Checklist for Autism in Toddlers: An initial study investigating the early detection of autism and pervasive developmental disorders. Journal of Autism and Developmental Disorders, 31(2), 131-144.
-
Ronald, A., Hoekstra, R. A., 2011. Autism spectrum disorders and autistic traits: A decade of new twin studies. American Journal of Medical Genetics - Neuropsychiatric Genetics, 156(3), 255-274.
Schopler, E., Reichler, R. J., DeVellis, R. F., Daly, K. R., 1980. Toward objective classification of childhood autism: Childhood Autism Rating Scale (CARS). Journal of Autism and Developmental Disorders, 10(1), 91-103.
-
Scott, F., Baron-Cohen, S., Bolton, P., Brayne, C., 2002. Brief Report Prevalence of Autism Spectrum Conditions in Children Aged 5-11 Years in Cambridgeshire, UK. Autism, 6(3), 231-237.
-
Sharma, A., Hamidi, M. S., & Hotak, Y. (2025). Market Basket Analysis using Machine Learning. Global Journal of Computer Science and Technology, 24(C1), 15–20.
-
Shunmuga Sundari, M., Sampath, T., Sundarambal, B., 2020. Association Rule based Autism Spectrum Disorder Diagnosis using Apriori Algorithm. International Journal of Grid and Distributed Computing, 13(1), 458-464.
-
Thabtah, F. (2017). Autism Spectrum Disorder Screening: Machine Learning Adaptation and DSM-5 Fulfillment. Proceedings of the 1st International Conference on Medical and Health Informatics (ICMHI 2017)
-
Thabtah, F., 2018. Machine learning in autistic spectrum disorder behavioral research: A review and ways forward. Informatics for Health & Social Care, 44(3), 278-297.
-
Wall, D. P., Kosmicki, J. A., DeLuca, T., Harstad, E., Fusaro, V. A., 2012. Use of machine learning to shorten observation-based screening and diagnosis of autism. Translational Psychiatry, 2(4), 100.
-
Witten, I. H., Frank, E., Hall, M. A., Pal, C., 2016. Data Mining, Fourth Edition: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Inc. eBooks.
-
Yahata, N., Morimoto, J., Hashimoto, R., Lisi, G., Shibata, K., Kawakubo, Y., Kuwabara, H., Kuroda, M., Yamada, T., Megumi, F., Imamizu, H., Náñez, J. E., Takahashi, H., Okamoto, Y., Kasai, K., Kato, N., Sasaki, Y., Watanabe, T., Kawato, M., 2016. A small number of abnormal brain connections predicts adult autism spectrum disorder. Nature Communications, 7(1).
-
Yin, X., Han, J., 2003. CPAR: Classification based on predictive association rules. Proceedings of the 2003 SIAM international conference on data mining.
-
Zhang, Y., Wang, J., Li, X., & Chen, Y. (2024). Leveraging Deep Learning on Eye-Tracking Data for Early Detection of Autism Spectrum Disorder. Journal of Biomedical Informatics, 143, 104385.