A Comparative Analysis of Machine Learning Techniques to Explore Factors Affecting Mathematics Success in Developing Countries: Turkey, Mexico, Thailand, And Bulgaria Case Studies

Tuba Arpa; Mahmut Çavur

doi:10.59940/jismar.1514958

Dissertation

Gelişmekte Olan Ülkelerde Matematik Başarısını Etkileyen Faktörlerin Araştırılmasında Makine Öğrenme Tekniklerinin Kullanılması: Türkiye, Meksika, Tayland Ve Bulgaristan Örneği

Year 2024, Volume: 6 Issue: 2, 24 - 36, 30.12.2024

Tuba Arpa , Mahmut Çavur

https://doi.org/10.59940/jismar.1514958

Abstract

Bu çalışmada, PISA 2018 verileri kullanılarak, Türkiye, Bulgaristan, Meksika ve Tayland'daki öğrencilerin başarılarını etkileyen faktörlerin, öğrenci üzerindeki etkisinin tespitinde çeşitli makine öğrenimi modellerinin etkinliği karşılaştırılmıştır. Çalışmada regresyon için; doğrusal regresyon, destek vektör makinesi, karar ağacı ve rastgele orman, sınıflandırma için; lojistik regresyon, destek vektör makinesi, karar ağacı ve rastgele orman modelleri kullanılmıştır. Ayrıca, XGBoost matematik başarısının temel belirleyicileri tanımlanmış ve K-Means kümeleme ile eksik verileri doldurulmuştur. Sonuçlara göre, tüm ülkeler için, öğrencilerin ekonomik ve sosyokültürel durumları, evdeki çalışma materyalleri, sorumluluk duyguları ve ailelerinin ilgisi temel katkı faktörlerini oluşturmaktadır. Model başarısı açısından, rastgele orman modeli hem regresyon hem de sınıflandırmada diğer modellere göre daha başarılı olmuş, rastgele orman regresyonu en yüksek R-kare değerlerini (%71-%84) elde etmiştir, doğrusal regresyon ise en düşük değerleri (%22-%43) vermiştir. Buna ek olarak, sınıflandırma algoritmaları ikili ve üçlü sınıflandırma açısından da analiz edilmiş, ikili sınıflandırmanın üçlü sınıflandırmadan daha başarılı olduğu gözlemlenmiştir. Rastgele orman algoritmasının doğruluk skorları ülkeler arasında %73 ile %83 arasında değişmiştir. Çalışmanın bulguları, öğrencinin matematik başarısına etki eden faktörleri tahmin etmek için en uygun algoritmaların seçiminde, karar vericiler için değerli içgörüler sunmakta ve eğitim sonuçlarını iyileştirmeleri için karar vericilere yardımcı olmaktadır.

Keywords

PISA, Makine Öğrenmesi, Öğrenci Başarısı, Matematik Başarısı, Algoritmaları Karşılaştırmak

Ethical Statement

Yüksek Lisans Tezi özeti olarak sunduğum "A Comparative Analysis of Machine Learning Techniques to Explore Factors Affecting Mathematics Success in Developing Countries: Turkey, Mexico, Thailand, And Bulgaria Case Studies " başlıklı makalenin yazılmasında bilimsel ve etik kurallara uyulduğunu, başvurulan kaynaklardan yapılan alıntılarının adlarının bilimsel kurallara uygun olarak metin içinde, dipnotlarda ve kaynaklarda gösterildiğini, kullanılan verilerde herhangi bir tahrifat yapılmadığını beyan ederim.

Supporting Institution

Kadir Has Üniversitesi

Thanks

Tez danışmanım, kıymetli hocam Sn. Mahmut Çavur'a her türlü desteği ve bilgiyi verdiği için teşekkürlerimi sunarım.

References

Aksu, N., Aksu, G. and Saracaloglu, S. (2022). Prediction of the Factors Affecting PISA Mathematics Literacy of Students from Different Countries by Using Data Mining Methods. International Electronic Journal of Elementary Education, 14 (5), 613-629. https://doi.org/10.26822/iejee.2022.267.
Alzubi, J., Anand, N. and Akshi, K. (2018). Machine learning from theory to algorithms: an overview. Journal of physics: conference series.
Bayirli, E.G., Atabey, K., and Ersoy, Ö. (2023). An Analysis of PISA 2018 Mathematics Assessment for Asia-Pacific Countries Using Educational Data Mining. Mathematics 11 (6), 1318.
Caruana, R. and Niculescu-Mizil, A. (2006). “An Empirical Comparison of Supervised Learning Algorithms”. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, 25-29 June 2006. http://dx.doi.org/10.1145/1143844.1143865
Celebi, M. E., Kingravi, H. A., & Vela, P. A. (2013). A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert systems with applications, 40(1), 200-210.
Chen, P.H., Chih‐Jen L., and Bernhard, S. (2005). A tutorial on ν‐support vector machines. Applied Stochastic Models in Business and Industry, 21 (2), 111-136.
Duran, M., and Bekdemir, M. (2013). Görsel Matematik Okuryazarlığı Özyeterlik Algısıyla Görsel Matematik Başarısının Değerlendirilmesi. Pegem Eğitim ve Öğretim Dergisi, 3 (3), 27-40.
Dursun, Ş., and Yüksel, D. (2004). Öğrencilerin matematikte başarısını etkileyen faktörler matematik öğretmenlerinin görüşleri bakımından. Gazi Üniversitesi Gazi Eğitim Fakültesi Dergisi, 24 (2), 217-230.
Fama, E.F. (1965). The behavior of stock-market prices. The journal of Business, 38 (1), 34-105.
Güre, Ö. B., Kayri, M. and Erdoğan, F. (2020). Analysis of Factors Effecting PISA 2015 Mathematics Literacy via Educational Data Mining. Education & Science/Egitim ve Bilim, 45 (202), 393-415.
Hanif, N., and Noman, A. (2016). Relationship between school education and economic growth: SAARC countries. International Journal of Economics and Financial Issues, 6 (1), 294-300.
Hearst, M.A., Susan, T. D., Edgar, O., Platt, J., and Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their applications, 13 (4), 18-28.
Hollands, R. (1990). Development of Mathematical Skills. London: Blackwell Publishers.
Jordan, M. & Mitchell, T.M. (2015). Machine Learning: Trends, Perspectives, and Prospects. Science, 349, 255 – 260, https://doi.org/10.1126/science.aaa8415
Karslı, N., Berberoğlu, G, And Çalişkan, M. (2019). Türkiye’de PISA fen okuryazarlık puanlarını yordayan değişkenler. Uluslararası Bilim ve Eğitim Dergisi, 2 (2): 38-49.
Kılıçaslan, H., and Yavuz, H.. (2019). PISA sonuçları ile Türkiye’de eğitim harcamaları ilişkisi." Bilgi Sosyal Bilimler Dergisi, 21 (2), 296-319.
Kiray, S.A., Bilge, G., & Bozkir, A.S. (2015). Identifying the factors affecting science and mathematics achievement using data mining methods. Journal of Education in Science Environment and Health, 1 (1), 28-48.
Koğar, H. (2015). PISA 2012 matematik okuryazarlığını etkileyen faktörlerin aracılık modeli ile incelenmesi. Eğitim ve Bilim, 40 (179), 45-55.
Korkmaz, C., and Correia, A.P. (2019). A review of research on machine learning in educational technology. Educational Media International, 56 (3), 250-267.
Lezhnina, O., and Gábor, K. (2022). Combining statistical and machine learning methods to explore German students’ attitudes towards ICT in PISA. International Journal of Research & Method in Education, 45 (2), 180-199.
Li, X.H. (2013). Using "random forest" for classification and regression. Chinese Journal of Applied Entomology, 50 (4), 1190-1197.
Liu, Y., Yourong, W., and Jian, Z. (2012). New machine learning algorithm: Random Forest. Information Computing and Applications: Third International Conference, ICICA 2012, Chengde, China, September 14-16, 2012. Proceedings 3.
Ma, J., Theiler, J. & Perkins, S. (2003). Accurate on-line support vector regression. Neural computation, 15 (11), 2683-2703.
MEB. (2019). “PISA-Uluslararası Öğrenci Değerlendirme Programı.” Accessed June 18, 2023. http://pisa.meb.gov.tr/www/raporlar/icerik/5.
Mishra, M, Dash, P.B., Nayak, J., Naik, B. & Swain, S.K. (2020). Deep learning and wavelet transform integrated approach for short-term solar PV power prediction. Measurement, 166, https://doi.org/10.1016/j.measurement.2020.108250.
Nick, T.G., and Campbell. K.M. (2007). Logistic regression. Topics in biostatistics, 1(1), 273-301.
Niss, M. (1994). Mathematics in society. Didactics of mathematics as a scientific discipline, 13, 367-378. OECD (2023). Data. Accessed June 23, 2023. https://www.oecd.org/pisa/data/.
Popescu, C., and Laura D. (2009). The relationship between the level of education and the Development State of a Country. ŞtiinŃe Economice, 1 (7), 475-480.
Priyam, A., Gupta R.A., Anju, R., and Saurabh, S. (2013). Comparative analysis of decision tree classification algorithms. International Journal of current engineering and technology, 3 (2), 334-337.
Saarela, M., Bülent, Y., Mohammed J. Z., and Tommi, K. (2016). "Predicting math performance from raw large-scale educational assessments data: a machine learning approach." JMLR Workshop and Conference Proceedings.
Sharma, A., Dinesh B., and Upendra, S. (2017). "Survey of stock market prediction using machine learning approach." 2017 International conference of electronics, communication and aerospace technology (ICECA).
Sheridan, Kathleen M, David B., Anne P., and Xiaoli W. (2020). Early math professional development: Meeting the challenge through online learning. Early Childhood Education Journal, 48 (2), 223-231.
Uysal, S. (2015). Factors affecting the Mathematics achievement of Turkish students in PISA 2012. Educational Research and Reviews, 10 (12), 1670-1678.
Xu, Min, Pakorn Watanachaturaporn, Pramod K Varshney, and Manoj K A. (2005). "Decision tree regression for soft classification of remote sensing data. Remote Sensing of Environment, 97 (3), 322-336.
Yadav, S.K., and Saurabh P. (2012). Data Mining: A Prediction for Performance Improvement of Engineering Students using Classification. World of Computer Science and Information Technology Journal (WCSIT), (2), 51-56.
Yüksel, M. (2022). PISA 2018 Araştırma Sonuçlarına Göre Ülkelerin Bileşik PISA Performans Sıralaması.

A Comparative Analysis of Machine Learning Techniques to Explore Factors Affecting Mathematics Success in Developing Countries: Turkey, Mexico, Thailand, And Bulgaria Case Studies

Year 2024, Volume: 6 Issue: 2, 24 - 36, 30.12.2024

Tuba Arpa , Mahmut Çavur

https://doi.org/10.59940/jismar.1514958

Abstract

This study explores factors influencing mathematics achievement in Turkey, Bulgaria, Mexico, and Thailand using PISA 2018 data and machine learning models, comparing their performance. Both classification and regression models were utilized: linear regression, support vector machine, decision tree, and random forest for regression; logistic regression, support vector, decision tree, and random forest for classification. Additionally, XGBoost identified key predictors of math achievement, and K-Means filled missing data. According to results, key contributing factors across all countries included students' economic, social, and cultural status, study materials at home, sense of ownership, and family welfare. Regarding model success, random forests outperformed other models in both regression and classification, with Random Forest Regression achieving the highest R-square values (71%-84%) while linear regression has the lowest (22%-43%). In addition, the classification algorithms were analyzed in terms of binary and ternary classification, binary classification proved more successful than ternary, with RF accuracy scores ranging from 73% to 83% across countries. The study's findings offer valuable insights for selecting optimal algorithms for predicting math achievement, aiding decision-makers in enhancing educational outcomes.

Keywords

PISA, Machine Learning, Students' Achievement, Mathematics Achievement, Comparing Algorithms

References

Aksu, N., Aksu, G. and Saracaloglu, S. (2022). Prediction of the Factors Affecting PISA Mathematics Literacy of Students from Different Countries by Using Data Mining Methods. International Electronic Journal of Elementary Education, 14 (5), 613-629. https://doi.org/10.26822/iejee.2022.267.
Alzubi, J., Anand, N. and Akshi, K. (2018). Machine learning from theory to algorithms: an overview. Journal of physics: conference series.
Bayirli, E.G., Atabey, K., and Ersoy, Ö. (2023). An Analysis of PISA 2018 Mathematics Assessment for Asia-Pacific Countries Using Educational Data Mining. Mathematics 11 (6), 1318.
Caruana, R. and Niculescu-Mizil, A. (2006). “An Empirical Comparison of Supervised Learning Algorithms”. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, 25-29 June 2006. http://dx.doi.org/10.1145/1143844.1143865
Celebi, M. E., Kingravi, H. A., & Vela, P. A. (2013). A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert systems with applications, 40(1), 200-210.
Chen, P.H., Chih‐Jen L., and Bernhard, S. (2005). A tutorial on ν‐support vector machines. Applied Stochastic Models in Business and Industry, 21 (2), 111-136.
Duran, M., and Bekdemir, M. (2013). Görsel Matematik Okuryazarlığı Özyeterlik Algısıyla Görsel Matematik Başarısının Değerlendirilmesi. Pegem Eğitim ve Öğretim Dergisi, 3 (3), 27-40.
Dursun, Ş., and Yüksel, D. (2004). Öğrencilerin matematikte başarısını etkileyen faktörler matematik öğretmenlerinin görüşleri bakımından. Gazi Üniversitesi Gazi Eğitim Fakültesi Dergisi, 24 (2), 217-230.
Fama, E.F. (1965). The behavior of stock-market prices. The journal of Business, 38 (1), 34-105.
Güre, Ö. B., Kayri, M. and Erdoğan, F. (2020). Analysis of Factors Effecting PISA 2015 Mathematics Literacy via Educational Data Mining. Education & Science/Egitim ve Bilim, 45 (202), 393-415.
Hanif, N., and Noman, A. (2016). Relationship between school education and economic growth: SAARC countries. International Journal of Economics and Financial Issues, 6 (1), 294-300.
Hearst, M.A., Susan, T. D., Edgar, O., Platt, J., and Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their applications, 13 (4), 18-28.
Hollands, R. (1990). Development of Mathematical Skills. London: Blackwell Publishers.
Jordan, M. & Mitchell, T.M. (2015). Machine Learning: Trends, Perspectives, and Prospects. Science, 349, 255 – 260, https://doi.org/10.1126/science.aaa8415
Karslı, N., Berberoğlu, G, And Çalişkan, M. (2019). Türkiye’de PISA fen okuryazarlık puanlarını yordayan değişkenler. Uluslararası Bilim ve Eğitim Dergisi, 2 (2): 38-49.
Kılıçaslan, H., and Yavuz, H.. (2019). PISA sonuçları ile Türkiye’de eğitim harcamaları ilişkisi." Bilgi Sosyal Bilimler Dergisi, 21 (2), 296-319.
Kiray, S.A., Bilge, G., & Bozkir, A.S. (2015). Identifying the factors affecting science and mathematics achievement using data mining methods. Journal of Education in Science Environment and Health, 1 (1), 28-48.
Koğar, H. (2015). PISA 2012 matematik okuryazarlığını etkileyen faktörlerin aracılık modeli ile incelenmesi. Eğitim ve Bilim, 40 (179), 45-55.
Korkmaz, C., and Correia, A.P. (2019). A review of research on machine learning in educational technology. Educational Media International, 56 (3), 250-267.
Lezhnina, O., and Gábor, K. (2022). Combining statistical and machine learning methods to explore German students’ attitudes towards ICT in PISA. International Journal of Research & Method in Education, 45 (2), 180-199.
Li, X.H. (2013). Using "random forest" for classification and regression. Chinese Journal of Applied Entomology, 50 (4), 1190-1197.
Liu, Y., Yourong, W., and Jian, Z. (2012). New machine learning algorithm: Random Forest. Information Computing and Applications: Third International Conference, ICICA 2012, Chengde, China, September 14-16, 2012. Proceedings 3.
Ma, J., Theiler, J. & Perkins, S. (2003). Accurate on-line support vector regression. Neural computation, 15 (11), 2683-2703.
MEB. (2019). “PISA-Uluslararası Öğrenci Değerlendirme Programı.” Accessed June 18, 2023. http://pisa.meb.gov.tr/www/raporlar/icerik/5.
Mishra, M, Dash, P.B., Nayak, J., Naik, B. & Swain, S.K. (2020). Deep learning and wavelet transform integrated approach for short-term solar PV power prediction. Measurement, 166, https://doi.org/10.1016/j.measurement.2020.108250.
Nick, T.G., and Campbell. K.M. (2007). Logistic regression. Topics in biostatistics, 1(1), 273-301.
Niss, M. (1994). Mathematics in society. Didactics of mathematics as a scientific discipline, 13, 367-378. OECD (2023). Data. Accessed June 23, 2023. https://www.oecd.org/pisa/data/.
Popescu, C., and Laura D. (2009). The relationship between the level of education and the Development State of a Country. ŞtiinŃe Economice, 1 (7), 475-480.
Priyam, A., Gupta R.A., Anju, R., and Saurabh, S. (2013). Comparative analysis of decision tree classification algorithms. International Journal of current engineering and technology, 3 (2), 334-337.
Saarela, M., Bülent, Y., Mohammed J. Z., and Tommi, K. (2016). "Predicting math performance from raw large-scale educational assessments data: a machine learning approach." JMLR Workshop and Conference Proceedings.
Sharma, A., Dinesh B., and Upendra, S. (2017). "Survey of stock market prediction using machine learning approach." 2017 International conference of electronics, communication and aerospace technology (ICECA).
Sheridan, Kathleen M, David B., Anne P., and Xiaoli W. (2020). Early math professional development: Meeting the challenge through online learning. Early Childhood Education Journal, 48 (2), 223-231.
Uysal, S. (2015). Factors affecting the Mathematics achievement of Turkish students in PISA 2012. Educational Research and Reviews, 10 (12), 1670-1678.
Xu, Min, Pakorn Watanachaturaporn, Pramod K Varshney, and Manoj K A. (2005). "Decision tree regression for soft classification of remote sensing data. Remote Sensing of Environment, 97 (3), 322-336.
Yadav, S.K., and Saurabh P. (2012). Data Mining: A Prediction for Performance Improvement of Engineering Students using Classification. World of Computer Science and Information Technology Journal (WCSIT), (2), 51-56.
Yüksel, M. (2022). PISA 2018 Araştırma Sonuçlarına Göre Ülkelerin Bileşik PISA Performans Sıralaması.

There are 36 citations in total.

Details

Primary Language	English
Subjects	Management Information Systems, Data Engineering and Data Science, Data Management and Data Science (Other), Reinforcement Learning
Journal Section	Vol 6 - Issue 2 - 30 December 2024 [en]
Authors	Tuba Arpa 0000-0001-5359-3258 Mahmut Çavur 0000-0002-1256-2700
Publication Date	December 30, 2024
Submission Date	July 13, 2024
Acceptance Date	December 21, 2024
Published in Issue	Year 2024 Volume: 6 Issue: 2

Cite

APA	Arpa, T., & Çavur, M. (2024). A Comparative Analysis of Machine Learning Techniques to Explore Factors Affecting Mathematics Success in Developing Countries: Turkey, Mexico, Thailand, And Bulgaria Case Studies. Journal of Information Systems and Management Research, 6(2), 24-36. https://doi.org/10.59940/jismar.1514958

Download Cover Image

Article Files

Full Text