Research Article

Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye

Volume: 16 Number: 4 December 31, 2025
TR EN

Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye

Abstract

This study compares statistical and machine learning methods for predicting mathematical literacy among students in Türkiye who participated in PISA 2022. Using data on 6,427 students and 13 standardized predictors capturing cognitive, affective, and contextual dimensions, we evaluated multiple linear regression, least absolute shrinkage and selection operator, random forests, extreme gradient boosting, artificial neural networks, and a stacking ensemble within a 10-fold cross-validation design. Ensemble approaches outperformed linear methods: the stacking model achieved the highest accuracy (out-of-fold R² = .319; RMSE = .777), followed closely by extreme gradient boosting (R² = .313) and random forests (R² = .304). Linear models yielded weaker results (multiple linear regression R² ≈ .270; least absolute shrinkage and selection operator R² = .273). Mean absolute error values were nearly identical across models (≈ .633–.658), with minimal between-fold variation due to rounding at three decimals. Residual analyses indicated that ensemble models produced more stable error structures, whereas linear methods showed stronger heteroskedasticity. Across all approaches, socioeconomic status consistently emerged as the strongest predictor, followed by mathematics self-efficacy and disciplinary climate, underscoring the dual roles of student beliefs and classroom environment. These findings highlight the advantages of ensemble methods for predictive performance and variable-importance estimation, emphasizing the ongoing impact of socioeconomic inequalities on educational outcomes.

Keywords

Supporting Institution

None

Project Number

Not applicable

Ethical Statement

This study uses publicly available PISA 2022 data published by OECD; therefore, no additional ethical approval was required.

Thanks

The authors would like to thank the OECD for providing access to the PISA 2022 dataset.

References

  1. Abd El-Salam, M. E.-F. (2013). The efficiency of some robust ridge regression for handling multicollinearity and non-normals errors problems. Applied Mathematical Sciences, 7(77–80), 3831–3846. https://doi.org/10.12988/ams.2013.36297
  2. Agasisti, T., & Longobardi, S. (2014). Inequality in education: Can Italian disadvantaged students close the gap? Journal of Behavioral and Experimental Economics, 52, 8–20. https://doi.org/10.1016/j.socec.2014.05.002
  3. Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’19), 2623–2631. Association for Computing Machinery. https://doi.org/10.1145/3292500.3330701
  4. Ashcraft, M. H. (2002). Math anxiety: Personal, educational, and cognitive consequences. Current Directions in Psychological Science, 11(5), 181–185. https://doi.org/10.1111/1467-8721.00196
  5. Bandura, A. (1986). The explanatory and predictive scope of self-efficacy theory. Journal of Social and Clinical Psychology, 4(3), 359–373. https://doi.org/10.1521/jscp.1986.4.3.359
  6. Bao, Y., & Wen, H. (2024). Research on prediction of anti-fraud in automobile finance based on XGBoost machine learning algorithm. Proceedings of the International Conference on Digital Economy, Blockchain and Artificial Intelligence (DEBAI 2024), 367–375. Association for Computing Machinery. https://doi.org/10.1145/3700058.3700116
  7. Baskin, I. I., Marcou, G., Horvath, D., & Varnek, A. (2017). Stacking. In J. Bajorath (Ed.), Tutorials in chemoinformatics (pp. 271–278). Wiley. https://doi.org/10.1002/9781119161110.ch19
  8. Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281–305.

Details

Primary Language

English

Subjects

Statistical Analysis Methods, Modelling

Journal Section

Research Article

Publication Date

December 31, 2025

Submission Date

September 12, 2025

Acceptance Date

October 22, 2025

Published in Issue

Year 2025 Volume: 16 Number: 4

APA
Yılmaz, T., & Atalay Kabasakal, K. (2025). Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye. Journal of Measurement and Evaluation in Education and Psychology, 16(4), 241-263. https://doi.org/10.21031/epod.1782727
AMA
1.Yılmaz T, Atalay Kabasakal K. Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye. JMEEP. 2025;16(4):241-263. doi:10.21031/epod.1782727
Chicago
Yılmaz, Taner, and Kübra Atalay Kabasakal. 2025. “Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye”. Journal of Measurement and Evaluation in Education and Psychology 16 (4): 241-63. https://doi.org/10.21031/epod.1782727.
EndNote
Yılmaz T, Atalay Kabasakal K (December 1, 2025) Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye. Journal of Measurement and Evaluation in Education and Psychology 16 4 241–263.
IEEE
[1]T. Yılmaz and K. Atalay Kabasakal, “Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye”, JMEEP, vol. 16, no. 4, pp. 241–263, Dec. 2025, doi: 10.21031/epod.1782727.
ISNAD
Yılmaz, Taner - Atalay Kabasakal, Kübra. “Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye”. Journal of Measurement and Evaluation in Education and Psychology 16/4 (December 1, 2025): 241-263. https://doi.org/10.21031/epod.1782727.
JAMA
1.Yılmaz T, Atalay Kabasakal K. Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye. JMEEP. 2025;16:241–263.
MLA
Yılmaz, Taner, and Kübra Atalay Kabasakal. “Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye”. Journal of Measurement and Evaluation in Education and Psychology, vol. 16, no. 4, Dec. 2025, pp. 241-63, doi:10.21031/epod.1782727.
Vancouver
1.Taner Yılmaz, Kübra Atalay Kabasakal. Comparison of Statistical and Machine Learning Approaches for Predicting Mathematical Literacy: Evidence from PISA 2022 Türkiye. JMEEP. 2025 Dec. 1;16(4):241-63. doi:10.21031/epod.1782727