Research Article
BibTex RIS Cite
Year 2024, Volume: 7 Issue: 2, 43 - 55, 26.09.2024
https://doi.org/10.38016/jista.1383998

Abstract

References

  • Adejo, O. W., & Connolly, T. (2018). Predicting student academic performance using multi-model heterogeneous ensemble approach. Journal of Applied Research in Higher Education, 10(1), 61–75. https://doi.org/10.1108/JARHE-09-2017-0113
  • Alalawi, K., Athauda, R., & Chiong, R. (2023). Contextualizing the current state of research on the use of machine learning for student performance prediction: A systematic literature review. Engineering Reports, 5(12), e12699. https://doi.org/10.1002/eng2.1269
  • Albreiki, B., Zaki, N., & Alashwal, H. (2021). A Systematic Literature Review of Student’ Performance Prediction Using Machine Learning Techniques. Education Sciences, 11(9), Article 9. https://doi.org/10.3390/educsci11090552
  • Asselman, A., Khaldi, M., & Aammou, S. (2023). Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments, 31(6), 3360–3379. https://doi.org/10.1080/10494820.2021.1928235
  • Breiman, L. (1996). Bagging predictors. Machine learning, 24, 123-140.
  • Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
  • Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. Classification and Regression Trees (CART). 1984. Belmont, CA, USA: Wadsworth International Group.
  • Chen, Y., & Zhai, L. (2023). A comparative study on student performance prediction using machine learning. Education and Information Technologies, 28(9), 12039–12057. https://doi.org/10.1007/s10639-023-11672-1
  • Chipman, H. A., George, E. I., & McCulloch, R. E. (2010). BART: Bayesian additive regression trees.
  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1), 21-27.
  • Cox, D. R. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society: Series B (Methodological), 20(2), 215–232.
  • DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 837-845.
  • Deo, R. C., Yaseen, Z. M., Al-Ansari, N., Nguyen-Huy, T., Langlands, T. A. M., & Galligan, L. (2020). Modern Artificial Intelligence Model Development for Undergraduate Student Performance Prediction: An Investigation on Engineering Mathematics Courses. IEEE Access, 8, 136697–136724. https://doi.org/10.1109/ACCESS.2020.3010938
  • Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine learning, 29, 103-130.
  • Elbadrawy, A., Polyzou, A., Ren, Z., Sweeney, M., Karypis, G., & Rangwala, H. (2016). Predicting Student Performance Using Personalized Analytics. Computer, 49(4), 61–69. https://doi.org/10.1109/MC.2016.119
  • Filho S., , R. L. C., Brito, K., & Adeodato, P. J. L. (2023). A data mining framework for reporting trends in the predictive contribution of factors related to educational achievement. Expert Systems with Applications, 221, 119729.
  • Freund, Y., & Schapire, R. E. (1996, July). Experiments with a new boosting algorithm. In icml (Vol. 96, pp. 148-156).
  • Friedman, J. H. (1991). Multivariate adaptive regression splines. The annals of statistics, 19(1), 1-67.
  • Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles.
  • Gamulin, J., Gamulin, O., & Kermek, D. (2016). Using Fourier coefficients in time series analysis for student performance prediction in blended learning environments. Expert Systems, 33(2), 189–200. https://doi.org/10.1111/exsy.12142
  • Guan, C., Mou, J., & Jiang, Z. (2020). Artificial intelligence innovation in education: A twenty-year data-driven historical analysis. International Journal of Innovation Studies, 4(4), 134–147. https://doi.org/10.1016/j.ijis.2020.09.001
  • Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural networks, 2(5), 359-366.
  • Hussain, M., Zhu, W., Zhang, W., Abidi, S. M. R., & Ali, S. (2019). Using machine learning to predict student difficulties from learning session data. Artificial Intelligence Review, 52(1), 381–407. https://doi.org/10.1007/s10462-018-9620-8
  • Karaboğa, H. A., & Demir, I. (2023). Examining the factors affecting students' science success with Bayesian networks. International Journal of Assessment Tools in Education, 10(3), 413-433.
  • Liu, J., Loh, L., Ng, E., Chen, Y., Wood, K. L., & Lim, K. H. (2020). Self-Evolving Adaptive Learning for Personalized Education. Conference Companion Publication of the 2020 on Computer Supported Cooperative Work and Social Computing, 317–321. https://doi.org/10.1145/3406865.3418326
  • McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2), 153-157.
  • Pallathadka, H., Wenda, A., Ramirez-Asís, E., Asís-López, M., Flores-Albornoz, J., & Phasinam, K. (2023). Classification and prediction of student performance data using various machine learning algorithms. Materials Today: Proceedings, 80, 3782–3785. https://doi.org/10.1016/j.matpr.2021.07.382
  • Quinlan, J. R. (1992). Learning with continuous classes. In 5th Australian joint conference on artificial intelligence (Vol. 92, pp. 343-348).
  • Quinlan, J. R. (1993). Combining instance-based and model-based learning. In Proceedings of the tenth international conference on machine learning (pp. 236-243).
  • Sağlam, A. Ç., & Aydoğmuş, M. (2016). Gelişmiş ve Gelişmekte Olan Ülkelerin Eğitim Sistemlerinin Denetim Yapıları Karşılaştırıldığında Türkiye Eğitim Sisteminin Denetimi Ne Durumdadır? Uşak Üniversitesi Sosyal Bilimler Dergisi, 9(1), 17–38. https://dergipark.org.tr/en/pub/usaksosbil/issue/21662/232993
  • Schapire, R. E. (1990). The strength of weak learnability. Machine learning, 5, 197-227.
  • Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press.
  • Sekeroglu, B., Abiyev, R., Ilhan, A., Arslan, M., & Idoko, J. B. (2021). Systematic Literature Review on Machine Learning and Student Performance Prediction: Critical Gaps and Possible Remedies. Applied Sciences, 11(22), Article 22. https://doi.org/10.3390/app112210907
  • Students Performance. (2023). Retrieved 25 September 2023, from https://www.kaggle.com/datasets/joebeachcapital/students-performance
  • Suleiman, R., & Anane, R. (2022). Institutional Data Analysis and Machine Learning Prediction of Student Performance. 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), 1480–1485. https://doi.org/10.1109/CSCWD54268.2022.9776102
  • Tilahun, L. A., & Sekeroglu, B. (2020). An intelligent and personalized course advising model for higher educational institutes. SN Applied Sciences, 2(10), 1635. https://doi.org/10.1007/s42452-020-03440-4
  • Tran, T.-O., Dang, H.-T., Dinh, V.-T., Truong, T.-M.-N., Vuong, T.-P.-T., & Phan, X.-H. (2017). Performance Prediction for Students: A Multi-Strategy Approach. Cybernetics and Information Technologies, 17(2), 164–182. https://doi.org/10.1515/cait-2017-0024
  • Vapnik, V., Golowich, S., & Smola, A. (1996). Support vector method for function approximation, regression estimation and signal processing. Advances in neural information processing systems, 9.
  • Wold, H. (1982). Soft modelling: the basic design and some extensions. Systems under indirect observation, Part II, 36-37.
  • Wold, S., Ruhe, A., Wold, H., & Dunn, Iii, W. J. (1984). The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM Journal on Scientific and Statistical Computing, 5(3), 735-743.
  • Wu, Z., He, T., Mao, C., & Huang, C. (2020). Exam paper generation based on performance prediction of student group. Information Sciences, 532, 72–90. https://doi.org/10.1016/j.ins.2020.04.043
  • Yousafzai, B. K., Hayat, M., & Afzal, S. (2020). Application of machine learning and data mining in predicting the performance of intermediate and secondary education level student. Education and Information Technologies, 25(6), 4677–4697. https://doi.org/10.1007/s10639-020-10189-1

Assessment of effective factors on student performance based on machine learning methods

Year 2024, Volume: 7 Issue: 2, 43 - 55, 26.09.2024
https://doi.org/10.38016/jista.1383998

Abstract

Machine learning methods have gained increasing attention in the field of education due to advancing technological tools and rapidly growing data. The general focus of this attention is on identifying the best method, but it is also critical to determine the extent to which the methods under consideration differ statistically and to correctly identify variable importance metrics. In this study, we benchmarked the performance of twenty-three machine learning algorithms on real educational data via cross-validation based on criteria such as accuracy, AUC and F1-score. Besides, the methods were statistically compared using DeLong and McNemar tests. The findings showed that the LightGBM method appeared to be the best method and presented the most important factors determining student achievement according to this method. The systematic process followed in the study is considered to yield valuable insights for data-driven studies as well as the field of education.

References

  • Adejo, O. W., & Connolly, T. (2018). Predicting student academic performance using multi-model heterogeneous ensemble approach. Journal of Applied Research in Higher Education, 10(1), 61–75. https://doi.org/10.1108/JARHE-09-2017-0113
  • Alalawi, K., Athauda, R., & Chiong, R. (2023). Contextualizing the current state of research on the use of machine learning for student performance prediction: A systematic literature review. Engineering Reports, 5(12), e12699. https://doi.org/10.1002/eng2.1269
  • Albreiki, B., Zaki, N., & Alashwal, H. (2021). A Systematic Literature Review of Student’ Performance Prediction Using Machine Learning Techniques. Education Sciences, 11(9), Article 9. https://doi.org/10.3390/educsci11090552
  • Asselman, A., Khaldi, M., & Aammou, S. (2023). Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments, 31(6), 3360–3379. https://doi.org/10.1080/10494820.2021.1928235
  • Breiman, L. (1996). Bagging predictors. Machine learning, 24, 123-140.
  • Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
  • Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. Classification and Regression Trees (CART). 1984. Belmont, CA, USA: Wadsworth International Group.
  • Chen, Y., & Zhai, L. (2023). A comparative study on student performance prediction using machine learning. Education and Information Technologies, 28(9), 12039–12057. https://doi.org/10.1007/s10639-023-11672-1
  • Chipman, H. A., George, E. I., & McCulloch, R. E. (2010). BART: Bayesian additive regression trees.
  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1), 21-27.
  • Cox, D. R. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society: Series B (Methodological), 20(2), 215–232.
  • DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 837-845.
  • Deo, R. C., Yaseen, Z. M., Al-Ansari, N., Nguyen-Huy, T., Langlands, T. A. M., & Galligan, L. (2020). Modern Artificial Intelligence Model Development for Undergraduate Student Performance Prediction: An Investigation on Engineering Mathematics Courses. IEEE Access, 8, 136697–136724. https://doi.org/10.1109/ACCESS.2020.3010938
  • Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine learning, 29, 103-130.
  • Elbadrawy, A., Polyzou, A., Ren, Z., Sweeney, M., Karypis, G., & Rangwala, H. (2016). Predicting Student Performance Using Personalized Analytics. Computer, 49(4), 61–69. https://doi.org/10.1109/MC.2016.119
  • Filho S., , R. L. C., Brito, K., & Adeodato, P. J. L. (2023). A data mining framework for reporting trends in the predictive contribution of factors related to educational achievement. Expert Systems with Applications, 221, 119729.
  • Freund, Y., & Schapire, R. E. (1996, July). Experiments with a new boosting algorithm. In icml (Vol. 96, pp. 148-156).
  • Friedman, J. H. (1991). Multivariate adaptive regression splines. The annals of statistics, 19(1), 1-67.
  • Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles.
  • Gamulin, J., Gamulin, O., & Kermek, D. (2016). Using Fourier coefficients in time series analysis for student performance prediction in blended learning environments. Expert Systems, 33(2), 189–200. https://doi.org/10.1111/exsy.12142
  • Guan, C., Mou, J., & Jiang, Z. (2020). Artificial intelligence innovation in education: A twenty-year data-driven historical analysis. International Journal of Innovation Studies, 4(4), 134–147. https://doi.org/10.1016/j.ijis.2020.09.001
  • Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural networks, 2(5), 359-366.
  • Hussain, M., Zhu, W., Zhang, W., Abidi, S. M. R., & Ali, S. (2019). Using machine learning to predict student difficulties from learning session data. Artificial Intelligence Review, 52(1), 381–407. https://doi.org/10.1007/s10462-018-9620-8
  • Karaboğa, H. A., & Demir, I. (2023). Examining the factors affecting students' science success with Bayesian networks. International Journal of Assessment Tools in Education, 10(3), 413-433.
  • Liu, J., Loh, L., Ng, E., Chen, Y., Wood, K. L., & Lim, K. H. (2020). Self-Evolving Adaptive Learning for Personalized Education. Conference Companion Publication of the 2020 on Computer Supported Cooperative Work and Social Computing, 317–321. https://doi.org/10.1145/3406865.3418326
  • McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2), 153-157.
  • Pallathadka, H., Wenda, A., Ramirez-Asís, E., Asís-López, M., Flores-Albornoz, J., & Phasinam, K. (2023). Classification and prediction of student performance data using various machine learning algorithms. Materials Today: Proceedings, 80, 3782–3785. https://doi.org/10.1016/j.matpr.2021.07.382
  • Quinlan, J. R. (1992). Learning with continuous classes. In 5th Australian joint conference on artificial intelligence (Vol. 92, pp. 343-348).
  • Quinlan, J. R. (1993). Combining instance-based and model-based learning. In Proceedings of the tenth international conference on machine learning (pp. 236-243).
  • Sağlam, A. Ç., & Aydoğmuş, M. (2016). Gelişmiş ve Gelişmekte Olan Ülkelerin Eğitim Sistemlerinin Denetim Yapıları Karşılaştırıldığında Türkiye Eğitim Sisteminin Denetimi Ne Durumdadır? Uşak Üniversitesi Sosyal Bilimler Dergisi, 9(1), 17–38. https://dergipark.org.tr/en/pub/usaksosbil/issue/21662/232993
  • Schapire, R. E. (1990). The strength of weak learnability. Machine learning, 5, 197-227.
  • Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press.
  • Sekeroglu, B., Abiyev, R., Ilhan, A., Arslan, M., & Idoko, J. B. (2021). Systematic Literature Review on Machine Learning and Student Performance Prediction: Critical Gaps and Possible Remedies. Applied Sciences, 11(22), Article 22. https://doi.org/10.3390/app112210907
  • Students Performance. (2023). Retrieved 25 September 2023, from https://www.kaggle.com/datasets/joebeachcapital/students-performance
  • Suleiman, R., & Anane, R. (2022). Institutional Data Analysis and Machine Learning Prediction of Student Performance. 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), 1480–1485. https://doi.org/10.1109/CSCWD54268.2022.9776102
  • Tilahun, L. A., & Sekeroglu, B. (2020). An intelligent and personalized course advising model for higher educational institutes. SN Applied Sciences, 2(10), 1635. https://doi.org/10.1007/s42452-020-03440-4
  • Tran, T.-O., Dang, H.-T., Dinh, V.-T., Truong, T.-M.-N., Vuong, T.-P.-T., & Phan, X.-H. (2017). Performance Prediction for Students: A Multi-Strategy Approach. Cybernetics and Information Technologies, 17(2), 164–182. https://doi.org/10.1515/cait-2017-0024
  • Vapnik, V., Golowich, S., & Smola, A. (1996). Support vector method for function approximation, regression estimation and signal processing. Advances in neural information processing systems, 9.
  • Wold, H. (1982). Soft modelling: the basic design and some extensions. Systems under indirect observation, Part II, 36-37.
  • Wold, S., Ruhe, A., Wold, H., & Dunn, Iii, W. J. (1984). The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM Journal on Scientific and Statistical Computing, 5(3), 735-743.
  • Wu, Z., He, T., Mao, C., & Huang, C. (2020). Exam paper generation based on performance prediction of student group. Information Sciences, 532, 72–90. https://doi.org/10.1016/j.ins.2020.04.043
  • Yousafzai, B. K., Hayat, M., & Afzal, S. (2020). Application of machine learning and data mining in predicting the performance of intermediate and secondary education level student. Education and Information Technologies, 25(6), 4677–4697. https://doi.org/10.1007/s10639-020-10189-1
There are 42 citations in total.

Details

Primary Language English
Subjects Machine Vision , Machine Learning (Other), Data Mining and Knowledge Discovery
Journal Section Research Articles
Authors

Hasan Yıldırım 0000-0003-4582-9018

Publication Date September 26, 2024
Submission Date October 31, 2023
Acceptance Date July 2, 2024
Published in Issue Year 2024 Volume: 7 Issue: 2

Cite

APA Yıldırım, H. (2024). Assessment of effective factors on student performance based on machine learning methods. Journal of Intelligent Systems: Theory and Applications, 7(2), 43-55. https://doi.org/10.38016/jista.1383998
AMA Yıldırım H. Assessment of effective factors on student performance based on machine learning methods. JISTA. September 2024;7(2):43-55. doi:10.38016/jista.1383998
Chicago Yıldırım, Hasan. “Assessment of Effective Factors on Student Performance Based on Machine Learning Methods”. Journal of Intelligent Systems: Theory and Applications 7, no. 2 (September 2024): 43-55. https://doi.org/10.38016/jista.1383998.
EndNote Yıldırım H (September 1, 2024) Assessment of effective factors on student performance based on machine learning methods. Journal of Intelligent Systems: Theory and Applications 7 2 43–55.
IEEE H. Yıldırım, “Assessment of effective factors on student performance based on machine learning methods”, JISTA, vol. 7, no. 2, pp. 43–55, 2024, doi: 10.38016/jista.1383998.
ISNAD Yıldırım, Hasan. “Assessment of Effective Factors on Student Performance Based on Machine Learning Methods”. Journal of Intelligent Systems: Theory and Applications 7/2 (September 2024), 43-55. https://doi.org/10.38016/jista.1383998.
JAMA Yıldırım H. Assessment of effective factors on student performance based on machine learning methods. JISTA. 2024;7:43–55.
MLA Yıldırım, Hasan. “Assessment of Effective Factors on Student Performance Based on Machine Learning Methods”. Journal of Intelligent Systems: Theory and Applications, vol. 7, no. 2, 2024, pp. 43-55, doi:10.38016/jista.1383998.
Vancouver Yıldırım H. Assessment of effective factors on student performance based on machine learning methods. JISTA. 2024;7(2):43-55.

Journal of Intelligent Systems: Theory and Applications