Machine learning methods have gained increasing attention in the field of education due to advancing technological tools and rapidly growing data. The general focus of this attention is on identifying the best method, but it is also critical to determine the extent to which the methods under consideration differ statistically and to correctly identify variable importance metrics. In this study, we benchmarked the performance of twenty-three machine learning algorithms on real educational data via cross-validation based on criteria such as accuracy, AUC and F1-score. Besides, the methods were statistically compared using DeLong and McNemar tests. The findings showed that the LightGBM method appeared to be the best method and presented the most important factors determining student achievement according to this method. The systematic process followed in the study is considered to yield valuable insights for data-driven studies as well as the field of education.
Student performance Machine learning Feature selection Artificial intelligence Statistical analysis
Primary Language | English |
---|---|
Subjects | Machine Vision , Machine Learning (Other), Data Mining and Knowledge Discovery |
Journal Section | Research Articles |
Authors | |
Publication Date | September 26, 2024 |
Submission Date | October 31, 2023 |
Acceptance Date | July 2, 2024 |
Published in Issue | Year 2024 Volume: 7 Issue: 2 |
Journal
of Intelligent Systems: Theory and Applications