This study aims to compare the classification performance of machine learning methods Gradient Boosting (GB) and Extreme Gradient Boosting (XGBoost). The Trends in International Mathematics and Science Study 2019 (TIMSS 2019) science data set was used in the study. The dataset consists of data collected from a total of 2565 students, 1309 of whom are girls (51%) and 1256 (49%) are boys. A Python-based program was used for data analysis. In the study, Area Under the Curve (AUC), accuracy, precision, recall, F1 score, Matthews correlation coefficient (MCC), and training time were used as performance indicators. The study revealed that hyperparameter tuning had a positive impact on the performance of both methods. The analysis results show that the GB method was more successful compared to the XGBoost method in all performance measures except for training time. According to the GB method, 'student confidence in science' was identified as the most influential factor in science achievement, while the XGBoost method highlighted 'home educational resources' as the most significant predictor.
The current study is not a study requiring ethics committee approval since it was prepared using an open access dataset.
No support was received from any individuals, institutions, or organizations in the conduct of this study.
Primary Language | English |
---|---|
Subjects | Statistical Data Science, Applied Statistics |
Journal Section | Research Article |
Authors | |
Early Pub Date | June 27, 2025 |
Publication Date | June 30, 2025 |
Submission Date | February 10, 2025 |
Acceptance Date | June 25, 2025 |
Published in Issue | Year 2025 Volume: 14 Issue: 2 |