This paper presents a comparative demonstration of variable selection based on the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and deviance for identifying the best-fitting two-level model using Early Grade Mathematics Assessment (EGMA) data collected in Zambia. A model that included all available predictor variables as fixed effects with random intercepts was run in R using the lme4 package, followed by an input model for comparison using a custom R function introduced by Nimon (2018). The analysis generated 108 models: 99 valid and nine invalid. The study determined a final model as the best fitting based on the principle of parsimony. The final model revealed that fixed effects of students' group-mean-centered reading ability, home reading status, gender, school reading average score, and number of pupils, and the random effect of group-mean-centered reading ability predicted the early-grade mathematics ability. The results from the retained model, compared with the null model, showed substantial improvements in model fit indices with a pseudo-R² value of 0.309. Overall, this study provides an empirical demonstration of selecting predictors among numerous variables for multilevel models, a crucial practical issue that is common in educational research due to the increasing availability of large datasets.
| Primary Language | English |
|---|---|
| Subjects | Statistical Analysis Methods, Modelling |
| Journal Section | Research Article |
| Authors | |
| Submission Date | October 30, 2025 |
| Acceptance Date | March 30, 2026 |
| Publication Date | April 1, 2026 |
| IZ | https://izlik.org/JA39HY39MA |
| Published in Issue | Year 2026 Volume: 17 Issue: 1 |