An Empirical Demonstration of Selecting Predictors for Multilevel Models

Brian Mumba; Burak Aydın

Research Article

BibTex

RIS

Cite

Year 2026, Volume: 17 Issue: 1 , 42 - 62 , 01.04.2026

Brian Mumba , Burak Aydın

https://izlik.org/JA39HY39MA

Abstract

References

Aho, K., Derryberry, D., & Peterson, T. (2014). Model selection for ecologists: The worldviews of AIC and BIC. Ecology, 95(3), 631–636. https://doi.org/10.1890/13-1452.1
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1109/TAC.1974.1100705
Aydın, B. (2016). Çok düzeyli modeller: Sürekli değişken ile iki düzeyli model örneği ve R programı ile analizi [Multilevel models: Example with continuous variable and two-level model analyzed in R program]. Ege Eğitim Dergisi, 17(2), 567–596. https://doi.org/10.12984/egeefd.280758
Baskin, A. W. (2023). Campus-level teacher turnover in Texas public elementary schools: An examination of the impact of leadership factors and school demographics using hierarchical linear modeling [Doctoral dissertation, University of Texas]. University of Texas Repository.
Bates, D. (2005). Fitting linear mixed models in R. R News, 5(1), 27–30.
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
Blatchford, P., Bassett, P., & Brown, P. (2011). Examining the effect of class size on classroom engagement and teacher–pupil interaction: Differences in relation to pupil prior attainment and primary vs. secondary schools. Learning and Instruction, 21(6), 715–730. https://doi.org/10.1016/j.learninstruc.2011.04.001
Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods & Research, 33(2), 261–304. https://doi.org/10.1177/0049124104268644
Buscemi, S., & Plaia, A. (2020). Model selection in linear mixed-effect models. AStA Advances in statistical Analysis, 104(4), 529–575. https://doi.org/10.1007/s10182-019-00359-z
Diego, V. P., Manusov, E. G., Mao, X., Curran, J. E., Göring, H., Almeida, M., Mahaney, M. C., Peralta, J. M., Blangero, J., & Williams-Blangero, S. (2023). Genotype-by-socioeconomic status interaction influences heart disease risk scores and carotid artery thickness in Mexican Americans: The predominant role of education in comparison to household income and socioeconomic index. Frontiers in Genetics, 14, 1132110. https://doi.org/10.3389/fgene.2023.1132110
Fokkema, M., Smits, N., Zeileis, A., Hothorn, T., & Kelderman, H. (2017). Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees. Behavior Research Methods, 50, 2016–2034. https://doi.org/10.3758/s13428-017-0971x
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press
Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., ... & Modrák, M. (2020). Bayesian workflow. arXiv preprint arXiv:2011.01808. https://arxiv.org/abs/2011.01808
Goldstein, H. (2011). Multilevel statistical models (4th ed.). John Wiley & Sons.
Gove, A., Habib, S., Piper, B., & Ralaingita, W. (2013). Classroom-up policy change: Early reading and math assessments at work. Research in Comparative and International Education, 8(3), 373–386. https://doi.org/10.2304/rcie.2013.8.3.373
Hamaker, E. L. (2023). The within-between dispute in cross-lagged panel research and how to move forward. Psychological Methods, 28(1), 1–15. https://doi.org/10.1037/met0000473
Hamaker, E. L., Kuiper, R. M., & Grasman, R. P. (2015). A critique of the cross-lagged panel model. Psychological Methods, 20(1), 102–116. https://doi.org/10.1037/a0038889
Harrison, X. A., Donaldson, L., Correa-Cano, M. E., Evans, J., Fisher, D. N., Goodwin, C. E., Robinson, B. S., Hodgson, D. J., & Inger, R. (2018). A brief introduction to mixed effects modelling and multimodel inference in ecology. PeerJ, 6, e4794. https://doi.org/10.7717/peerj.4794
Johnson, J. B., & Omland, K. S. (2004). Model selection in ecology and evolution. Trends in Ecology & Evolution, 19(2), 101–108. https://doi.org/10.1016/j.tree.2003.10.013
Kuha, J. (2004). AIC and BIC: Comparisons of assumptions and performance. Sociological Methods & Research, 33(2), 188–229. https://doi.org/10.1177/0049124103262065
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13
Lorah, J., & Womack, A. (2019). Value of sample size for computation of the Bayesian information criterion (BIC) in multilevel modeling. Behavior Research Methods, 51(1), 440–450. https://doi.org/10.3758/s13428-018-1188-3
Luke, D. A. (2020). Multilevel Modeling (2nd ed.). Sage Publications.
McNeish, D. (2017). Small sample methods for multilevel modeling: A colloquial elucidation of REML and the Kenward-Roger correction. Multivariate Behavioral Research, 52(5), 661–670. https://doi.org/10.1080/00273171.2017.1344538
McNeish, D., & Stapleton, L. M. (2016). The effect of small sample size on two-level model estimates: A review and illustration. Educational Psychology Review, 28(2), 295–314. https://doi.org/10.1007/s10648-014-9287-x
Nimon, K. (2018). apsl2lme: A model-selection diagnostic tool for hierarchical linear models. General Linear Model Journal, 44(1), 1–9. https://doi.org/10.31523/glmj.044002.002
Ortega, A., & Navarrete, G. (2017). Bayesian hypothesis testing: An alternative to null hypothesis significance testing (NHST) in psychology and social sciences. Frontiers in Psychology, 8, 608. https://doi.org/10.3389/fpsyg.2017.00608
Pinheiro, J., Bates, D., DebRoy, S., Sarkar, D., & R Core Team. (2023). nlme: Linear and nonlinear mixed effects models (R package version 3.1-163) [Computer software]. https://CRAN.R-project.org/package=nlme
Platas, L. M., Ketterlin-Geller, L. R., & Sitabkhan, Y. (2016). Using an assessment of early mathematical knowledge and skills to inform policy and practice: Examples from the Early Grade Mathematics Assessment. International Journal of Education in Mathematics, Science and Technology, 4(3), 163–173. https://doi.org/10.18404/ijemst.15920
Preacher, K. J., & Yaremych, H. E. (2023). Model selection in structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (2nd ed., pp. 206–222). Guilford Press.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Sage
R Core Team. (2023). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org/
RTI International. (2014). Early grade mathematics assessment (EGMA) toolkit (Rev. ed.). Research Triangle Institute. https://pdf.usaid.gov/pdf_docs/PA00JP65.pdf
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. https://doi.org/10.1214/aos/1176344136
Silhavy, R., Silhavy, P., & Prokopova, Z. (2017). Analysis and selection of a regression model for the use case points method using a stepwise approach. Journal of Systems and Software, 125, 1–14. https://doi.org/10.1016/j.jss.2016.11.037
Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed.). Sage.
Stylianou, C., Pickles, A., & Roberts, S. A. (2013). Using Bonferroni, BIC, and AIC to assess evidence for alternative biological pathways: Covariate selection for the multilevel Embryo-Uterus model. BMC Medical Research Methodology, 13(1), 73. https://doi.org/10.1186/1471-2288-13-73
Sweller, J., Zhang, L., Ashman, G., Cobern, W., & Kirschner, P. A. (2024). Response to De Jong et al.'s (2023) paper "Let us talk evidence—The case for combining inquiry-based and direct instruction." Educational Research Review, 42, 100584. https://doi.org/10.1016/j.edurev.2024.100584
Vallejo, G., Tuero-Herrero, E., Núñez, J. C., & Rosário, P. (2014). Performance evaluation of recent information criteria for selecting multilevel models in behavioral and social sciences. International Journal of Clinical and Health Psychology, 14(1), 48–57. https://doi.org/10.1016/s1697-2600(14)70036-5
Vrieze, S. I. (2012). Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychological Methods, 17(2), 228–243. https://doi.org/10.1037/a0027127
West, B. T., Welch, K. B., & Galecki, A. T. (2022). Linear mixed models: A practical guide using statistical software (2nd ed.). Chapman and Hall/CRC.

An Empirical Demonstration of Selecting Predictors for Multilevel Models

Year 2026, Volume: 17 Issue: 1 , 42 - 62 , 01.04.2026

Brian Mumba , Burak Aydın

https://izlik.org/JA39HY39MA

Abstract

This paper presents a comparative demonstration of variable selection based on the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and deviance for identifying the best-fitting two-level model using Early Grade Mathematics Assessment (EGMA) data collected in Zambia. A model that included all available predictor variables as fixed effects with random intercepts was run in R using the lme4 package, followed by an input model for comparison using a custom R function introduced by Nimon (2018). The analysis generated 108 models: 99 valid and nine invalid. The study determined a final model as the best fitting based on the principle of parsimony. The final model revealed that fixed effects of students' group-mean-centered reading ability, home reading status, gender, school reading average score, and number of pupils, and the random effect of group-mean-centered reading ability predicted the early-grade mathematics ability. The results from the retained model, compared with the null model, showed substantial improvements in model fit indices with a pseudo-R² value of 0.309. Overall, this study provides an empirical demonstration of selecting predictors among numerous variables for multilevel models, a crucial practical issue that is common in educational research due to the increasing availability of large datasets.

Keywords

predictor selection strategies , multilevel modelling , AIC , BIC , EGMA

References

Aho, K., Derryberry, D., & Peterson, T. (2014). Model selection for ecologists: The worldviews of AIC and BIC. Ecology, 95(3), 631–636. https://doi.org/10.1890/13-1452.1
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1109/TAC.1974.1100705
Aydın, B. (2016). Çok düzeyli modeller: Sürekli değişken ile iki düzeyli model örneği ve R programı ile analizi [Multilevel models: Example with continuous variable and two-level model analyzed in R program]. Ege Eğitim Dergisi, 17(2), 567–596. https://doi.org/10.12984/egeefd.280758
Baskin, A. W. (2023). Campus-level teacher turnover in Texas public elementary schools: An examination of the impact of leadership factors and school demographics using hierarchical linear modeling [Doctoral dissertation, University of Texas]. University of Texas Repository.
Bates, D. (2005). Fitting linear mixed models in R. R News, 5(1), 27–30.
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
Blatchford, P., Bassett, P., & Brown, P. (2011). Examining the effect of class size on classroom engagement and teacher–pupil interaction: Differences in relation to pupil prior attainment and primary vs. secondary schools. Learning and Instruction, 21(6), 715–730. https://doi.org/10.1016/j.learninstruc.2011.04.001
Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods & Research, 33(2), 261–304. https://doi.org/10.1177/0049124104268644
Buscemi, S., & Plaia, A. (2020). Model selection in linear mixed-effect models. AStA Advances in statistical Analysis, 104(4), 529–575. https://doi.org/10.1007/s10182-019-00359-z
Diego, V. P., Manusov, E. G., Mao, X., Curran, J. E., Göring, H., Almeida, M., Mahaney, M. C., Peralta, J. M., Blangero, J., & Williams-Blangero, S. (2023). Genotype-by-socioeconomic status interaction influences heart disease risk scores and carotid artery thickness in Mexican Americans: The predominant role of education in comparison to household income and socioeconomic index. Frontiers in Genetics, 14, 1132110. https://doi.org/10.3389/fgene.2023.1132110
Fokkema, M., Smits, N., Zeileis, A., Hothorn, T., & Kelderman, H. (2017). Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees. Behavior Research Methods, 50, 2016–2034. https://doi.org/10.3758/s13428-017-0971x
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press
Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., ... & Modrák, M. (2020). Bayesian workflow. arXiv preprint arXiv:2011.01808. https://arxiv.org/abs/2011.01808
Goldstein, H. (2011). Multilevel statistical models (4th ed.). John Wiley & Sons.
Gove, A., Habib, S., Piper, B., & Ralaingita, W. (2013). Classroom-up policy change: Early reading and math assessments at work. Research in Comparative and International Education, 8(3), 373–386. https://doi.org/10.2304/rcie.2013.8.3.373
Hamaker, E. L. (2023). The within-between dispute in cross-lagged panel research and how to move forward. Psychological Methods, 28(1), 1–15. https://doi.org/10.1037/met0000473
Hamaker, E. L., Kuiper, R. M., & Grasman, R. P. (2015). A critique of the cross-lagged panel model. Psychological Methods, 20(1), 102–116. https://doi.org/10.1037/a0038889
Harrison, X. A., Donaldson, L., Correa-Cano, M. E., Evans, J., Fisher, D. N., Goodwin, C. E., Robinson, B. S., Hodgson, D. J., & Inger, R. (2018). A brief introduction to mixed effects modelling and multimodel inference in ecology. PeerJ, 6, e4794. https://doi.org/10.7717/peerj.4794
Johnson, J. B., & Omland, K. S. (2004). Model selection in ecology and evolution. Trends in Ecology & Evolution, 19(2), 101–108. https://doi.org/10.1016/j.tree.2003.10.013
Kuha, J. (2004). AIC and BIC: Comparisons of assumptions and performance. Sociological Methods & Research, 33(2), 188–229. https://doi.org/10.1177/0049124103262065
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13
Lorah, J., & Womack, A. (2019). Value of sample size for computation of the Bayesian information criterion (BIC) in multilevel modeling. Behavior Research Methods, 51(1), 440–450. https://doi.org/10.3758/s13428-018-1188-3
Luke, D. A. (2020). Multilevel Modeling (2nd ed.). Sage Publications.
McNeish, D. (2017). Small sample methods for multilevel modeling: A colloquial elucidation of REML and the Kenward-Roger correction. Multivariate Behavioral Research, 52(5), 661–670. https://doi.org/10.1080/00273171.2017.1344538
McNeish, D., & Stapleton, L. M. (2016). The effect of small sample size on two-level model estimates: A review and illustration. Educational Psychology Review, 28(2), 295–314. https://doi.org/10.1007/s10648-014-9287-x
Nimon, K. (2018). apsl2lme: A model-selection diagnostic tool for hierarchical linear models. General Linear Model Journal, 44(1), 1–9. https://doi.org/10.31523/glmj.044002.002
Ortega, A., & Navarrete, G. (2017). Bayesian hypothesis testing: An alternative to null hypothesis significance testing (NHST) in psychology and social sciences. Frontiers in Psychology, 8, 608. https://doi.org/10.3389/fpsyg.2017.00608
Pinheiro, J., Bates, D., DebRoy, S., Sarkar, D., & R Core Team. (2023). nlme: Linear and nonlinear mixed effects models (R package version 3.1-163) [Computer software]. https://CRAN.R-project.org/package=nlme
Platas, L. M., Ketterlin-Geller, L. R., & Sitabkhan, Y. (2016). Using an assessment of early mathematical knowledge and skills to inform policy and practice: Examples from the Early Grade Mathematics Assessment. International Journal of Education in Mathematics, Science and Technology, 4(3), 163–173. https://doi.org/10.18404/ijemst.15920
Preacher, K. J., & Yaremych, H. E. (2023). Model selection in structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (2nd ed., pp. 206–222). Guilford Press.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Sage
R Core Team. (2023). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org/
RTI International. (2014). Early grade mathematics assessment (EGMA) toolkit (Rev. ed.). Research Triangle Institute. https://pdf.usaid.gov/pdf_docs/PA00JP65.pdf
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. https://doi.org/10.1214/aos/1176344136
Silhavy, R., Silhavy, P., & Prokopova, Z. (2017). Analysis and selection of a regression model for the use case points method using a stepwise approach. Journal of Systems and Software, 125, 1–14. https://doi.org/10.1016/j.jss.2016.11.037
Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed.). Sage.
Stylianou, C., Pickles, A., & Roberts, S. A. (2013). Using Bonferroni, BIC, and AIC to assess evidence for alternative biological pathways: Covariate selection for the multilevel Embryo-Uterus model. BMC Medical Research Methodology, 13(1), 73. https://doi.org/10.1186/1471-2288-13-73
Sweller, J., Zhang, L., Ashman, G., Cobern, W., & Kirschner, P. A. (2024). Response to De Jong et al.'s (2023) paper "Let us talk evidence—The case for combining inquiry-based and direct instruction." Educational Research Review, 42, 100584. https://doi.org/10.1016/j.edurev.2024.100584
Vallejo, G., Tuero-Herrero, E., Núñez, J. C., & Rosário, P. (2014). Performance evaluation of recent information criteria for selecting multilevel models in behavioral and social sciences. International Journal of Clinical and Health Psychology, 14(1), 48–57. https://doi.org/10.1016/s1697-2600(14)70036-5
Vrieze, S. I. (2012). Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychological Methods, 17(2), 228–243. https://doi.org/10.1037/a0027127
West, B. T., Welch, K. B., & Galecki, A. T. (2022). Linear mixed models: A practical guide using statistical software (2nd ed.). Chapman and Hall/CRC.

There are 41 citations in total.

Details

Primary Language	English
Subjects	Statistical Analysis Methods, Modelling
Journal Section	Research Article
Authors	Brian Mumba 0000-0001-9796-6749 Burak Aydın 0000-0003-4462-1784
Submission Date	October 30, 2025
Acceptance Date	March 30, 2026
Publication Date	April 1, 2026
IZ	https://izlik.org/JA39HY39MA
Published in Issue	Year 2026 Volume: 17 Issue: 1

Cite

APA	Mumba, B., & Aydın, B. (2026). An Empirical Demonstration of Selecting Predictors for Multilevel Models. Journal of Measurement and Evaluation in Education and Psychology, 17(1), 42-62. https://izlik.org/JA39HY39MA

Article Files

Full Text