Estimation and Standardization of Variance Parameters for Planning Cluster-Randomized Trials: A Short Guide for Researchers

Metin Buluş; Sakine Göçer Şahin

doi:10.21031/epod.530642

Research Article

Estimation and Standardization of Variance Parameters for Planning Cluster-Randomized Trials: A Short Guide for Researchers

Year 2019, , 179 - 201, 28.06.2019

Metin Buluş , Sakine Göçer Şahin

https://doi.org/10.21031/epod.530642

Cited By: 6

Abstract

A review of
literature covering the past decade indicates a shortage of cluster-randomized
trials (CRTs) in education and psychology in Turkey, the gold standard that is
capable of producing high-quality evidence for high-stake decision making when
individual randomization is not feasible. Scarcity of CRTs is not only
detrimental to collective knowledge on the effectiveness of interventions but
also hinders efficient design of such studies as prior information is at best
incomplete or unavailable. In this illustration, we demonstrate how to estimate
variance parameters from existing data and transform them into standardized
forms so that they can be used in planning sufficiently powered CRTs. The
illustration uses publicly available software and guides researchers step by
step via introducing statistical models, defining parameters, relating them to
notations in statistical models and power formulas, and estimating variance
parameters. Finally, we provide example statistical power and minimum required
sample size calculations.

Keywords

cluster-randomized trials, variance estimation, statistical power analysis, minimum required sample size

References

Bland J. M. (2004). Cluster randomized trials in the medical literature: two bibliometric surveys. BMC Medical Research Methodology, 4(21). DOI: https://doi.org/10.1186/1471-2288-4-21
Bloom, H. S. (1995). Minimum detectable effects a simple way to report the statistical power of experimental designs. Evaluation Review, 19(5), 547-556. DOI: https://doi.org/10.1177/0193841X9501900504
Bloom, H. S. (2006). The core analytics of randomized experiments for social research. MDRC Working Papers on Research Methodology. New York, NY: MDRC. Retrieved from DOI: https://www.mdrc.org/sites/default/files/full_533.pdf.
Bloom, H. S., Bos, J. M., & Lee, S. W. (1999). Using cluster random assignment to measure program impacts statistical implications for the evaluation of education programs. Evaluation Review, 23(4), 445-469. DOI: https://doi.org/10.1177%2F0193841X9902300405
Bulus, M., Dong, N., Kelcey, B., & Spybrook, J. (2019). PowerUpR: Power Analysis Tools for Multilevel Randomized Experiments. R package version 1.0.4. DOI: https://CRAN.R-project.org/package=PowerUpR
Cameron, A. C., & Miller, d. L. (2015). A practitioner’s guide to cluster-robust inference. Journal of Human Resources, 50, 317-372. DOI: https://doi.org/10.3368/jhr.50.2.317
Dong, N., & Maynard, R. (2013). PowerUp!: A Tool for Calculating Minimum Detectable Effect Sizes and Minimum Required Sample Sizes for Experimental and Quasi-experimental Design Studies. Journal of Research on Educational Effectiveness, 6(1), 24-67. DOI: https://doi.org/10.1080/19345747.2012.673143
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48. DOI: https://doi.org/10.18637/jss.v067.i01
Hayes, R. J. & Moulton, L. H. (2017). Cluster Randomized Trials (2nd ed.). New York, NY: Chapman and Hall/CRC Press. DOI: https://doi.org/10.4324/9781315370286
Hedberg, E. C. (2016). Academic and behavioral design parameters for cluster randomized trials in kindergarten: an analysis of the Early Childhood Longitudinal Study 2011 Kindergarten Cohort (ECLS-K 2011). Evaluation Review, 40(4), 279-313. DOI: https://doi.org/10.1177/0193841X16655657
Hedberg, E. C., & Hedges, L. V. (2014). Reference values of within-district intraclass correlations of academic achievement by district characteristics: Results from a meta-analysis of district-specific values. Evaluation Review, 38(6), 546-582. DOI: https://doi.org/10.1177/0193841X14554212
Hedges, L. V., & Hedberg, E. C. (2013). Intraclass correlations and covariate outcome correlations for planning two-and three-level cluster-randomized experiments in education. Evaluation Review, 37(6), 445-489. DOI: https://doi.org/10.1177/0193841X14529126
Hedges, L. V., & Rhoads, C. (2010). Statistical Power Analysis in Education Research (NCSER 2010-3006). Washington, DC: National Center for Special Education Research, Institute of Education Sciences, U.S. Department of Education. Retrieved from https://files.eric.ed.gov/fulltext/ED509387.pdf
Konstantopoulos, S. (2009a). Using power tables to compute statistical power in multilevel experimental designs. Practical Assessment, Research & Evaluation, 14(10).
Konstantopoulos, S. (2009b). Incorporating Cost in Power Analysis for Three-Level Cluster-Randomized Designs. Evaluation Review, 33(4), 335-357. DOI: https://doi.org/10.1177/0193841X09337991
Moerbeek, M., & Safarkhani, M. (2018). The design of cluster randomized trials with random cross-classifications. Journal of Educational and Behavioral Statistics, 43(2), 159-181. DOI: https://doi.org/10.3102/1076998617730303
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods (2nd ed.). Thousand Oaks, CA: Sage Publications.
R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing [Computer software]. Vienna, Austria. Retrieved from https://www.R-project.org.
Spybrook, J., Shi, R., & Kelcey, B. (2016). Progress in the past decade: an examination of the precision of cluster randomized trials funded by the U.S. Institute of Education Sciences. International Journal of Research & Method in Education, 39(3), 255-267. DOI: https://doi.org/10.1080/1743727X.2016.1150454
Spybrook, J., Westine, C. D., & Taylor, J. A. (2016). Design parameters for impact research in science education: A multistate analysis. AERA Open, 2(1). DOI: https://doi.org/10.1177/2332858415625975
Westine, C. D. (2016). Finding Efficiency in the Design of Large Multisite Evaluations: Estimating Variances for Science Achievement Studies. American Journal of Evaluation, 37(3), 311-325. DOI: https://doi.org/10.1177/1098214015624014
Westine, C. D., Spybrook, J., & Taylor, J. A. (2013). An empirical investigation of variance design parameters for planning cluster-randomized trials of science achievement. Evaluation Review, 37(6), 490-519. DOI: https://doi.org/10.1177/0193841X14531584
Zopluoglu, C. (2012). A cross-national comparison of intra-class correlation coefficient in educational achievement outcomes. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 3(1), 242-278.

Year 2019, , 179 - 201, 28.06.2019

Metin Buluş , Sakine Göçer Şahin

https://doi.org/10.21031/epod.530642

Cited By: 6

Abstract

Geçtiğimiz on yılı kapsayan literatür taraması, bireysel seçkisiz
atamanın mümkün olmadığı durumlarda, önemli kararlar alınırken etkili kanıt
üretebilecek altın standardındaki seçkisiz-küme deneylerinin (SKD) Türkiye'de
eğitim ve psikoloji alanlarında yetersiz sayıda olduğunu göstermektedir. SKD
sayısının azlığı sadece programın etkililiği konusunda toplam bilgiye zarar
vermekle kalmayıp, önceki bilgiler mevcut olmadığından veya en iyi ihtimalle
eksik olduğundan bu tür çalışmaların etkili bir şekilde tasarlanmasını da
engellemektedir. Bu çalışmada, mevcut verilerden yola çıkarak varyans
parametrelerinin nasıl kestirilebileceği, nasıl standart formlara dönüştürülebileceği
ve böylece bunların nasıl yeteri kadar güçlü SKD’lerin planlanmasında kullanılabileceği
gösterilmiştir. Bu çalışmada belirtilen
amacı gerçekleştirmek üzere halka açık yazılım kullanılmıştır. Burada istatistiksel modelleri tanıtıp,
parametreleri tanımlayarak, istatistiksel model ve güç formüllerinde varyans
parametrelerini tahmini gösterimlerle (notasyon) ilişkilendirerek
araştırmacılara adım adım rehberlik edilmeye çalışılmıştır.

Keywords

seçkisiz-küme deneyleri, varyans kestirimi, istatistiksel güç analizi, gerekli en küçük örneklem büyüklüğü

References

Bland J. M. (2004). Cluster randomized trials in the medical literature: two bibliometric surveys. BMC Medical Research Methodology, 4(21). DOI: https://doi.org/10.1186/1471-2288-4-21
Bloom, H. S. (1995). Minimum detectable effects a simple way to report the statistical power of experimental designs. Evaluation Review, 19(5), 547-556. DOI: https://doi.org/10.1177/0193841X9501900504
Bloom, H. S. (2006). The core analytics of randomized experiments for social research. MDRC Working Papers on Research Methodology. New York, NY: MDRC. Retrieved from DOI: https://www.mdrc.org/sites/default/files/full_533.pdf.
Bloom, H. S., Bos, J. M., & Lee, S. W. (1999). Using cluster random assignment to measure program impacts statistical implications for the evaluation of education programs. Evaluation Review, 23(4), 445-469. DOI: https://doi.org/10.1177%2F0193841X9902300405
Bulus, M., Dong, N., Kelcey, B., & Spybrook, J. (2019). PowerUpR: Power Analysis Tools for Multilevel Randomized Experiments. R package version 1.0.4. DOI: https://CRAN.R-project.org/package=PowerUpR
Cameron, A. C., & Miller, d. L. (2015). A practitioner’s guide to cluster-robust inference. Journal of Human Resources, 50, 317-372. DOI: https://doi.org/10.3368/jhr.50.2.317
Dong, N., & Maynard, R. (2013). PowerUp!: A Tool for Calculating Minimum Detectable Effect Sizes and Minimum Required Sample Sizes for Experimental and Quasi-experimental Design Studies. Journal of Research on Educational Effectiveness, 6(1), 24-67. DOI: https://doi.org/10.1080/19345747.2012.673143
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48. DOI: https://doi.org/10.18637/jss.v067.i01
Hayes, R. J. & Moulton, L. H. (2017). Cluster Randomized Trials (2nd ed.). New York, NY: Chapman and Hall/CRC Press. DOI: https://doi.org/10.4324/9781315370286
Hedberg, E. C. (2016). Academic and behavioral design parameters for cluster randomized trials in kindergarten: an analysis of the Early Childhood Longitudinal Study 2011 Kindergarten Cohort (ECLS-K 2011). Evaluation Review, 40(4), 279-313. DOI: https://doi.org/10.1177/0193841X16655657
Hedberg, E. C., & Hedges, L. V. (2014). Reference values of within-district intraclass correlations of academic achievement by district characteristics: Results from a meta-analysis of district-specific values. Evaluation Review, 38(6), 546-582. DOI: https://doi.org/10.1177/0193841X14554212
Hedges, L. V., & Hedberg, E. C. (2013). Intraclass correlations and covariate outcome correlations for planning two-and three-level cluster-randomized experiments in education. Evaluation Review, 37(6), 445-489. DOI: https://doi.org/10.1177/0193841X14529126
Hedges, L. V., & Rhoads, C. (2010). Statistical Power Analysis in Education Research (NCSER 2010-3006). Washington, DC: National Center for Special Education Research, Institute of Education Sciences, U.S. Department of Education. Retrieved from https://files.eric.ed.gov/fulltext/ED509387.pdf
Konstantopoulos, S. (2009a). Using power tables to compute statistical power in multilevel experimental designs. Practical Assessment, Research & Evaluation, 14(10).
Konstantopoulos, S. (2009b). Incorporating Cost in Power Analysis for Three-Level Cluster-Randomized Designs. Evaluation Review, 33(4), 335-357. DOI: https://doi.org/10.1177/0193841X09337991
Moerbeek, M., & Safarkhani, M. (2018). The design of cluster randomized trials with random cross-classifications. Journal of Educational and Behavioral Statistics, 43(2), 159-181. DOI: https://doi.org/10.3102/1076998617730303
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods (2nd ed.). Thousand Oaks, CA: Sage Publications.
R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing [Computer software]. Vienna, Austria. Retrieved from https://www.R-project.org.
Spybrook, J., Shi, R., & Kelcey, B. (2016). Progress in the past decade: an examination of the precision of cluster randomized trials funded by the U.S. Institute of Education Sciences. International Journal of Research & Method in Education, 39(3), 255-267. DOI: https://doi.org/10.1080/1743727X.2016.1150454
Spybrook, J., Westine, C. D., & Taylor, J. A. (2016). Design parameters for impact research in science education: A multistate analysis. AERA Open, 2(1). DOI: https://doi.org/10.1177/2332858415625975
Westine, C. D. (2016). Finding Efficiency in the Design of Large Multisite Evaluations: Estimating Variances for Science Achievement Studies. American Journal of Evaluation, 37(3), 311-325. DOI: https://doi.org/10.1177/1098214015624014
Westine, C. D., Spybrook, J., & Taylor, J. A. (2013). An empirical investigation of variance design parameters for planning cluster-randomized trials of science achievement. Evaluation Review, 37(6), 490-519. DOI: https://doi.org/10.1177/0193841X14531584
Zopluoglu, C. (2012). A cross-national comparison of intra-class correlation coefficient in educational achievement outcomes. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 3(1), 242-278.

There are 23 citations in total.

Details

Primary Language	English
Journal Section	Articles
Authors	Metin Buluş 0000-0003-4348-6322 Sakine Göçer Şahin 0000-0002-6914-354X
Publication Date	June 28, 2019
Acceptance Date	June 13, 2019
Published in Issue	Year 2019

Cite

APA	Buluş, M., & Göçer Şahin, S. (2019). Estimation and Standardization of Variance Parameters for Planning Cluster-Randomized Trials: A Short Guide for Researchers. Journal of Measurement and Evaluation in Education and Psychology, 10(2), 179-201. https://doi.org/10.21031/epod.530642