The Influence of Using Plausible Values and Survey Weights on Multiple Regression and Hierarchical Linear Model Parameters
Year 2019,
, 235 - 248, 04.09.2019
Osman Tat
,
İlhan Koyuncu
,
Selahattin Gelbal
Abstract
In
large-scale assessments like Programme for International Students Assessment
(PISA) and the Trends in International Mathematics and Science Study (TIMSS),
plausible values are often used as students’ ability estimations. In those
studies, stratified sampling method is employed in order to draw participants,
and hence, the data gathered has a hierarchical structure. In the context of
large-scale assessments, plausible values refer to randomly drawn values from
posterior ability distribution. It is reported that using one of plausible
values or mean of those values as independent variable in linear models may
lead to some estimation errors. Moreover, it is observed that sampling weights
sometimes are not used during analysis of large-scale assessment data. This
study aims to investigate the influence of three approaches on the parameters
of linear and hierarchical linear regression models: 1) using only one
plausible value, 2) using all plausible values, 3) incorporating sampling
weights or not. Data used in the present study is obtained from school and
student questionnaires in PISA (2015) Turkey database. Results revealed that
the use of sampling weights and number of plausible values has significant
effects on regression coefficients, standard errors and explained variance for
both regression models. Findings of the study were discussed in details and
some conclusions were drawn for practice and further research.
References
- Adams, R.J., & Wu, M.L. (Eds.) (2002) PISA 2000 technical report. Paris: OECD Publications.
- Beaton, A.E. (1987). Implementing the new design. The NAEP 1983-84 technical report. (Report No. 15-TR-20). Princeton, NJ: Educational Testing Service.
- Bock, R.D. & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm. Psychometrika 46, 443-459.
- Bryk, A. S., & Raudenbush, S. W. (1988). Toward a more appropriate conceptualization of research on school effects: A three-level hierarchical linear model. American Journal of Education 97(1), 65-108.
- Bryk, A. S., & Raudenbush, S. W. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Sage Publications, Thousand Oaks, CA.
- Bryk, A. S., Raudenbush, S. W., & Congdon, R. (2010). HLM7 for Windows [Computer software]. Chicago, IL: Scientific Software International, Inc.
- Chowa, G. A., Masa, R. D., Ramos, Y., & Ansong, D. (2015). How do student and school characteristics influence youth academic achievement in Ghana? A hierarchical linear modelling of Ghana YouthSave baseline data. International Journal of Educational Development, 45, 129-140.
- Cochran, W. G. (1977). Sampling techniques (3rd ed.). John Wiley and Sons, New York, NY.
- Gelman, A. (2006). Multilevel (hierarchical) modelling: What it can and cannot do. Technometrics 48(3), 432-435.Goldstein, H. (2011). Multilevel statistical models (Vol. 922). John Wiley & Sons.
- IBM Corp. Released 2015. IBM SPSS Statistics for Windows, Version 23.0. Armonk, NY: IBM Corp.IEA (2016) Help Manual for the IDB Analyzer. Hamburg, Germany. (Available fromwww.iea.nl/data)
- Kreft, I. G., De Leeuw, J., & Aiken, L. S. (1995). The effect of different forms of centering in hierarchical linear models. Multivariate Behavioral Research, 30(1), 1-21.
- Lohr, S. (2010). Sampling: Design and Analysis (2nd edition). Brooks/Cole, Boston, MA.
- Maas, C. J. M., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 1(3), 86-92. doi:10.1027/1614-2241.1.3.86
- Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56(2), 177–196.
- Mislevy, R. J. (1993). Should “multiple imputations” be treated as “multiple indicators”? Psychometrika, 58(1), 79–85.
- OECD (2009). Analyses with plausible values. In PISA Data Analysis Manual: SPSS, Second Edition, OECD Publishing. http://dx.doi.org/10.1787/9789264056275-9-en
- OECD (2017). PISA 2015 technical report. Paris: OECD.
- Osborne, J. W. (2000). Advantages of hierarchical linear modeling. Practical Assessment, Research & Evaluation, 7(1), 1-3.
- Rasch, G. (1960). Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests. Oxford, England: Nielsen & Lydiche.
- Raudenbush, S. W. (1988). Educational applications of hierarchical linear models: A review. Journal of Educational Statistics, 13(2), 85-116.
- Raudenbush, S. W., & Bryk, A. S. (1986). A hierarchical model for studying school effects. Sociology of Education, 1-17.
- Roberts, J. K. (2004). An introductory primer on multilevel and hierarchical linear modeling. Learning Disabilities: A Contemporary Journal 2, 30-38.
- Rubin, D.B. (1987). Multiple imputations for non-response in surveys. New York: Wiley.
- Särndal, C., B. Swensson, & J. Wretman (1992). Model assisted survey sampling. Springer-Verlag, New York, NY.
- Snijders, T., & Bosker, R. (2003). Multilevel analysis: An introduction to basic and applied multilevel analysis. Sage Publications, Thousand Oaks, CA.
- Stipek, D., & Valentino, R. A. (2015). Early childhood memory and attention as predictors of academic growth trajectories. Journal of Educational Psychology, 107(3), 771.
- Von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI monograph series 2, 9-36.
- Warm, T. A. (1985). Weighted maximum likelihood estimation of ability in item response theory with tests q/jinite length. Technical Report CGI-TR-85-08. Oklahoma City: U.S. Coast Guard Institute.
- Woltman, H., Feldstain, A., MacKay, J. C., & Rocchi, M. (2012). An introduction to hierarchical linear modeling. Tutorials in Quantitative Methods for Psychology, 8(1), 52-69.
- Wright. B.D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press.
- Wu, M. (2005). The role of plausible values in large-scale surveys. Studies in Educational Evaluation, 31(2), 114-128.
Year 2019,
, 235 - 248, 04.09.2019
Osman Tat
,
İlhan Koyuncu
,
Selahattin Gelbal
Abstract
Uluslararası Öğrenci
Değerlendirme Programı (PISA) ve Uluslararası Matematik ve Fen Eğilimleri
Çalışması (TIMSS) gibi geniş ölçekli uygulamalarda öğrenci yeteneğine ilişkin
kestirimler olarak makul değerler kullanılır. Bu çalışmalarda katılımcılar
tabakalı örnekleme yöntemi ile çekilmektedir. Bu durum elde edilen verilerin,
çok sayıdaki tabakadan oluşan hiyerarşik bir yapıda olmasının önünü açmaktadır.
Geniş ölçekli değerlendirme çalışmaları bağlamında makul değerler, sonsal
yetenek dağılımından rastgele elde edilen değerler olarak tanımlanmaktadır.
Doğrusal modellerde tek bir makul değerin veya tüm makul değerlerin
ortalamasının bağımsız değişken olarak kullanılmasının yanlı sonuçlara sebep
olabildiği bilinmektedir. Aynı zamanda bu geniş ölçekli çalışmaların verileri
ile analizler yapılırken örnekleme ağırlıklarının göz ardı edildiği sıkça
gözlenmektedir. Bu çalışmanın amacı, çoklu doğrusal regresyon ve hiyerarşik
doğrusal modellerde 1) tek makul değer kullanımının, 2) tüm makul değerlerin
kullanımının, 3) ağırlık kullanma durumunun parametre kestirimlerine etkisini
araştırmaktır. Çalışmada PISA 2015 uygulamasının Türkiye verilerinden
yararlanılmıştır. Araştırmada, örnekleme ağırlıkları kullanma durumunun ve
makul değerlerin kullanım şeklinin kat sayıların, standart hataların ve
açıklanan varyans oranının kestirilmesinde önemli rolleri olduğu
belirlenmiştir. Bulgular detaylı bir biçimde tartışılmış ve uygulama ve gelecek
araştırmalar için bazı öneriler sunulmuştur.
References
- Adams, R.J., & Wu, M.L. (Eds.) (2002) PISA 2000 technical report. Paris: OECD Publications.
- Beaton, A.E. (1987). Implementing the new design. The NAEP 1983-84 technical report. (Report No. 15-TR-20). Princeton, NJ: Educational Testing Service.
- Bock, R.D. & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm. Psychometrika 46, 443-459.
- Bryk, A. S., & Raudenbush, S. W. (1988). Toward a more appropriate conceptualization of research on school effects: A three-level hierarchical linear model. American Journal of Education 97(1), 65-108.
- Bryk, A. S., & Raudenbush, S. W. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Sage Publications, Thousand Oaks, CA.
- Bryk, A. S., Raudenbush, S. W., & Congdon, R. (2010). HLM7 for Windows [Computer software]. Chicago, IL: Scientific Software International, Inc.
- Chowa, G. A., Masa, R. D., Ramos, Y., & Ansong, D. (2015). How do student and school characteristics influence youth academic achievement in Ghana? A hierarchical linear modelling of Ghana YouthSave baseline data. International Journal of Educational Development, 45, 129-140.
- Cochran, W. G. (1977). Sampling techniques (3rd ed.). John Wiley and Sons, New York, NY.
- Gelman, A. (2006). Multilevel (hierarchical) modelling: What it can and cannot do. Technometrics 48(3), 432-435.Goldstein, H. (2011). Multilevel statistical models (Vol. 922). John Wiley & Sons.
- IBM Corp. Released 2015. IBM SPSS Statistics for Windows, Version 23.0. Armonk, NY: IBM Corp.IEA (2016) Help Manual for the IDB Analyzer. Hamburg, Germany. (Available fromwww.iea.nl/data)
- Kreft, I. G., De Leeuw, J., & Aiken, L. S. (1995). The effect of different forms of centering in hierarchical linear models. Multivariate Behavioral Research, 30(1), 1-21.
- Lohr, S. (2010). Sampling: Design and Analysis (2nd edition). Brooks/Cole, Boston, MA.
- Maas, C. J. M., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 1(3), 86-92. doi:10.1027/1614-2241.1.3.86
- Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56(2), 177–196.
- Mislevy, R. J. (1993). Should “multiple imputations” be treated as “multiple indicators”? Psychometrika, 58(1), 79–85.
- OECD (2009). Analyses with plausible values. In PISA Data Analysis Manual: SPSS, Second Edition, OECD Publishing. http://dx.doi.org/10.1787/9789264056275-9-en
- OECD (2017). PISA 2015 technical report. Paris: OECD.
- Osborne, J. W. (2000). Advantages of hierarchical linear modeling. Practical Assessment, Research & Evaluation, 7(1), 1-3.
- Rasch, G. (1960). Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests. Oxford, England: Nielsen & Lydiche.
- Raudenbush, S. W. (1988). Educational applications of hierarchical linear models: A review. Journal of Educational Statistics, 13(2), 85-116.
- Raudenbush, S. W., & Bryk, A. S. (1986). A hierarchical model for studying school effects. Sociology of Education, 1-17.
- Roberts, J. K. (2004). An introductory primer on multilevel and hierarchical linear modeling. Learning Disabilities: A Contemporary Journal 2, 30-38.
- Rubin, D.B. (1987). Multiple imputations for non-response in surveys. New York: Wiley.
- Särndal, C., B. Swensson, & J. Wretman (1992). Model assisted survey sampling. Springer-Verlag, New York, NY.
- Snijders, T., & Bosker, R. (2003). Multilevel analysis: An introduction to basic and applied multilevel analysis. Sage Publications, Thousand Oaks, CA.
- Stipek, D., & Valentino, R. A. (2015). Early childhood memory and attention as predictors of academic growth trajectories. Journal of Educational Psychology, 107(3), 771.
- Von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI monograph series 2, 9-36.
- Warm, T. A. (1985). Weighted maximum likelihood estimation of ability in item response theory with tests q/jinite length. Technical Report CGI-TR-85-08. Oklahoma City: U.S. Coast Guard Institute.
- Woltman, H., Feldstain, A., MacKay, J. C., & Rocchi, M. (2012). An introduction to hierarchical linear modeling. Tutorials in Quantitative Methods for Psychology, 8(1), 52-69.
- Wright. B.D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press.
- Wu, M. (2005). The role of plausible values in large-scale surveys. Studies in Educational Evaluation, 31(2), 114-128.