The Problem of Measurement Equivalence or Invariance in Instruments

Tülin (otbiçer) Acar

doi:10.21449/ijate.690865

Research Article

The Problem of Measurement Equivalence or Invariance in Instruments

Year 2021, Volume: 8 Issue: 1, 167 - 180, 15.03.2021

Tülin (otbiçer) Acar

https://doi.org/10.21449/ijate.690865

Cited By: 1

Abstract

The purpose of this study is to discuss the validity of equivalence in the sample groups of young and adult; females and males in the scale of assessing the attitudes towards foreign language skills and to offer the researchers that will use this scale certain evidence based on data. No measurement equivalence/invariance was found in adult and young groups. Consequently, measurement equivalence/invariance based on gender variable was not present, either. The absence of measurement equivalence/invariance is in fact a fundamental proof that the measurement instrument is specific to the group that it is intended for. For this reason, researchers should evaluate cross-validity or multi-group analyses on the basis of the traits that are measured using the measurement instrument. It is not always negative not to have measurement equivalence/invariance during the process of gathering validity evidences.

Keywords

Measurement equivalence, Measurement invariance, Cross-validation

References

Acar, T. (2016). Measurement of attitudes regarding foreign language skills and its relation with success. International Journal of Evaluation and Research in Education, 5(4), 310-322. https://doi.org/10.11591/ijere.v5i4.5959
Akyıldız, D. (2009). PIRLS 2001 testinin yapı geçerliliğinin ülkelerarası karşılaştırılması [The comparison of construct validities of the PIRLS 2001 test between countries]. Yüzüncü Yıl Üniversitesi Eğitim Fakültesi Dergisi, 6(1), 18 47. https://dergipark.org.tr/tr/pub/yyuefd/issue/13711/165993
Ariola, M. M. (2006). Principles and methods of research. Rex Book Store.
Asil, M., & Gelbal, S. (2012). Crosscultural equivalence of the PISA student questionnaire. Education and Science, 37, 236-249.
Baumgartner, H., & Homburg, C. (1996). Applications of structural equation modeling in marketing and consumer research: A review. International Journal of Research in Marketing, 13(2), 139-161.
Baumgartner, H., & Steenkamp, J.-B. E. M. (1998). Multi-group latent variable models for varying numbers of items and factors with cross-national and longitudinal applications. Marketing Letters, 9, 21-35. https://doi.org/10.1023/A:1007911903032
Bentler, P. M. (1980). Multivariate analysis with latent variables: Causal modeling. Annual Review of Psychology, 31(1), 419-456.
Bialosiewicz, S., Murphy, K., & Berry, T. (2013). An introduction to measurement invariance testing: Resource packet for participants. http://comm.eval.org/HigherLogic/System/DownloadDocumentFile.ashx?DocumentFileKey=63758fed-a490-43f2-8862-2de0217a08b8
Bollen, K. A., & Long, J. S. (Eds.). (1983). Testing structural equation models. Sage.
Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456-466. https://doi.org/10.1037/0033-2909.105.3.456
Byrne, B. M. (2008) Testing for multigroup equivalence of a measuring instrument: A walk through the process. Psicothema, 20, 872-882.
Byrne, B. M., & Stewart, S. M. (2006). Teacher's corner: the macs approach to testing for multigroup invariance of a second-order structure: A walk through the process. Structural Equation Modeling, 13(2), 287 321. https://doi.org/10.1207/s15328007sem1302_7
Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indices for testing measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 9(2), 233-255. https://doi.org/10.1207/S15328007SEM0902_5
Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P., & Billiet, J. (2014). Measurement equivalence in cross-national research. Annual Review of Sociology, 40(1) 55-75. https://doi.org/10.1146/annurev-soc-071913-043137
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Erlbaum.
Feingold, A. (1992). Good-looking people are not what we think. Psychological Bulletin, 111(2), 304–341. https://doi.org/10.1037/0033-2909.111.2.304
Fiala, W. E., Bjorck, J. P., & Gorsuch, R. (2002). The religious support scale: construction, validation, and cross-validation. American Journal of community Psychology, 30, 761-786. https://doi.org/10.1023/A:1020264718397
French, B. F., & Finch, W. H. (2008). Multigroup confirmatory factor analysis: Locating the invariant referent sets. Structural Equation Modeling: A Multidisciplinary Journal, 15(1), 96-113. https://doi.org/10.1080/10705510701758349
Gandek, B., Ware J. E., Aaronson N. K., Apolone, G. B., Brazier J. E., et al. (1998). Cross validation of item selection and scoring for the SF-12 health survey in nine countries: results from the iqola project, international quality of life assessment. Journal of Clinical Epidemiology, 51(11), 1171-1180. https://doi.org/10.1016/S0895-4356(98)00109-7
Gierl, M., Khaliq, S. N., & Boughton, K. (1999). Gender differential ıtem functioning in mathematics and science: prevalence and policy ımplications. Paper Presented at the Symposium entitled "Improving Large-Scale Assessment in Education" at the Annual Meeting of the Canadian Society for the Study of Education, Canada
Hambleton, R. K., Swaminathan, H., & J. H. Rogers. (1991). Fundamentals of Item Response Theory. Sage Publications.
Hirschfeld, G., & Brachel, R. (2014). Multiple-Group confirmatory factor analysis in R-A tutorial in measurement invariance with continuous and ordinal indicators. Practical Assessment, Research & Evaluation, 19(7), 1-12.
Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural equation modeling: a multidisciplinary journal, 6(1), 1-55. https://doi.org/10.1080/10705519909540118
Jöreskog, K.G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409-426. https://doi.org/10.1007/BF02291366
Kankaraš, M., & Moors, G. (2010). Researching measurement equivalence in cross cultural studies. Psihologija, 43(2), 121-136. https://doi.org/10.2298/PSI1002121K
Kline, R. B. (2011). Principles and practice of structural equation modeling. The Guilford Press.
Koh, K., & Zumbo, B. D. (2008). Multi-group confirmatory factor analysis for testing measurement invariance in mixed item format data. Journal of Modern Applied Statistical Methods, 7(2), 471 477. https://doi.org/10.22237/jmasm/1225512660
Little, T.D. (2010). Mean and covariance structures (MACS) analyses of crosscultural data: Practical and theoretical issues. Multivariate Behavioral Research, 32(1), 53-76. https://doi.org/10.1207/s15327906mbr3201_3
Lomax, R. G. (1983). A guide to multiple-sample structural equation modeling. Behavior Research Methods & Instrumentation, 15, 580-584. https://doi.org/10.3758/BF03203726
Lord, F. M. (1980). Applications of item response theory to practical testing problems (1st ed.). Routledge. https://doi.org/10.4324/9780203056615
Milfont, T. L., & Fischer, R. (2010). Testing measurement invariance across groups: Applications in cross-cultural research. International Journal of Psychological Research, 3(1), 111-130. https://doi.org/10.21500/20112084.857
Mullen, M. R. (1995). Diagnosing measurement equivalence in cross-national research. Journal of International Business Studies, 26, 573 596. https://doi.org/10.1057/palgrave.jibs.8490187
Murayama, K., Zhou, M., & Nesbit, J. C. (2009) A cross-cultural examination of the psychometric properties of responses to the Achievement Goal Questionnaire. Educational and Psychological Measurement, 69(2), 266 286. https://doi.org/10.1177/0013164408322017
Önen, E. (2007). Gruplar arası karşılaştırmalarda ölçme değişmezliğinin incelenmesi: Epistemolojik inançlar envanteri üzerine bir çalışma [Examination of measurement invariance at groups’ comparisions: A study on epistemological beliefs inventory]. Ege Eğitim Dergisi, 8(2), 87 109. https://dergipark.org.tr/tr/pub/egeefd/issue/4913/67270
Raju, N. S. (1988). The area between two item response functions. Psychometrika, 53, 495-502. https://doi.org/10.1007/BF02294403
Rijkeboer, M. M., & van den Bergh, H. (2006). Multiple group confirmatory factor analysis of the young schema-questionnaire in a Dutch clinical versus non-clinical population. Cogn. Ther. Res., 30, 263–278. https://doi.org/10.1007/s10608-006-9051-8
Steinmetz, H., Schmidt, P., Tina-Booh, A., Wieczorek, S., & Schwartz, S. H. (2009). Testing measurement invariance using multigroup CFA: Differences between educational groups in human values measurement. Quality and Quantity, 42, 599 616. https://doi.org/10.1007/s11135-007-9143-x
Thissen, D., Steinberg, L., & Wainer, H (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. I. Braun (Eds.), Test validity, (pp. 147-169). Lawrence Erlbaum Associates, Inc.
Uyar, Ş., & Doğan, N. (2014). PISA 2009 Türkiye örnekleminde öğrenme stratejileri modelinin farklı gruplarda ölçme değişmezliğinin incelenmesi [An investigation of measurement invariance of learning strategies model across different groups in PISA turkey sample]. Uluslararası Türk Eğitim Bilimleri Dergisi, 2, 30-43.
Varoquaux, G. (2018). Cross-validation failure: small sample sizes lead to large error bars. Neuroimage,180, 68–77.
Van de Schoot, R., Lugtig, P., & Hox, J. (2012) A checklist for testing measurement invariance. European Journal of Developmental Psychology, 9(4), 486 492. https://doi.org/10.1080/17405629.2012.686740
Vijver, F., & Tanzer, N. K. (2004). Bias and equivalence in cross-cultural assessment: an overview. Revue Européenne de Psychologie Appliquée, 54(2), 119 135. https://doi.org/10.1016/j.erap.2003.12.004
Yoo, B. (2002). Cross-group comparisons: A cautionary note. Psychology & Marketing, 19(4), 357-368.https://doi.org/10.1002/mar.10014
Yuan, K. H., & Bentler, P. M. (1997). Mean and covariance structure analysis: Theoretical and practical improvements. Journal of the American Statistical Association, 92(438), 767-774. https://doi.org/10.1080/01621459.1997.10474029

The Problem of Measurement Equivalence or Invariance in Instruments

Year 2021, Volume: 8 Issue: 1, 167 - 180, 15.03.2021

Tülin (otbiçer) Acar

https://doi.org/10.21449/ijate.690865

Cited By: 1

Abstract

Keywords

Measurement equivalence, Measurement invariance, Cross-validation

References

Acar, T. (2016). Measurement of attitudes regarding foreign language skills and its relation with success. International Journal of Evaluation and Research in Education, 5(4), 310-322. https://doi.org/10.11591/ijere.v5i4.5959
Akyıldız, D. (2009). PIRLS 2001 testinin yapı geçerliliğinin ülkelerarası karşılaştırılması [The comparison of construct validities of the PIRLS 2001 test between countries]. Yüzüncü Yıl Üniversitesi Eğitim Fakültesi Dergisi, 6(1), 18 47. https://dergipark.org.tr/tr/pub/yyuefd/issue/13711/165993
Ariola, M. M. (2006). Principles and methods of research. Rex Book Store.
Asil, M., & Gelbal, S. (2012). Crosscultural equivalence of the PISA student questionnaire. Education and Science, 37, 236-249.
Baumgartner, H., & Homburg, C. (1996). Applications of structural equation modeling in marketing and consumer research: A review. International Journal of Research in Marketing, 13(2), 139-161.
Baumgartner, H., & Steenkamp, J.-B. E. M. (1998). Multi-group latent variable models for varying numbers of items and factors with cross-national and longitudinal applications. Marketing Letters, 9, 21-35. https://doi.org/10.1023/A:1007911903032
Bentler, P. M. (1980). Multivariate analysis with latent variables: Causal modeling. Annual Review of Psychology, 31(1), 419-456.
Bialosiewicz, S., Murphy, K., & Berry, T. (2013). An introduction to measurement invariance testing: Resource packet for participants. http://comm.eval.org/HigherLogic/System/DownloadDocumentFile.ashx?DocumentFileKey=63758fed-a490-43f2-8862-2de0217a08b8
Bollen, K. A., & Long, J. S. (Eds.). (1983). Testing structural equation models. Sage.
Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456-466. https://doi.org/10.1037/0033-2909.105.3.456
Byrne, B. M. (2008) Testing for multigroup equivalence of a measuring instrument: A walk through the process. Psicothema, 20, 872-882.
Byrne, B. M., & Stewart, S. M. (2006). Teacher's corner: the macs approach to testing for multigroup invariance of a second-order structure: A walk through the process. Structural Equation Modeling, 13(2), 287 321. https://doi.org/10.1207/s15328007sem1302_7
Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indices for testing measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 9(2), 233-255. https://doi.org/10.1207/S15328007SEM0902_5
Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P., & Billiet, J. (2014). Measurement equivalence in cross-national research. Annual Review of Sociology, 40(1) 55-75. https://doi.org/10.1146/annurev-soc-071913-043137
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Erlbaum.
Feingold, A. (1992). Good-looking people are not what we think. Psychological Bulletin, 111(2), 304–341. https://doi.org/10.1037/0033-2909.111.2.304
Fiala, W. E., Bjorck, J. P., & Gorsuch, R. (2002). The religious support scale: construction, validation, and cross-validation. American Journal of community Psychology, 30, 761-786. https://doi.org/10.1023/A:1020264718397
French, B. F., & Finch, W. H. (2008). Multigroup confirmatory factor analysis: Locating the invariant referent sets. Structural Equation Modeling: A Multidisciplinary Journal, 15(1), 96-113. https://doi.org/10.1080/10705510701758349
Gandek, B., Ware J. E., Aaronson N. K., Apolone, G. B., Brazier J. E., et al. (1998). Cross validation of item selection and scoring for the SF-12 health survey in nine countries: results from the iqola project, international quality of life assessment. Journal of Clinical Epidemiology, 51(11), 1171-1180. https://doi.org/10.1016/S0895-4356(98)00109-7
Gierl, M., Khaliq, S. N., & Boughton, K. (1999). Gender differential ıtem functioning in mathematics and science: prevalence and policy ımplications. Paper Presented at the Symposium entitled "Improving Large-Scale Assessment in Education" at the Annual Meeting of the Canadian Society for the Study of Education, Canada
Hambleton, R. K., Swaminathan, H., & J. H. Rogers. (1991). Fundamentals of Item Response Theory. Sage Publications.
Hirschfeld, G., & Brachel, R. (2014). Multiple-Group confirmatory factor analysis in R-A tutorial in measurement invariance with continuous and ordinal indicators. Practical Assessment, Research & Evaluation, 19(7), 1-12.
Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural equation modeling: a multidisciplinary journal, 6(1), 1-55. https://doi.org/10.1080/10705519909540118
Jöreskog, K.G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409-426. https://doi.org/10.1007/BF02291366
Kankaraš, M., & Moors, G. (2010). Researching measurement equivalence in cross cultural studies. Psihologija, 43(2), 121-136. https://doi.org/10.2298/PSI1002121K
Kline, R. B. (2011). Principles and practice of structural equation modeling. The Guilford Press.
Koh, K., & Zumbo, B. D. (2008). Multi-group confirmatory factor analysis for testing measurement invariance in mixed item format data. Journal of Modern Applied Statistical Methods, 7(2), 471 477. https://doi.org/10.22237/jmasm/1225512660
Little, T.D. (2010). Mean and covariance structures (MACS) analyses of crosscultural data: Practical and theoretical issues. Multivariate Behavioral Research, 32(1), 53-76. https://doi.org/10.1207/s15327906mbr3201_3
Lomax, R. G. (1983). A guide to multiple-sample structural equation modeling. Behavior Research Methods & Instrumentation, 15, 580-584. https://doi.org/10.3758/BF03203726
Lord, F. M. (1980). Applications of item response theory to practical testing problems (1st ed.). Routledge. https://doi.org/10.4324/9780203056615
Milfont, T. L., & Fischer, R. (2010). Testing measurement invariance across groups: Applications in cross-cultural research. International Journal of Psychological Research, 3(1), 111-130. https://doi.org/10.21500/20112084.857
Mullen, M. R. (1995). Diagnosing measurement equivalence in cross-national research. Journal of International Business Studies, 26, 573 596. https://doi.org/10.1057/palgrave.jibs.8490187
Murayama, K., Zhou, M., & Nesbit, J. C. (2009) A cross-cultural examination of the psychometric properties of responses to the Achievement Goal Questionnaire. Educational and Psychological Measurement, 69(2), 266 286. https://doi.org/10.1177/0013164408322017
Önen, E. (2007). Gruplar arası karşılaştırmalarda ölçme değişmezliğinin incelenmesi: Epistemolojik inançlar envanteri üzerine bir çalışma [Examination of measurement invariance at groups’ comparisions: A study on epistemological beliefs inventory]. Ege Eğitim Dergisi, 8(2), 87 109. https://dergipark.org.tr/tr/pub/egeefd/issue/4913/67270
Raju, N. S. (1988). The area between two item response functions. Psychometrika, 53, 495-502. https://doi.org/10.1007/BF02294403
Rijkeboer, M. M., & van den Bergh, H. (2006). Multiple group confirmatory factor analysis of the young schema-questionnaire in a Dutch clinical versus non-clinical population. Cogn. Ther. Res., 30, 263–278. https://doi.org/10.1007/s10608-006-9051-8
Steinmetz, H., Schmidt, P., Tina-Booh, A., Wieczorek, S., & Schwartz, S. H. (2009). Testing measurement invariance using multigroup CFA: Differences between educational groups in human values measurement. Quality and Quantity, 42, 599 616. https://doi.org/10.1007/s11135-007-9143-x
Thissen, D., Steinberg, L., & Wainer, H (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. I. Braun (Eds.), Test validity, (pp. 147-169). Lawrence Erlbaum Associates, Inc.
Uyar, Ş., & Doğan, N. (2014). PISA 2009 Türkiye örnekleminde öğrenme stratejileri modelinin farklı gruplarda ölçme değişmezliğinin incelenmesi [An investigation of measurement invariance of learning strategies model across different groups in PISA turkey sample]. Uluslararası Türk Eğitim Bilimleri Dergisi, 2, 30-43.
Varoquaux, G. (2018). Cross-validation failure: small sample sizes lead to large error bars. Neuroimage,180, 68–77.
Van de Schoot, R., Lugtig, P., & Hox, J. (2012) A checklist for testing measurement invariance. European Journal of Developmental Psychology, 9(4), 486 492. https://doi.org/10.1080/17405629.2012.686740
Vijver, F., & Tanzer, N. K. (2004). Bias and equivalence in cross-cultural assessment: an overview. Revue Européenne de Psychologie Appliquée, 54(2), 119 135. https://doi.org/10.1016/j.erap.2003.12.004
Yoo, B. (2002). Cross-group comparisons: A cautionary note. Psychology & Marketing, 19(4), 357-368.https://doi.org/10.1002/mar.10014
Yuan, K. H., & Bentler, P. M. (1997). Mean and covariance structure analysis: Theoretical and practical improvements. Journal of the American Statistical Association, 92(438), 767-774. https://doi.org/10.1080/01621459.1997.10474029

There are 44 citations in total.

Details

Primary Language	English
Subjects	Studies on Education
Journal Section	Articles
Authors	Tülin (otbiçer) Acar 0000-0001-7976-5521
Publication Date	March 15, 2021
Submission Date	February 18, 2020
Published in Issue	Year 2021 Volume: 8 Issue: 1

Cite

APA	(otbiçer) Acar, T. (2021). The Problem of Measurement Equivalence or Invariance in Instruments. International Journal of Assessment Tools in Education, 8(1), 167-180. https://doi.org/10.21449/ijate.690865

Cited By

A Measurement Equivalence Study of the Family Bondedness Scale: Comparison Between Black/African American and White Pet Owners

Anthrozoös

https://doi.org/10.1080/08927936.2022.2121048

Article Files

Full Text

23823 23825 23824