| | | |

## Examining the Measurement Invariance of TIMSS 2015 Mathematics Liking Scale through Different Methods

#### Zafer ERTÜRK [1] , Esra OYAR [2]

: Studies aiming to make cross-cultural comparisons first should establish measurement invariance in the groups to be compared because results obtained from such comparisons may be artificial in the event that measurement invariance cannot be established. The purpose of this study is to investigate the measurement invariance of the data obtained from the "Mathematics Liking Scale" in TIMSS 2015through Multiple Group CFA, Multiple Group LCA and Mixed Rasch Model, which are based on different theoretical foundations and to compare the obtained results. To this end, TIMSS 2015 data for students in the USA and Canada, who speak the same language and data for students in the USA and Turkey, who speak different languages, are used. The study is conducted through a descriptive study approach. The study revealed that all measurement invariance levels were established in Multiple Group CFA for the USA-Canada comparison. In Multiple Group LCA, on the other hand, measurement invariance was established up to partial homogeneity. However, it was not established in the Mixed Rasch Model. As for the USA-Turkey comparison, metric invariance was established in Multiple Group CFA whereas in Multiple Group LCA it stopped at the heterogeneity level. Measurement invariance for data failed to be established for the relevant sample in the Mixed Rasch Model. The foregoing findings suggest that methods with different theoretical foundations yield different measurement invariance results. In this regard, when deciding on the method to be used in measurement invariance studies, it is recommended to examine the necessary assumptions and consider the variable structure.
Measurement invariance, TIMSS, Latent class, Mixed rasch model, Factor analysis
• Anderson, J. C., & Gerbing, D.W. (1984). The effect of sampling error on convergence, improper solutions, and goodness-of-fit indices for maximum likelihood confirmatory factor analysis. Psychometrika, 49(2), 155-173. https://doi.org/10.1007/BF02294170
• Arim, R. G., & Ercikan, K. (2014). Comparability between the American and Turkish versions of the TIMSS mathematics test results. Education & Science, 39(172), 33-48.
• Aryadoust, V. (2015). Fitting a mixture Rasch model to English as a foreign language listening tests: The role of cognitive and background variables in explaining latent differential item functioning. International Journal of Testing, 15(3), 216 238. https://doi.org/10.1080/15305058.2015.1004409
• Aryadoust, V., & Zhang, L. (2016). Fitting the mixed rasch model to a reading comprehension test: Exploring individual difference profiles in L2 reading. Language Testing, 33(4), 529-553. https://doi.org/10.1177/0265532215594640
• Asil, M., & Gelbal, S. (2012). PISA öğrenci anketinin kültürler arası eşdeğerliği [Cross-cultural equivalence of the PISA student questionnaire]. Eğitim ve Bilim, 37(166), 236-249.
• Baghaei, P., & Carstensen, C. H. (2013). Fitting the mixed rasch model to a reading comprehension test: Identifying reader types. Practical Assessment, Research & Evaluation, 18. 1-13. https://doi.org/10.7275/n191-pt86
• Bahadır, E. (2012). Uluslararası Öğrenci Değerlendirme Programı'na (PISA 2009) göre Türkiye'deki öğrencilerin okuma becerilerini etkileyen değişkenlerin bölgelere göre incelenmesi [According Programme for International Student Assessment (PISA 2009), investigation of variables that affect Turkish students' reading skills by regions]. Unpublished master thesis, Hacettepe University, Institutes of Social Sciences, Ankara.
• Başusta, N. B., & Gelbal, S. (2015). Gruplar arası karşılaştırmalarda ölçme değişmezliğinin test edilmesi: PISA öğrenci anketi örneği [Examination of measurement invariance at groups’ comparisons: a study on PISA student questionnaire]. Hacetepe Üniversitesi Eğitim Fakültesi Dergisi, 30(4), 80-90.
• Bilir, M. K. (2009). Mixture item response theory-mimic model: simultaneous estimation of differential item functioning for manifest groups and latent classes (Unpublished doctoral dissertation). Florida State University.
• Bowden, S. C., Saklofske, D. H., & Weiss, L. G. (2011). Invariance of the measurement model underlying the Wechsler Adult Intelligence Scale-IV in the United States and Canada. Educational and Psychological Measurement, 71(1), 186-199.
• Brien, M., Forest, J., Mageau, G. A., Boudrias, J. S., Desrumaux, P., Brunet, L., & Morin, E. M. (2012). The basic psychological needs at work scale: measurement invariance between Canada and France. Applied Psychology: Health and Well‐Being, 4(2), 167-187.
• Bryne, B. M., & Watkins, D. (2003). The issue of measurement invariance revisited. Journal of Cross Cultural Psychology, 34(2), 155 175. https://doi.org/10.1177/0022022102250225
• Buchholz, J., & Hartig, J. (2017). Comparing attitudes across groups: An IRT-based item-fit statistic for the analysis of measurement invariance. Applied Psychological Measurement, 43(3), 241-250. https://doi.org/10.1177/0146621617748323
• Büyüköztürk, Ş. (2017). Sosyal bilimler için veri analizi el kitabi istatistik, araştırma deseni SPSS uygulamaları ve yorum. [Data analysis handbook statistics for social sciences, research design spss applications and interpretation.] Ankara: Pegem Akademi Yayıncılık.
• Byrne, B. M., Shavelson, R. J., & Muthen, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105(3), 456-466. https://doi.org/10.1037/0033-2909.105.3.456
• Byrne, B. M., & Stewart, S. M. (2006). The MACS approach to testing for multigroup invariance of a second-orderstructure: A walk through the process. Structural Equation Modeling, 13(2), 287-321. https://doi.org/10.1207/s15328007sem1302_
• Cho, S. J. (2007). A multilevel mixture IRT model for DIF analysis (Doctoral dissertation, uga).
• Cohen, A.S., & Bolt, D.M. (2005). A mixture model analysis of differential item functioning. Journal of Educational Measurement, 42(2), 133–148. https://doi.org/10.1111/j.1745-3984.2005.00007
• Cole, D. A. (1987). Utility of confirmatory factor analysis in test validation research. Journal of Consulting and Clinical Psychology, 55(4), 1019-1031. https://doi.org/10.1037/0022-006X.55.4.584
• Collins, L. M., & Lanza, S. T. (2010). Latent class and latent trasition analysis: With applications in the social, behavioral, and health sciences. New Jersey: John Wiley & Sons, Inc.
• Eid, M., Langeheine, R., & Diener, E. (2003). Comparing typological structures across cultures by multigroup latent class analysis: A primer. Journal of Cross-Cultural Psychology, 34(2), 195-210. https://doi.org/10.1177/0022022102250427
• Eid, M., & Rauber, M. (2000). Detecting measurement invariance in organizational surveys. European Journal of Psychological Assessment, 16(1), 20 30. https://doi.org/10.1027//1015-5759.16.1.20
• Fischer, G. H., & Molenaar, I. W. (Eds.). (2012). Rasch models: Foundations, recent developments, and applications. New York: Springer Science & Business Media.
• Frick, H., Strobl, C., & Zeileis, A. (2015). Rasch mixture models for DIF detection: A comparison of old and new score specifications. Educational and Psychological Measurement, 75(2), 208-234. https://doi.org/10.1177/0013164414536183.
• Goodman L. (2002) Latent class analysis In, Hagenaars J., McCutcheon A. (Ed.), Applied latent class analysis (pp. 3-18). Cambridge University Press: New York.
• Gülleroğlu, H. D. (2017). PISA 2012 matematik uygulamasına katılan Türk öğrencilerin duyuşsal özeliklerinin cinsiyete göre ölçme değişmezliğinin incelenmesi. [An investigation of measurement invariance by gender for the turkish students’ affective characteristics whotook the PISA 2012 math test]. Gazi Üniversitesi Gazi Eğitim Fakültesi Dergisi, 37(1), 151-175.
• Güngör, D., Korkmaz, M., & Somer, O. (2013). Çoklu-grup örtük sınıf analizi ve ölçme eşdeğerliği. [Multi-Group Latent Class Analysis and Measurement Equivalence]. Türk Psikoloji Dergisi, 28(72), 48-57.
• Güzeller, O.C. (2011). PISA 2009 Türkiye örnekleminde öğrencilerin bilgisayar özyeterlik inançları ve bilgisayar tutumları arasındaki ilişkinin incelenmesi. [An ınvestigation of the relationship between students’ computer self-efficacy beliefs and their computer attitudes in PISA 2009 turkey sampling] Ahi Evran Üniversitesi Kırşehir Eğitim Fakültesi Dergisi, 12(4), 183-203.
• Hooper, D., Coughlan, J., & Mullen, M. R. (2008). Structural Equation Modelling: Guidelines for Determining Model Fit. The Electronic Journal of Business Research Methods. 6(1) 53 – 60.
• Horn, J. L., McArdle, J.J., & Mason, R. (1983). When is invariance not invariant: A practical scientist’s look at the ethereal concept of factor invariance. The Southern Psychologist, 1(4), 179-188.
• Hui, C.H., & Triandis, H.C. (1985). Measurement in cross-cultural psychology: a review and comparison of strategies. Journal of Cross-cultural Psychology, 16(2), 131–152. https://doi.org/10.1177/0022002185016002001
• Jöreskog, K. G., & Sörbom, D. (1993). LISREL 8: Structural equation modeling with the SIMPLIS command language. Scientific Software International.
• Kankaras, M. (2010). Essays on measurement equivalence in cross-cultural survey research: A latent class approach (Unpublished doctoral dissertation).
• Kankaras, M., & Moors, G. (2010). Researching measurement equivalence in cross-cultural studies. Serbian Psychological Association, 43(2), 121-136.
• Kankaras, M., Vermunt, J. K., & Moors, G. (2011). Measurement equivalence of ordinal items: A comparison of factor analytic, item response theory, and latent class approaches. Sociological Methods & Research, 40(2), 279 310. https://doi.org/10.1177/0049124111405301
• Karakoc Alatli, B., Ayan, C., Polat Demir, B., & Uzun, G. (2016). Examination of the TIMSS 2011 Fourth Grade Mathematics Test in terms of cross-cultural measurement invariance. Eurasian Journal of Educational Research, 66, 389 406. https://doi.org/10.14689/ejer.2016.66.22
• Karasar, N. (2013). Bilimsel araştırma yöntemi. [Scientific research methods]. Ankara: Nobel Yayınevi.
• Kelderman, H., & Macready, G. B. (1990). The use of loglinear models for assessing differential item functioning across manifest and latent examinee groups. Journal of Educational Measurement, 27(4), 307 327. https://doi.org/10.1111/j.1745 3984.1990.tb00751.x
• Köse, İ. A. (2015). PISA 2009 öğrenci anketi alt ölçeklerinde (q32-q33) bulunan maddelerin değişen madde fonksiyonu açısından incelenmesi. [Examining the differential item functioning in the PISA 2009 student survey subscales (q32-q33)] Kastamonu Eğitim Dergisi, 23(1), 227-240.
• Langeheine, R., Pannekoek, J., & Van de Pol, F. (1996). Bootstrapping goodness-of-fit measures in categorical data analysis. Sociological Methods & Research, 24(4), 492-516. https://doi.org/10.1177/0049124196024004004
• Li, M., Wang, M. C., Shou, Y., Zhong, C., Ren, F., Zhang, X., & Yang, W. (2018). Psychometric properties and measurement invariance of the brief symptom inventory-18 among chinese insurance employees. Frontiers in psychology, 9, 519. https://doi.org/10.3389/fpsyg.2018.00519
• Liang, L., & Lee, Y. H. (2019). Factor structure of the ruminative response scale and measurement ınvariance across gender and age among chinese adolescents. Advances in Applied Sociology, 9, 193-207. https://doi.org/10.4236/aasoci.2019.96016
• Magidson, J., & Vermunt, J. K. (2001). Latent class factor and cluster models, bi‐plots, and related graphical displays. Sociological methodology, 31(1), 223-264. https://doi.org/10.1111/0081-1750.00096
• McCutcheon, A. L., & Hagenaars, J. A. (1997). Comparative social research with multi-sample latent class models. Applications of latent trait and latent class models in the social sciences, 266-277.
• Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525-543. https://doi.org/10.1007/BF02294825
• Millsap, R. E., & Yun-Tein, J. (2004). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39(3), 479 515. https://doi.org/10.1207/S15327906MBR3903_4
• Mislevy, R. J., & Verhelst, N. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55(2), 195 215. https://doi.org/10.1007/BF02295283
• Moors, G. (2004). Facts and artefacts in the comparison of attitudes among ethnic minorities. A multigroup latent class structure model with adjustment for response style behavior. European Sociological Review, 20(4), 303-320. https://doi.org/10.1093/esr/jch026
• Moors, G., & Wennekers, C. (2003). Comparing moral values in Western European countries between 1981 and 1999. A multiple group latent-class factor approach. International Journal of Comparative Sociology, 44(2), 155 172. https://doi.org/10.1177/002071520304400203
• Mullen, M. R. (1995). Diagnosing measurement equivalence in cross-national research. Journal of International Business Studies, 26(3), 573 596. https://doi.org/10.1057/palgrave.jibs.8490187
• Olson, J. F., Martin, M. O., & Mullis, I. V. S. (Eds.). (2008). Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College.
• Oon Pey Tee., & R. Subramaniam (2018) Comparative study of middle school students’ attitudes towards science: Rasch analysis of entire TIMSS 2011 attitudinal data for England, Singapore and the U.S.A. as well as psychometric properties of attitudes scale. International Journal of Science Education, 40(3), 268 290. https://doi.org/10.1080/09500693.2017.1413717
• Ölçüoğlu, R. (2015). TIMMS 2011 Türkiye sekizinci sınıf matematik başarısını etkileyen değişkenlerin bölgelere göre incelenmesi [The investigation of the variables that affecting TIMSS 2011 Turkey eight grade math achievement according to regions]. Unpublished master thesis, Hacettepe University, Institutes of Social Sciences, Ankara.
• Ölmez, İ. B., & Cohen, A. S. (2018). A mixture partial credit analysis of math anxiety. International Journal of Assessment Tools in Education, 5(4), 611-630. https://doi.org/10.21449/ijate.455175
• Önen, E. (2009). Ölçme değişmezliğinin yapısal eşitlik modelleme teknikleri ile incelenmesi [Examination of measurement invariance with structural equation modelling techniques]. Unpublished doctoral thesis, Ankara University, Ankara.
• Pishghadam, R., Baghaei, P., & Seyednozadi, Z. (2017). Introducing emotioncy as a potential source of test bias: A mixed Rasch modeling study. International Journal of Testing, 17(2), 127-140. https://doi.org/10.1080/15305058.2016.1183208
• Quandt, M. (2011). Using the mixed Rasch model in the comparative analysis of attitudes. Cross-cultural analysis: Methods and applications, 433-460.
• Rost, J. (1991). A logistic mixture distribution model for polychotomous item responses. British Journal of Mathematical and Statistical Psychology, 44(1), 75 92. https://doi.org/10.1111/j.2044-8317.1991.tb00951.x
• Rost, J., Carstensen, C., & Von Davier, M. (1997). Applying the mixed Rasch model to personality questionnaires. Applications of latent trait and latent class models in the social sciences, 324-332.
• Rost, J., & von Davier, M. (1995). Mixture distribution Rasch models. In Rasch models (pp. 257-268). Springer: New York, NY.
• Rutkowski, L., & Svetina, D. (2014). Assessing the hypothesis of measurement invariance in the context of large-scale international surveys. Educational and Psychological Measurement, 74(1), 31-57. https://doi.org/10.1177/0013164413498257
• Schnabel, D. B., Kelava, A., Van de Vijver, F. J., & Seifert, L. (2015). Examining psychometric properties, measurement invariance, and construct validity of a short version of the Test to Measure Intercultural Competence (TMIC-S) in Germany and Brazil. International Journal of Intercultural Relations, 49, 137 155. https://doi.org/10.1016/j.ijintrel.2015.08.002
• Sırgancı, G. (2019). Karma rasch model ile değişen madde fonksiyonunun belirlenmesinde kovaryant (ortak) değişkenin etkisi. [The effect of covariant (common) variable in determining the changing item function with mixed rasch model]. Unpublished doctoral thesis, Ankara University, Faculty of Education, Ankara.
• Silvia, P. J., Kaufman, J. C., & Pretz, J. E. (2009). Is creativity domain-specific? Latent class models of creative accomplishments and creative self-descriptions. Psychology of Aesthetics, Creativity, and the Arts, 3(3), 139-148. https://doi.org/10.1037/a0014940
• Somer, O., Korkmaz, M., Dural, S., & Can, S. (2009). Ölçme eşdeğerliliğinin yapısal eşitlik modellemesi ve madde tepki kuramı kapsamında incelenmesi. [Examining measurement invariance with structural equation modeling and item response theory]. Türk Psikoloji Dergisi, 24(64), 61-75. https://doi.org/10.14527/9786053188407.23
• Steenkamp, J. B. E., & Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25(1), 78-90. https://doi.org/10.1086/209528
• Sümer, N. (2000). Yapısal eşitlik modelleri: Temel kavramlar ve örnek uygulamalar. [Structural equation modeling: basic concepts and lisrel applications]. Türk Psikoloji Yazıları, 3(6), 49-74.
• Şen, S. (2016). Applying the mixed Rasch model to the Runco ideational behavior scale. Creativity Research Journal, 28(4), 426 434. https://doi.org/10.1080/10400419.2016.12299858
• TIMSS (2011). TIMSS 2011 international database. TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College, Chestnut Hill, MA and International Association for the Evaluation of Educational Achievement (IEA), IEA Secretariat, Amsterdam, the Netherlands. Retrieved January 10, 2020 from http://timss.bc.edu/timss2011/international-database.html.
• Tucker, K. L., Ozer, D. J., Lyubomirsk, S., & Boehm, J. K. (2006). Testing for measurement invariance in the satisfaction with life scale: A comparison of Russians and North Americans. Social Indicators Research, 78(2), 341–360. https://doi.org/10.1007/s11205-005-1037-5
• Uyar, Ş. & Doğan, N. (2014). PISA 2009 Türkiye örnekleminde öğrenme stratejileri modelinin farklı gruplarda ölçme değişmezliğinin incelenmesi [An investigation of measurement invariance of learning strategies model across different groups in PISA Turkey sample]. Uluslararası Türk Eğitim Bilimleri Dergisi, 2(3), 30-43
• Uzun, N. B. (2008). TIMSS-R Türkiye örnekleminde fen başarısını etkileyen değişkenlerin cinsiyetler arası değişmezliğinin değerlendirilmesi [Assessing the measurement invariance of factors that are related to students' science achievement across gender in TIMSS-R Turkey sample]. Unpublished master thesis, Hacettepe University, Institutes of Social Sciences, Ankara.
• Uzun, B. & Ogretmen T. (2010). Fen başarısı ile ilgili bazı değişkenlerin TIMSS-R Türkiye örnekleminde cinsiyete göre ölçme değişmezliğinin değerlendirilmesi. [Assessing the measurement invariance of factors that are related to students’ science achievement across gender in TIMSS-R Turkey Sample]. Eğitim ve Bilim, 35(155), 26-35.
• von Davier M. (2001). WINMIRA 2001: Software and user manual. Available from: http://208.76.80.46/~svfklumu/wmira/ index.html.
• von Davier, M., & Rost, J. (1995). Polytomous mixed Rasch models. In Rasch models (pp. 371-379). Springer, New York, NY.
• Wu, A. D., Li, Z., & Zumbo, B. D. (2007). Decoding the meaning of factorial invariance and updating the practice of multi-group confirmatory factor analysis: A demonstration with TIMSS data. Practical Assessment, Research & Evaluation, 12(3), 1-26. https://doi.org/10.7275/mhqa-cd89.
• Yalçın, S. (2019). Use of mixed item response theory in rating scales. International Electronic Journal of Elementary Education, 11(3), 273-278.
• Yandı, A. (2017). Ölçme eşdeğerliğini incelemede kullanılan yöntemlerin farklı koşullar altında istatistiksel güç oranları açısından karşılaştırılması [Comparison of the methods of examining measurement equivalence under different conditions in terms of statistical power ratios]. Unpublished doctoral thesis, Ankara University, Institutes of Social Sciences, Ankara.
• Yandı, A., Köse, İ. A., & Uysal, Ö. (2017). Farklı yöntemlerle ölçme değişmezliğinin incelenmesi: Pisa 2012 örneği. [Examining the measurement invariance with different methods: Example of Pısa 2012] Mersin Üniversitesi Eğitim Fakültesi Dergisi, 13(1), 243-253. https://doi.org/10.17860/mersinefd.305952
• Yandı, A., Köse, İ. A., Uysal, Ö., & Oğul, G. (2017). PISA 2015 öğrenci anketinin (st094q01nast094q05na) ölçme değişmezliğinin farklı yöntemlerle incelenmesi. [Investigation of the PISA 2015 student survey (ST094Q01NA-ST094Q05NA) with the different methods of measurement]. Ankara: Pegem
• Yung, Y. F. (1997). Finite mixtures in confirmatory factor-analysis models. Psychometrika, 62(3), 297-330. https://doi.org/10.1007/BF02294554
• Yüksel, S. (2015). Ölçeklerde saptanan madde işlev farklılığının karma rasch modelleri ile incelenmesi [Analyzing differential item functioning by mixed rasch models which stated in scales]. Unpublished doctoral thesis, Ankara University, Institutes of Health Sciences, Ankara.
• Yüksel, S., Elhan, A. H., Gökmen, D., Küçükdeveci, A. A., & Kutlay, Ş. (2018). Analyzing differential item functioning of the Nottingham Health Profile by mixed rasch model. Turkish Journal of Physical Medicine & Rehabilitation, 64(4), 300-307. https://doi.org/10.5606/tftrd.2018.2796
Birincil Dil en Eğitim, Bilimsel Disiplinler March Makaleler Orcid: 0000-0003-3651-7602Yazar: Zafer ERTÜRK (Sorumlu Yazar)Kurum: GAZİ ÜNİVERSİTESİÜlke: Turkey Orcid: 0000-0002-4337-7815Yazar: Esra OYARKurum: GAZİ ÜNİVERSİTESİÜlke: Turkey Yayımlanma Tarihi : 15 Mart 2021
 APA Ertürk, Z , Oyar, E . (2021). Examining the Measurement Invariance of TIMSS 2015 Mathematics Liking Scale through Different Methods . International Journal of Assessment Tools in Education , 8 (1) , 67-89 . DOI: 10.21449/ijate.705426

Makalenin Yazarları