Linking Scales in Item Response Theory with Covariates
Abstract
When test forms are administered to different non-equivalent groups of examinees and are scored by item response theory (IRT), it is necessary to put item parameters estimated separately on two groups on the same scale. In the IRT models which include covariates about the examinees, we have two parameters which model uniform and non-uniform differential item functioning (DIF) and that have to be put on the same scale. The aim of this study is to propose conversion equations, which are used to put the uniform and non-uniform DIF parameters on the same scale. To estimate the coefficients of the conversion equations we will use four methods: mean/mean, mean/sigma, Haebara and Stocking-Lord. We give a simulation study and an empirical example. The results of the simulation study show that the coefficients of the conversion equations are substantially equal for the Haebara and Stocking-Lord methods, while they are different for the other methods. The results of the empirical example is that IRT with covariates produces a more informative test than using IRT without covariates for high abilities’ values and, when the mean-mean and the mean-sigma methods are used, we obtain more informative tests than when using concurrent calibration.
Keywords
References
- Andersson, B. (2018). Asymptotic variance of linking coefficient estimators for polytomous IRT models. Applied Psychological Measurement, 42(3), 192-205.
- Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord and M. R. Novick (Eds.), Statistical theories of mental test scores (chaps. 17-20). Reading, MA: Addison-Wesley.
- Borchers, H. W. (2017). Pracma: Practical numerical math functions. R package.
- Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. New Jersey: Lawrence Erlbaum.
- Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29, 278-295.
- González, J., & Wiberg, M. (2017). Applying test equating methods – using R. Cham, Switzerland: Springer.
- Haebara, T. (1980). Equating logistic ability scales by a weighted least squares method. Japanese Psychological Research, 22, 144-149.
- Hanson, B. A., & Béguin, A. A. (2002). Obtaining a common scale for item response theory item parameters using separate versus concurrent estimation in the common-item equating design. Applied Psychological Measurement, 26, 3-24.
Details
Primary Language
English
Subjects
Studies on Education
Journal Section
Research Article
Publication Date
September 5, 2019
Submission Date
November 9, 2018
Acceptance Date
December 27, 2018
Published in Issue
Year 2018 Volume: 3 Number: 2