Research Article
BibTex RIS Cite

Investigation of the Performance of Multidimensional Equating Procedures for Common-Item Nonequivalent Groups Design

Year 2017, , 421 - 434, 29.12.2017


In this study, the performance of the multidimensional extentions of Stocking-Lord, mean/mean, and mean/sigma equating procedures under common-item nonequivalent groups design was investigated. The performance of those three equating procedures was examined under the combination of various conditions including sample size, ability distribution, correlation between two dimensions, and percentage of anchor items in the test. Item parameter recovery was evaluated calculating RMSE (root man squared error) and BIAS values. It was found that Stocking-Lord procedure provided the smaller RMSE and BIAS values for both item discrimination and item difficulty parameter estimates across most conditions.


  • Ackerman, T. A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29(1), 67-91.
  • Angoff, W. H. (1971). Scales, norms, and equaivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp.508-600). Washington, DC: American Council on Education. (Reprinted as W. H. Angoff, Scales, norm, and equivalent scores. Princeton, NJ: Educational Testing Service, 1984).
  • Bolt, D. M. (1999). Evaluating the effects of multidimensionality on IRT true score equating. Applied Measurement in Education, 12(4), 383-407.
  • Camilli, G., Wang, M., & Fesq, J. (1995). The effects of dimensionality on equating the law school admission test. Journal of Educational Measurement, 32(1), 79-96.
  • Davey, T., Oshima, T. C., and Lee, K. (1996). Linking multidimensional item calibrations. Applied Psychological Measurement, 11(3), 221-224.
  • De Champlain, A. F. (1996). The effect of multidimensionality on IRT true-score equating for subgroups of examinees. Journal of Educational Measurement, 33(2), 181-201.
  • Gulliksen, H. (1950). Theory of mental tests. New York: Wiley.
  • Hirsch, T. M. (1989). Multidimensional equating. Journal of Educational Measurement, 26(4), 337-349.
  • Kolen, M. J. & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices (2nd ed.). New York: Springer-Verlag.
  • Levine, R. (1955). Equating the score scales of alternate forms administered to samples of different ability (Research Bulletin, 55-23). Princeton, NJ: Educational Testing Service.
  • Li, Y. H. & Lissitz, R. W. (2000). An evaluation of the accuracy of multidimensional IRT linking. Applied Psychological Measurement, 24(2), 115-138.
  • Livingston, S. A., Dorans, N. J.,& Wright, N, K. (1990). What combination of sampling and equating methods works best? Applied Measurement in Education, 3(1), 73-95.
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
  • Skaggs, G. & Lissitz, R. W. (1986). IRT test equating: Relevant issues and a review of recent research. Review of Educational Research, 56(4), 495-529.
  • Yao, L. (2003). BMIRT: Bayesian multivariate item response theory [Computer software and manual]. Monteray, CA: CTB/McGraw Hill.
  • Yao, L. (2003). SimuMIRT [Computer software]. Monteray, CA: DMDC DoD Center.
  • Yao, L. (2004). LinkMIRT: Linking of multivariate item response model [Computer software]. Monteray, CA: DMDC DoD Center.
  • Yao, L. & Boughton, K. (2007). A multidimensional item response modeling approach for improving subscale proficiency estimation and classification. Applied Psycholoical Measurement, 31, 1-23.
  • Yao, L. & Boughton, K. (2009). Multidimensional linking for tests with mixed item types. Journal of Educational Measurement, 46(2), 177-197.
  • Yao, L. & Schwarz, R. (2006). A multidimensional partial credit model with associated item and test statistics: An application to mixed format tests. Applied Psychological Measurement, 30, 469-492.

Çok Boyutlu Eşitleme Yöntemlerinin Eşdeğer Olmayan Gruplarda Ortak Madde Deseni için Performanslarının İncelenmesi

Year 2017, , 421 - 434, 29.12.2017


Bu çalışmada çok boyutlu veri için adapte edilen Stocking-Lord, ortlama/ortlama ve ortalama/sigma eşitleme yöntemlerinin performansları eşdeğer olmayan gruplarda ortak madde deseni göz önüne alınarak incelenmiştir. Bu üç eşitleme yöntemin performansları orneklem büyüklüğünün, yetenek dağılımının, boyutlar arasındaki korelasyon değerlerinin ve testteki ortak madde yüzdelerinin kombinasyonlari altinda araştırılmıştır. Madde parametre kestirimlerinin değerlendirilmesinde RMSE ve yanlılık değerleri kullanılmıştır. Bu çalışmada çoğu koşul icin hem madde ayırt edicilik parametre kestirimlerinde hem de madde güçlük parametre kestirimlerinde Stocking-Lord yönteminin diger iki yönteme gore daha kucuk RMSE ve yanlılık değerleri verdiği bulunmuştur. 


  • Ackerman, T. A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29(1), 67-91.
  • Angoff, W. H. (1971). Scales, norms, and equaivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp.508-600). Washington, DC: American Council on Education. (Reprinted as W. H. Angoff, Scales, norm, and equivalent scores. Princeton, NJ: Educational Testing Service, 1984).
  • Bolt, D. M. (1999). Evaluating the effects of multidimensionality on IRT true score equating. Applied Measurement in Education, 12(4), 383-407.
  • Camilli, G., Wang, M., & Fesq, J. (1995). The effects of dimensionality on equating the law school admission test. Journal of Educational Measurement, 32(1), 79-96.
  • Davey, T., Oshima, T. C., and Lee, K. (1996). Linking multidimensional item calibrations. Applied Psychological Measurement, 11(3), 221-224.
  • De Champlain, A. F. (1996). The effect of multidimensionality on IRT true-score equating for subgroups of examinees. Journal of Educational Measurement, 33(2), 181-201.
  • Gulliksen, H. (1950). Theory of mental tests. New York: Wiley.
  • Hirsch, T. M. (1989). Multidimensional equating. Journal of Educational Measurement, 26(4), 337-349.
  • Kolen, M. J. & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices (2nd ed.). New York: Springer-Verlag.
  • Levine, R. (1955). Equating the score scales of alternate forms administered to samples of different ability (Research Bulletin, 55-23). Princeton, NJ: Educational Testing Service.
  • Li, Y. H. & Lissitz, R. W. (2000). An evaluation of the accuracy of multidimensional IRT linking. Applied Psychological Measurement, 24(2), 115-138.
  • Livingston, S. A., Dorans, N. J.,& Wright, N, K. (1990). What combination of sampling and equating methods works best? Applied Measurement in Education, 3(1), 73-95.
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
  • Skaggs, G. & Lissitz, R. W. (1986). IRT test equating: Relevant issues and a review of recent research. Review of Educational Research, 56(4), 495-529.
  • Yao, L. (2003). BMIRT: Bayesian multivariate item response theory [Computer software and manual]. Monteray, CA: CTB/McGraw Hill.
  • Yao, L. (2003). SimuMIRT [Computer software]. Monteray, CA: DMDC DoD Center.
  • Yao, L. (2004). LinkMIRT: Linking of multivariate item response model [Computer software]. Monteray, CA: DMDC DoD Center.
  • Yao, L. & Boughton, K. (2007). A multidimensional item response modeling approach for improving subscale proficiency estimation and classification. Applied Psycholoical Measurement, 31, 1-23.
  • Yao, L. & Boughton, K. (2009). Multidimensional linking for tests with mixed item types. Journal of Educational Measurement, 46(2), 177-197.
  • Yao, L. & Schwarz, R. (2006). A multidimensional partial credit model with associated item and test statistics: An application to mixed format tests. Applied Psychological Measurement, 30, 469-492.
There are 20 citations in total.


Journal Section Articles

Burcu Atar

Gonca Yesiltaş

Publication Date December 29, 2017
Acceptance Date November 21, 2017
Published in Issue Year 2017


APA Atar, B., & Yesiltaş, G. (2017). Çok Boyutlu Eşitleme Yöntemlerinin Eşdeğer Olmayan Gruplarda Ortak Madde Deseni için Performanslarının İncelenmesi. Journal of Measurement and Evaluation in Education and Psychology, 8(4), 421-434.