Item Parameter Estimation for Dichotomous Items Based on Item Response Theory: Comparison of BILOG-MG, Mplus and R (ltm)

Şeyma Uyar; Neşe Öztürk Gübeş

doi:10.21031/epod.591415

Research Article

Year 2020, Volume: 11 Issue: 1, 27 - 42, 24.03.2020

Şeyma Uyar , Neşe Öztürk Gübeş

https://doi.org/10.21031/epod.591415

Abstract

References

Baker, F. B. (1987). Methodology review: Item parameter estimation under the one, two and three parameter logistic models. Applied Psychological Measurement, 11, 111- 141.
Baker, F. B. (1990). Some observations on the metric of BILOG results. Applied Psychological Measurement, 14, 139–150.
Baker, F. B. (1998). An investigation of the item parameter recovery of a Gibbs sampling procedure. Applied Psychological Measurement, 22(2), 153–169. http://dx.doi.org/10.1177/ 01466216980222005
Bulut, O. ve Zopluoglu, C. (2013, April). Item parameter recovery of the graded response model using the R package ltm: A Monte Carlo simulation study. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA.
de Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.
Foley, B. (2010)."Improving IRT parameter estimates with small sample sizes: Evaluating the efficacy of a new data augmentation technique. Open Access Theses and Dissertations from the College of Education and Human Sciences. Paper 75
Gao, F. ve Chen, L. (2005). Bayesian or non-Bayesian: A comparison study of item parameter estimation in the three-parameter logistic model. Applied Measurement in Education, 18, 351-380.
Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L.
Hambleton, R. K., Swaminathan, H. ve Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, Calif.: Sage Publications.
Hulin, C. L., Lissak, R. I., ve Drasgow, F. (1982). Recovery of two and three-parameter logistic item characteristic curves: A Monte Carlo study. Applied Psychological Measurement, 6(3), 249–260. http://dx.doi.org/ 10.1177/014662168200600301
Van der Linden, W. & Hambleton, R. K. (1997). Handbook of modern item response theory. Newyork: Springer-Verlag.
Linn (Ed.), Educational measurement (3rd ed., pp. 147–200). New York, NY: Macmillan.
Lim, R. G. ve Drasgow, F. (1990). Evaluation of two methods for estimating item response theory parameters when assessing differential item function. Journal of Applied Psychology, 75, 164–174.
Lord, F. M. (1968). An Analysis of the Verbal Scholastic Aptitude Test using Birnbaum's three-parameter logistic model. Educational and Psychological Measurement, 28, 989-1020.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
Pan, T. (2012). Comparison of four maximum likelihood methods in estimating the Rasch model. Paper presented at the annual meeting of the American Educational Research Association, Vancouver, Canada.
Partchev, I. (2017). Package ‘irtoys’. A collection of functions related to item response theory (IRT).
Patsula, L. N., ve Gessaroli M. E. (1995, April). A comparison of item parameter estimates and ICCs produced with TESTGRAF and BILOG under different test lengths and sample sizes. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco, CA.
Proctor, T., Teo, K.-S., Hou ve J., Hsieh (2005). Comparison of Parameter Recovery in a 2 Parameter Logistic Item Response Model using MLE and Bayesian MCMC Methods. Class project for 07P:148/22S:138 Bayesian Statistics,University of Iowa.
Rahman, N. ve Chajewski, M. (2014). A Comparison and Validation of 2- and 3-PL IRT Calibrations in BILOG, PARSCALE, IRTPPRO, flexMIRT, and LTM (R). National Council of Measurement in Education, at Philadephia.
Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17(5), 1–25.
Swaminathan, H. ve Gifford, J. (1983). Estimation of parameters in the three-parameter latent trait model. In D. J. Weiss & R. D. Bock (Eds.), New horizons in testing: latent trait test theory and computerized adaptive testing (pp. 13–30). New York: Academic Press.
Thissen, D. ve Wainer, H. (1982). Some standard errors in item response theory. Psychometrika, 47(4), 397–412. http://dx.doi.org/10.1007/BF02293705
Toland, M.D. (2008). Determining the accuracy of item parameter standard error of estimates in BILOG-MG3. Doctoral dissertation. Retrieved from ProQuest LLC (UMI Number 3317288).
Yen, W. M. (1987). A comparison of the efficiency and accuracy of Bilog and Logist. Psychometrika, 52(2), 275–291. http://dx.doi.org/10.1007/BF02294241
Yen, W., ve Fitzpatrick, A. R. (2006). Item response theory. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 111-153). Westport, CT: Praeger Publishers.
Yoes, M. (1995). An updated comparison of micro-computer based item parameter estimation procedures used with the 3-parameter IRT model. Saint Paul, MN: Assessment Systems Corporation.
Muthe´n, L. K., & Muthe´n, B. O. (1998-2012). Mplus user’s guide (8th ed.). Los Angeles, CA: Muthe´n & Muthe´n
Muthe´n, L. K., & Muthe´n, B. O. (2002). How To Use A Monte Carlo Study To Decide On Sample Size and Determine PowerThe Sage of Encyclopedia of Educational Research, measurement, and Evaluatin Ed. Frey, B. B.) IRT from SSI: BILOG-MG, MULTILOG, PARSCALE, TESTFACT (Ed. Du Toit, M.)
Zimowski, M., Muraki, E., Mislevy, R. J., & Bock, R. D. (2003). BILOG-MG 3: Item analysis and test scoring with binary logistic models. Chicago, IL: Scientific Software. [Computer software]

Item Parameter Estimation for Dichotomous Items Based on Item Response Theory: Comparison of BILOG-MG, Mplus and R (ltm)

Year 2020, Volume: 11 Issue: 1, 27 - 42, 24.03.2020

Şeyma Uyar , Neşe Öztürk Gübeş

https://doi.org/10.21031/epod.591415

Abstract

The aim of this study is twofold. The first one is to investigate the effect of sample size and test length on the estimation of item parameters and their standard errors for the two parameter item response theory (IRT). Another is to provide information about the performance of Mplus, BILOG-MG and R (ltm) programs in terms of parameter estimation under the conditions which were mentioned above. The simulated data were used in this study. The examinee responses were generated by using the open-source program R. After obtaining the data sets, the parameters were estimated in BILOG-MG, Mplus and R (ltm). The accuracy of the item parameters and ability estimates were evaluated under six conditions that differed in the numbers of items and examinees. After looking at the resulting bias and root mean square error (RMSE) values, it can be concluded that Mplus is an unbiased program when compared to BILOG-MG and R (ltm). BILOG-MG can estimate parameters and standard errors close to the true values, when compared to Mplus and R (ltm).

Keywords

IRT, parameter estimation, Mplus, BILOG-MG, ltm

References

Baker, F. B. (1987). Methodology review: Item parameter estimation under the one, two and three parameter logistic models. Applied Psychological Measurement, 11, 111- 141.
Baker, F. B. (1990). Some observations on the metric of BILOG results. Applied Psychological Measurement, 14, 139–150.
Baker, F. B. (1998). An investigation of the item parameter recovery of a Gibbs sampling procedure. Applied Psychological Measurement, 22(2), 153–169. http://dx.doi.org/10.1177/ 01466216980222005
Bulut, O. ve Zopluoglu, C. (2013, April). Item parameter recovery of the graded response model using the R package ltm: A Monte Carlo simulation study. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA.
de Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.
Foley, B. (2010)."Improving IRT parameter estimates with small sample sizes: Evaluating the efficacy of a new data augmentation technique. Open Access Theses and Dissertations from the College of Education and Human Sciences. Paper 75
Gao, F. ve Chen, L. (2005). Bayesian or non-Bayesian: A comparison study of item parameter estimation in the three-parameter logistic model. Applied Measurement in Education, 18, 351-380.
Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L.
Hambleton, R. K., Swaminathan, H. ve Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, Calif.: Sage Publications.
Hulin, C. L., Lissak, R. I., ve Drasgow, F. (1982). Recovery of two and three-parameter logistic item characteristic curves: A Monte Carlo study. Applied Psychological Measurement, 6(3), 249–260. http://dx.doi.org/ 10.1177/014662168200600301
Van der Linden, W. & Hambleton, R. K. (1997). Handbook of modern item response theory. Newyork: Springer-Verlag.
Linn (Ed.), Educational measurement (3rd ed., pp. 147–200). New York, NY: Macmillan.
Lim, R. G. ve Drasgow, F. (1990). Evaluation of two methods for estimating item response theory parameters when assessing differential item function. Journal of Applied Psychology, 75, 164–174.
Lord, F. M. (1968). An Analysis of the Verbal Scholastic Aptitude Test using Birnbaum's three-parameter logistic model. Educational and Psychological Measurement, 28, 989-1020.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
Pan, T. (2012). Comparison of four maximum likelihood methods in estimating the Rasch model. Paper presented at the annual meeting of the American Educational Research Association, Vancouver, Canada.
Partchev, I. (2017). Package ‘irtoys’. A collection of functions related to item response theory (IRT).
Patsula, L. N., ve Gessaroli M. E. (1995, April). A comparison of item parameter estimates and ICCs produced with TESTGRAF and BILOG under different test lengths and sample sizes. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco, CA.
Proctor, T., Teo, K.-S., Hou ve J., Hsieh (2005). Comparison of Parameter Recovery in a 2 Parameter Logistic Item Response Model using MLE and Bayesian MCMC Methods. Class project for 07P:148/22S:138 Bayesian Statistics,University of Iowa.
Rahman, N. ve Chajewski, M. (2014). A Comparison and Validation of 2- and 3-PL IRT Calibrations in BILOG, PARSCALE, IRTPPRO, flexMIRT, and LTM (R). National Council of Measurement in Education, at Philadephia.
Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17(5), 1–25.
Swaminathan, H. ve Gifford, J. (1983). Estimation of parameters in the three-parameter latent trait model. In D. J. Weiss & R. D. Bock (Eds.), New horizons in testing: latent trait test theory and computerized adaptive testing (pp. 13–30). New York: Academic Press.
Thissen, D. ve Wainer, H. (1982). Some standard errors in item response theory. Psychometrika, 47(4), 397–412. http://dx.doi.org/10.1007/BF02293705
Toland, M.D. (2008). Determining the accuracy of item parameter standard error of estimates in BILOG-MG3. Doctoral dissertation. Retrieved from ProQuest LLC (UMI Number 3317288).
Yen, W. M. (1987). A comparison of the efficiency and accuracy of Bilog and Logist. Psychometrika, 52(2), 275–291. http://dx.doi.org/10.1007/BF02294241
Yen, W., ve Fitzpatrick, A. R. (2006). Item response theory. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 111-153). Westport, CT: Praeger Publishers.
Yoes, M. (1995). An updated comparison of micro-computer based item parameter estimation procedures used with the 3-parameter IRT model. Saint Paul, MN: Assessment Systems Corporation.
Muthe´n, L. K., & Muthe´n, B. O. (1998-2012). Mplus user’s guide (8th ed.). Los Angeles, CA: Muthe´n & Muthe´n
Muthe´n, L. K., & Muthe´n, B. O. (2002). How To Use A Monte Carlo Study To Decide On Sample Size and Determine PowerThe Sage of Encyclopedia of Educational Research, measurement, and Evaluatin Ed. Frey, B. B.) IRT from SSI: BILOG-MG, MULTILOG, PARSCALE, TESTFACT (Ed. Du Toit, M.)
Zimowski, M., Muraki, E., Mislevy, R. J., & Bock, R. D. (2003). BILOG-MG 3: Item analysis and test scoring with binary logistic models. Chicago, IL: Scientific Software. [Computer software]

There are 30 citations in total.

Details

Primary Language	English
Journal Section	Articles
Authors	Şeyma Uyar 0000-0002-8315-2637 Neşe Öztürk Gübeş 0000-0003-0179-1986
Publication Date	March 24, 2020
Acceptance Date	January 6, 2020
Published in Issue	Year 2020 Volume: 11 Issue: 1

Cite

APA	Uyar, Ş., & Öztürk Gübeş, N. (2020). Item Parameter Estimation for Dichotomous Items Based on Item Response Theory: Comparison of BILOG-MG, Mplus and R (ltm). Journal of Measurement and Evaluation in Education and Psychology, 11(1), 27-42. https://doi.org/10.21031/epod.591415

Download Cover Image

Article Files

Full Text