The Effect of Aberrant Responses on Ability Estimation in Computer Adaptive Tests

Sebahat Gören; Hakan Kara; Başak Erdem Kara; Hülya Kelecioğlu

doi:10.21031/epod.1067307

Research Article

The Effect of Aberrant Responses on Ability Estimation in Computer Adaptive Tests

Year 2022, Volume: 13 Issue: 3, 256 - 268, 30.09.2022

Sebahat Gören , Hakan Kara , Başak Erdem Kara , Hülya Kelecioğlu

https://doi.org/10.21031/epod.1067307

Cited By: 2

Abstract

In computer adaptive test (CAT), aberrant responses caused by some factors such as lucky guesses and carelessness errors may cause significant bias in ability estimation. Correct responses resulting from lucky guesses and false responses resulting from carelessness or anxiety may reveal aberrant responses and the impact of these types of aberrant responses may cause an erroneous estimation of the examinee’s actual ability because they do not reflect the examinee’s actual knowledge. In this study, the performances of regarding ability estimation were examined comparatively in the context of CAT simulations in case of aberrant responses.Under different conditions, twelve different CAT simulations were conducted with 10 replications for each of the conditions. Correlation, RMSE, bias, and mean absolute error (MAE) values were calculated and interpreted for each condition. Results generally indicated that the 4PL IRT model provided a more efficient and robust ability estimation than the 3PL IRT model and the 4PL model increased the precision and effectiveness of the CAT applications.

Keywords

Computer adaptive tests (CAT) , 3PL IRT model , 4PL IRT model , aberrant responses , early mistake

References

Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13(2), 113-127. https://doi.org/10.1177/014662168901300201
Babcock, B., & Weiss, D. J. (2012). Termination criteria in computerized adaptive tests: Do variable-length cats provide efficient and effective measurement? Journal of Computerized Adaptive Testing, 1(1), 1-18. https://doi.org/10.7333/1212-0101001
Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item response model. (RR 81-20). Educational Testing Service. https://doi.org/10.1002/j.2333-8504.1981.tb01255.x
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologist. Lawrence Erlbaum Associates.
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principals and applications. Kluwer Academic Publishers.
Jia, B., Zhang, X., & Zhu, Z. (2019). A short note on aberrant responses bias in item response theory. Frontiers in Psychology, 10, 43. https://doi.org/10.3389/fpsyg.2019.00043
Liao, W., Ho, R., Yen, Y., & Cheng, H. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality, 40(10), 1679–1694. https://doi.org/10.2224/sbp.2012.40.10.1679
Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. The British Journal of Mathematical and Statistical Psychology, 63(3), 509–25. https://doi.org/10.1348/000711009X474502
Magis, D. (2014). On the asymptotic standard error of a class of robust estimators of ability in dichotomous item response models. The British Journal of Mathematical and Statistical Psychology, 67(3), 430–450. https://doi.org/10.1111/bmsp.12027
Miller, I. & Miller, M. (2004). John E. Freund’s mathematical statistics with applications (7th ed.). Prentice Hall.
Reckase, M., D. (2009). Multidimensional item response theory: Statistics for social and behavioral sciences. Springer.
Rulison, K. L., & Loken, E. (2009). I’ve fallen and I can't get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83–101. https://doi.org/10.1177/0146621608324023
Segall, D. O. (2004). Computerized adaptive testing. In K. Kempf-Leonard (Ed.), Encyclopedia of social measurement (pp. 429-438). Academic.
Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), 778-793. https://doi.org/10.1177/0013164408324460
Thompson, N. A., & Weiss, D. J. (2011). A framework for the development of computerized adaptive tests. Practical Assessment, Research & Evaluation, 16(1), 1-9. https://doi.org/10.7275/wqzt-9427
Wainer, H. (Ed.). (2000). Computerized adaptive testing: A primer (2nd ed.). Lawrence Erlbaum.
Waller, N. G., & Reise, S. P. (2010). Measuring psychopathology with non-standard item response theory models: Fitting the four-parameter model to the Minnesota Multiphasic Personality Inventory. In S. Embretson (Ed), New directions in psychological measurement with model-based approaches (pp. 147-173). American Psychological Association.
Weiss, D. J. (2004). Computerized adaptive testing for effective and efficient measurement in counseling and education. Measurement and Evaluation in Counseling and Development, 37(2), 71-84. https://doi.org/10.1080/07481756.2004.11909751

Year 2022, Volume: 13 Issue: 3, 256 - 268, 30.09.2022

Sebahat Gören , Hakan Kara , Başak Erdem Kara , Hülya Kelecioğlu

https://doi.org/10.21031/epod.1067307

Cited By: 2

Abstract

References

Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13(2), 113-127. https://doi.org/10.1177/014662168901300201
Babcock, B., & Weiss, D. J. (2012). Termination criteria in computerized adaptive tests: Do variable-length cats provide efficient and effective measurement? Journal of Computerized Adaptive Testing, 1(1), 1-18. https://doi.org/10.7333/1212-0101001
Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item response model. (RR 81-20). Educational Testing Service. https://doi.org/10.1002/j.2333-8504.1981.tb01255.x
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologist. Lawrence Erlbaum Associates.
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principals and applications. Kluwer Academic Publishers.
Jia, B., Zhang, X., & Zhu, Z. (2019). A short note on aberrant responses bias in item response theory. Frontiers in Psychology, 10, 43. https://doi.org/10.3389/fpsyg.2019.00043
Liao, W., Ho, R., Yen, Y., & Cheng, H. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality, 40(10), 1679–1694. https://doi.org/10.2224/sbp.2012.40.10.1679
Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. The British Journal of Mathematical and Statistical Psychology, 63(3), 509–25. https://doi.org/10.1348/000711009X474502
Magis, D. (2014). On the asymptotic standard error of a class of robust estimators of ability in dichotomous item response models. The British Journal of Mathematical and Statistical Psychology, 67(3), 430–450. https://doi.org/10.1111/bmsp.12027
Miller, I. & Miller, M. (2004). John E. Freund’s mathematical statistics with applications (7th ed.). Prentice Hall.
Reckase, M., D. (2009). Multidimensional item response theory: Statistics for social and behavioral sciences. Springer.
Rulison, K. L., & Loken, E. (2009). I’ve fallen and I can't get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83–101. https://doi.org/10.1177/0146621608324023
Segall, D. O. (2004). Computerized adaptive testing. In K. Kempf-Leonard (Ed.), Encyclopedia of social measurement (pp. 429-438). Academic.
Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), 778-793. https://doi.org/10.1177/0013164408324460
Thompson, N. A., & Weiss, D. J. (2011). A framework for the development of computerized adaptive tests. Practical Assessment, Research & Evaluation, 16(1), 1-9. https://doi.org/10.7275/wqzt-9427
Wainer, H. (Ed.). (2000). Computerized adaptive testing: A primer (2nd ed.). Lawrence Erlbaum.
Waller, N. G., & Reise, S. P. (2010). Measuring psychopathology with non-standard item response theory models: Fitting the four-parameter model to the Minnesota Multiphasic Personality Inventory. In S. Embretson (Ed), New directions in psychological measurement with model-based approaches (pp. 147-173). American Psychological Association.
Weiss, D. J. (2004). Computerized adaptive testing for effective and efficient measurement in counseling and education. Measurement and Evaluation in Counseling and Development, 37(2), 71-84. https://doi.org/10.1080/07481756.2004.11909751

There are 18 citations in total.

Details

Primary Language	English
Journal Section	Articles
Authors	Sebahat Gören 0000-0002-6453-3258 Hakan Kara 0000-0002-0451-6034 Başak Erdem Kara 0000-0003-3066-2892 Hülya Kelecioğlu 0000-0002-0741-9934
Publication Date	September 30, 2022
Acceptance Date	July 10, 2022
Published in Issue	Year 2022 Volume: 13 Issue: 3

Cite

APA	Gören, S., Kara, H., Erdem Kara, B., Kelecioğlu, H. (2022). The Effect of Aberrant Responses on Ability Estimation in Computer Adaptive Tests. Journal of Measurement and Evaluation in Education and Psychology, 13(3), 256-268. https://doi.org/10.21031/epod.1067307

Journal of Measurement and Evaluation in Education and Psychology

The Effect of Aberrant Responses on Ability Estimation in Computer Adaptive Tests

Abstract

Keywords

References

Abstract

References

Details

Cite

Cited By

Analyzing aberrant response pattern in mathematics achievement test

EUREKA: Social and Humanities

https://doi.org/10.21303/2504-5571.2024.003486

Detection of aberrant testing behaviour in unproctored CAT via a verification test

International Journal of Assessment Tools in Education

https://doi.org/10.21449/ijate.1598330