Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions
Abstract
The validity of individual test scores is an important issue that needs to be studied in psychological and educational assessment. An important factor affecting the validity of individual test scores is aberrant item response behavior. Aberrant item scores may increase/decrease the individuals’ scores and as a result individuals’ ability can be estimated above/below their true ability. Person-fit statistics (PFS) are useful tools to detect aberrant behavior. There are a great number of parametric and nonparametric PFS in the literature. The general purpose of the study is to examine the effectiveness of the parametric and nonparametric PFS in data sets which consist of polytomous items. This study is fundamental research aimed at determining the effectiveness of PFS using simulated data sets. According to the results, as expected, as the Type I error rates (significance alpha level) increased, detection rates (power) increased. In general, it is seen that as the number of misfitting item score vector and number of items increased, detection rates increased. Generally, nonparametric PFS (N-PFS) (especially GP) detected more aberrant individuals than parametric PFS (P-PFS) lzp. However, in some tests’ conditions lzp detected more aberrant individuals than N-PFS for longer tests. The results indicate that N-PFS outperformed P-PFS in most of the test conditions.
Keywords
References
- Bahry, L. M. (2012). Polytomous item response theory parameter recovery: an investigation of nonnormal distributions and small sample size (Master’s Thesis). Available from ProQuest Dissertations and Theses database. (UMI No. MR90146)
- Baker, F. B. (2001). The basis of item response theory. United State of America: Eric Clearinghouse on Assessment and Evaluation.
- Cohen, A. S., Kim, S. H., & Baker, F. B. (1993). Detection of differential item functioning in the graded response model. Applied Psychological Measurement, 17(4), 335-350. doi:10.1177/01466216930170040
- Conijn, J. M., Emons, W. H., De Jong, K., & Sijtsma, K. (2015). Detecting and explaining aberrant responding to the outcome questionnaire-45. Assessment, 22(4), 513-524. doi.org/10.1177/1073191114560882
- DeMars, C. E. (2002, April). Recovery of graded response and partial credit parameters in multilog and parscale. Paper presented at the annual meeting of American Educational Research Association, Chicago.
- Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38(1), 67-86.
- Egberink, I. J. A. L. (2010). Applications of item response theory to non-cognitive data. University Library Groningen.
- Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. New Jersey: Lawrence Erlbaum Associates.
Details
Primary Language
English
Subjects
-
Journal Section
Research Article
Authors
Publication Date
December 13, 2019
Submission Date
February 11, 2019
Acceptance Date
August 24, 2019
Published in Issue
Year 2019 Volume: 10 Number: 4
Cited By
Aberrant individuals’ effects on fit indices both of confirmatory factor analysis and polytomous IRT models
Current Psychology
https://doi.org/10.1007/s12144-021-01563-4Detecting Aberrant Response Behavior with Nonparametric Method: Mokken and PerFit Packages in RStudio
Measurement: Interdisciplinary Research and Perspectives
https://doi.org/10.1080/15366367.2020.1725734Cultural sensitivity of early childhood assessments based on learning progressions: a Rasch person fit analysis
Educational Assessment, Evaluation and Accountability
https://doi.org/10.1007/s11092-025-09453-0