Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions

Asiye Şengül Avşar

doi:10.21031/epod.525647

EN

Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions

Abstract

The validity of individual test scores is an important issue that needs to be studied in psychological and educational assessment. An important factor affecting the validity of individual test scores is aberrant item response behavior. Aberrant item scores may increase/decrease the individuals’ scores and as a result individuals’ ability can be estimated above/below their true ability. Person-fit statistics (PFS) are useful tools to detect aberrant behavior. There are a great number of parametric and nonparametric PFS in the literature. The general purpose of the study is to examine the effectiveness of the parametric and nonparametric PFS in data sets which consist of polytomous items. This study is fundamental research aimed at determining the effectiveness of PFS using simulated data sets. According to the results, as expected, as the Type I error rates (significance alpha level) increased, detection rates (power) increased. In general, it is seen that as the number of misfitting item score vector and number of items increased, detection rates increased. Generally, nonparametric PFS (N-PFS) (especially G^P) detected more aberrant individuals than parametric PFS (P-PFS) l_z^p. However, in some tests’ conditions l_z^p detected more aberrant individuals than N-PFS for longer tests. The results indicate that N-PFS outperformed P-PFS in most of the test conditions.

Keywords

polytomous items,aberrant item response,person-fit statistics

References

Bahry, L. M. (2012). Polytomous item response theory parameter recovery: an investigation of nonnormal distributions and small sample size (Master’s Thesis). Available from ProQuest Dissertations and Theses database. (UMI No. MR90146)
Baker, F. B. (2001). The basis of item response theory. United State of America: Eric Clearinghouse on Assessment and Evaluation.
Cohen, A. S., Kim, S. H., & Baker, F. B. (1993). Detection of differential item functioning in the graded response model. Applied Psychological Measurement, 17(4), 335-350. doi:10.1177/01466216930170040
Conijn, J. M., Emons, W. H., De Jong, K., & Sijtsma, K. (2015). Detecting and explaining aberrant responding to the outcome questionnaire-45. Assessment, 22(4), 513-524. doi.org/10.1177/1073191114560882
DeMars, C. E. (2002, April). Recovery of graded response and partial credit parameters in multilog and parscale. Paper presented at the annual meeting of American Educational Research Association, Chicago.
Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38(1), 67-86.
Egberink, I. J. A. L. (2010). Applications of item response theory to non-cognitive data. University Library Groningen.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. New Jersey: Lawrence Erlbaum Associates.

Emmen, P. (2011). A person-fit analysis of personality data. Amsterdam. Master Thesis Vrije Universiteit.
Emons, W. H. M. (2003). Detection and diagnosis of misfitting item-score vectors Amsterdam: Dutch University Press.
Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32(3), 224-247. doi:10.1177/0146621607302479
Emons, W. H. (2009). Detection and diagnosis of person misfit from patterns of summed polytomous item scores. Applied Psychological Measurement, 33(8), 599-619. doi:10.1177/0146621609334378
Glass, C. A. W., & Dagohoy, A. V. T. (2007). A person-fit test for irt models for polytomous items. Psychometrika, 72(2), 159-180
Hambleton, R. K., van der Linden W. J., & Wells, C. S. (2011). IRT models for the analysis of polytomous scored data: Brief and selected history of model building advances. In: Nering ML, Ostini R, editors. Handbook of Polytomous Item Response Theory Models. New York: Routledge; 2011. p. 21–42
Jiang, S., Wang, C., & Weiss, D. J. (2016). Sample size requirements for estimation of item parameters in the multidimensional graded response model. Frontiers In Psychology, 7, 109. doi: 10.3389/fpsyg.2016.00109
Junker, B., & Sijtsma, K. (2001). Nonparametric item response theory in action: an overview of the special issue. Applied Psychological Measurement, 25(3), 211- 220. doi:10.1177/01466210122032028
Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person-fit statistics. Applied Measurement in Education, 16(4), 277-298. doi: 10.1207/S15324818AME1604_2
Lee, Y. S. (2007). A comparision of methods for nonparametric estimation of item characteristic curves for binary items. Applied Psychological Measurement, 31(2), 121-134. doi:10.1177/0146621606290248
Lee, Y. S., Wollack, J. A. & Douglas, J. (2009). On the use of nonparametric item characteristic curve estimation techniques for checking parametric model fit. Educational and Psychological Measurement, 69(2), 181 -197. doi:10.1177/0013164408322026
Liang, T., Wells, C. S., & Hambleton, R. K. (2014). An assessment of nonparametric approach for evaluating the fit of item response models. Journal of Educational Measurement, 51(1), 1-17. doi:10.1111/jedm.12031
Meijer, R. R. (1996). Person-fit research: an introduction. Applied Measurement in Education, 9(1), 3-8.
Meijer, R. R. (2003). Diagnosing item score patterns on a test using item response theory-based person-fit statistics. Psychological Methods, 8(1), 72. doi: 10.1037/1082-989X.8.1.72
Meijer, R. R. (2004, March). Investigating the quality of items in cat using nonparametric irt. Law School Admission Council Computerized Testing Report. A Publication of the Law School Admission Council.
Meijer, R. R., & Baneke, J. J. (2004). Analyzing psychopathology items: a case for nonparametric item response theory modeling. Psychological Methods, 9(3), 354-368. doi: 10.1037/1082-989X.9.3.354
Meijer, R. R., & Sijtsma, K. (2001). Methodology review: evaluating person-fit. Applied Psychological Measurement, 25(2), 107-135. doi: 10.1177/01466210122031957
Meijer, R. R., & Tendeiro, J. N. (2018). Unidimensional item response theory. In P. Irwing, T. Booth, and D. J. Hughes (Eds.), The wiley handbook of psychometric testing: a multidisciplinary reference on survey, scale and test development (p. 413-443). John Wiley & Sons
Meijer, R. R., Molenaar, I. W., & Sijtsma, K. (1994). Influence of test and person characteristics on nonparametric appropriateness measurement. Applied Psychological Measurement, 18(2), 111-120
Meijer, R. R., Niessen, A. S. M., & Tendeiro, J. N. (2016). A practical guide to check the consistency of item response patterns in clinical research through person-fit statistics: examples and a computer program. Assessment, 23(1), 52-62. doi: 10.1177/1073191115577800
Meijer, R. R., Egberink, I. J., Emons, W. H., & Sijtsma, K. (2008). Detection and validation of unscalable item score patterns using item response theory: an illustration with harter's self-perception profile for children. Journal of Personality Assessment, 90(3), 227-238. doi: 10.1080/00223890701884921
Molenaar, I. W. (2001). Thirty years of nonparametric item response theory. Applied Psychological Measurement, 25(3), 295-299. doi:10.1177/01466210122032091
Mousavi, A., Tendeiro, J. N., & Younesi, J. (2016). Person fit assessment using the Perfit package in R. The Quantitative Methods for Psychology, 12(3), 232–242. doi:10.20982/tqmp.12.3.p232
Nydick, S. W. (2015). catIrt: an r package for simulating irt-based computerized adaptive tests. R package version 0.4-2.
Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56(4), 611-630.
Rupp, A. A. (2013). A systematic review of the methodology for person-fit research in item response theory: lessons about generalizability of inferences from the design of simulation studies. Psychological Test and Assessment Modeling, 55(1), 3-38.
Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. USA: Sage Publications.
Sijtsma, K., Emons, W. H., Bouwmeester, S., Nyklícek, I., & Roorda, L. D. (2008). Nonparametric irt analysis of quality of life scales and its application to the world health organization quality of life scale (whoqol-bref). Quality Of Life Research: An International Journal Of Quality Of Life Aspects Of Treatment,Care And Rehabilitation, 17(2), 275-290. doi:10.1007/s11136-007-9281-6
Sodano, S. M., & Tracey, T. J. (2011). A brief Inventory of Interpersonal Problems–Circumplex using nonparametric item response theory: introducing the iip–c–irt. Journal of Personality Assessment, 93(1), 62-75. doi: 10.1080/00223891.2010.528482
Spoden, C. (2014). Person fit analysis with simulation-based methods (Doctoral dissertation, Universitäts bibliothek Duisburg-Essen).
Syu, J. J. (2013). Applying person-fit in faking detection-the simulation and practice of non parametric item response theory. (Doctoral Dissertation, National Chengchi University). Retrieved from http://nccur.lib.nccu.edu.tw/bitstream/140.119/58646/1/251501.pdf
Şengül Avşar, A., & Tavşancıl, E. (2017). Examination of polytomous items' psychometric properties according to nonparametric item response theory models in different test conditions. Educational Sciences: Theory & Practice, 17(2). doi:10.12738/estp.2017.2.0246
Tendeiro, J. N. (2016). Package ‘PerFit’.
Twiste, L. T. (2011). Detection of unmotivated test takers through an analysis of response patterns: beyond person-fit statistics (Doctoral Dissertation). Available from ProQuest Dissertations and Theses database. (UMI No. 3478798)
van Der Flier, H. (1982). Deviant response patterns and comparability of test scores. Journal of Cross-Cultural Psychology, 13(3), 267-298.
Voncken, L. (2014). Comparison of the lz* Person-Fit Index and Ω Copying-Index in Copying Detection. Universiteit van Tilburg. Research Master Methoden en Technieken.
Waller, G. N., & Jones, J. (2016). Package ‘fungible’.
Wang, S. X. (2001). Maximum weighted likelihood estimation (Doctoral dissertation, University of British Columbia).
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450.

Details

Primary Language

English

Subjects

-

Journal Section

Research Article

Authors

Asiye Şengül Avşar ^*
0000-0001-5522-2514
Türkiye

Publication Date

December 13, 2019

Submission Date

February 11, 2019

Acceptance Date

August 24, 2019

Published in Issue

Year 2019 Volume: 10 Number: 4

DOI

https://doi.org/10.21031/epod.525647

IZ

https://izlik.org/JA22MU69FP

Cite

RIS / Bibtex

APA

Şengül Avşar, A. (2019). Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. Journal of Measurement and Evaluation in Education and Psychology, 10(4), 348-364. https://doi.org/10.21031/epod.525647

AMA

1.Şengül Avşar A. Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. JMEEP. 2019;10(4):348-364. doi:10.21031/epod.525647

Chicago

Şengül Avşar, Asiye. 2019. “Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions”. Journal of Measurement and Evaluation in Education and Psychology 10 (4): 348-64. https://doi.org/10.21031/epod.525647.

EndNote

Şengül Avşar A (December 1, 2019) Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. Journal of Measurement and Evaluation in Education and Psychology 10 4 348–364.

IEEE

[1]A. Şengül Avşar, “Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions”, JMEEP, vol. 10, no. 4, pp. 348–364, Dec. 2019, doi: 10.21031/epod.525647.

ISNAD

Şengül Avşar, Asiye. “Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions”. Journal of Measurement and Evaluation in Education and Psychology 10/4 (December 1, 2019): 348-364. https://doi.org/10.21031/epod.525647.

JAMA

1.Şengül Avşar A. Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. JMEEP. 2019;10:348–364.

MLA

Şengül Avşar, Asiye. “Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions”. Journal of Measurement and Evaluation in Education and Psychology, vol. 10, no. 4, Dec. 2019, pp. 348-64, doi:10.21031/epod.525647.

Vancouver

1.Asiye Şengül Avşar. Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. JMEEP. 2019 Dec. 1;10(4):348-64. doi:10.21031/epod.525647

Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions

Abstract

Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions

Abstract

Keywords

References

Details

Primary Language

Subjects

Journal Section

Authors

Publication Date

Submission Date

Acceptance Date

Published in Issue

DOI

IZ

Cite

Cited By

Aberrant individuals’ effects on fit indices both of confirmatory factor analysis and polytomous IRT models

Detecting Aberrant Response Behavior with Nonparametric Method: Mokken and PerFit Packages in RStudio

Cultural sensitivity of early childhood assessments based on learning progressions: a Rasch person fit analysis