TY - JOUR T1 - Investigating the Effect of Item Position on Person and Item Parameters: PISA 2015 Turkey Sample AU - Demirkol, Sinem AU - Kelecioğlu, Hülya PY - 2022 DA - March Y2 - 2021 DO - 10.21031/epod.958576 JF - Journal of Measurement and Evaluation in Education and Psychology JO - JMEEP PB - Association for Measurement and Evaluation in Education and Psychology WT - DergiPark SN - 1309-6575 SP - 69 EP - 85 VL - 13 IS - 1 LA - en AB - Different positions of items in booklets affect the probabilities of correct answers. This effect is called the item position effect in the literature, which causes variances in the item and person parameters. The aim of this study is to investigate the item position effect within the framework of explanatory item response theory. The analyses of this research were carried out on the PISA 2015 Turkey sample, and the item position effect was examined in the domains of reading and mathematics. In addition, the effect of the item position in different item formats (open response and multiple choice) was investigated. According to the results, the item position effect decreased the probability of answering the item correctly, and this effect was higher in reading than in mathematics. Furthermore, in the domain of mathematics, open response items were affected more than multiple-choice items by the item position. In the reading domain, open response and multiple choice items were affected similarly. The results of the analysis show that there were undesirable effects of the item position, and these effects should be taken into account. KW - Item position KW - explanatory item response theory KW - item format KW - item easiness KW - mathematics and reading domain KW - PISA 2015 CR - Albano, A. D. (2013). Multilevel modeling of item position effects. Journal of Educational Measurement, 50(4), 408-426. https://doi.org/10.1111/jedm.12026 CR - Albano, A. D., McConnell, S. R., Lease, E. M., & Cai, L. (2020). Contextual interference effects in early assessment: Evaluating the psychometric benefits of item interleaving. Frontiers in Education, 5. https://doi.org/10.3389/feduc.2020.00133 CR - Asseburg, R., & Frey, A. (2013). Too hard, too easy, or just right? The relationship between effort or boredom and ability-difficulty fit. Psychological Test and Assessment Modeling, 55(1), 92-104. https://psycnet.apa.org/record/2013-18917-006 CR - Bates, D., Maechler, M., Bokler, B., & Walker, S. (2014). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48. https://doi.org/10.18637/jss.v067.i01 CR - Brennan, R. L. (1992). The context of context effects. Applied Measurement in Education, 5, 225-264. https://doi.org/10.1207/s15324818ame0503_4 CR - Breslow, N. E., & Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88(421), 9-25. https://doi.org/10.2307/2290687 CR - Bulut, O. (2021). eirm: Explanatory item response modeling for dichotomous and polytomous item responses (R package version 0.3.0) [Computer software]. https://doi.org/10.5281/zenodo.4556285 CR - Bulut, O., Quo, O., & Gierl, M. (2017). A structural equation modeling approach for examining position effects in large‑scale assessments. Large Scale in Assessments in Education, 5(8), 1-20. https://doi.org/10.1186/s40536-017-0042-x CR - Christiansen, A., & Janssen, R. (2020). Item position effects in listening but not in reading in the European Survey of Language Competences. Educational Assessment, Evaluation and Accountability, 33(3), 49-69. https://doi.org/10.1007/s11092-020-09335-7 CR - Cook, L. L., & Petersen, N. S. (1987). Problems related to the use of conventional and item response theory equating methods in less than optimal circumstances. Applied Psychological Measurement, 11(1), 225-244. https://doi.org/10.1177/014662168701100302 CR - De Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. Springer. CR - De Boeck, P., Bakker, M., Zwitser, R., Nivard, M., Hofman, A., Tuerlinckx, F., & Partchev, I. (2011). The estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software, 39(12), 1-28. https://doi.org/10.18637/jss.v039.i12 CR - Debeer, D., & Janssen, R. (2013). Modeling item-position effects within an IRT framework. Journal of Educational Measurement, 50(2), 164-185. https://doi.org/10.1111/jedm.12009 CR - Desjardins, C. D., & Bulut, O. (2018). Handbook of educational measurement and psychometrics using R. CRC Press. CR - Fahrmeir, L., & Tutz, G. (2001). Multivariate statistical modeling based on generalized linear models (2nd ed.). Springer. CR - Frey, A. & Bernhardt, R. (2012). On the importance of using balanced booklet designs in PISA. Psychological Test and Assessment Modeling, 54(4), 397-417. https://www.psychologie-aktuell.com/fileadmin/download/ptam/4-2012_20121224/05_Frey.pdf CR - Frey, A., Hartig, J., & Rupp, A. (2009). An NCME instructional module on booklet designs in large-scale assessments of student achievement: Theory and practice. Educational Measurement: Issues and Practice, 28(3), 39-53. https://doi.org/10.1111/j.1745-3992.2009.00154.x CR - Goff, M., & Ackerman, P. L. (1992). Personality-intelligence relations: Assessment of typical intellectual engagement. Journal of Educational Psychology, 84(4), 537-552. https://doi.org/10.1037/0022-0663.84.4.537 CR - Gonzalez, E., & Rutkowski, L. (2010). Principles of multiple matrix booklet designs and parameter recovery in large scale assessments. IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments, 3, 125-156. https://www.ierinstitute.org/fileadmin/Documents/IERI_Monograph/IERI_Monograph_Volume_03_Chapter_6.pdf CR - Guertin, W. H. (1954). The effect of instructions and item order on the arithmetic subtest of the Wechsler- Bellevue. Journal of Genetic Psychology, 85(1), 79-83. https://doi.org/10.1080/00221325.1954.10532863 CR - Hambleton, R. K., & Traub, R. E. (1974). The effects of item order in test performance and stress. Journal of Experimental Education, 43(1), 40-46. http://www.jstor.org/stable/20150989 CR - Hahne, J. (2008). Analyzing position effects within reasoning items using the LLTM for structurally incomplete data. Psychology Science Quarterly, 50(3), 379-390. https://www.psychologie-aktuell.com/fileadmin/download/PschologyScience/3-2008/05_Hahne.pdf CR - Hartig, J., & Buchholz, J. (2012). A multilevel item response model for item position effects and individual persistence. Psychological Test and Assessment Modeling, 54(4), 418-431. https://www.proquest.com/scholarly-journals/multilevel-item-response-model-positioneffects/docview/1355923397 CR - Hecht, M., Weirich, S., Siegle, T., & Frey, A. (2015). Effects of design properties on parameter estimation in large-scale assessments. Educational and Psychological Measurement, 75(6), 1021-1044. https://doi.org/10.1177/0013164415573311 CR - Hohensinn, C., Kubinger, K., Reif, M., Schleicher, E., & Khorramdel, L. (2011). Analysing item position effects due to test booklet design within large-scale assessment. Educational Research and Evaluation, 17(6), 497-509. https://doi.org/10.1080/13803611.2011.632668 CR - Janssen, R., Schepers, J., & Peres, D. (2004). Models with item and item group predictors. In P. De Boeck & M. Wilson (Eds.), Explanatory item response models (pp. 189-212). Springer. https://doi.org/10.1007/978-1-4757-3990-9_6 CR - Kingston, N. M., & Dorans, N. J. (1982). The effect of the position of an item within a test on item responding behavior: An analysis based on item response theory (GRE Board Professional Report GREB No. 79-12bP). Educational Testing Service. https://doi.org/10.1002/j.2333-8504.1982.tb01308.x CR - Kingston, N. M., & Dorans, N. J. (1984). Item location effects and their implications for IRT equating and adaptive testing. Applied Psychological Measurement, 8(2), 147-154. https://doi.org/10.1177/014662168400800202 CR - Kolen, M. J., & Brennan, R. L. (2004). Testing equating, scaling, and linking: Methods and practice. Springer. CR - Kolen, M. J., & Harris, D. (1990). Comparison of item pre-equating and random groups equating using IRT and equipercentile methods. Journal of Educational Measurement, 27(1), 27-29. https://doi.org/10.1111/j.1745-3984.1990.tb00732.x CR - Le, L. T. (2007, July). Effects of item positions on their difficulty and discrimination: A study in PISA Science data across test language and countries. Paper presented at the 72nd Annual Meeting of the Psychometric Society, Tokyo. https://research.acer.edu.au/pisa/2/ CR - Leary, L. F., & Dorans, N. J. (1985). Implications for altering the context in which test items appear: A historical perspective on an immediate concern. Review of Educational Research, 55(3), 387-413. CR - Lord, F. M. (1980). Applications of item response theory to practical testing problems. Erlbaum. CR - Lord, F. N., & Novick, M. R. (1968). Statistical theories of mental test scores. Addison-Wesley. CR - MacNicol, K. (1956). Effects of varying order of item difficulty in an unspeeded verbal test (Unpublished manuscript). Educational Testing Service. CR - McCoach, D. B., & Black, A. C. (2008). Evaluation of model fit and adequacy. In A. A. O’Connell & D. B. McCoach (Ed.), Multilevel modeling of educational data (pp. 245-272). Information Age Publishing, Inc. CR - McCullagh, P., & NeIder, J. A. (1989). Generalized linear models (2nd ed.). Chapman & Hall. CR - McCulloch, C. E., & Searle, S. R. (2001). Generalized, linear, and mixed models. Wiley. CR - Meyers, J. L., Miller, G. E., & Way, W. D. (2009). Item position and item difficulty change in an IRT- based common item equating design. Applied Measurement in Education, 22(1), 38-60. https://doi.org/10.1080/08957340802558342 CR - Mollenkopf, W. G. (1950). An experimental study of the effects on item-analysis data of changing item placement and test time limit. Psychometrika, 15(3), 291-315. https://doi.org/10.1007/BF02289044 CR - Nagy, G., Nagengast, B., Frey, A., Becker, M., & Rose, N. (2018). A multilevel study of position effects in PISA achievement tests: student- and school-level predictors in the German tracked school system. Assessment in Education: Principles, Policy & Practice, 26(4), 422-443. https://doi.org/10.1080/0969594X.2018.1449100 CR - Organisation for Economic Co-operation and Development. (2009). PISA 2006 technical report. Organisation for Economic Co-operation and Development. https://www.oecd.org/pisa/data/42025182.pdf CR - Organisation for Economic Co-operation and Development. (2012). PISA 2009 technical report. Organisation for Economic Co-operation and Development. http://dx.doi.org/10.1787/9789264167872-en CR - Organisation for Economic Co-operation and Development. (2014). PISA 2012 technical report. Organisation for Economic Co-operation and Development. https://www.oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf CR - Organisation for Economic Co-operation and Development. (2017). PISA 2015 technical report. Organisation for Economic Co-operation and Development. https://www.oecd.org/pisa/data/2015-technical-report/ CR - Okumura, T. (2014). Empirical differences in omission tendency and reading ability in PISA: An application of tree-based item response models. Educational and Psychological Measurement, 74(4), 611-626. https://doi.org/10.1177/0013164413516976 CR - R Core Team. (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing. CR - Raven, J. C., Raven, J., & Court, J. H. (1997). Raven’s progressive matrices and vocabulary scales. J. C. Raven Ltd. CR - Rose, N., Davier, M., & Xu, X. (2010). Modeling nonignorable missing data with item response theory (IRT) (Report No. RR-10-11). Educational Testing Service. CR - Rose, N., Nagy, G., Nagengast, B., Frey, A., & Becker, M. (2019). Modeling multiple item context effects with generalized linear mixed models. Frontiers in Psychology, 10. https://doi.org/10.3389/fpsyg.2019.00248 CR - Sax, G., & Carr, A. (1962). An investigation of response sets on altered parallel forms. Educational and Psychological Measurement, 22(2), 371-376. https://doi.org/10.1177/001316446202200210 CR - Schweizer, K., Schreiner, M., & Gold, A. (2009). The confirmatory investigation of APM items with loadings as a function of the position and easiness of items: A two-dimensional model of APM. Psychology Science Quarterly, 51(1), 47-64. https://psycnet.apa.org/record/2009-06359-003 CR - Smouse, A. D., & Munz, D. C. (1968). The effects of anxiety and item difficulty sequence on achievement testing scores. Journal of Psychology, 68(2), 181-184. https://doi.org/10.1080/00223980.1968.10543421 CR - Trendtel, M., Robitzsch, A. (2018). Modeling item position effects with a Bayesian item response model applied to PISA 2009-2015 data. Psychological Test and Assessment Modeling, 60(2), 241-263. https://www.psychologie-aktuell.com/fileadmin/download/ptam/2-2018_20180627/06_PTAM-2-2018_Trendtel_v2.pdf CR - Tuerlinckx, F., & De Boeck, P. (2004). Models for residual dependencies. In P. De Boeck & M. Wilson (Eds.), Explanatory item response models (pp. 289-316). Springer. CR - Wainer, H., & Kiely, G. L. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24(3), 185-201. http://www.jstor.org/stable/1434630 CR - Weirich, S., Hecht, M., & Böhme, K. (2014). Modeling item position effects using generalized linear mixed models. Applied Psychological Measurement, 38(7), 535-548. https://doi.org/10.1177/0146621614534955 CR - Weirich, S., Hecht, M., Penk, C., Roppelt, A., & Böhme, K. (2016). Item position effects are moderated by changes in test-taking effort. Applied Psychological Measurement, 41(2), 115-129. https://doi.org/10.1177/0146621616676791 CR - Whitely, E., & Dawis, R. (1976). The influence of test context on item difficulty. Educational and Psychological Measurement, 36(2), 329-337. https://doi.org/10.1177/001316447603600211 CR - Wise, L. L., Chia, W. J., & Park, R. (1989, 27-31 March). Item position effects for test of word knowledge and arithmetic reasoning. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA. CR - Wise, S. L., & DeMars, C. E. (2005). Low examinee effort in low-stakes assessment: Problems and potential solutions. Educational Assessment, 10(1), 1-17. https://doi.org/10.1207/s15326977ea1001_1 CR - Wise, S. L., & Kong, X. (2005). Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18(2), 163-183. https://doi.org/10.1207/s15324818ame1802_2 CR - Wu, Q., Debeer, D. Buchholz, J., Hartig, J., & Janssen, R. (2019). Predictors of individual performance changes related to item positions in PISA assessments. Large Scale Assessment in Education, 7(5), 1-20. https://doi.org/10.1186/s40536-019-0073-6 CR - Yen, W. M. (1980). The extent, causes and importance of context effects on item parameters for two latent trait models. Journal of Educational Measurement, 17(4), 297-311. http://www.jstor.org/stable/1434871 CR - Zwick, R. (1991). Effects of item order and context on estimation of NAEP reading proficiency. Educational Measurement: Issues and Practice, 10(3), 10-16. https://doi.org/10.1111/j.1745-3992.1991.tb00198.x UR - https://doi.org/10.21031/epod.958576 L1 - https://dergipark.org.tr/en/download/article-file/1846959 ER -