The Comparison of PISA 2015-2018 Mathematics Trend Items Based on Item Response Times

Muhsin Polat; Hülya Kelecioğlu

doi:10.21031/epod.1398317

Research Article

BibTex

RIS

Cite

Year 2024, Volume: 15 Issue: 3, 183 - 192, 26.10.2024

Muhsin Polat , Hülya Kelecioğlu

https://doi.org/10.21031/epod.1398317

Abstract

References

AERA, APA, & NCME. (2014). Standards for educational and psychological testing. Washington DC.
Altuner, F. (2019). Examining the relationship between item statistics and item response time [Master’s Thesis, Mersin University]. Retrieved from http://tez2.yok.gov.tr/
Barrett, T., Dowle, M., Srinivasan, A., Gorecki, J., Chirico, M., & Hocking, T. (2024). data.table: Extension of 'data.frame'. R package version 1.14.8. https://CRAN.R-project.org/package=data.table
Debeer, D., Buchholz, J., Hartig, J., & Janssen, R. (2014). Student, school, and country differences in sustained test-taking effort in the 2009 PISA reading assessment. Journal of Educational and Behavioral Statistics, 39(6), 502-523. doi:10.3102/1076998614558485
Debeer, D., Janssen, R., & Boeck, P. D. (2017). Modeling skipped and not-reached items using IRTrees. Journal of Educational Measurement, 54(3), 333-363. doi:10.1111/jedm.12147
DeMars, C. E. (2000). Test stakes and item format interactions. Applied Measurement in Education, 13(1), 55-77. doi:10.1207/s15324818ame1301_3
Eklöf, H., Pavešič, B. J., & Grønmo, L. S. (2014). A cross- national comparison of reported effort and mathematics performance in TIMSS Advanced. Applied Measurement in Education, 27(1), 31-45. doi:https://doi.org/10.1080/08957347.2013.853070
Kuang, H., & Sahin, F. (2023). Comparison of disengagement levels and the impact of disengagement on item parameters between PISA 2015 and PISA 2018 in the United States. Large-scale Assessments in Education, 11(4). doi:10.1186/s40536-023-00152-0
Lee, Y. H., & Haberman, S. J. (2016). Investigating test-taking behaviors using timing and process data. International Journal of Testing, 16(3), 240-267. doi:10.1080/15305058.2015.1085385
Lee, Y. H., & Jia, Y. (2014). Using response time to investigate students' test-taking behaviors in a NAEP computer-based study. Large-scale Assessments in Education, 2(8), 1-24. doi:10.1186/s40536-014-0008-1
MEB. (2019). PISA 2018 Türkiye Ön Raporu. Ankara: Milli Eğitim Bakanlığı.
Meijer, R. R. (1996). Person-fit research: An introduction. Applied Measurement in Education, 9(1), 3-8. doi: 10.1207/s15324818ame0901_2
Michaelides, M. P., & Militsa, I. (2022). Response time as an indicator of test-taking effort in PISA: country and item-type differences. Psychological Test and Assessment Modeling, 64(3), 304-338.
OECD. (2023). OECD-PISA. Retrieved from PISA 2018 Technical Report: https://www.oecd.org/pisa/data/pisa2018technicalreport/
Petscher, Y., Mitchell, A. M., & Foorman, B. R. (2015). Improving the reliability of student scores from speeded assessments: An illustration of conditional item response theory using a computer-administered measure of vocabulary. Reading and Writing, 28, 31–56. doi:10.1007/s11145-014-9518-z
Pools, E., & Monseur, C. (2021). Student test-taking effort in low-stakes assessments: evidence from the English version of the PISA 2015 science test. Large-scale Assessments in Education, 9(10), 1-31. doi:10.1186/s40536-021-00104-6
Rios, J. A., & Guo, H. (2020). Can culture be a salient predictor of test-taking engagement? An analysis of differential noneffortful responding on an international college-level assessment of critical thinking. Applied Measurement in Education, 33(4), 263-279. doi:10.1080/08957347.2020.1789141
Schnipke, D. L., & Scrams, D. J. (1997). Modeling item response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34, 213-232. doi: 10.1111/j.1745-3984.1997.tb00516.x
Setzer, J. C., Wise, S. L., Heuvel, J. R., & Ling, G. (2013). An investigation of examinee test-taking effort on a large-scale assessment. Applied Measurement in Education, 1, 34-49. doi:10.1080/08957347.2013.739453
van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73(3), 365–384. doi:10.1007/s11336-007-9045-8
Wise, S. L. (2006). An investigation of the differential effort received by items on a low-stakes computer-based test. Applied Measurement in Education, 19(2), 95-114. doi:10.1207/s15324818ame1902_2
Wise, S. L., & DeMars, C. E. (2006). An application of item response time: the effort-moderated IRT model. Journal of Educational Measurement, 43(1), 19-38. doi:10.1111/j.1745-3984.2006.00002.x
Wise, S. L., & Gao, L. (2017). A general approach to measuring test-taking effort on computer-based tests. Applied Measurement in Education, 30(4), 343-354. doi:10.1080/08957347.2017.1353992
Wise, S. L., & Kingsbury, G. G. (2016). Modeling student test‐taking motivation in the context of an adaptive achievement test. Journal of Educational Measurement, 53(1), 86-105. doi:10.1111/jedm.12102
Wise, S. L., & Kong, X. (2005). Response time effort: a new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18(2), 163–183. doi:10.1207/s15324818ame1802_2
Wise, S. L., & Ma, L. (2012). Setting response time thresholds for a CAT item pool: The normative threshold method. Paper presented at the 2012 annual meeting of the national council on measurement in education. Vancouver: Canada.
Wise, S. L., Bhola, D. S., & Yang, S.-T. (2006). Taking the time to improve the validity of low‐stakes tests: the effort‐monitoring CBT. Educational Measurement Issues and Practice, 25(2), 21-30. doi:10.1111/j.1745-3992.2006.00054.x

The Comparison of PISA 2015-2018 Mathematics Trend Items Based on Item Response Times

Year 2024, Volume: 15 Issue: 3, 183 - 192, 26.10.2024

Muhsin Polat , Hülya Kelecioğlu

https://doi.org/10.21031/epod.1398317

Abstract

This study aims to explore the intricate relationship between students' response times, item characteristics, and the effort invested during the Programme for International Student Assessment (PISA) 2015 and 2018 cycles. Through the analysis of data obtained from 69 mathematics trend items administered in a computer-based format across both cycles, this research investigates the dynamics of students' response times and their implications on effort and item characteristics. Findings reveal a significant increase in students' mean response times in the 2018 cycle compared to 2015, indicating potentially heightened effort and solution behavior. Notably, item formats exerted a substantial influence on response times, with open-ended items consistently eliciting lengthier response times compared to multiple-choice items. Additionally, a correlation between response times and item difficulty emerged, suggesting that more challenging items tend to consume more time, possibly due to the complexity of involved cognitive processes. Item based effort, assessed through Response Time Fidelity (RTF) indices, highlighted that a majority of students exhibited solution behavior across both cycles to the items.. Moreover, a decrease in the proportion of students displaying rapid-guessing behavior was observed in the 2018 cycle, potentially reflecting increased engagement with the assessment. While providing insights into the interplay of response times, item characteristics, and effort, this study emphasizes the need for further exploration into the multifaceted nature of effort in educational assessments. Overall, this research contributes valuable perspectives on nuances surrounding test performance and effort evaluation within PISA mathematics assessments.

Keywords

item response time, response-time effort, rapid guessing, PISA

References

AERA, APA, & NCME. (2014). Standards for educational and psychological testing. Washington DC.
Altuner, F. (2019). Examining the relationship between item statistics and item response time [Master’s Thesis, Mersin University]. Retrieved from http://tez2.yok.gov.tr/
Barrett, T., Dowle, M., Srinivasan, A., Gorecki, J., Chirico, M., & Hocking, T. (2024). data.table: Extension of 'data.frame'. R package version 1.14.8. https://CRAN.R-project.org/package=data.table
Debeer, D., Buchholz, J., Hartig, J., & Janssen, R. (2014). Student, school, and country differences in sustained test-taking effort in the 2009 PISA reading assessment. Journal of Educational and Behavioral Statistics, 39(6), 502-523. doi:10.3102/1076998614558485
Debeer, D., Janssen, R., & Boeck, P. D. (2017). Modeling skipped and not-reached items using IRTrees. Journal of Educational Measurement, 54(3), 333-363. doi:10.1111/jedm.12147
DeMars, C. E. (2000). Test stakes and item format interactions. Applied Measurement in Education, 13(1), 55-77. doi:10.1207/s15324818ame1301_3
Eklöf, H., Pavešič, B. J., & Grønmo, L. S. (2014). A cross- national comparison of reported effort and mathematics performance in TIMSS Advanced. Applied Measurement in Education, 27(1), 31-45. doi:https://doi.org/10.1080/08957347.2013.853070
Kuang, H., & Sahin, F. (2023). Comparison of disengagement levels and the impact of disengagement on item parameters between PISA 2015 and PISA 2018 in the United States. Large-scale Assessments in Education, 11(4). doi:10.1186/s40536-023-00152-0
Lee, Y. H., & Haberman, S. J. (2016). Investigating test-taking behaviors using timing and process data. International Journal of Testing, 16(3), 240-267. doi:10.1080/15305058.2015.1085385
Lee, Y. H., & Jia, Y. (2014). Using response time to investigate students' test-taking behaviors in a NAEP computer-based study. Large-scale Assessments in Education, 2(8), 1-24. doi:10.1186/s40536-014-0008-1
MEB. (2019). PISA 2018 Türkiye Ön Raporu. Ankara: Milli Eğitim Bakanlığı.
Meijer, R. R. (1996). Person-fit research: An introduction. Applied Measurement in Education, 9(1), 3-8. doi: 10.1207/s15324818ame0901_2
Michaelides, M. P., & Militsa, I. (2022). Response time as an indicator of test-taking effort in PISA: country and item-type differences. Psychological Test and Assessment Modeling, 64(3), 304-338.
OECD. (2023). OECD-PISA. Retrieved from PISA 2018 Technical Report: https://www.oecd.org/pisa/data/pisa2018technicalreport/
Petscher, Y., Mitchell, A. M., & Foorman, B. R. (2015). Improving the reliability of student scores from speeded assessments: An illustration of conditional item response theory using a computer-administered measure of vocabulary. Reading and Writing, 28, 31–56. doi:10.1007/s11145-014-9518-z
Pools, E., & Monseur, C. (2021). Student test-taking effort in low-stakes assessments: evidence from the English version of the PISA 2015 science test. Large-scale Assessments in Education, 9(10), 1-31. doi:10.1186/s40536-021-00104-6
Rios, J. A., & Guo, H. (2020). Can culture be a salient predictor of test-taking engagement? An analysis of differential noneffortful responding on an international college-level assessment of critical thinking. Applied Measurement in Education, 33(4), 263-279. doi:10.1080/08957347.2020.1789141
Schnipke, D. L., & Scrams, D. J. (1997). Modeling item response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34, 213-232. doi: 10.1111/j.1745-3984.1997.tb00516.x
Setzer, J. C., Wise, S. L., Heuvel, J. R., & Ling, G. (2013). An investigation of examinee test-taking effort on a large-scale assessment. Applied Measurement in Education, 1, 34-49. doi:10.1080/08957347.2013.739453
van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73(3), 365–384. doi:10.1007/s11336-007-9045-8
Wise, S. L. (2006). An investigation of the differential effort received by items on a low-stakes computer-based test. Applied Measurement in Education, 19(2), 95-114. doi:10.1207/s15324818ame1902_2
Wise, S. L., & DeMars, C. E. (2006). An application of item response time: the effort-moderated IRT model. Journal of Educational Measurement, 43(1), 19-38. doi:10.1111/j.1745-3984.2006.00002.x
Wise, S. L., & Gao, L. (2017). A general approach to measuring test-taking effort on computer-based tests. Applied Measurement in Education, 30(4), 343-354. doi:10.1080/08957347.2017.1353992
Wise, S. L., & Kingsbury, G. G. (2016). Modeling student test‐taking motivation in the context of an adaptive achievement test. Journal of Educational Measurement, 53(1), 86-105. doi:10.1111/jedm.12102
Wise, S. L., & Kong, X. (2005). Response time effort: a new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18(2), 163–183. doi:10.1207/s15324818ame1802_2
Wise, S. L., & Ma, L. (2012). Setting response time thresholds for a CAT item pool: The normative threshold method. Paper presented at the 2012 annual meeting of the national council on measurement in education. Vancouver: Canada.
Wise, S. L., Bhola, D. S., & Yang, S.-T. (2006). Taking the time to improve the validity of low‐stakes tests: the effort‐monitoring CBT. Educational Measurement Issues and Practice, 25(2), 21-30. doi:10.1111/j.1745-3992.2006.00054.x

There are 27 citations in total.

Details

Primary Language	English
Subjects	Testing, Assessment and Psychometrics (Other)
Journal Section	Articles
Authors	Muhsin Polat 0009-0003-2897-3189 Hülya Kelecioğlu 0000-0002-0741-9934
Publication Date	October 26, 2024
Submission Date	December 1, 2023
Acceptance Date	September 24, 2024
Published in Issue	Year 2024 Volume: 15 Issue: 3

Cite

APA	Polat, M., & Kelecioğlu, H. (2024). The Comparison of PISA 2015-2018 Mathematics Trend Items Based on Item Response Times. Journal of Measurement and Evaluation in Education and Psychology, 15(3), 183-192. https://doi.org/10.21031/epod.1398317

Download Cover Image

Article Files

Full Text