Research Article
BibTex RIS Cite
Year 2024, Volume: 15 Issue: 3, 183 - 192, 26.10.2024
https://doi.org/10.21031/epod.1398317

Abstract

References

  • AERA, APA, & NCME. (2014). Standards for educational and psychological testing. Washington DC.
  • Altuner, F. (2019). Examining the relationship between item statistics and item response time [Master’s Thesis, Mersin University]. Retrieved from http://tez2.yok.gov.tr/
  • Barrett, T., Dowle, M., Srinivasan, A., Gorecki, J., Chirico, M., & Hocking, T. (2024). data.table: Extension of 'data.frame'. R package version 1.14.8. https://CRAN.R-project.org/package=data.table
  • Debeer, D., Buchholz, J., Hartig, J., & Janssen, R. (2014). Student, school, and country differences in sustained test-taking effort in the 2009 PISA reading assessment. Journal of Educational and Behavioral Statistics, 39(6), 502-523. doi:10.3102/1076998614558485
  • Debeer, D., Janssen, R., & Boeck, P. D. (2017). Modeling skipped and not-reached items using IRTrees. Journal of Educational Measurement, 54(3), 333-363. doi:10.1111/jedm.12147
  • DeMars, C. E. (2000). Test stakes and item format interactions. Applied Measurement in Education, 13(1), 55-77. doi:10.1207/s15324818ame1301_3
  • Eklöf, H., Pavešič, B. J., & Grønmo, L. S. (2014). A cross- national comparison of reported effort and mathematics performance in TIMSS Advanced. Applied Measurement in Education, 27(1), 31-45. doi:https://doi.org/10.1080/08957347.2013.853070
  • Kuang, H., & Sahin, F. (2023). Comparison of disengagement levels and the impact of disengagement on item parameters between PISA 2015 and PISA 2018 in the United States. Large-scale Assessments in Education, 11(4). doi:10.1186/s40536-023-00152-0
  • Lee, Y. H., & Haberman, S. J. (2016). Investigating test-taking behaviors using timing and process data. International Journal of Testing, 16(3), 240-267. doi:10.1080/15305058.2015.1085385
  • Lee, Y. H., & Jia, Y. (2014). Using response time to investigate students' test-taking behaviors in a NAEP computer-based study. Large-scale Assessments in Education, 2(8), 1-24. doi:10.1186/s40536-014-0008-1
  • MEB. (2019). PISA 2018 Türkiye Ön Raporu. Ankara: Milli Eğitim Bakanlığı.
  • Meijer, R. R. (1996). Person-fit research: An introduction. Applied Measurement in Education, 9(1), 3-8. doi: 10.1207/s15324818ame0901_2
  • Michaelides, M. P., & Militsa, I. (2022). Response time as an indicator of test-taking effort in PISA: country and item-type differences. Psychological Test and Assessment Modeling, 64(3), 304-338.
  • OECD. (2023). OECD-PISA. Retrieved from PISA 2018 Technical Report: https://www.oecd.org/pisa/data/pisa2018technicalreport/
  • Petscher, Y., Mitchell, A. M., & Foorman, B. R. (2015). Improving the reliability of student scores from speeded assessments: An illustration of conditional item response theory using a computer-administered measure of vocabulary. Reading and Writing, 28, 31–56. doi:10.1007/s11145-014-9518-z
  • Pools, E., & Monseur, C. (2021). Student test-taking effort in low-stakes assessments: evidence from the English version of the PISA 2015 science test. Large-scale Assessments in Education, 9(10), 1-31. doi:10.1186/s40536-021-00104-6
  • Rios, J. A., & Guo, H. (2020). Can culture be a salient predictor of test-taking engagement? An analysis of differential noneffortful responding on an international college-level assessment of critical thinking. Applied Measurement in Education, 33(4), 263-279. doi:10.1080/08957347.2020.1789141
  • Schnipke, D. L., & Scrams, D. J. (1997). Modeling item response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34, 213-232. doi: 10.1111/j.1745-3984.1997.tb00516.x
  • Setzer, J. C., Wise, S. L., Heuvel, J. R., & Ling, G. (2013). An investigation of examinee test-taking effort on a large-scale assessment. Applied Measurement in Education, 1, 34-49. doi:10.1080/08957347.2013.739453
  • van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73(3), 365–384. doi:10.1007/s11336-007-9045-8
  • Wise, S. L. (2006). An investigation of the differential effort received by items on a low-stakes computer-based test. Applied Measurement in Education, 19(2), 95-114. doi:10.1207/s15324818ame1902_2
  • Wise, S. L., & DeMars, C. E. (2006). An application of item response time: the effort-moderated IRT model. Journal of Educational Measurement, 43(1), 19-38. doi:10.1111/j.1745-3984.2006.00002.x
  • Wise, S. L., & Gao, L. (2017). A general approach to measuring test-taking effort on computer-based tests. Applied Measurement in Education, 30(4), 343-354. doi:10.1080/08957347.2017.1353992
  • Wise, S. L., & Kingsbury, G. G. (2016). Modeling student test‐taking motivation in the context of an adaptive achievement test. Journal of Educational Measurement, 53(1), 86-105. doi:10.1111/jedm.12102
  • Wise, S. L., & Kong, X. (2005). Response time effort: a new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18(2), 163–183. doi:10.1207/s15324818ame1802_2
  • Wise, S. L., & Ma, L. (2012). Setting response time thresholds for a CAT item pool: The normative threshold method. Paper presented at the 2012 annual meeting of the national council on measurement in education. Vancouver: Canada.
  • Wise, S. L., Bhola, D. S., & Yang, S.-T. (2006). Taking the time to improve the validity of low‐stakes tests: the effort‐monitoring CBT. Educational Measurement Issues and Practice, 25(2), 21-30. doi:10.1111/j.1745-3992.2006.00054.x

The Comparison of PISA 2015-2018 Mathematics Trend Items Based on Item Response Times

Year 2024, Volume: 15 Issue: 3, 183 - 192, 26.10.2024
https://doi.org/10.21031/epod.1398317

Abstract

This study aims to explore the intricate relationship between students' response times, item characteristics, and the effort invested during the Programme for International Student Assessment (PISA) 2015 and 2018 cycles. Through the analysis of data obtained from 69 mathematics trend items administered in a computer-based format across both cycles, this research investigates the dynamics of students' response times and their implications on effort and item characteristics. Findings reveal a significant increase in students' mean response times in the 2018 cycle compared to 2015, indicating potentially heightened effort and solution behavior. Notably, item formats exerted a substantial influence on response times, with open-ended items consistently eliciting lengthier response times compared to multiple-choice items. Additionally, a correlation between response times and item difficulty emerged, suggesting that more challenging items tend to consume more time, possibly due to the complexity of involved cognitive processes. Item based effort, assessed through Response Time Fidelity (RTF) indices, highlighted that a majority of students exhibited solution behavior across both cycles to the items.. Moreover, a decrease in the proportion of students displaying rapid-guessing behavior was observed in the 2018 cycle, potentially reflecting increased engagement with the assessment. While providing insights into the interplay of response times, item characteristics, and effort, this study emphasizes the need for further exploration into the multifaceted nature of effort in educational assessments. Overall, this research contributes valuable perspectives on nuances surrounding test performance and effort evaluation within PISA mathematics assessments.

References

  • AERA, APA, & NCME. (2014). Standards for educational and psychological testing. Washington DC.
  • Altuner, F. (2019). Examining the relationship between item statistics and item response time [Master’s Thesis, Mersin University]. Retrieved from http://tez2.yok.gov.tr/
  • Barrett, T., Dowle, M., Srinivasan, A., Gorecki, J., Chirico, M., & Hocking, T. (2024). data.table: Extension of 'data.frame'. R package version 1.14.8. https://CRAN.R-project.org/package=data.table
  • Debeer, D., Buchholz, J., Hartig, J., & Janssen, R. (2014). Student, school, and country differences in sustained test-taking effort in the 2009 PISA reading assessment. Journal of Educational and Behavioral Statistics, 39(6), 502-523. doi:10.3102/1076998614558485
  • Debeer, D., Janssen, R., & Boeck, P. D. (2017). Modeling skipped and not-reached items using IRTrees. Journal of Educational Measurement, 54(3), 333-363. doi:10.1111/jedm.12147
  • DeMars, C. E. (2000). Test stakes and item format interactions. Applied Measurement in Education, 13(1), 55-77. doi:10.1207/s15324818ame1301_3
  • Eklöf, H., Pavešič, B. J., & Grønmo, L. S. (2014). A cross- national comparison of reported effort and mathematics performance in TIMSS Advanced. Applied Measurement in Education, 27(1), 31-45. doi:https://doi.org/10.1080/08957347.2013.853070
  • Kuang, H., & Sahin, F. (2023). Comparison of disengagement levels and the impact of disengagement on item parameters between PISA 2015 and PISA 2018 in the United States. Large-scale Assessments in Education, 11(4). doi:10.1186/s40536-023-00152-0
  • Lee, Y. H., & Haberman, S. J. (2016). Investigating test-taking behaviors using timing and process data. International Journal of Testing, 16(3), 240-267. doi:10.1080/15305058.2015.1085385
  • Lee, Y. H., & Jia, Y. (2014). Using response time to investigate students' test-taking behaviors in a NAEP computer-based study. Large-scale Assessments in Education, 2(8), 1-24. doi:10.1186/s40536-014-0008-1
  • MEB. (2019). PISA 2018 Türkiye Ön Raporu. Ankara: Milli Eğitim Bakanlığı.
  • Meijer, R. R. (1996). Person-fit research: An introduction. Applied Measurement in Education, 9(1), 3-8. doi: 10.1207/s15324818ame0901_2
  • Michaelides, M. P., & Militsa, I. (2022). Response time as an indicator of test-taking effort in PISA: country and item-type differences. Psychological Test and Assessment Modeling, 64(3), 304-338.
  • OECD. (2023). OECD-PISA. Retrieved from PISA 2018 Technical Report: https://www.oecd.org/pisa/data/pisa2018technicalreport/
  • Petscher, Y., Mitchell, A. M., & Foorman, B. R. (2015). Improving the reliability of student scores from speeded assessments: An illustration of conditional item response theory using a computer-administered measure of vocabulary. Reading and Writing, 28, 31–56. doi:10.1007/s11145-014-9518-z
  • Pools, E., & Monseur, C. (2021). Student test-taking effort in low-stakes assessments: evidence from the English version of the PISA 2015 science test. Large-scale Assessments in Education, 9(10), 1-31. doi:10.1186/s40536-021-00104-6
  • Rios, J. A., & Guo, H. (2020). Can culture be a salient predictor of test-taking engagement? An analysis of differential noneffortful responding on an international college-level assessment of critical thinking. Applied Measurement in Education, 33(4), 263-279. doi:10.1080/08957347.2020.1789141
  • Schnipke, D. L., & Scrams, D. J. (1997). Modeling item response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34, 213-232. doi: 10.1111/j.1745-3984.1997.tb00516.x
  • Setzer, J. C., Wise, S. L., Heuvel, J. R., & Ling, G. (2013). An investigation of examinee test-taking effort on a large-scale assessment. Applied Measurement in Education, 1, 34-49. doi:10.1080/08957347.2013.739453
  • van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73(3), 365–384. doi:10.1007/s11336-007-9045-8
  • Wise, S. L. (2006). An investigation of the differential effort received by items on a low-stakes computer-based test. Applied Measurement in Education, 19(2), 95-114. doi:10.1207/s15324818ame1902_2
  • Wise, S. L., & DeMars, C. E. (2006). An application of item response time: the effort-moderated IRT model. Journal of Educational Measurement, 43(1), 19-38. doi:10.1111/j.1745-3984.2006.00002.x
  • Wise, S. L., & Gao, L. (2017). A general approach to measuring test-taking effort on computer-based tests. Applied Measurement in Education, 30(4), 343-354. doi:10.1080/08957347.2017.1353992
  • Wise, S. L., & Kingsbury, G. G. (2016). Modeling student test‐taking motivation in the context of an adaptive achievement test. Journal of Educational Measurement, 53(1), 86-105. doi:10.1111/jedm.12102
  • Wise, S. L., & Kong, X. (2005). Response time effort: a new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18(2), 163–183. doi:10.1207/s15324818ame1802_2
  • Wise, S. L., & Ma, L. (2012). Setting response time thresholds for a CAT item pool: The normative threshold method. Paper presented at the 2012 annual meeting of the national council on measurement in education. Vancouver: Canada.
  • Wise, S. L., Bhola, D. S., & Yang, S.-T. (2006). Taking the time to improve the validity of low‐stakes tests: the effort‐monitoring CBT. Educational Measurement Issues and Practice, 25(2), 21-30. doi:10.1111/j.1745-3992.2006.00054.x
There are 27 citations in total.

Details

Primary Language English
Subjects Testing, Assessment and Psychometrics (Other)
Journal Section Articles
Authors

Muhsin Polat 0009-0003-2897-3189

Hülya Kelecioğlu 0000-0002-0741-9934

Publication Date October 26, 2024
Submission Date December 1, 2023
Acceptance Date September 24, 2024
Published in Issue Year 2024 Volume: 15 Issue: 3

Cite

APA Polat, M., & Kelecioğlu, H. (2024). The Comparison of PISA 2015-2018 Mathematics Trend Items Based on Item Response Times. Journal of Measurement and Evaluation in Education and Psychology, 15(3), 183-192. https://doi.org/10.21031/epod.1398317