Research Article
BibTex RIS Cite
Year 2024, Volume: 15 Issue: Special Issue, 282 - 301, 30.12.2024
https://doi.org/10.21031/epod.1532846

Abstract

References

  • Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., . . . Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Google. https://www.tensorflow.org/
  • Baker, R. (2021). Artificial intelligence in education: Bringing it all together. In S. Vincent Lancrin (Ed.), Pushing the frontiers with AI, blockchain, and robots (pp. 43–54). OECD.
  • Ercikan, K., Guo, H., & He, Q. (2020). Use of response process data to inform group comparisons and fairness research. Educational assessment, 25(3), 179–197. https://doi.org/10.1080/10627197.2020.1804353
  • Ercikan, K., Guo, H., & Por, H.-H. (2023). Uses of process data in advancing the practice and science of technology-rich assessments. Innovating Assessments to measure and support complex skills (N. Foster & M. Piacentini, Eds.). OECD Publishing. Retrieved from https://www.oecdilibrary. org/education/innovating-assessments-to-measure-and-support-complexskills_ 7b3123f1-en
  • Ercikan, K., & Pellegrino, J. (2017). Validation of score meaning in the next generation of assessments: The use of response processes. Routledge.
  • Geron, A. (2017). Hands-on machine learning with scikit-learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. Sebastopol, CA: O’Reilly Media.
  • Gordon, E. (2020). Toward assessment in the service of learning. Educational Measurement: Issues and Practice, 39(3), 72–78. Retrieved from https://doi.org/ 10.1111/emip.12370
  • Greiff, S., Niepel, C., Scherer, R., & Martin, R. (2016). Understanding students’ performance in a computer-based assessment of complex problem solving: An analysis of behavioral data from computer-generated log files. Computers in Human Behavior, 61, 36-46.
  • Guo, H., & Ercikan, K. (2021a). Differential rapid responding across language and cultural groups. Educational Research and Evaluation, 26(5-6), 302-327. https://doi.org/10.1080/13803611.2021.1963941
  • Guo, H., & Ercikan, K. (2021b). Using response-time data to compare the testing behaviors of English language learners (ells) to other test-takers (non-ells) on a mathematics assessment. ETS Research Report, 2021(1), 1-15. https://doi.org/10.1002/ets2.12340
  • Guo, H., Johnson, M., Ercikan, K., Saldivia, L. & Worthington, M. (2024). Large-scale assessments for learning: A huma-centered AI approach to contextualize test performance. Journal of Learning Analytics, 11(2), 229-245. https://doi.org/10.18608/jla.2024.8007
  • Guo, H., Rios, J., Haberman, S., Liu, O., Wang, J. & Paek, I. (2017). A new procedure for detection of students’ rapid guessing responses using response time. Applied Measurement in Education. 29(3). 173 – 183. http://doi.org/10.1080/08957347.2016.1171766
  • Johnson, M. S., & Liu, X. (2022). Psychometric considerations for the joint modeling of response and process data [Paper presentation]. International Meeting of Psychometric Society, Bologna, Italy.
  • Levy, R. (2020). Implications of considering response process data for greater and lesser psychometrics. Educational Assessment, 25(3), 218–235. https://doi.org/10.1080/10627197.2020.1804352
  • Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605.
  • Miao, F., Holmes, W., Huang, R., & Zhang, H. (2021). AI and education: Guidance for policymakers. UNESCO.
  • National Assessment Governing Board. (NAGB, 2020). Response process data from the 2017 NAEP grade 8 mathematics assessment. https://www.nationsreportcard.gov/process_data/
  • National Assessment Governing Board (NAGB, 2024a). Mathematics assessment framework for the 2022 and 2024 National Assessment of Educational progress. Retrieved from https://www.nagb.gov/content/dam/nagb/en/documents/publications/frameworks/mathematics/2 022-24-nagb-math-framework-508.pdf
  • National Assessment Governing Board (NAGB, 2024b). How states use and value the Nation’s Report Card. Retrieved from https://www.nagb.gov/about-us/state-and-tuda-case-studies.html
  • National Center for Education Statistics. (NCES, 2022). NAEP questions tool. Retrieved from https://nces.ed.gov/NationsReportCard/nqt/
  • National Research Council. (NRC, 2000). How people learn: Brain, mind, experience, and school: Expanded edition. Washington, DC: The National Academies Press. Retrieved from https://doi.org/10.17226/9853
  • National Academies of Sciences, Engineering, and Medicine (NASEM, 2018). How people learn II: Learners, contexts, and Cultures. Washington, DC: The National Academies Press.
  • Office of Educational Technology. (2023). Artificial intelligence and the future of teaching and learning: Insights and recommendations (Report). Washington, DC, 2023: U.S. Department of Education.
  • Pellegrino, J. W. (2020). Important considerations for assessment to function in the service of education. Educational Measurement: Issues and Practice, 39(3), 81- 85. Retrieved from https://doi.org/10.1111/emip.12372
  • Pohl, S., Ulitzsch, E., & von Davier, M. (2021). Reframing rankings in educational assessments. Science, 372(6540), 338-340. Retrieved from https://doi.org/10.1126/science.abd3300
  • Radwan, A. M. (2019). Human active learning. In S. M. Brito (Ed.), Active learning (chap. 2). Rijeka: IntechOpen. Retrieved from https://doi.org/10.5772/ intechopen.81371
  • Rios, J., & Guo, H. (2020). Can culture be a salient predictor of test-taking engagement? an analysis of differential noneffortful responding on an international college-level assessment of critical thinking ISLA. Applied Measurement in Education, 33(4), 263–279. https://doi.org/10.1080/08957347.2020.1789141
  • Rios, J., Guo, H., Mao, L., & Liu, O. L. (2017). Evaluating the impact of careless responding on aggregated-scores: To filter unmotivated examinees or not? International Journal of Testing, 17(1), 74–104. https://doi.org/10.1080/08957347.2020.1789141
  • Rizve, M. N., Duarte, K., Rawat, Y. S., & Shah, M. (2021). In defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning. In International conference on learning representations. Retrieved from https://iclr.cc/media/iclr- 2021/Slides/3255.pdf
  • Wise, S. (2017). Rapid-guessing behavior: Its identification, interpretation, and implications. Educational Measurement: Issues and Practice, 36(4), 52–61. https://doi.org/10.1111/emip.12165
  • Wise, S. (2021). Six insights regarding test-taking disengagement. Educational Research and Evaluation, 26(5-6), 328–338. https://doi.org/10.1080/13803611.2021.1963942
  • Wise, S. & Kong, X. (2005). Response time effort: a new measure of examinee motivation in computerbased tests. Applied Measurement in Education, 18 (2), 163 – 183. https://doi.org/10.1207/s15324818ame1802_2
  • Xie, Q., Dai, Z., Hovy, E. H., Luong, M., & Le, Q. V. (2019). Unsupervised data augmentation for consistency training. CoRR, abs/1904.12848. Retrieved from http://arxiv.org/abs/1904.12848
  • Zhu, X., Lafferty, J., & Ghahramani, Z. (2003). Combining active learning and semisupervised learning using gaussian fields and harmonic functions. In ICML 2003 workshop on the continuum from labeled to unlabeled data in machine learning and data mining (pp. 58–65).
  • Zoanetti, N., & Griffin, P. (2017). Log-file data as indicators for problem-solving processes. In B. Csapo & J. Funke (Eds.), The nature of problem solving: Using research to inspire 21st century learning (chap. 11). Paris: OECD Publishing. Retrieved from https://doi.org/10.1787/9789264273955- en

Human-Centered AI for Discovering Student Engagement Profiles on Large-Scale Educational Assessments

Year 2024, Volume: 15 Issue: Special Issue, 282 - 301, 30.12.2024
https://doi.org/10.21031/epod.1532846

Abstract

Large-scale assessments play a key role in education: they provide insights for educators and stakeholders about what students know and are able to do, which can inform educational policies and interventions. Besides overall performance scores and subscores, educators need to know how and why students performed at certain proficiency levels to improve learning. Process/log data contain nuanced information about how students engaged with and acted on tasks in an assessment, which hold promise of contextualizing a performance score. However, one isolated action event observed in process data may be open to multiple interpretations. To address this challenge, in the current study, we propose to integrate sequential process data with response data to create engagement profiles to better reflect students' test-taking processes. Most importantly, we propose to use AI algorithms to assist and amplify human expertise in the creation of students’ engagement profiles, so that the information extraction from the multi-source (performance and process) data can be scaled up to enhance the value of large-scale assessments in teaching and learning. We leveraged various machine learning techniques and developed a general framework of the human-centered AI approach to help human experts efficiently and effectively make sense of the multi-source data. Using a mathematics item block from the National Assessment of Educational Progress (NAEP) for illustrations, data from over 14,000 students resulted in ten preliminary profiles, more than half of which were associated with low performing students. These engagement profiles are expected to generate rich and meaningful feedback for educators and stakeholders.

References

  • Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., . . . Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Google. https://www.tensorflow.org/
  • Baker, R. (2021). Artificial intelligence in education: Bringing it all together. In S. Vincent Lancrin (Ed.), Pushing the frontiers with AI, blockchain, and robots (pp. 43–54). OECD.
  • Ercikan, K., Guo, H., & He, Q. (2020). Use of response process data to inform group comparisons and fairness research. Educational assessment, 25(3), 179–197. https://doi.org/10.1080/10627197.2020.1804353
  • Ercikan, K., Guo, H., & Por, H.-H. (2023). Uses of process data in advancing the practice and science of technology-rich assessments. Innovating Assessments to measure and support complex skills (N. Foster & M. Piacentini, Eds.). OECD Publishing. Retrieved from https://www.oecdilibrary. org/education/innovating-assessments-to-measure-and-support-complexskills_ 7b3123f1-en
  • Ercikan, K., & Pellegrino, J. (2017). Validation of score meaning in the next generation of assessments: The use of response processes. Routledge.
  • Geron, A. (2017). Hands-on machine learning with scikit-learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. Sebastopol, CA: O’Reilly Media.
  • Gordon, E. (2020). Toward assessment in the service of learning. Educational Measurement: Issues and Practice, 39(3), 72–78. Retrieved from https://doi.org/ 10.1111/emip.12370
  • Greiff, S., Niepel, C., Scherer, R., & Martin, R. (2016). Understanding students’ performance in a computer-based assessment of complex problem solving: An analysis of behavioral data from computer-generated log files. Computers in Human Behavior, 61, 36-46.
  • Guo, H., & Ercikan, K. (2021a). Differential rapid responding across language and cultural groups. Educational Research and Evaluation, 26(5-6), 302-327. https://doi.org/10.1080/13803611.2021.1963941
  • Guo, H., & Ercikan, K. (2021b). Using response-time data to compare the testing behaviors of English language learners (ells) to other test-takers (non-ells) on a mathematics assessment. ETS Research Report, 2021(1), 1-15. https://doi.org/10.1002/ets2.12340
  • Guo, H., Johnson, M., Ercikan, K., Saldivia, L. & Worthington, M. (2024). Large-scale assessments for learning: A huma-centered AI approach to contextualize test performance. Journal of Learning Analytics, 11(2), 229-245. https://doi.org/10.18608/jla.2024.8007
  • Guo, H., Rios, J., Haberman, S., Liu, O., Wang, J. & Paek, I. (2017). A new procedure for detection of students’ rapid guessing responses using response time. Applied Measurement in Education. 29(3). 173 – 183. http://doi.org/10.1080/08957347.2016.1171766
  • Johnson, M. S., & Liu, X. (2022). Psychometric considerations for the joint modeling of response and process data [Paper presentation]. International Meeting of Psychometric Society, Bologna, Italy.
  • Levy, R. (2020). Implications of considering response process data for greater and lesser psychometrics. Educational Assessment, 25(3), 218–235. https://doi.org/10.1080/10627197.2020.1804352
  • Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605.
  • Miao, F., Holmes, W., Huang, R., & Zhang, H. (2021). AI and education: Guidance for policymakers. UNESCO.
  • National Assessment Governing Board. (NAGB, 2020). Response process data from the 2017 NAEP grade 8 mathematics assessment. https://www.nationsreportcard.gov/process_data/
  • National Assessment Governing Board (NAGB, 2024a). Mathematics assessment framework for the 2022 and 2024 National Assessment of Educational progress. Retrieved from https://www.nagb.gov/content/dam/nagb/en/documents/publications/frameworks/mathematics/2 022-24-nagb-math-framework-508.pdf
  • National Assessment Governing Board (NAGB, 2024b). How states use and value the Nation’s Report Card. Retrieved from https://www.nagb.gov/about-us/state-and-tuda-case-studies.html
  • National Center for Education Statistics. (NCES, 2022). NAEP questions tool. Retrieved from https://nces.ed.gov/NationsReportCard/nqt/
  • National Research Council. (NRC, 2000). How people learn: Brain, mind, experience, and school: Expanded edition. Washington, DC: The National Academies Press. Retrieved from https://doi.org/10.17226/9853
  • National Academies of Sciences, Engineering, and Medicine (NASEM, 2018). How people learn II: Learners, contexts, and Cultures. Washington, DC: The National Academies Press.
  • Office of Educational Technology. (2023). Artificial intelligence and the future of teaching and learning: Insights and recommendations (Report). Washington, DC, 2023: U.S. Department of Education.
  • Pellegrino, J. W. (2020). Important considerations for assessment to function in the service of education. Educational Measurement: Issues and Practice, 39(3), 81- 85. Retrieved from https://doi.org/10.1111/emip.12372
  • Pohl, S., Ulitzsch, E., & von Davier, M. (2021). Reframing rankings in educational assessments. Science, 372(6540), 338-340. Retrieved from https://doi.org/10.1126/science.abd3300
  • Radwan, A. M. (2019). Human active learning. In S. M. Brito (Ed.), Active learning (chap. 2). Rijeka: IntechOpen. Retrieved from https://doi.org/10.5772/ intechopen.81371
  • Rios, J., & Guo, H. (2020). Can culture be a salient predictor of test-taking engagement? an analysis of differential noneffortful responding on an international college-level assessment of critical thinking ISLA. Applied Measurement in Education, 33(4), 263–279. https://doi.org/10.1080/08957347.2020.1789141
  • Rios, J., Guo, H., Mao, L., & Liu, O. L. (2017). Evaluating the impact of careless responding on aggregated-scores: To filter unmotivated examinees or not? International Journal of Testing, 17(1), 74–104. https://doi.org/10.1080/08957347.2020.1789141
  • Rizve, M. N., Duarte, K., Rawat, Y. S., & Shah, M. (2021). In defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning. In International conference on learning representations. Retrieved from https://iclr.cc/media/iclr- 2021/Slides/3255.pdf
  • Wise, S. (2017). Rapid-guessing behavior: Its identification, interpretation, and implications. Educational Measurement: Issues and Practice, 36(4), 52–61. https://doi.org/10.1111/emip.12165
  • Wise, S. (2021). Six insights regarding test-taking disengagement. Educational Research and Evaluation, 26(5-6), 328–338. https://doi.org/10.1080/13803611.2021.1963942
  • Wise, S. & Kong, X. (2005). Response time effort: a new measure of examinee motivation in computerbased tests. Applied Measurement in Education, 18 (2), 163 – 183. https://doi.org/10.1207/s15324818ame1802_2
  • Xie, Q., Dai, Z., Hovy, E. H., Luong, M., & Le, Q. V. (2019). Unsupervised data augmentation for consistency training. CoRR, abs/1904.12848. Retrieved from http://arxiv.org/abs/1904.12848
  • Zhu, X., Lafferty, J., & Ghahramani, Z. (2003). Combining active learning and semisupervised learning using gaussian fields and harmonic functions. In ICML 2003 workshop on the continuum from labeled to unlabeled data in machine learning and data mining (pp. 58–65).
  • Zoanetti, N., & Griffin, P. (2017). Log-file data as indicators for problem-solving processes. In B. Csapo & J. Funke (Eds.), The nature of problem solving: Using research to inspire 21st century learning (chap. 11). Paris: OECD Publishing. Retrieved from https://doi.org/10.1787/9789264273955- en
There are 35 citations in total.

Details

Primary Language English
Subjects Testing, Assessment and Psychometrics (Other)
Journal Section Articles
Authors

Hongwen Guo 0000-0002-1751-0918

Matthew Johnson 0000-0003-3157-4165

Luis Saldivia 0009-0007-3482-7654

Michelle Worthington 0009-0006-0480-3769

Kadriye Ercikan 0000-0001-8056-9165

Publication Date December 30, 2024
Submission Date August 15, 2024
Acceptance Date November 23, 2024
Published in Issue Year 2024 Volume: 15 Issue: Special Issue

Cite

APA Guo, H., Johnson, M., Saldivia, L., Worthington, M., et al. (2024). Human-Centered AI for Discovering Student Engagement Profiles on Large-Scale Educational Assessments. Journal of Measurement and Evaluation in Education and Psychology, 15(Special Issue), 282-301. https://doi.org/10.21031/epod.1532846