Araştırma Makalesi
BibTex RIS Kaynak Göster

Comparison of Different Computerized Adaptive Testing Approaches with Shadow Test Under Different Test Length and Ability Estimation Method Conditions

Yıl 2023, Cilt: 14 Sayı: 4, 396 - 412, 31.12.2023
https://doi.org/10.21031/epod.1202599

Öz

Adaptive testing approaches have been used and adopted in many international large-scale assessments (PISA, TIMSS, PIRLS, etc.). The shadow test approach, on the other hand, is an innovative testing approach that both meets all test specifications and constraints and aims to provide maximum information at the test taker's true ability level. The aim of this study is to investigate the effectiveness of four different adaptive testing approaches created with shadow test (CAT, 2-Stage O-MST, 3-Stage O-MST and LOFT) according to the test length and ability estimation method. With the Monte Carlo (MC) study in R software, 200 item parameters and 2000 test takers were generated under the 3PL model and the results were calculated over 50 replications. The results show that CAT, 2-Stage O-MST and 3-Stage O-MST are quite similar in effectiveness, while LOFT is less effective than these techniques. As the test length increases, the measurement precision increases in all different types of adaptive tests. Although EAP method generally presents better measurement precision than MLE method, at the extremes of the ability scale MLE has been found to present good measurement precision. In the research, it is discussed that large-scale assessments can benefit from adaptive testing created with shadow test approach.

Destekleyen Kurum

-

Proje Numarası

-

Kaynakça

  • Akhtar, H., Silfiasari, Vekety, B., & Kovacs, K. (2023). The effect of computerized adaptive testing on motivation and anxiety: A systematic review and meta-analysis. Assessment, 30(5), 1379–1390. https://doi.org/10.1177/10731911221100995
  • Birnbaum, A. L. (1968). Some latent trait models and their use in inferring an examinee's ability. Statistical theories of mental test scores.
  • Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443–459. https://doi.org/10.1007/bf02293801
  • Borgatto, A. F., Azevedo, C., Pinheiro, A., & Andrade, D. (2015). Comparison of ability estimation methods using IRT for tests with different degrees of difficulty. Communications in Statistics-Simulation and Computation, 44(2), 474-488. https://doi.org/10.1080/03610918.2013.781630
  • Bulut, O., & Sünbül, Ö. (2017). Monte Carlo Simulation Studies in Item Response Theory with the R Programming Language. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 266-287. https://doi.org/10.21031/epod.305821
  • Chang, H.-H., & Ying, Z. (1999). A-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211–222. https://doi.org/10.1177/01466219922031338
  • Choi, S. W., & Lim, S. (2022). Adaptive test assembly with a mix of set-based and discrete items. Behaviormetrika, 49(2), 231-254. https://doi.org/10.1007/s41237-021-00148-6
  • Choi, S. W., & van der Linden, W. J. (2018). Ensuring content validity of patient-reported outcomes: a shadow-test approach to their adaptive measurement. Quality of Life Research, 27(7), 1683-1693. https://doi.org/10.1007/s11136-017-1650-1
  • Choi, S. W., Lim, S., & van der Linden, W. J. (2022). TestDesign: an optimal test design approach to constructing fixed and adaptive tests in R. Behaviormetrika, 49(2), 191-229. https://doi.org/10.1007/s41237-021-00145-9
  • Choi, S. W., Moellering, K. T., Li, J., & van der Linden, W. J. (2016). Optimal reassembly of shadow tests in CAT. Applied psychological measurement, 40(7), 469-485. https://doi.org/10.1177/0146621616654597
  • Çoban, E. (2020). Bilgisayar temelli bireyselleştirilmiş test yaklaşımlarının Türkiye'deki merkezi dil sınavlarında uygulanabilirliğinin araştırılması. Yayınlanmamış Doktora Tezi. Ankara Üniversitesi
  • Demi̇r, S., & Atar, B. (2021). Investigation of classification accuracy, test length and measurement precision at computerized adaptive classification tests. Journal of Measurement and Evaluation in Education and Psychology, 12(1), 15–27. https://doi.org/10.21031/epod.787865
  • Ebenbeck, N. (2023). Computerized adaptive testing in inclusive education. Universität Regensburg. https://doi.org/10.5283/EPUB.54551
  • Embretson S. E., & Reise S. P. (2000). Item Response Theory for Psychologists. Mahwah, NJ: Lawrence Earlbaum.
  • Erdem Kara, B., & Doğan, N. (2022). The effect of ratio of items indicating differential item functioning on computer adaptive and multi-stage tests. International Journal of Assessment Tools in Education, 9(3), 682–696. https://doi.org/10.21449/ijate.1105769
  • Feinberg, R. A., & Rubright, J. D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36-49.
  • Gökçe, S., & Glas, C. A. W. (2018). Can TIMSS mathematics assessments be implemented as a computerized adaptive test? Journal of Measurement and Evaluation in Education and Psychology, 9(4), 422–436. https://doi.org/10.21031/epod.487351
  • Gündeğer, C., & Doğan, N. (2018). Bireyselleştirilmiş Bilgisayarlı Sınıflama Testi Kriterlerinin Test Etkililiği ve Ölçme Kesinliği Açısından Karşılaştırılması. Journal of Measurement and Evaluation in Education and Psychology, 9(2), 161–177. https://doi.org/10.21031/epod.401077
  • Han, K. T. (2016). Maximum likelihood score estimation method with fences for short-length tests and computerized adaptive tests. Applied Psychological Measurement, 40(4), 289–301. https://doi.org/10.1177/0146621616631317
  • Han, K. T., & Guo, F. (2014). Multistage testing by shaping modules on the fly. Computerized multistage testing: Theory and applications, 119-133.
  • Harwell, M., Stone, C. A., Hsu, T.-C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101–125. https://doi.org/10.1177/014662169602000201
  • Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement Issues and Practice, 26(2), 44–52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
  • Huang, Y.-M., Lin, Y.-T., & Cheng, S. C. (2009). An adaptive testing system for supporting versatile educational assessment. Computers & Education, 52(1), 53–67. https://doi.org/10.1016/j.compedu.2008.06.007
  • Kaplan, M., de la Torre, J., & Barrada, J. R. (2015). New item selection methods for cognitive diagnosis computerized adaptive testing. Applied Psychological Measurement, 39(3), 167–188. https://doi.org/10.1177/0146621614554650
  • Khorramdel, L., Pokropek, A., Joo, S. H., Kirsch, I., & Halderman, L. (2020). Examining gender DIF and gender differences in the PISA 2018 reading literacy scale: A partial invariance approach. Psychological Test and Assessment Modeling, 62(2), 179-231.
  • Kim, H., & Plake, B. (1993). Monte Carlo simulation comparison of two-stage testing and computer adaptive testing. Unpublished doctoral dissertation, University of Nebraska, Lincoln.
  • Kirsch, I., & Lennon, M. L. (2017). PIAAC: a new design for a new era. Large-Scale Assessments in Education, 5(1), 1-22. https://doi.org/10.1186/s40536-017-0046-6
  • Macken-Ruiz, C. L. (2008). A comparison of multi-stage and computerized adaptive tests based on the generalized partial credit model. Unpublished doctoral dissertation, University of Texas at Austin
  • Mooney, C. Z. (1997). Monte carlo simulation. Sage.
  • Mullis, I. V., & Martin, M. O. (2019). PIRLS 2021 Assessment Frameworks. International Association for the Evaluation of Educational Achievement. Herengracht 487, Amsterdam, 1017 BT, The Netherlands. National Center for Education Statistics (NCES). (2019). Program for International Student Assessment 2022 (PISA 2022) Main Study Recruitment and Field Test.
  • Özdemir, B., & Gelbal, S. (2022). Measuring language ability of students with compensatory multidimensional CAT: A post-hoc simulation study. Education and Information Technologies, 27(5), 6273–6294. https://doi.org/10.1007/s10639-021-10853-0
  • Patsula, L. N. (1999). A comparison of computerized-adaptive testing and multi-stage testing. Unpublished doctoral dissertation, University of Massachusetts at Amherst.
  • Raborn, A., & Sari, H. (2021). Mixed Adaptive Multistage Testing: A New Approach. Journal of measurement and evaluation in education and psychology, 12(4), 358–373. https://doi.org/10.21031/epod.871014
  • Şahin, M. G., & Boztunç Öztürk, N. (2019). Analyzing the maximum likelihood score estimation method with fences in ca-MST. International Journal of Assessment Tools in Education, 6(4), 555–567.. https://doi.org/10.21449/ijate.634091
  • Samejima, F. (1977). A method of estimating item characteristic functions using the maximum likelihood estimate of ability. Psychometrika, 42(2), 163-191.
  • Schnipke, D. L. & Reese, L. M. (1999). A comparison of testlet-based test designs for computerized adaptive testing (Law School Admissions Council Computerized Testing Report 97-01). Newtown, PA: Law School Admission Council.
  • Sigal, M. J., & Chalmers, R. P. (2016). Play it again: Teaching statistics with Monte Carlo simulation. Journal of Statistics Education: An International Journal on the Teaching and Learning of Statistics, 24(3), 136–156. https://doi.org/10.1080/10691898.2016.1246953
  • Stafford, R. E., Runyon, C. R., Casabianca, J. M., & Dodd, B. G. (2019). Comparing computer adaptive testing stopping rules under the generalized partial-credit model. Behavior research methods, 51(3), 1305-1320. https://doi.org/10.3758/s13428-018-1068-x
  • Theussl, S., Hornik, K., Buchta, C., Schwendinger, F., Schuchardt, H., & Theussl, M. S. (2019). Package ‘Rglpk’. GitHub, Inc., San Francisco, CA, USA, Tech. Rep. 0.6-4.
  • van der Linden WJ, Diao Q (2014). Using a universal shadow-test assembler with multistage testing. In: Yan D, von Davier AA, Lewis C (eds) Computerized multistage testing: theory and applications. CRC Press, New York, 101–118
  • van der Linden, W. J. (1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63(2), 201–216. https://doi.org/10.1007/bf02294775
  • van der Linden, W. J. (2009). Constrained adaptive testing with shadow tests. Elements of adaptive testing (pp. 31-55). Springer, New York, NY.
  • van der Linden, W. J. (2010). Elements of adaptive testing (Vol. 10, pp. 978-0). C. A. Glas (Ed.). New York, NY: Springer.
  • van der Linden, W. J. (2022). Review of the shadow-test approach to adaptive testing. Behaviormetrika, 49(2), 169-190. https://doi.org/10.1007/s41237-021-00150-y
  • van der Linden, W. J., & Chang, H. H. (2003). Implementing content constraints in alpha-stratified adaptive testing using a shadow test approach. Applied Psychological Measurement, 27(2), 107-120. https://doi.org/10.1177/0146621602250531
  • van der Linden, W. J., & Veldkamp, B. P. (2004). Constraining item exposure in computerized adaptive testing with shadow tests. Journal of Educational and Behavioral Statistics, 29(3), 273-291. https://doi.org/10.3102/10769986029003273
  • Veerkamp, W. J. J., & Berger, M. P. F. (1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics: A Quarterly Publication Sponsored by the American Educational Research Association and the American Statistical Association, 22(2), 203–226. https://doi.org/10.3102/10769986022002203
  • Wainer, H. (1990). An Adaptive Algebra Test: A Testlet-Based, Hierarchically-Structured Test with Validity-Based Scoring. Technical Report No. 90-92.
  • Wang, K. (2017). A fair comparison of the performance of computerized adaptive testing and multistage adaptive testing (Unpublished Doctoral Dissertation). Michigan State University.
  • Wang, T., & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35(2), 109–135. https://doi.org/10.1111/j.1745-3984.1998.tb00530.x
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450.
  • Weiss, D. J. (2004). Computerized adaptive testing for effective and efficient measurement in counseling and education. Measurement and Evaluation in Counseling and Development, 37(2), 70-84.
  • Xiao, J., & Bulut, O. (2022). Item Selection with Collaborative Filtering in On-The-Fly Multistage Adaptive Testing. Applied Psychological Measurement, 01466216221124089.
  • Yiğiter, M. S., & Dogan, N. (2023). Computerized multistage testing: Principles, designs and practices with R. Measurement: Interdisciplinary Research and Perspectives, 21(4), 254–277. https://doi.org/10.1080/15366367.2022.2158017
  • Yin, L., & Foy, P. (2021). TIMSS 2023 Assessment Design. TIMSS 2023 Assessment Frameworks, 71.
  • Zheng, Y., & Chang, H.-H. (2015). On-the-fly assembled multistage adaptive testing. Applied Psychological Measurement, 39(2), 104–118. https://doi.org/10.1177/0146621614544519
Yıl 2023, Cilt: 14 Sayı: 4, 396 - 412, 31.12.2023
https://doi.org/10.21031/epod.1202599

Öz

Proje Numarası

-

Kaynakça

  • Akhtar, H., Silfiasari, Vekety, B., & Kovacs, K. (2023). The effect of computerized adaptive testing on motivation and anxiety: A systematic review and meta-analysis. Assessment, 30(5), 1379–1390. https://doi.org/10.1177/10731911221100995
  • Birnbaum, A. L. (1968). Some latent trait models and their use in inferring an examinee's ability. Statistical theories of mental test scores.
  • Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443–459. https://doi.org/10.1007/bf02293801
  • Borgatto, A. F., Azevedo, C., Pinheiro, A., & Andrade, D. (2015). Comparison of ability estimation methods using IRT for tests with different degrees of difficulty. Communications in Statistics-Simulation and Computation, 44(2), 474-488. https://doi.org/10.1080/03610918.2013.781630
  • Bulut, O., & Sünbül, Ö. (2017). Monte Carlo Simulation Studies in Item Response Theory with the R Programming Language. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 266-287. https://doi.org/10.21031/epod.305821
  • Chang, H.-H., & Ying, Z. (1999). A-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211–222. https://doi.org/10.1177/01466219922031338
  • Choi, S. W., & Lim, S. (2022). Adaptive test assembly with a mix of set-based and discrete items. Behaviormetrika, 49(2), 231-254. https://doi.org/10.1007/s41237-021-00148-6
  • Choi, S. W., & van der Linden, W. J. (2018). Ensuring content validity of patient-reported outcomes: a shadow-test approach to their adaptive measurement. Quality of Life Research, 27(7), 1683-1693. https://doi.org/10.1007/s11136-017-1650-1
  • Choi, S. W., Lim, S., & van der Linden, W. J. (2022). TestDesign: an optimal test design approach to constructing fixed and adaptive tests in R. Behaviormetrika, 49(2), 191-229. https://doi.org/10.1007/s41237-021-00145-9
  • Choi, S. W., Moellering, K. T., Li, J., & van der Linden, W. J. (2016). Optimal reassembly of shadow tests in CAT. Applied psychological measurement, 40(7), 469-485. https://doi.org/10.1177/0146621616654597
  • Çoban, E. (2020). Bilgisayar temelli bireyselleştirilmiş test yaklaşımlarının Türkiye'deki merkezi dil sınavlarında uygulanabilirliğinin araştırılması. Yayınlanmamış Doktora Tezi. Ankara Üniversitesi
  • Demi̇r, S., & Atar, B. (2021). Investigation of classification accuracy, test length and measurement precision at computerized adaptive classification tests. Journal of Measurement and Evaluation in Education and Psychology, 12(1), 15–27. https://doi.org/10.21031/epod.787865
  • Ebenbeck, N. (2023). Computerized adaptive testing in inclusive education. Universität Regensburg. https://doi.org/10.5283/EPUB.54551
  • Embretson S. E., & Reise S. P. (2000). Item Response Theory for Psychologists. Mahwah, NJ: Lawrence Earlbaum.
  • Erdem Kara, B., & Doğan, N. (2022). The effect of ratio of items indicating differential item functioning on computer adaptive and multi-stage tests. International Journal of Assessment Tools in Education, 9(3), 682–696. https://doi.org/10.21449/ijate.1105769
  • Feinberg, R. A., & Rubright, J. D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36-49.
  • Gökçe, S., & Glas, C. A. W. (2018). Can TIMSS mathematics assessments be implemented as a computerized adaptive test? Journal of Measurement and Evaluation in Education and Psychology, 9(4), 422–436. https://doi.org/10.21031/epod.487351
  • Gündeğer, C., & Doğan, N. (2018). Bireyselleştirilmiş Bilgisayarlı Sınıflama Testi Kriterlerinin Test Etkililiği ve Ölçme Kesinliği Açısından Karşılaştırılması. Journal of Measurement and Evaluation in Education and Psychology, 9(2), 161–177. https://doi.org/10.21031/epod.401077
  • Han, K. T. (2016). Maximum likelihood score estimation method with fences for short-length tests and computerized adaptive tests. Applied Psychological Measurement, 40(4), 289–301. https://doi.org/10.1177/0146621616631317
  • Han, K. T., & Guo, F. (2014). Multistage testing by shaping modules on the fly. Computerized multistage testing: Theory and applications, 119-133.
  • Harwell, M., Stone, C. A., Hsu, T.-C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101–125. https://doi.org/10.1177/014662169602000201
  • Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement Issues and Practice, 26(2), 44–52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
  • Huang, Y.-M., Lin, Y.-T., & Cheng, S. C. (2009). An adaptive testing system for supporting versatile educational assessment. Computers & Education, 52(1), 53–67. https://doi.org/10.1016/j.compedu.2008.06.007
  • Kaplan, M., de la Torre, J., & Barrada, J. R. (2015). New item selection methods for cognitive diagnosis computerized adaptive testing. Applied Psychological Measurement, 39(3), 167–188. https://doi.org/10.1177/0146621614554650
  • Khorramdel, L., Pokropek, A., Joo, S. H., Kirsch, I., & Halderman, L. (2020). Examining gender DIF and gender differences in the PISA 2018 reading literacy scale: A partial invariance approach. Psychological Test and Assessment Modeling, 62(2), 179-231.
  • Kim, H., & Plake, B. (1993). Monte Carlo simulation comparison of two-stage testing and computer adaptive testing. Unpublished doctoral dissertation, University of Nebraska, Lincoln.
  • Kirsch, I., & Lennon, M. L. (2017). PIAAC: a new design for a new era. Large-Scale Assessments in Education, 5(1), 1-22. https://doi.org/10.1186/s40536-017-0046-6
  • Macken-Ruiz, C. L. (2008). A comparison of multi-stage and computerized adaptive tests based on the generalized partial credit model. Unpublished doctoral dissertation, University of Texas at Austin
  • Mooney, C. Z. (1997). Monte carlo simulation. Sage.
  • Mullis, I. V., & Martin, M. O. (2019). PIRLS 2021 Assessment Frameworks. International Association for the Evaluation of Educational Achievement. Herengracht 487, Amsterdam, 1017 BT, The Netherlands. National Center for Education Statistics (NCES). (2019). Program for International Student Assessment 2022 (PISA 2022) Main Study Recruitment and Field Test.
  • Özdemir, B., & Gelbal, S. (2022). Measuring language ability of students with compensatory multidimensional CAT: A post-hoc simulation study. Education and Information Technologies, 27(5), 6273–6294. https://doi.org/10.1007/s10639-021-10853-0
  • Patsula, L. N. (1999). A comparison of computerized-adaptive testing and multi-stage testing. Unpublished doctoral dissertation, University of Massachusetts at Amherst.
  • Raborn, A., & Sari, H. (2021). Mixed Adaptive Multistage Testing: A New Approach. Journal of measurement and evaluation in education and psychology, 12(4), 358–373. https://doi.org/10.21031/epod.871014
  • Şahin, M. G., & Boztunç Öztürk, N. (2019). Analyzing the maximum likelihood score estimation method with fences in ca-MST. International Journal of Assessment Tools in Education, 6(4), 555–567.. https://doi.org/10.21449/ijate.634091
  • Samejima, F. (1977). A method of estimating item characteristic functions using the maximum likelihood estimate of ability. Psychometrika, 42(2), 163-191.
  • Schnipke, D. L. & Reese, L. M. (1999). A comparison of testlet-based test designs for computerized adaptive testing (Law School Admissions Council Computerized Testing Report 97-01). Newtown, PA: Law School Admission Council.
  • Sigal, M. J., & Chalmers, R. P. (2016). Play it again: Teaching statistics with Monte Carlo simulation. Journal of Statistics Education: An International Journal on the Teaching and Learning of Statistics, 24(3), 136–156. https://doi.org/10.1080/10691898.2016.1246953
  • Stafford, R. E., Runyon, C. R., Casabianca, J. M., & Dodd, B. G. (2019). Comparing computer adaptive testing stopping rules under the generalized partial-credit model. Behavior research methods, 51(3), 1305-1320. https://doi.org/10.3758/s13428-018-1068-x
  • Theussl, S., Hornik, K., Buchta, C., Schwendinger, F., Schuchardt, H., & Theussl, M. S. (2019). Package ‘Rglpk’. GitHub, Inc., San Francisco, CA, USA, Tech. Rep. 0.6-4.
  • van der Linden WJ, Diao Q (2014). Using a universal shadow-test assembler with multistage testing. In: Yan D, von Davier AA, Lewis C (eds) Computerized multistage testing: theory and applications. CRC Press, New York, 101–118
  • van der Linden, W. J. (1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63(2), 201–216. https://doi.org/10.1007/bf02294775
  • van der Linden, W. J. (2009). Constrained adaptive testing with shadow tests. Elements of adaptive testing (pp. 31-55). Springer, New York, NY.
  • van der Linden, W. J. (2010). Elements of adaptive testing (Vol. 10, pp. 978-0). C. A. Glas (Ed.). New York, NY: Springer.
  • van der Linden, W. J. (2022). Review of the shadow-test approach to adaptive testing. Behaviormetrika, 49(2), 169-190. https://doi.org/10.1007/s41237-021-00150-y
  • van der Linden, W. J., & Chang, H. H. (2003). Implementing content constraints in alpha-stratified adaptive testing using a shadow test approach. Applied Psychological Measurement, 27(2), 107-120. https://doi.org/10.1177/0146621602250531
  • van der Linden, W. J., & Veldkamp, B. P. (2004). Constraining item exposure in computerized adaptive testing with shadow tests. Journal of Educational and Behavioral Statistics, 29(3), 273-291. https://doi.org/10.3102/10769986029003273
  • Veerkamp, W. J. J., & Berger, M. P. F. (1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics: A Quarterly Publication Sponsored by the American Educational Research Association and the American Statistical Association, 22(2), 203–226. https://doi.org/10.3102/10769986022002203
  • Wainer, H. (1990). An Adaptive Algebra Test: A Testlet-Based, Hierarchically-Structured Test with Validity-Based Scoring. Technical Report No. 90-92.
  • Wang, K. (2017). A fair comparison of the performance of computerized adaptive testing and multistage adaptive testing (Unpublished Doctoral Dissertation). Michigan State University.
  • Wang, T., & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35(2), 109–135. https://doi.org/10.1111/j.1745-3984.1998.tb00530.x
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450.
  • Weiss, D. J. (2004). Computerized adaptive testing for effective and efficient measurement in counseling and education. Measurement and Evaluation in Counseling and Development, 37(2), 70-84.
  • Xiao, J., & Bulut, O. (2022). Item Selection with Collaborative Filtering in On-The-Fly Multistage Adaptive Testing. Applied Psychological Measurement, 01466216221124089.
  • Yiğiter, M. S., & Dogan, N. (2023). Computerized multistage testing: Principles, designs and practices with R. Measurement: Interdisciplinary Research and Perspectives, 21(4), 254–277. https://doi.org/10.1080/15366367.2022.2158017
  • Yin, L., & Foy, P. (2021). TIMSS 2023 Assessment Design. TIMSS 2023 Assessment Frameworks, 71.
  • Zheng, Y., & Chang, H.-H. (2015). On-the-fly assembled multistage adaptive testing. Applied Psychological Measurement, 39(2), 104–118. https://doi.org/10.1177/0146621614544519
Toplam 56 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Test Kuramları
Bölüm Makaleler
Yazarlar

Mahmut Sami Yiğiter 0000-0002-2896-0201

Nuri Doğan 0000-0001-6274-2016

Proje Numarası -
Yayımlanma Tarihi 31 Aralık 2023
Kabul Tarihi 20 Ekim 2023
Yayımlandığı Sayı Yıl 2023 Cilt: 14 Sayı: 4

Kaynak Göster

APA Yiğiter, M. S., & Doğan, N. (2023). Comparison of Different Computerized Adaptive Testing Approaches with Shadow Test Under Different Test Length and Ability Estimation Method Conditions. Journal of Measurement and Evaluation in Education and Psychology, 14(4), 396-412. https://doi.org/10.21031/epod.1202599