Research Article
BibTex RIS Cite

Year 2025, Volume: 16 Issue: 3, 124 - 138, 30.09.2025
https://doi.org/10.21031/epod.1658558

Abstract

References

  • Aybek, E. C., & Çıkrıkçı, R. N. (2018, September). Kendini Değerlendirme Envanteri’nin bilgisayar ortamında bireye uyarlanmış test olarak uygulanabilirliği. Turkish Psychological Counseling and Guidance Journal, 8(50), 117–141. Turkish Psychological Counseling and Guidance Association. Doi; https://dergipark.org.tr/en/download/article-file/571511
  • Bock, R., & Mislevy, R. (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6, 431 - 444. https://doi.org/10.1177/014662168200600405.
  • Chen, S., Hou, L., & Dodd, B. (1998). A Comparison of Maximum Likelihood Estimation and Expected a Posteriori Estimation in CAT Using the Partial Credit Model. Educational and Psychological Measurement, 58, 569 - 595. https://doi.org/10.1177/0013164498058004002.
  • Chen, S., Hou, L., Fitzpatrick, S., & Dodd, B. (1997). The Effect of Population Distribution and Method of Theta Estimation on Computerized Adaptive Testing (CAT) Using the Rating Scale Model. Educational and Psychological Measurement, 57, 422 - 439. https://doi.org/10.1177/0013164497057003004.
  • Cheng, Y. (2008). Computerized adaptive testing: New developments and applications (Doctoral dissertation, University of Illinois at Urbana-Champaign). University of Illinois at Urbana-Champaign. Doi; https://hdl.handle.net/2142/82159
  • Di Stefano, F., Pannaux, M., Correges, A., Galtier, S., Robert, V., & Saint‐Hilary, G. (2022). A comparison of estimation methods adjusting for selection bias in adaptive enrichment designs with time‐to‐event endpoints. Statistics in Medicine, 41(10), 1767-1779. doi: https://doi.org/10.1002/sim.9327
  • Dutilleul, P. (1999). The mle algorithm for the matrix normal distribution. Journal of Statistical Computation and Simulation, 64, 105-123. https://doi.org/10.1080/00949659908811970.
  • Fuh, C. D., Ip, E. H., & Chen, S. H. (2020). Computerized adaptive test using raw responses for item selection: theoretical results and applications for the up-and-down method. Statistics and Its Interface, 13(3), 317-333. doi; https://doi.org/10.4310/SII.2020.v13.n3.a3
  • Geraldo, I. (2022). An Automated Profile-Likelihood-Based Algorithm for Fast Computation of the Maximum Likelihood Estimate in a Statistical Model for Crash Data. J. Appl. Math., 2022, 6974166:1-6974166:11. https://doi.org/10.1155/2022/6974166.
  • Gorin, J., Dodd, B., Fitzpatrick, S., & Shieh, Y. (2005). Computerized Adaptive Testing With the Partial Credit Model: Estimation Procedures, Population Distributions, and Item Pool Characteristics. Applied Psychological Measurement, 29, 433 - 456. https://doi.org/10.1177/0146621605280072.
  • Graf, A., Gutjahr, G., & Brannath, W. (2015). Precision of maximum likelihood estimation in adaptive designs. Statistics in Medicine, 35, 922 - 941. https://doi.org/10.1002/sim.6761.
  • Gündeğer, C., & Doğan, N. (2018). Bireyselleştirilmiş bilgisayarlı sınıflama testlerinde madde havuzu özelliklerinin test uzunluğu ve sınıflama doğruluğu üzerindeki etkisi. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 33(4), 888-896. doi: https://doi.org/10.16986/HUJE.2016024284
  • Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory (2nd ed.). SAGE Publications. https://books.google.com.tr/books?id=gW05DQAAQBAJ
  • Han, K. (2016). Maximum Likelihood Score Estimation Method With Fences for Short-Length Tests and Computerized Adaptive Tests. Applied Psychological Measurement, 40, 289 - 301. https://doi.org/10.1177/0146621616631317.
  • Ho, T. H., & Dodd, B. G. (2012). Item Selection and Ability Estimation Procedures for a Mixed-Format Adaptive Test. Applied Measurement in Education, 25(4), 305–326. https://doi.org/10.1080/08957347.2012.714686
  • Kalender, İ. (2009). Başarı ve yetenek kestiriminde yeni bir yaklaşım: Bilgisayar ortamında bireyselleştirilmiş testler. CITO Eğitim: Kuram ve Uygulama, 5, 40–48.
  • Kalender, İ. (2011). Effects of different computerized adaptive testing strategies on recovery of ability (Doctoral dissertation, Middle East Technical University). Middle East Technical University Graduate School of Natural and Applied Sciences. https://hdl.handle.net/11511/21135
  • Karagianni, M., & Tsaousis, I. (2025). From Development to Validation: Exploring the Efficiency of Numetrive, a Computerized Adaptive Assessment of Numerical Reasoning. Behavioral Sciences, 15(3), 268. doi; https://doi.org/10.3390/bs15030268
  • Kern, J. L., & Choe, E. (2021). Using a response time–based expected a posteriori estimator to control for differential speededness in computerized adaptive test. Applied Psychological Measurement, 45(5), 361-385. DOI: https://doi.org/10.1177/01466216211014601
  • Lilley, M. (2007). The development and application of computer-adaptive testing in a higher education environment (Doctoral dissertation, University of Hertfordshire).
  • Lin, C. H., Chen, K. P., & Tsai, C. H. (2008, December). Modeling the Examinee Ability on the Computerized Adaptive Testing Using Adaptive Network-Based Fuzzy Inference System. In 2008 IEEE Asia-Pacific Services Computing Conference (pp. 139-144). IEEE. doi: https://doi.org/10.1109/APSCC.2008.53.
  • Lord, F. M., & Novick, M. R. (2008). Statistical theories of mental test scores. Information Age Publishing. https://books.google.com.tr/books?id=k_wnDwAAQBAJ
  • Magis, D. & Raiche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48(8), 1-31. https://doi.org/10.18637/jss.v048.i08
  • Magis, D., & Barrada, J. R. (2017). Computerized adaptive testing with R: Recent updates of the package catR. Journal of Statistical Software, 76, 1-19. Doi; https://doi.org/10.18637/jss.v076.c01
  • Piepho, H. (1993). Use of the Maximum Likelihood Method in the Analysis of Phenotypic Stability. Biometrical Journal, 35, 815-822. https://doi.org/10.1002/BIMJ.4710350709.
  • R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. URL https://www.R-project.org/.
  • Sands, W. A., Waters, B. K., & McBride, J. R. (Eds.). (1997). Computerized adaptive testing: From inquiry to operation. American Psychological Association. https://doi.org/10.1037/10244-000
  • Stocking, M. L., & Lewis, C. (1995). A new method of controlling item exposure in computerized adaptive testing. ETS Research Report Series, 1995(2), i-29. https://doi.org/10.1002/j.2333-8504.1995.tb01660.x
  • Suhardi, I. (2020). Alternative item selection strategies for improving test security in computerized adaptive testing of the algorithm. Research and Evaluation in Education, 6, 32-40. https://doi.org/10.21831/reid.v6i1.30508.
  • van der Linden, W. J., & Glas, C. A. W. (Eds.). (2000). Computerized adaptive testing: Theory and practice. Springer. https://doi.org/10.1007/0-306-47531-6
  • van der Linden, W. J., & Glas, C. A. W. (Eds.). (2010). Elements of adaptive testing. Springer. https://doi.org/10.1007/978-0-387-85461-8
  • Wang, S., & Wang, T. (2001). Precision of Warm’s weighted likelihood estimates for a polytomous model in computerized adaptive testing. Applied Psychological Measurement, 25(4), 317-331. https://doi.org/10.1177/0146621012203216
  • Wang, T., & Vispoel, W. (1998). Properties of Ability Estimation Methods in Computerized Adaptive Testing. Journal of Educational Measurement, 35, 109-135. https://doi.org/10.1111/J.1745-3984.1998.TB00530.X.
  • Weiss, D., & Şahin, A. (2024). Computerized adaptive testing: From concept to implementation. The Guilford Press. ISBN: 9781462554515
  • Wyse, A., & Mcbride, J. (2022). Handling Extreme Scores in Vertically Scaled Fixed-Length Computerized Adaptive Tests. Measurement: Interdisciplinary Research and Perspectives, 20, 1 - 20. https://doi.org/10.1080/15366367.2021.1977583

Stabilizing Maximum Likelihood Estimation with a Damping Factor in the Initial Phase of Computerized Adaptive Testing

Year 2025, Volume: 16 Issue: 3, 124 - 138, 30.09.2025
https://doi.org/10.21031/epod.1658558

Abstract

Maximum Likelihood Estimation (MLE) is a widely used ability estimation method in Item Response Theory (IRT)-based CAT applications. However, traditional MLE is highly sensitive to initial responses, often leading to substantial fluctuations and estimation instability, particularly in short tests or small item pools. This study investigates the effects of incorporating a damping factor into MLE at the early stages of CAT to mitigate undue ability estimate fluctuations. Using Monte Carlo simulations based on a 3-Parameter Logistic (3PL) model in R, we examine the performance of the adjusted MLE compared to standard MLE, Maximum A Posteriori (MAP), and Expected a Posteriori (EAP) estimation methods. Results indicate that damping improves MLE stability, reducing extreme ability fluctuations and enhancing estimation accuracy, particularly in short tests and small sample conditions.

References

  • Aybek, E. C., & Çıkrıkçı, R. N. (2018, September). Kendini Değerlendirme Envanteri’nin bilgisayar ortamında bireye uyarlanmış test olarak uygulanabilirliği. Turkish Psychological Counseling and Guidance Journal, 8(50), 117–141. Turkish Psychological Counseling and Guidance Association. Doi; https://dergipark.org.tr/en/download/article-file/571511
  • Bock, R., & Mislevy, R. (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6, 431 - 444. https://doi.org/10.1177/014662168200600405.
  • Chen, S., Hou, L., & Dodd, B. (1998). A Comparison of Maximum Likelihood Estimation and Expected a Posteriori Estimation in CAT Using the Partial Credit Model. Educational and Psychological Measurement, 58, 569 - 595. https://doi.org/10.1177/0013164498058004002.
  • Chen, S., Hou, L., Fitzpatrick, S., & Dodd, B. (1997). The Effect of Population Distribution and Method of Theta Estimation on Computerized Adaptive Testing (CAT) Using the Rating Scale Model. Educational and Psychological Measurement, 57, 422 - 439. https://doi.org/10.1177/0013164497057003004.
  • Cheng, Y. (2008). Computerized adaptive testing: New developments and applications (Doctoral dissertation, University of Illinois at Urbana-Champaign). University of Illinois at Urbana-Champaign. Doi; https://hdl.handle.net/2142/82159
  • Di Stefano, F., Pannaux, M., Correges, A., Galtier, S., Robert, V., & Saint‐Hilary, G. (2022). A comparison of estimation methods adjusting for selection bias in adaptive enrichment designs with time‐to‐event endpoints. Statistics in Medicine, 41(10), 1767-1779. doi: https://doi.org/10.1002/sim.9327
  • Dutilleul, P. (1999). The mle algorithm for the matrix normal distribution. Journal of Statistical Computation and Simulation, 64, 105-123. https://doi.org/10.1080/00949659908811970.
  • Fuh, C. D., Ip, E. H., & Chen, S. H. (2020). Computerized adaptive test using raw responses for item selection: theoretical results and applications for the up-and-down method. Statistics and Its Interface, 13(3), 317-333. doi; https://doi.org/10.4310/SII.2020.v13.n3.a3
  • Geraldo, I. (2022). An Automated Profile-Likelihood-Based Algorithm for Fast Computation of the Maximum Likelihood Estimate in a Statistical Model for Crash Data. J. Appl. Math., 2022, 6974166:1-6974166:11. https://doi.org/10.1155/2022/6974166.
  • Gorin, J., Dodd, B., Fitzpatrick, S., & Shieh, Y. (2005). Computerized Adaptive Testing With the Partial Credit Model: Estimation Procedures, Population Distributions, and Item Pool Characteristics. Applied Psychological Measurement, 29, 433 - 456. https://doi.org/10.1177/0146621605280072.
  • Graf, A., Gutjahr, G., & Brannath, W. (2015). Precision of maximum likelihood estimation in adaptive designs. Statistics in Medicine, 35, 922 - 941. https://doi.org/10.1002/sim.6761.
  • Gündeğer, C., & Doğan, N. (2018). Bireyselleştirilmiş bilgisayarlı sınıflama testlerinde madde havuzu özelliklerinin test uzunluğu ve sınıflama doğruluğu üzerindeki etkisi. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 33(4), 888-896. doi: https://doi.org/10.16986/HUJE.2016024284
  • Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory (2nd ed.). SAGE Publications. https://books.google.com.tr/books?id=gW05DQAAQBAJ
  • Han, K. (2016). Maximum Likelihood Score Estimation Method With Fences for Short-Length Tests and Computerized Adaptive Tests. Applied Psychological Measurement, 40, 289 - 301. https://doi.org/10.1177/0146621616631317.
  • Ho, T. H., & Dodd, B. G. (2012). Item Selection and Ability Estimation Procedures for a Mixed-Format Adaptive Test. Applied Measurement in Education, 25(4), 305–326. https://doi.org/10.1080/08957347.2012.714686
  • Kalender, İ. (2009). Başarı ve yetenek kestiriminde yeni bir yaklaşım: Bilgisayar ortamında bireyselleştirilmiş testler. CITO Eğitim: Kuram ve Uygulama, 5, 40–48.
  • Kalender, İ. (2011). Effects of different computerized adaptive testing strategies on recovery of ability (Doctoral dissertation, Middle East Technical University). Middle East Technical University Graduate School of Natural and Applied Sciences. https://hdl.handle.net/11511/21135
  • Karagianni, M., & Tsaousis, I. (2025). From Development to Validation: Exploring the Efficiency of Numetrive, a Computerized Adaptive Assessment of Numerical Reasoning. Behavioral Sciences, 15(3), 268. doi; https://doi.org/10.3390/bs15030268
  • Kern, J. L., & Choe, E. (2021). Using a response time–based expected a posteriori estimator to control for differential speededness in computerized adaptive test. Applied Psychological Measurement, 45(5), 361-385. DOI: https://doi.org/10.1177/01466216211014601
  • Lilley, M. (2007). The development and application of computer-adaptive testing in a higher education environment (Doctoral dissertation, University of Hertfordshire).
  • Lin, C. H., Chen, K. P., & Tsai, C. H. (2008, December). Modeling the Examinee Ability on the Computerized Adaptive Testing Using Adaptive Network-Based Fuzzy Inference System. In 2008 IEEE Asia-Pacific Services Computing Conference (pp. 139-144). IEEE. doi: https://doi.org/10.1109/APSCC.2008.53.
  • Lord, F. M., & Novick, M. R. (2008). Statistical theories of mental test scores. Information Age Publishing. https://books.google.com.tr/books?id=k_wnDwAAQBAJ
  • Magis, D. & Raiche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48(8), 1-31. https://doi.org/10.18637/jss.v048.i08
  • Magis, D., & Barrada, J. R. (2017). Computerized adaptive testing with R: Recent updates of the package catR. Journal of Statistical Software, 76, 1-19. Doi; https://doi.org/10.18637/jss.v076.c01
  • Piepho, H. (1993). Use of the Maximum Likelihood Method in the Analysis of Phenotypic Stability. Biometrical Journal, 35, 815-822. https://doi.org/10.1002/BIMJ.4710350709.
  • R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. URL https://www.R-project.org/.
  • Sands, W. A., Waters, B. K., & McBride, J. R. (Eds.). (1997). Computerized adaptive testing: From inquiry to operation. American Psychological Association. https://doi.org/10.1037/10244-000
  • Stocking, M. L., & Lewis, C. (1995). A new method of controlling item exposure in computerized adaptive testing. ETS Research Report Series, 1995(2), i-29. https://doi.org/10.1002/j.2333-8504.1995.tb01660.x
  • Suhardi, I. (2020). Alternative item selection strategies for improving test security in computerized adaptive testing of the algorithm. Research and Evaluation in Education, 6, 32-40. https://doi.org/10.21831/reid.v6i1.30508.
  • van der Linden, W. J., & Glas, C. A. W. (Eds.). (2000). Computerized adaptive testing: Theory and practice. Springer. https://doi.org/10.1007/0-306-47531-6
  • van der Linden, W. J., & Glas, C. A. W. (Eds.). (2010). Elements of adaptive testing. Springer. https://doi.org/10.1007/978-0-387-85461-8
  • Wang, S., & Wang, T. (2001). Precision of Warm’s weighted likelihood estimates for a polytomous model in computerized adaptive testing. Applied Psychological Measurement, 25(4), 317-331. https://doi.org/10.1177/0146621012203216
  • Wang, T., & Vispoel, W. (1998). Properties of Ability Estimation Methods in Computerized Adaptive Testing. Journal of Educational Measurement, 35, 109-135. https://doi.org/10.1111/J.1745-3984.1998.TB00530.X.
  • Weiss, D., & Şahin, A. (2024). Computerized adaptive testing: From concept to implementation. The Guilford Press. ISBN: 9781462554515
  • Wyse, A., & Mcbride, J. (2022). Handling Extreme Scores in Vertically Scaled Fixed-Length Computerized Adaptive Tests. Measurement: Interdisciplinary Research and Perspectives, 20, 1 - 20. https://doi.org/10.1080/15366367.2021.1977583
There are 35 citations in total.

Details

Primary Language English
Subjects Classical Test Theories, Item Response Theory
Journal Section Articles
Authors

Alper Tosun 0000-0001-9715-5209

Eren Can Aybek 0000-0003-3040-2337

Alper Sinan 0000-0001-6632-5500

Publication Date September 30, 2025
Submission Date March 15, 2025
Acceptance Date June 9, 2025
Published in Issue Year 2025 Volume: 16 Issue: 3

Cite

APA Tosun, A., Aybek, E. C., & Sinan, A. (2025). Stabilizing Maximum Likelihood Estimation with a Damping Factor in the Initial Phase of Computerized Adaptive Testing. Journal of Measurement and Evaluation in Education and Psychology, 16(3), 124-138. https://doi.org/10.21031/epod.1658558