TY  - JOUR
T1  - Stabilizing Maximum Likelihood Estimation with a Damping Factor in the Initial Phase of Computerized Adaptive Testing
AU  - Tosun, Alper
AU  - Aybek, Eren Can
AU  - Sinan, Alper
PY  - 2025
DA  - September
Y2  - 2025
DO  - 10.21031/epod.1658558
JF  - Journal of Measurement and Evaluation in Education and Psychology
JO  - JMEEP
PB  - Association for Measurement and Evaluation in Education and Psychology
WT  - DergiPark
SN  - 1309-6575
SP  - 124
EP  - 138
VL  - 16
IS  - 3
LA  - en
AB  - Maximum Likelihood Estimation (MLE) is a widely used ability estimation method in Item Response Theory (IRT)-based CAT applications. However, traditional MLE is highly sensitive to initial responses, often leading to substantial fluctuations and estimation instability, particularly in short tests or small item pools. This study investigates the effects of incorporating a damping factor into MLE at the early stages of CAT to mitigate undue ability estimate fluctuations. Using Monte Carlo simulations based on a 3-Parameter Logistic (3PL) model in R, we examine the performance of the adjusted MLE compared to standard MLE, Maximum A Posteriori (MAP), and Expected a Posteriori (EAP) estimation methods. Results indicate that damping improves MLE stability, reducing extreme ability fluctuations and enhancing estimation accuracy, particularly in short tests and small sample conditions.
KW  - Computerized Adaptive Testing
KW  - Maximum Likelihood Estimation
KW  - Maximum a Posteriori
CR  - Aybek, E. C., &amp; Çıkrıkçı, R. N. (2018, September). Kendini Değerlendirme Envanteri’nin bilgisayar ortamında bireye uyarlanmış test olarak uygulanabilirliği. Turkish Psychological Counseling and Guidance Journal, 8(50), 117–141. Turkish Psychological Counseling and Guidance Association. Doi; https://dergipark.org.tr/en/download/article-file/571511
CR  - Bock, R., &amp; Mislevy, R. (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6, 431 - 444. https://doi.org/10.1177/014662168200600405.
CR  - Chen, S., Hou, L., &amp; Dodd, B. (1998). A Comparison of Maximum Likelihood Estimation and Expected a Posteriori Estimation in CAT Using the Partial Credit Model. Educational and Psychological Measurement, 58, 569 - 595. https://doi.org/10.1177/0013164498058004002.
CR  - Chen, S., Hou, L., Fitzpatrick, S., &amp; Dodd, B. (1997). The Effect of Population Distribution and Method of Theta Estimation on Computerized Adaptive Testing (CAT) Using the Rating Scale Model. Educational and Psychological Measurement, 57, 422 - 439. https://doi.org/10.1177/0013164497057003004.
CR  - Cheng, Y. (2008). Computerized adaptive testing: New developments and applications (Doctoral dissertation, University of Illinois at Urbana-Champaign). University of Illinois at Urbana-Champaign. Doi; https://hdl.handle.net/2142/82159
CR  - Di Stefano, F., Pannaux, M., Correges, A., Galtier, S., Robert, V., &amp; Saint‐Hilary, G. (2022). A comparison of estimation methods adjusting for selection bias in adaptive enrichment designs with time‐to‐event endpoints. Statistics in Medicine, 41(10), 1767-1779. doi: https://doi.org/10.1002/sim.9327
CR  - Dutilleul, P. (1999). The mle algorithm for the matrix normal distribution. Journal of Statistical Computation and Simulation, 64, 105-123. https://doi.org/10.1080/00949659908811970.
CR  - Fuh, C. D., Ip, E. H., &amp; Chen, S. H. (2020). Computerized adaptive test using raw responses for item selection: theoretical results and applications for the up-and-down method. Statistics and Its Interface, 13(3), 317-333. doi; https://doi.org/10.4310/SII.2020.v13.n3.a3
CR  - Geraldo, I. (2022). An Automated Profile-Likelihood-Based Algorithm for Fast Computation of the Maximum Likelihood Estimate in a Statistical Model for Crash Data. J. Appl. Math., 2022, 6974166:1-6974166:11. https://doi.org/10.1155/2022/6974166.
CR  - Gorin, J., Dodd, B., Fitzpatrick, S., &amp; Shieh, Y. (2005). Computerized Adaptive Testing With the Partial Credit Model: Estimation Procedures, Population Distributions, and Item Pool Characteristics. Applied Psychological Measurement, 29, 433 - 456. https://doi.org/10.1177/0146621605280072.
CR  - Graf, A., Gutjahr, G., &amp; Brannath, W. (2015). Precision of maximum likelihood estimation in adaptive designs. Statistics in Medicine, 35, 922 - 941. https://doi.org/10.1002/sim.6761.
CR  - Gündeğer, C., &amp; Doğan, N. (2018). Bireyselleştirilmiş bilgisayarlı sınıflama testlerinde madde havuzu özelliklerinin test uzunluğu ve sınıflama doğruluğu üzerindeki etkisi. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 33(4), 888-896. doi: https://doi.org/10.16986/HUJE.2016024284
CR  - Hambleton, R. K., Swaminathan, H., &amp; Rogers, H. J. (1991). Fundamentals of item response theory (2nd ed.). SAGE Publications. https://books.google.com.tr/books?id=gW05DQAAQBAJ
CR  - Han, K. (2016). Maximum Likelihood Score Estimation Method With Fences for Short-Length Tests and Computerized Adaptive Tests. Applied Psychological Measurement, 40, 289 - 301. https://doi.org/10.1177/0146621616631317.
CR  - Ho, T. H., &amp; Dodd, B. G. (2012). Item Selection and Ability Estimation Procedures for a Mixed-Format Adaptive Test. Applied Measurement in Education, 25(4), 305–326. https://doi.org/10.1080/08957347.2012.714686
CR  - Kalender, İ. (2009). Başarı ve yetenek kestiriminde yeni bir yaklaşım: Bilgisayar ortamında bireyselleştirilmiş testler. CITO Eğitim: Kuram ve Uygulama, 5, 40–48.
CR  - Kalender, İ. (2011). Effects of different computerized adaptive testing strategies on recovery of ability (Doctoral dissertation, Middle East Technical University). Middle East Technical University Graduate School of Natural and Applied Sciences. https://hdl.handle.net/11511/21135
CR  - Karagianni, M., &amp; Tsaousis, I. (2025). From Development to Validation: Exploring the Efficiency of Numetrive, a Computerized Adaptive Assessment of Numerical Reasoning. Behavioral Sciences, 15(3), 268. doi; https://doi.org/10.3390/bs15030268
CR  - Kern, J. L., &amp; Choe, E. (2021). Using a response time–based expected a posteriori estimator to control for differential speededness in computerized adaptive test. Applied Psychological Measurement, 45(5), 361-385. DOI: https://doi.org/10.1177/01466216211014601
CR  - Lilley, M. (2007). The development and application of computer-adaptive testing in a higher education environment (Doctoral dissertation, University of Hertfordshire).
CR  - Lin, C. H., Chen, K. P., &amp; Tsai, C. H. (2008, December). Modeling the Examinee Ability on the Computerized Adaptive Testing Using Adaptive Network-Based Fuzzy Inference System. In 2008 IEEE Asia-Pacific Services Computing Conference (pp. 139-144). IEEE. doi: https://doi.org/10.1109/APSCC.2008.53.
CR  - Lord, F. M., &amp; Novick, M. R. (2008). Statistical theories of mental test scores. Information Age Publishing. https://books.google.com.tr/books?id=k_wnDwAAQBAJ
CR  - Magis, D. &amp; Raiche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48(8), 1-31. https://doi.org/10.18637/jss.v048.i08
CR  - Magis, D., &amp; Barrada, J. R. (2017). Computerized adaptive testing with R: Recent updates of the package catR. Journal of Statistical Software, 76, 1-19. Doi; https://doi.org/10.18637/jss.v076.c01
CR  - Piepho, H. (1993). Use of the Maximum Likelihood Method in the Analysis of Phenotypic Stability. Biometrical Journal, 35, 815-822. https://doi.org/10.1002/BIMJ.4710350709.
CR  - R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. URL https://www.R-project.org/.
CR  - Sands, W. A., Waters, B. K., &amp; McBride, J. R. (Eds.). (1997). Computerized adaptive testing: From inquiry to operation. American Psychological Association. https://doi.org/10.1037/10244-000
CR  - Stocking, M. L., &amp; Lewis, C. (1995). A new method of controlling item exposure in computerized adaptive testing. ETS Research Report Series, 1995(2), i-29. https://doi.org/10.1002/j.2333-8504.1995.tb01660.x
CR  - Suhardi, I. (2020). Alternative item selection strategies for improving test security in computerized adaptive testing of the algorithm. Research and Evaluation in Education, 6, 32-40. https://doi.org/10.21831/reid.v6i1.30508.
CR  - van der Linden, W. J., &amp; Glas, C. A. W. (Eds.). (2000). Computerized adaptive testing: Theory and practice. Springer. https://doi.org/10.1007/0-306-47531-6
CR  - van der Linden, W. J., &amp; Glas, C. A. W. (Eds.). (2010). Elements of adaptive testing. Springer. https://doi.org/10.1007/978-0-387-85461-8
CR  - Wang, S., &amp; Wang, T. (2001). Precision of Warm’s weighted likelihood estimates for a polytomous model in computerized adaptive testing. Applied Psychological Measurement, 25(4), 317-331. https://doi.org/10.1177/0146621012203216
CR  - Wang, T., &amp; Vispoel, W. (1998). Properties of Ability Estimation Methods in Computerized Adaptive Testing. Journal of Educational Measurement, 35, 109-135. https://doi.org/10.1111/J.1745-3984.1998.TB00530.X.
CR  - Weiss, D., &amp; Şahin, A. (2024). Computerized adaptive testing: From concept to implementation. The Guilford Press. ISBN: 9781462554515
CR  - Wyse, A., &amp; Mcbride, J. (2022). Handling Extreme Scores in Vertically Scaled Fixed-Length Computerized Adaptive Tests. Measurement: Interdisciplinary Research and Perspectives, 20, 1 - 20. https://doi.org/10.1080/15366367.2021.1977583
UR  - https://doi.org/10.21031/epod.1658558
L1  - https://dergipark.org.tr/en/download/article-file/4693642
ER  -