Araştırma Makalesi
BibTex RIS Kaynak Göster

The study of the effect of item parameter drift on ability estimation obtained from adaptive testing under different conditions

Yıl 2022, Cilt: 9 Sayı: 3, 654 - 681, 30.09.2022
https://doi.org/10.21449/ijate.1070848

Öz

Item parameter drift (IPD) is the systematic differentiation of parameter values of items over time due to various reasons. If it occurs in computer adaptive tests (CAT), it causes errors in the estimation of item and ability parameters. Identification of the underlying conditions of this situation in CAT is important for estimating item and ability parameters with minimum error. This study examines the measurement precision of IPD and its impacts on the test information function (TIF) in CAT administrations. This simulation study compares sample size (1000, 5000), IPD size (0.00 logit, 0.50 logit, 0.75 logit, 1.00 logit), percentage of items containing IPD (0%, 5%, 10%, 20%), three time points and item bank size (200, 500, 1000) conditions. To examine the impacts of the conditions on ability estimations; measurement precision, and TIF values were calculated, and factorial analysis of variance (ANOVA) for independent samples was carried out to examine whether there were any differences between estimations in terms of these factors. The study found that an increase in the number of measurements using item bank with IPD items results in a decrease in measurement precision and the amount of information the test provides. Factorial ANOVA for independent samples revealed that measurements precision and TIF differences are mostly statistically significant. Although all IPD conditions negatively affect measurement precision and TIF, it has been shown that sample size and item bank size generally do not have an increasing or decreasing effect on these factors.

Kaynakça

  • Abad, F.J., Olea, J., Aguado, D., Ponsoda, V., & Barrada, J.R. (2010). Deterioro de parámetros de los ítems en tests adaptativos informatizados: estudio con eCAT [Item parameter drift in computerized adaptive testing: Study with eCAT]. Psicothema, 22, 340-7.
  • Aksu Dünya, B. (2017). Item parameter drift in computer adaptive testing due to lack of content knowledge within sub-populations [Doctoral dissertation, University of Illinois].
  • Babcock, B., & Albano, A.D. (2012). Rasch scale stability in the presence of item parameter and trait drift. Applied Psychological Measurement, 36(7), 565 580. https://doi.org/10.1177/0146621612455090
  • Babcock, B., & Weiss, D.J. (2012). Termination criteria in computerized adaptive test do variable-length CAT’s provide efficient and effective measurement? International Association for Computerized Adaptive Testing, 1, 1 18. http://dx.doi.org/10.7333%2Fjcat.v1i1.16
  • Barrada, J.R., Olea, J., Ponsoda, V., & Abad, F.J. (2010). A method for the comparison of item selection rules in computerized adaptive testing. Applied Psychological Measurement, 34, 438-452. https://doi.org/10.1177/0146621610370152
  • Bergstrom, B.A., Stahl, J., & Netzky, B.A. (2001, April). Factors that influence parameter drift [Conference presentation] American Educational Research Association, Seattle, WA.
  • Blais, J. & Raiche, G. (2002, April). Features of the sampling distribution of the ability estimate in computerized adaptive testing according to two stopping rules, International Objective Measurement Workshop, New Orleans.
  • Bock, D.B., Muraki, E., & Pfeiffenberger, W. (1988). Item pool maintenance in the presence of item parameter drift. Journal of Educational Measurement, 25(4), 275-285. https://doi.org/10.1111/j.1745-3984.1988.tb00308.x
  • Burton, A., Altman, D.G., Royston, P., & Holder, R.L. (2006). The design of simulation studies in medical statistics. Statistics in Medicine, 25, 4279 4292. https://doi.org/10.1002/sim.2673
  • Chan, K.Y., Drasgow, F., & Sawin, L.L. (1999). What is the shelf life of a test? The effect of time on the psychometrics of a cognitive ability test battery. Journal of Applied Psychology, 84(4), 610-619. https://doi.org/10.1037/0021-9010.84.4.610
  • Chang, H., & Ying, Z. (1999). A-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23, 211-222. https://doi.org/10.1177/01466219922031338
  • Chang, S.W., & Ansley, T.N. (2003). A comparative study of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 40, 71-103. https://www.jstor.org/stable/1435055
  • Chen, S.Y., Ankenmann, R.D., & Chang, H.H. (2000). A comparison of item selection rules at the early stages of computerized adaptive testing. Applied Psychological Measurement, 24, 241-255. https://doi.org/10.1177/01466210022031705
  • Chen, Q. (2013). Remove or keep: linking ıtems showing ıtem parameter drift [Unpublished Doctoral Dissertation]. Michigan State University.
  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Lawrence Erlbaum.
  • Çıkrıkçı-Demirtaşlı, N. (1999). Psikometride yeni ufuklar: Bilgisayar ortamında bireye uyarlanmış test [New horizons in psychometrics: Individualized test in computer environment]. Türk Psikoloji Bülteni, 5(13), 31-36.
  • Deng, H., Ansley, T., & Chang, H. (2010). Stratified and maximum ınformation ıtem selection procedures in computer adaptive testing. Journal of Educational Measurement, 47(2), 202-226. https://doi.org/10.1111/j.1745-3984.2010.00109.x
  • Deng, H., & Melican, G. (2010, April). An investigation of scale drift in computer adaptive test [Conference presentation] Annual Meeting of National Council on Measurement in Education, San Diego, CA.
  • Donoghue, J.R., & Isham, S.P. (1998). A comparison of procedures to detect item parameter drift. Applied Psychological Measurement, 22(1), 33 51. https://doi.org/10.1177/01466216980221002
  • Eggen, T.J.H.M. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23(3), 249 261. https://doi.org/10.1177/01466219922031365
  • Eroğlu, M.G. (2013). Bireyselleştirilmiş bilgisayarlı test uygulamalarında farklı sonlandırma kurallarının ölçme kesinliği ve test uzunluğu açısından karşılaştırılması [Comparison of different test termination rules in terms of measurement precision and test length in computerized adaptive testing] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Evans, J.J. (2010). Comparability of examinee proficiency scores on Computer Adaptive Tests using real and simulated data [Unpublished Doctoral dissertation]. The State University of New Jersey.
  • Filho, N.H., Machado, W.L., & Damasio, B.F. (2014). Effects of statistical models and items difficulties on making trait-level inferences: A simulation study. Psicologia Reflexão e Crítica, 27(4). https://doi.org/10.1590/1678-7153.201427407
  • Goldstein, H. (1983). Measuring changes in educational attainment over time: Problems and possibilities. Journal of Educational Measurement, 20(4), 369 377. https://doi.org/10.1111/j.1745-3984.1983.tb00214.x
  • Guo, F., & Wang, L. (2003, April). Online calibration and scale stability of a CAT program Conference presentation] The Annual Meeting of the National Council on Measurement in Education, Chicago, IL.
  • Hagge, S., Woo, A., & Dickison, P. (2011, October). Impact of item drift on candidate ability estimation [Conference presentation] The Annual Conference of the International Association for Computerized Adaptive Testing, Pacific Grove, CA.
  • Han, K.T., & Guo, F. (2011). Potential impact of item parameter drift due to practice and curriculum change on item calibration in computerized adaptive testing (R-11-02). Graduate Management Admission Council Research Report.
  • Hatfield, J.P., & Nhouyvanisvong, A. (2005, April). Parameter drift in a high-stakes computer adaptive licensure examination: An analysis of anchor items [Conference presentation] The Annual Meeting of the American Educational Research Association, Montreal, Canada.
  • Huang, C., & Shyu, C. (2003, April). The impact of item parameter drift on equating [Conference presentation] National Council on Measurement in Education, Chicago, IL.
  • Jiang, G., Tay, L., & Drasgow, F. (2009). Conspiracies and test compromise: An evaluation of the resistance of test systems to small-scale cheating. International Journal of Testing, 9(4), 283-309. https://doi.org/10.1080/15305050903351901
  • Jones, P.E., & Smith, R.W. (2006, April) Item parameter drift in certification exams and its impact on pass-fail decision making [Conference presentation] National Council of Measurement in Education, San Francisco, CA.
  • Kalender, İ. (2011). Effects of different computerized adaptive testing strategies on recovery of ability [Unpublished Doctoral Dissertation] Middle East Technical University.
  • Kaptan, F. (1993). Yetenek kestiriminde adaptive (bireyselleştirilmiş) test uygulaması ile geleneksel kağıt-kalem testi uygulamasının karşılaştırılması [Comparison of adaptive (individualized) test application and traditional paper-pencil test application in ability estimation] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Keller, A.L. (2000). Ability estimation procedures in computerized adaptive testing (Technical Report). American Institute of Certified Public Accountants-AICPA Research Concortium-Examination Teams.
  • Kezer, F. (2013). Bilgisayar ortamında bireye uyarlanmış test stratejilerinin karşılaştırılması [Comparison of computerized adaptive testing strategies] [Unpublished Doctoral Dissertation]. Ankara University.
  • Kim, S.H., & Cohen, A.S. (1992). Effects of linking methods on detection of DIF. Journal of Educational Measurement, 29, 51–66. https://www.jstor.org/stable/1434776
  • Kingsbury, G.G., & Wise, S.L. (2011). Creating a K-12 adaptive test: Examining the stability of ıtem parameter estimates and measurement scales. Journal of Applied Testing Technology, 12.
  • Kingsbury, G.G., & Zara, A.R. (2009). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2(4), 359 375. https://doi.org/10.1207/s15324818ame0204_6
  • Köse, İ.A. & Başaran, İ. (2021). 2 parametreli lojistik modelde normal dağılım ihlalinin madde parametre kestirimine etkisinin incelenmesi [Investigation of the effect of different ability distributions on item parameter estimation under two-parameter logistics model]. Journal of Digital Measurement and Evaluation Research, 1(1), 01 21. https://doi.org/10.29329/dmer.2021.285.1
  • Li, X. (2008). An investigation of the item parameter drift in the examination for the certificate of proficiency in English (ECPE). Spaan Fellow Working Papers in Second or Foreign Language Assessment, 6, 1–28.
  • Linda, T. (1996, April). A comparison of the traditional maximum information method and the global information method in CAT item selection [Conference presentation] National Council on Measurement in Education, New York.
  • Linden, W.J., & Glas, G.A.W. (2002). Computerized adaptive testing: Theory and practice. Kluwer Academic Publishers.
  • Lord, F.M. (1980). Applications of ıtem response theory to practical testing problems. Lawrence Erlbaum Associates Publishers.
  • McCoy, K.M. (2009). The impact of item parameter drift on examinee ability measures in a computer adaptive environment [Unpublished Doctoral Dissertation]. University of Illinois.
  • McDonald, P.L. (2002). Computer adaptive test for measuring personality factors using item response theory [Unpublished Doctoral Dissertation]. The University Western of Ontario.
  • Meng, H., Steinkamp, S., & Matthews-Lopez, J. (2010). An investigation of item parameter drift in computer adaptive testing [Conference presentation] The Annual Meeting of the National Council on Measurement in Education, Denver, CO.
  • Meyers, J., Miller, G.E., & Way, W.D. (2009, April). Item position and item difficulty change in an IRT based common item equating design [Conference presentation] The American Educational Research Association, San Francisco, CA.
  • Nydick, S.W. (2015). An R package for simulating IRT-based computerized adaptive tests.
  • Patton, J.M., Cheng, Y., Yuan, K.H., & Diao (2013). The influence of item calibration error on variable-length computerized adaptive testing. Applied Psychological Measurement, 37(1), 24–40. https://doi.org/10.1177/0146621612461727
  • Ranganathan, K., & Foster, I. (2003). Simulation studies of computation and data scheduling algorithms for data grids. Journal of Grid Computing, 1, 53 62. https://doi.org/10.1023/A:1024035627870
  • Reckase, M.D. (2011). Computerized adaptive assessment (CAA): The way forward. In The road ahead for state assessments, policy analysis for California education and Rennie Center for Education Research & Policy (pp.1-11). Rennie Center for Education Research & Policy.
  • Risk, N.M. (2015). The impact of item parameter drift in computer adaptive testing (CAT) [Unpublished Doctoral Dissertation]. University of Illinois.
  • Rudner, L.M., & Guo, F. (2011). Computer adaptive testing for small scale programs and instructional systems. Graduate Management Council (GMAC), 11(01), 6-10.
  • Rupp, A.A., & Zumbo, B.D. (2003). Bias coefficients for lack of invariance in unidimensional IRT models. Vancouver: University of British Columbia.
  • Rupp, A.A., & Zumbo, B.D. (2004). A note on how to quantify and report whether item parameter invariance holds: When Pearson correlations are not enough. Educational and Psychological Measurement, 64, 588-599. https://doi.org/10.1177/0013164403261051
  • Schulz, W., & Fraillon, J. (2009, September). The analysis of measurement equivalence in ınternational studies using the rasch model [Conference presentation] The European Conference on Educational Research (ECER), Vienna.
  • Scullard, M.G. (2007). Application of item response theory based computerized adaptive testing to the strong interest inventory [Unpublished Doctoral Dissertation]. University of Minnesota.
  • Segall, D.O. (2004). Computerized adaptive testing. In K. Kempf-Lenard (Ed.), The Encyclopedia of social measurement. Academic Press. Song, T., & Arce-Ferrer, A. (2009, April). Comparing IPD detection approaches in common-item nonequivalent group equating design [Conference presentation] The Annual Meeting of the National Council on Measurement, San Diego, CA.
  • Stahl, J.A., & Muckle, T. (2007, April). Investigating displacement in the Winsteps Rasch calibration application [Conference presentation] The Annual Meeting of the American Educational Research Association, Chicago, IL.
  • Sulak, S. (2013). Bireyselleştirilmiş bilgisayarlı test uygulamalarında kullanılan madde seçme yöntemlerinin karşılaştırılması [Comparision of item selection methods in computerized adaptive testing] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Svetina, D., Crawford, A.V., Levy, R., Green, S.B., Scott, L., Thompson, M., Gorin, J.S., Fay, D., & Kunze, K.L. (2013). Designing small-scale tests: A simulation study of parameter recovery with the 1-PL. Psychological Test and Assessment Modeling, 55(4), 335-360.
  • Şahin, A. (2012). Madde tepki kuramında test uzunluğu ve örneklem büyüklüğünün model veri uyumu, madde parametreleri ve standart hata değerlerine etkisinin incelenmesi [An investigation on the effects of test length and sample size in item response theory on model-data fit, item parameters and standard error values] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Veldkamp, B.P., & Linden van der, W. (2006) Designing item pool for computerized adaptive testing. In Designing Item Pools (pp.149-166). University of Twente.
  • Wainer, H. (1993). Some practical considerations when converting a linearly administered test to an adaptive format. Educational Measurement: Issues and Practice, 12(1), 15–20. https://doi.org/10.1111/j.1745-3992.1993.tb00519.x
  • Wainer, H., Dorans, N.J., Eignor, D., Flaugher, R., Green, B.F., Mislevy, R.J., Steinberg, L., & Thissen, D. (2010). Computerized adaptive testing: A primer. Lawrence Erlbaum Associates Publishers.
  • Wang, T. (1997, March). Essential unbiased EAP estimates in computerized adaptive testing [Conference presentation] The American Educational Association, Chicago, IL.
  • Wang, H-P., Kuo, B-C., Tsai, Y-H., & Liao, C-H. (2012). A Cerf-Based computerized testing system for Chinese proficiency. TOJET: The Turkish Journal of Educational Technology, 11(4), 1–12.
  • Weiss, D.J., & Kingsbury, G.G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361–-375. http://www.jstor.org/stable/1434587
  • Wells, C.S., Subkoviak, M.J., & Serlin, R.C. (2002). The effect of item parameter drift on examinee ability estimates. Applied Psychological Measurement, 26(1), 77-87. https://doi.org/10.1177/0146621602261005
  • Wise, S.L., & Kingsbury, G.G. (2000). Practical issues in developing and maintaining a computerized adaptive testing program. Psicológica, 21(2000), 135-155.
  • Witt, E.A., Stahl, J.A., Bergstrom, B.A., & Muckle, T. (2003, April). Impact of item drift with nonnormal distributions [Conference presentation] The Annual Meeting of the American Educational Research Association, Chicago, IL.
  • Wollack, J.A., Sung, H.J., & Kang, T. (2005) Longitudinal effects of item parameter drift [Conference presentation] The Annual Meeting of the National Council on Measurement in Education, Montreal, CA.
  • Yao, L. (2013). Comparing the performance of five multidimensional CAT selection procedures with different stopping rules. Applied Psychological Measurement, 37(1), 3-23. https://doi.org/10.1177/0146621612455687
  • Yi, Q., Wang, T., & Ban, J.C. (2001). Effects of scale transformation and test termination rule on the precision of ability estimation in computerized adaptive testing. Journal of Educational Measurement, 38, 267 292. https://doi.org/10.1111/j.17453984.2001.tb01127.x

The study of the effect of item parameter drift on ability estimation obtained from adaptive testing under different conditions

Yıl 2022, Cilt: 9 Sayı: 3, 654 - 681, 30.09.2022
https://doi.org/10.21449/ijate.1070848

Öz

Item parameter drift (IPD) is the systematic differentiation of parameter values of items over time due to various reasons. If it occurs in computer adaptive tests (CAT), it causes errors in the estimation of item and ability parameters. Identification of the underlying conditions of this situation in CAT is important for estimating item and ability parameters with minimum error. This study examines the measurement precision of IPD and its impacts on the test information function (TIF) in CAT administrations. This simulation study compares sample size (1000, 5000), IPD size (0.00 logit, 0.50 logit, 0.75 logit, 1.00 logit), percentage of items containing IPD (0%, 5%, 10%, 20%), three time points and item bank size (200, 500, 1000) conditions. To examine the impacts of the conditions on ability estimations; measurement precision, and TIF values were calculated, and factorial analysis of variance (ANOVA) for independent samples was carried out to examine whether there were any differences between estimations in terms of these factors. The study found that an increase in the number of measurements using item bank with IPD items results in a decrease in measurement precision and the amount of information the test provides. Factorial ANOVA for independent samples revealed that measurements precision and TIF differences are mostly statistically significant. Although all IPD conditions negatively affect measurement precision and TIF, it has been shown that sample size and item bank size generally do not have an increasing or decreasing effect on these factors.

Kaynakça

  • Abad, F.J., Olea, J., Aguado, D., Ponsoda, V., & Barrada, J.R. (2010). Deterioro de parámetros de los ítems en tests adaptativos informatizados: estudio con eCAT [Item parameter drift in computerized adaptive testing: Study with eCAT]. Psicothema, 22, 340-7.
  • Aksu Dünya, B. (2017). Item parameter drift in computer adaptive testing due to lack of content knowledge within sub-populations [Doctoral dissertation, University of Illinois].
  • Babcock, B., & Albano, A.D. (2012). Rasch scale stability in the presence of item parameter and trait drift. Applied Psychological Measurement, 36(7), 565 580. https://doi.org/10.1177/0146621612455090
  • Babcock, B., & Weiss, D.J. (2012). Termination criteria in computerized adaptive test do variable-length CAT’s provide efficient and effective measurement? International Association for Computerized Adaptive Testing, 1, 1 18. http://dx.doi.org/10.7333%2Fjcat.v1i1.16
  • Barrada, J.R., Olea, J., Ponsoda, V., & Abad, F.J. (2010). A method for the comparison of item selection rules in computerized adaptive testing. Applied Psychological Measurement, 34, 438-452. https://doi.org/10.1177/0146621610370152
  • Bergstrom, B.A., Stahl, J., & Netzky, B.A. (2001, April). Factors that influence parameter drift [Conference presentation] American Educational Research Association, Seattle, WA.
  • Blais, J. & Raiche, G. (2002, April). Features of the sampling distribution of the ability estimate in computerized adaptive testing according to two stopping rules, International Objective Measurement Workshop, New Orleans.
  • Bock, D.B., Muraki, E., & Pfeiffenberger, W. (1988). Item pool maintenance in the presence of item parameter drift. Journal of Educational Measurement, 25(4), 275-285. https://doi.org/10.1111/j.1745-3984.1988.tb00308.x
  • Burton, A., Altman, D.G., Royston, P., & Holder, R.L. (2006). The design of simulation studies in medical statistics. Statistics in Medicine, 25, 4279 4292. https://doi.org/10.1002/sim.2673
  • Chan, K.Y., Drasgow, F., & Sawin, L.L. (1999). What is the shelf life of a test? The effect of time on the psychometrics of a cognitive ability test battery. Journal of Applied Psychology, 84(4), 610-619. https://doi.org/10.1037/0021-9010.84.4.610
  • Chang, H., & Ying, Z. (1999). A-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23, 211-222. https://doi.org/10.1177/01466219922031338
  • Chang, S.W., & Ansley, T.N. (2003). A comparative study of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 40, 71-103. https://www.jstor.org/stable/1435055
  • Chen, S.Y., Ankenmann, R.D., & Chang, H.H. (2000). A comparison of item selection rules at the early stages of computerized adaptive testing. Applied Psychological Measurement, 24, 241-255. https://doi.org/10.1177/01466210022031705
  • Chen, Q. (2013). Remove or keep: linking ıtems showing ıtem parameter drift [Unpublished Doctoral Dissertation]. Michigan State University.
  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Lawrence Erlbaum.
  • Çıkrıkçı-Demirtaşlı, N. (1999). Psikometride yeni ufuklar: Bilgisayar ortamında bireye uyarlanmış test [New horizons in psychometrics: Individualized test in computer environment]. Türk Psikoloji Bülteni, 5(13), 31-36.
  • Deng, H., Ansley, T., & Chang, H. (2010). Stratified and maximum ınformation ıtem selection procedures in computer adaptive testing. Journal of Educational Measurement, 47(2), 202-226. https://doi.org/10.1111/j.1745-3984.2010.00109.x
  • Deng, H., & Melican, G. (2010, April). An investigation of scale drift in computer adaptive test [Conference presentation] Annual Meeting of National Council on Measurement in Education, San Diego, CA.
  • Donoghue, J.R., & Isham, S.P. (1998). A comparison of procedures to detect item parameter drift. Applied Psychological Measurement, 22(1), 33 51. https://doi.org/10.1177/01466216980221002
  • Eggen, T.J.H.M. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23(3), 249 261. https://doi.org/10.1177/01466219922031365
  • Eroğlu, M.G. (2013). Bireyselleştirilmiş bilgisayarlı test uygulamalarında farklı sonlandırma kurallarının ölçme kesinliği ve test uzunluğu açısından karşılaştırılması [Comparison of different test termination rules in terms of measurement precision and test length in computerized adaptive testing] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Evans, J.J. (2010). Comparability of examinee proficiency scores on Computer Adaptive Tests using real and simulated data [Unpublished Doctoral dissertation]. The State University of New Jersey.
  • Filho, N.H., Machado, W.L., & Damasio, B.F. (2014). Effects of statistical models and items difficulties on making trait-level inferences: A simulation study. Psicologia Reflexão e Crítica, 27(4). https://doi.org/10.1590/1678-7153.201427407
  • Goldstein, H. (1983). Measuring changes in educational attainment over time: Problems and possibilities. Journal of Educational Measurement, 20(4), 369 377. https://doi.org/10.1111/j.1745-3984.1983.tb00214.x
  • Guo, F., & Wang, L. (2003, April). Online calibration and scale stability of a CAT program Conference presentation] The Annual Meeting of the National Council on Measurement in Education, Chicago, IL.
  • Hagge, S., Woo, A., & Dickison, P. (2011, October). Impact of item drift on candidate ability estimation [Conference presentation] The Annual Conference of the International Association for Computerized Adaptive Testing, Pacific Grove, CA.
  • Han, K.T., & Guo, F. (2011). Potential impact of item parameter drift due to practice and curriculum change on item calibration in computerized adaptive testing (R-11-02). Graduate Management Admission Council Research Report.
  • Hatfield, J.P., & Nhouyvanisvong, A. (2005, April). Parameter drift in a high-stakes computer adaptive licensure examination: An analysis of anchor items [Conference presentation] The Annual Meeting of the American Educational Research Association, Montreal, Canada.
  • Huang, C., & Shyu, C. (2003, April). The impact of item parameter drift on equating [Conference presentation] National Council on Measurement in Education, Chicago, IL.
  • Jiang, G., Tay, L., & Drasgow, F. (2009). Conspiracies and test compromise: An evaluation of the resistance of test systems to small-scale cheating. International Journal of Testing, 9(4), 283-309. https://doi.org/10.1080/15305050903351901
  • Jones, P.E., & Smith, R.W. (2006, April) Item parameter drift in certification exams and its impact on pass-fail decision making [Conference presentation] National Council of Measurement in Education, San Francisco, CA.
  • Kalender, İ. (2011). Effects of different computerized adaptive testing strategies on recovery of ability [Unpublished Doctoral Dissertation] Middle East Technical University.
  • Kaptan, F. (1993). Yetenek kestiriminde adaptive (bireyselleştirilmiş) test uygulaması ile geleneksel kağıt-kalem testi uygulamasının karşılaştırılması [Comparison of adaptive (individualized) test application and traditional paper-pencil test application in ability estimation] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Keller, A.L. (2000). Ability estimation procedures in computerized adaptive testing (Technical Report). American Institute of Certified Public Accountants-AICPA Research Concortium-Examination Teams.
  • Kezer, F. (2013). Bilgisayar ortamında bireye uyarlanmış test stratejilerinin karşılaştırılması [Comparison of computerized adaptive testing strategies] [Unpublished Doctoral Dissertation]. Ankara University.
  • Kim, S.H., & Cohen, A.S. (1992). Effects of linking methods on detection of DIF. Journal of Educational Measurement, 29, 51–66. https://www.jstor.org/stable/1434776
  • Kingsbury, G.G., & Wise, S.L. (2011). Creating a K-12 adaptive test: Examining the stability of ıtem parameter estimates and measurement scales. Journal of Applied Testing Technology, 12.
  • Kingsbury, G.G., & Zara, A.R. (2009). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2(4), 359 375. https://doi.org/10.1207/s15324818ame0204_6
  • Köse, İ.A. & Başaran, İ. (2021). 2 parametreli lojistik modelde normal dağılım ihlalinin madde parametre kestirimine etkisinin incelenmesi [Investigation of the effect of different ability distributions on item parameter estimation under two-parameter logistics model]. Journal of Digital Measurement and Evaluation Research, 1(1), 01 21. https://doi.org/10.29329/dmer.2021.285.1
  • Li, X. (2008). An investigation of the item parameter drift in the examination for the certificate of proficiency in English (ECPE). Spaan Fellow Working Papers in Second or Foreign Language Assessment, 6, 1–28.
  • Linda, T. (1996, April). A comparison of the traditional maximum information method and the global information method in CAT item selection [Conference presentation] National Council on Measurement in Education, New York.
  • Linden, W.J., & Glas, G.A.W. (2002). Computerized adaptive testing: Theory and practice. Kluwer Academic Publishers.
  • Lord, F.M. (1980). Applications of ıtem response theory to practical testing problems. Lawrence Erlbaum Associates Publishers.
  • McCoy, K.M. (2009). The impact of item parameter drift on examinee ability measures in a computer adaptive environment [Unpublished Doctoral Dissertation]. University of Illinois.
  • McDonald, P.L. (2002). Computer adaptive test for measuring personality factors using item response theory [Unpublished Doctoral Dissertation]. The University Western of Ontario.
  • Meng, H., Steinkamp, S., & Matthews-Lopez, J. (2010). An investigation of item parameter drift in computer adaptive testing [Conference presentation] The Annual Meeting of the National Council on Measurement in Education, Denver, CO.
  • Meyers, J., Miller, G.E., & Way, W.D. (2009, April). Item position and item difficulty change in an IRT based common item equating design [Conference presentation] The American Educational Research Association, San Francisco, CA.
  • Nydick, S.W. (2015). An R package for simulating IRT-based computerized adaptive tests.
  • Patton, J.M., Cheng, Y., Yuan, K.H., & Diao (2013). The influence of item calibration error on variable-length computerized adaptive testing. Applied Psychological Measurement, 37(1), 24–40. https://doi.org/10.1177/0146621612461727
  • Ranganathan, K., & Foster, I. (2003). Simulation studies of computation and data scheduling algorithms for data grids. Journal of Grid Computing, 1, 53 62. https://doi.org/10.1023/A:1024035627870
  • Reckase, M.D. (2011). Computerized adaptive assessment (CAA): The way forward. In The road ahead for state assessments, policy analysis for California education and Rennie Center for Education Research & Policy (pp.1-11). Rennie Center for Education Research & Policy.
  • Risk, N.M. (2015). The impact of item parameter drift in computer adaptive testing (CAT) [Unpublished Doctoral Dissertation]. University of Illinois.
  • Rudner, L.M., & Guo, F. (2011). Computer adaptive testing for small scale programs and instructional systems. Graduate Management Council (GMAC), 11(01), 6-10.
  • Rupp, A.A., & Zumbo, B.D. (2003). Bias coefficients for lack of invariance in unidimensional IRT models. Vancouver: University of British Columbia.
  • Rupp, A.A., & Zumbo, B.D. (2004). A note on how to quantify and report whether item parameter invariance holds: When Pearson correlations are not enough. Educational and Psychological Measurement, 64, 588-599. https://doi.org/10.1177/0013164403261051
  • Schulz, W., & Fraillon, J. (2009, September). The analysis of measurement equivalence in ınternational studies using the rasch model [Conference presentation] The European Conference on Educational Research (ECER), Vienna.
  • Scullard, M.G. (2007). Application of item response theory based computerized adaptive testing to the strong interest inventory [Unpublished Doctoral Dissertation]. University of Minnesota.
  • Segall, D.O. (2004). Computerized adaptive testing. In K. Kempf-Lenard (Ed.), The Encyclopedia of social measurement. Academic Press. Song, T., & Arce-Ferrer, A. (2009, April). Comparing IPD detection approaches in common-item nonequivalent group equating design [Conference presentation] The Annual Meeting of the National Council on Measurement, San Diego, CA.
  • Stahl, J.A., & Muckle, T. (2007, April). Investigating displacement in the Winsteps Rasch calibration application [Conference presentation] The Annual Meeting of the American Educational Research Association, Chicago, IL.
  • Sulak, S. (2013). Bireyselleştirilmiş bilgisayarlı test uygulamalarında kullanılan madde seçme yöntemlerinin karşılaştırılması [Comparision of item selection methods in computerized adaptive testing] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Svetina, D., Crawford, A.V., Levy, R., Green, S.B., Scott, L., Thompson, M., Gorin, J.S., Fay, D., & Kunze, K.L. (2013). Designing small-scale tests: A simulation study of parameter recovery with the 1-PL. Psychological Test and Assessment Modeling, 55(4), 335-360.
  • Şahin, A. (2012). Madde tepki kuramında test uzunluğu ve örneklem büyüklüğünün model veri uyumu, madde parametreleri ve standart hata değerlerine etkisinin incelenmesi [An investigation on the effects of test length and sample size in item response theory on model-data fit, item parameters and standard error values] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Veldkamp, B.P., & Linden van der, W. (2006) Designing item pool for computerized adaptive testing. In Designing Item Pools (pp.149-166). University of Twente.
  • Wainer, H. (1993). Some practical considerations when converting a linearly administered test to an adaptive format. Educational Measurement: Issues and Practice, 12(1), 15–20. https://doi.org/10.1111/j.1745-3992.1993.tb00519.x
  • Wainer, H., Dorans, N.J., Eignor, D., Flaugher, R., Green, B.F., Mislevy, R.J., Steinberg, L., & Thissen, D. (2010). Computerized adaptive testing: A primer. Lawrence Erlbaum Associates Publishers.
  • Wang, T. (1997, March). Essential unbiased EAP estimates in computerized adaptive testing [Conference presentation] The American Educational Association, Chicago, IL.
  • Wang, H-P., Kuo, B-C., Tsai, Y-H., & Liao, C-H. (2012). A Cerf-Based computerized testing system for Chinese proficiency. TOJET: The Turkish Journal of Educational Technology, 11(4), 1–12.
  • Weiss, D.J., & Kingsbury, G.G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361–-375. http://www.jstor.org/stable/1434587
  • Wells, C.S., Subkoviak, M.J., & Serlin, R.C. (2002). The effect of item parameter drift on examinee ability estimates. Applied Psychological Measurement, 26(1), 77-87. https://doi.org/10.1177/0146621602261005
  • Wise, S.L., & Kingsbury, G.G. (2000). Practical issues in developing and maintaining a computerized adaptive testing program. Psicológica, 21(2000), 135-155.
  • Witt, E.A., Stahl, J.A., Bergstrom, B.A., & Muckle, T. (2003, April). Impact of item drift with nonnormal distributions [Conference presentation] The Annual Meeting of the American Educational Research Association, Chicago, IL.
  • Wollack, J.A., Sung, H.J., & Kang, T. (2005) Longitudinal effects of item parameter drift [Conference presentation] The Annual Meeting of the National Council on Measurement in Education, Montreal, CA.
  • Yao, L. (2013). Comparing the performance of five multidimensional CAT selection procedures with different stopping rules. Applied Psychological Measurement, 37(1), 3-23. https://doi.org/10.1177/0146621612455687
  • Yi, Q., Wang, T., & Ban, J.C. (2001). Effects of scale transformation and test termination rule on the precision of ability estimation in computerized adaptive testing. Journal of Educational Measurement, 38, 267 292. https://doi.org/10.1111/j.17453984.2001.tb01127.x
Toplam 74 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Alan Eğitimleri
Bölüm Makaleler
Yazarlar

Merve Şahin Kürşad 0000-0002-6591-0705

Ömay Çokluk-bökeoglu 0000-0002-3879-9204

Nükhet Çıkrıkçı 0000-0001-8853-4733

Erken Görünüm Tarihi 31 Ağustos 2022
Yayımlanma Tarihi 30 Eylül 2022
Gönderilme Tarihi 9 Şubat 2022
Yayımlandığı Sayı Yıl 2022 Cilt: 9 Sayı: 3

Kaynak Göster

APA Şahin Kürşad, M., Çokluk-bökeoglu, Ö., & Çıkrıkçı, N. (2022). The study of the effect of item parameter drift on ability estimation obtained from adaptive testing under different conditions. International Journal of Assessment Tools in Education, 9(3), 654-681. https://doi.org/10.21449/ijate.1070848

23824         23823             23825