İçerik Ağırlıklandırmasının Maddeler-Arası Boyutluluk Modeline Dayalı Çok Boyutlu Bilgisayar Ortamında Bireyselleştirilmiş Test Yöntemleri Üzerindeki Etkisinin İncelenmesi

Burhanettin Özdemir; Selahattin Gelbal

doi:10.21031/epod.03278

İçerik Ağırlıklandırmasının Maddeler-Arası Boyutluluk Modeline Dayalı Çok Boyutlu Bilgisayar Ortamında Bireyselleştirilmiş Test Yöntemleri Üzerindeki Etkisinin İncelenmesi

Year 2015, Volume: 6 Issue: 2, 0 - 0, 02.01.2016

Burhanettin Özdemir , Selahattin Gelbal

https://doi.org/10.21031/epod.03278

Abstract

Bu çalışmanın amacı, maddeler-arası boyutluluk modeline dayalı Çok Boyutlu Bilgisayar Ortamında Bireyselleştirilmiş (BOB) Test Yöntemlerinin performanslarının karşılaştırılması ve içerik ağırlıklandırmasının (content balancing) çok boyutlu BOB testi yöntemleri üzerindeki etkisinin incelenmesidir. Bu amaç doğrultusunda, 2009-2013 eğitim ve öğretim yıllarında Hacettepe Üniversitesi tarafından uygulanan İngilizce Yeterlik Sınavına (İYS) ait gerçek veri seti kullanılmıştır. Her bir testte yer alan dinleme, dilbilgisi ve okuduğunu anlamaya ilişkin maddeler ile üç boyutlu gerçek madde havuzu oluşturulmuştur. Maddeler-arası boyutluluk modeli ile kalibre edilerek oluşturulan madde havuzu toplamda 555 maddeden oluşmaktadır. En uygun çok boyutlu BOB testini belirlemek amacıyla; iki farklı yetenek kestirim yöntemi (Bayesyen MAP ve Fisher’in puanlama yöntemi), üç farklı madde seçim yöntemi (A-optimality, D-optimality ve seçkisiz) ve hata varyansı durdurma kuralına dayalı üç farklı ölçüt kullanılmıştır. Ayrıca içerik ağırlıklandırmasının çok boyutlu BOB testi yöntemleri üzerindeki etkisini incelemek amacıyla, içerik ağırlıklandırmasının yapıldığı ve içerik ağırlıklandırmasının yapılmadığı koşullara ilişkin bulgular karşılaştırılmıştır. Her bir koşula ilişkin çok boyutlu BOB testi bulguları, boyutlara ilişkin güvenirlik katsayıları, ölçmenin standart hatası ve RMSD değerlerine bakılarak karşılaştırılmıştır. Analiz sonuçlarına göre, A-Optimality madde seçim yöntemi kullanıldığında, hem Bayesyen MAP hem de Fisher’in Puanlama yöntemlerinin benzer sonuçlar verdiği bulgusuna ulaşılmıştır. Buna karşın, Fisher’in puanlama yönteminin hem madde seçim yöntemlerinden hem de içerik ağırlıklandırmasından etkilendiği söylenebilir. Ayrıca içerik ağırlıklandırması uygulandığında her bir koşul için testteki ortalama madde sayısı artarken, güvenirlik katsayılarının azaldığı, buna karşın RMSD ve standart hataların azaldığı bulgusuna ulaşılmıştır.

References

Berger, M.P. F., & Veerkamp, W. J. J. (1996). A review of selection methods for optimal tests design. In M. Wilson & G. Engelhard (Eds.), Objective measurement: Theory into practice (Vol. 3, pp. 437-455). Norwood, NJ: Ablex
Bloxom, B., & Vale, C.D. (1987).Multidimensional adaptive testing: An approximate procedure for updating. In Meeting of the psychometric society. Montreal, Canada, June.
Chang, H.-H., Qian, J. and Ying, Z. (2001). A-stratified multistage computerized adaptive testing with b blocking. Applied Psychological Measurement, 25, 333-341.
Choi, S. W. & King D. R. (2011). MAT: Multidimensional adaptive testing. [Çevirim içi: https://cran.r-project.org/web/packages/MAT/MAT.pdf], Erişim tarihi: 15 Temmuz 2015.
Diao, Q. (2009). Comparison of ability estimation and item selection methods in multidimensional computerized adaptive testing. Unpublished Doctoral Dissertation. Michigan State University.
Diao, Q. & Reckase, M. (2009). Comparison of ability estimation and item selection methods in multidimensional computerized adaptive testing. In: Weiss DJ (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. pp. 1-13.
Fan, M., & Hsu, Y. (1996). Multidimensional computer adaptive testing. In Annual meeting of the American educational research association. New York City, NY, April.
Gershon, R. C. (2005). Computer adaptive testing. Journal of Applied Measurement 6:109-27.
Green, B. G., Bock, R.D., Humphries, L. G., Linn, R.L., & Reckase, M.D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21, 347-360
IACAT Official Web Site. [Çevirim-içi: http://iacat.org/content/research-strategies-cat ]. Erişim tarihi: 24 Aralık 2015.
Lin, H. (2012). Item selection methods in multidimensional computerized adaptive testing adopting polytomously-scored items under multidımensional generalized partial credit model. Unpublisjed
Doctoral Dissertation. University of Illinois at Urbana-Champaign.
Lord, F. M. (1971a). Tailored testing, an approximation of stochastic approximation. Journal of the American Statistical Association, 66, 707–711.
Lord, F. M. (1971b). A theoretical study of the measurement effectiveness of flexilevel tests. Educational and Psychological Measurement, 31, 805–813.
Lord, F. M. (1971c). A theoretical study of two-stage testing. Psychometrika, 36, 227–242.
Luecht, R. M. (1996). Multidimensional computerized adaptive testing in a certification or licensure context. Applied Psychological Measurement, 20 (4), 389-404.
McBride, J.R. & Martin, J.T. (1983). Reliability and Validity of Adaptive Ability Tests in a military setting. in Weiss D.J. (Ed.) "New Horizons in Testing" New York: Academic Press.
Mulder, J., & van der Linden, W. J. (2009). Muldimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74 (2), 273-296.
Passos, V. L., Berger, M. P. F., & Tan, F. E. (2007). Test design optimization in CAT early stage with the nominal response model. Applied Psychological Measurement, 31, 213-232.
Rizavi, S. & Swaminathan, H. (2001). The effect of test and examinee characteristics on the occurrence of aberrant response patterns in a computerized adaptive test. Paper presented at the annual meeting of the American Educational Research Association, Seattle WA. (2001)
Segall, D. O. (1996). Multidimensional adaptive testing. Psychometrika, 61(2), 331-354.
Segall, D.O. (2000). Principles of multidimensional adaptive testing. In W.J. van der Linden & C.A.W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 53–73). Boston: Kluwer Academic.
Segall, D. O. (2001). General ability measurement: An application of multidimensional item response theory. Psychometrika, 66, 79-97.
Silvey, S.D. (1980). Optimal design. London: Chapman & Hall.
Sympson, J.B. and Hetter, R.D. (1985, october). Controlling item exposure rates in computerized adaptive testing. Proceedings of the 27th annual meeting of the Military Testing Association (pp. 973-977). San Diego, CA: Navy Personnel Research and Development Center.
van der Linden, W.J. (1996). Assembling tests for the measurement of multiple traits. Applied Psychological Measurement, 20, 373–388.
van der Linden, W.J. (1999). Multidimensional adaptive testing with a minimum error-variance criterion. Journal of Educational and Behavioral Statistics, 24, 398–412.
van der Linden, W. J., & Glas, C. A. W. (2000). Computerized adaptive testing: Theory and practice. Boston: Kluwer.
van der Linden, W. J. & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer.
Veldkamp, B. P. , & van der Linden, W. J. (2002). Multidimensional adaptive testing with constraints on test content. Psychometrika, 67(4), 575-588.
Wainer, H., Dorans, N., Eignor, D., Flaugher, R., Green, B., Mislevy, R., et al. (2000). Computerized adaptive testing: A primer (2nd ed.). Hillsdale, NJ: Erlbaum.
Wang, W. C. & Chen, P.H. (2004). Implementation and measurement efficiency of multidimensional computerized adaptive testing. Applied Psychological Measurement 2004 28: 295. DOI: 10.1177/0146621604265938.
Wang, W., Chen, P., & Cheng, Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9, 116–136.
Wang, W.-C., Wilson, M., and Adams, R. (1997). Rasch models for multidimensionality between items and within items. In G. Englehard, Wilson, Mark (Ed.), Objective Measurement (Vol. 4, ): Greenwich, CN: Ablex Publishing.
Weiss, D. J., & Betz, N. E. (1973). Ability measurement: Conventional or adaptive? (Research Report 73-1). Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program, Computerized Adaptive Testing Laboratory.
Weiss, D.J., & Kingsbury, G.G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21:4 361-375.
Yoo, H. (2011). Evaluating several multidimensional adaptive testing procedures for diagnostic assessment. Unpublished Doctoral Dissertation. University of Massachusetts Amherst)

Year 2015, Volume: 6 Issue: 2, 0 - 0, 02.01.2016

Burhanettin Özdemir , Selahattin Gelbal

https://doi.org/10.21031/epod.03278

Abstract

References

Berger, M.P. F., & Veerkamp, W. J. J. (1996). A review of selection methods for optimal tests design. In M. Wilson & G. Engelhard (Eds.), Objective measurement: Theory into practice (Vol. 3, pp. 437-455). Norwood, NJ: Ablex
Bloxom, B., & Vale, C.D. (1987).Multidimensional adaptive testing: An approximate procedure for updating. In Meeting of the psychometric society. Montreal, Canada, June.
Chang, H.-H., Qian, J. and Ying, Z. (2001). A-stratified multistage computerized adaptive testing with b blocking. Applied Psychological Measurement, 25, 333-341.
Choi, S. W. & King D. R. (2011). MAT: Multidimensional adaptive testing. [Çevirim içi: https://cran.r-project.org/web/packages/MAT/MAT.pdf], Erişim tarihi: 15 Temmuz 2015.
Diao, Q. (2009). Comparison of ability estimation and item selection methods in multidimensional computerized adaptive testing. Unpublished Doctoral Dissertation. Michigan State University.
Diao, Q. & Reckase, M. (2009). Comparison of ability estimation and item selection methods in multidimensional computerized adaptive testing. In: Weiss DJ (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. pp. 1-13.
Fan, M., & Hsu, Y. (1996). Multidimensional computer adaptive testing. In Annual meeting of the American educational research association. New York City, NY, April.
Gershon, R. C. (2005). Computer adaptive testing. Journal of Applied Measurement 6:109-27.
Green, B. G., Bock, R.D., Humphries, L. G., Linn, R.L., & Reckase, M.D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21, 347-360
IACAT Official Web Site. [Çevirim-içi: http://iacat.org/content/research-strategies-cat ]. Erişim tarihi: 24 Aralık 2015.
Lin, H. (2012). Item selection methods in multidimensional computerized adaptive testing adopting polytomously-scored items under multidımensional generalized partial credit model. Unpublisjed
Doctoral Dissertation. University of Illinois at Urbana-Champaign.
Lord, F. M. (1971a). Tailored testing, an approximation of stochastic approximation. Journal of the American Statistical Association, 66, 707–711.
Lord, F. M. (1971b). A theoretical study of the measurement effectiveness of flexilevel tests. Educational and Psychological Measurement, 31, 805–813.
Lord, F. M. (1971c). A theoretical study of two-stage testing. Psychometrika, 36, 227–242.
Luecht, R. M. (1996). Multidimensional computerized adaptive testing in a certification or licensure context. Applied Psychological Measurement, 20 (4), 389-404.
McBride, J.R. & Martin, J.T. (1983). Reliability and Validity of Adaptive Ability Tests in a military setting. in Weiss D.J. (Ed.) "New Horizons in Testing" New York: Academic Press.
Mulder, J., & van der Linden, W. J. (2009). Muldimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74 (2), 273-296.
Passos, V. L., Berger, M. P. F., & Tan, F. E. (2007). Test design optimization in CAT early stage with the nominal response model. Applied Psychological Measurement, 31, 213-232.
Rizavi, S. & Swaminathan, H. (2001). The effect of test and examinee characteristics on the occurrence of aberrant response patterns in a computerized adaptive test. Paper presented at the annual meeting of the American Educational Research Association, Seattle WA. (2001)
Segall, D. O. (1996). Multidimensional adaptive testing. Psychometrika, 61(2), 331-354.
Segall, D.O. (2000). Principles of multidimensional adaptive testing. In W.J. van der Linden & C.A.W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 53–73). Boston: Kluwer Academic.
Segall, D. O. (2001). General ability measurement: An application of multidimensional item response theory. Psychometrika, 66, 79-97.
Silvey, S.D. (1980). Optimal design. London: Chapman & Hall.
Sympson, J.B. and Hetter, R.D. (1985, october). Controlling item exposure rates in computerized adaptive testing. Proceedings of the 27th annual meeting of the Military Testing Association (pp. 973-977). San Diego, CA: Navy Personnel Research and Development Center.
van der Linden, W.J. (1996). Assembling tests for the measurement of multiple traits. Applied Psychological Measurement, 20, 373–388.
van der Linden, W.J. (1999). Multidimensional adaptive testing with a minimum error-variance criterion. Journal of Educational and Behavioral Statistics, 24, 398–412.
van der Linden, W. J., & Glas, C. A. W. (2000). Computerized adaptive testing: Theory and practice. Boston: Kluwer.
van der Linden, W. J. & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer.
Veldkamp, B. P. , & van der Linden, W. J. (2002). Multidimensional adaptive testing with constraints on test content. Psychometrika, 67(4), 575-588.
Wainer, H., Dorans, N., Eignor, D., Flaugher, R., Green, B., Mislevy, R., et al. (2000). Computerized adaptive testing: A primer (2nd ed.). Hillsdale, NJ: Erlbaum.
Wang, W. C. & Chen, P.H. (2004). Implementation and measurement efficiency of multidimensional computerized adaptive testing. Applied Psychological Measurement 2004 28: 295. DOI: 10.1177/0146621604265938.
Wang, W., Chen, P., & Cheng, Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9, 116–136.
Wang, W.-C., Wilson, M., and Adams, R. (1997). Rasch models for multidimensionality between items and within items. In G. Englehard, Wilson, Mark (Ed.), Objective Measurement (Vol. 4, ): Greenwich, CN: Ablex Publishing.
Weiss, D. J., & Betz, N. E. (1973). Ability measurement: Conventional or adaptive? (Research Report 73-1). Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program, Computerized Adaptive Testing Laboratory.
Weiss, D.J., & Kingsbury, G.G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21:4 361-375.
Yoo, H. (2011). Evaluating several multidimensional adaptive testing procedures for diagnostic assessment. Unpublished Doctoral Dissertation. University of Massachusetts Amherst)

There are 37 citations in total.

Details

Journal Section	Articles
Authors	Burhanettin Özdemir Selahattin Gelbal
Publication Date	January 2, 2016
Published in Issue	Year 2015 Volume: 6 Issue: 2

Cite

APA	Özdemir, B., & Gelbal, S. (2016). İçerik Ağırlıklandırmasının Maddeler-Arası Boyutluluk Modeline Dayalı Çok Boyutlu Bilgisayar Ortamında Bireyselleştirilmiş Test Yöntemleri Üzerindeki Etkisinin İncelenmesi. Journal of Measurement and Evaluation in Education and Psychology, 6(2). https://doi.org/10.21031/epod.03278

Download Cover Image

Article Files

Full Text