Kategori Sayısının Psikometrik Özellikler Üzerine Etkisinin Mokken Homojenlik Modeli’ne Göre İncelenmesi

Asiye Şengül Avşar

doi:10.21031/epod.357160

Research Article

Kategori Sayısının Psikometrik Özellikler Üzerine Etkisinin Mokken Homojenlik Modeli’ne Göre İncelenmesi

Year 2018, Volume: 9 Issue: 1, 49 - 63, 31.03.2018

Asiye Şengül Avşar

https://doi.org/10.21031/epod.357160

Cited By: 1

Abstract

Araştırmanın amacı çok kategorili puanlanan maddelerden oluşan testlerde kategori
sayısının psikometrik özellikler üzerindeki etkisinin parametrik olmayan madde
tepki kuramı (POMTK) modeli ile belirlenmesidir. Belirlenen amaç doğrultusunda iki
farklı büyüklükte (100 ve 500), çeşitli dağılım özelliklerine sahip (normal
dağılan, sağa çarpık dağılan ve sola çarpık dağılan) örneklemler için iki
farklı test uzunluğunda (10 madde ve 30 madde), üç farklı sayıda kategoriye (üç,
beş ve yedi) sahip maddeler simülatif olarak üretilmiştir. Kategori sayısının
psikometrik özellikler üzerindeki etkisi POMTK modellerinden Mokken Homojenlik
Modeli (MHM) ile araştırılmıştır. Yapılan araştırma temel araştırma olarak
tasarlanmıştır. Verilerin üretilmesinde ve verilerin analizinde R Studio 3.4.0 yazılımı kullanılmıştır. R Studio
yazılımında MHM’ye göre analizler Mokken paketi ile yapılmıştır. MHM’ye göre
yapılan ölçekleme sonucunda kategori sayısının değişmesiyle birlikte maddelerin
MHM’ye uyumunda belli bir örüntü gözlenmemiştir. Genel olarak hem kısa
testlerde, hem de uzun testlerde kategori sayısının güvenirlik değerlerinin
kestiriminde etkili olmadıkları gözlenmiştir. Araştırmada belirlenen test
koşullarında testler MHM’ye düşük düzeyde uyumlu çıkmıştır.

Keywords

çok kategorili puanlanan maddeler, kategori sayısı, parametrik olmayan madde tepki kuramı

References

Bahry, L. M. (2012). Polytomous item response theory parameter recovery: an investigation of nonnormal distributions and small sample size (Master’s Thesis). Available from ProQuest Dissertations and Theses database. (UMI No. MR90146)
Cohen, A. S., Kim, S. H., & Baker, F. B. (1993). Detection of differential item functioning in the graded response model. Applied Psychological Measurement, 17(4), 335-350. doi:10.1177/01466216930170040
Crocker, L. & Algina, J. (1986). Introduction to Classical and Modern Test Theory. Orlando: Harcourt Brace Jovanovich Inc.
DeMars, C. E. (2002, April). Recovery of graded response and partial credit parameters in multilog and parscale. Paper presented at the annual meeting of American Educational Research Association, Chicago.
Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32(3), 224- 247. doi:10.1177/0146621607302479
Erkuş, A., Sanlı, N., Bağlı, M., & Güven, K. (2000). Öğretmenliğe ilişkin tutum ölçeği geliştirilmesi. Eğitim ve Bilim, 25(116). http://egitimvebilim.ted.org.tr/index.php/EB/article/view/5276/1439 adresinden erişildi.
Fabiola, G., Iwin, L., Jennifer, L., & Zaira, V. (2012). The effect of the number of answer choices on the psychometric properties of stress measurement in an instrument applied to children. Evaluar, 12, 43-59. Retrieved from https://revistas.unc.edu.ar/index.php/revaluar/article/viewFile/4694/4488
Galindo-Garre, F., Hendriks, S. A., Volicer, L., Smalbrugge, M., Hertogh, C. M., & van der Steen, J. T. (2014). The Bedford Alzheimer nursing-severity scale to assess dementia severity in advanced dementia: a nonparametric item response analysis and a study of its psychometric characteristics. Am J Alzheimers Dis Other Demen, 29(1), 84-90. doi: 10.1177/1533317513506777
Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Educational Measurement, 12, 38-47.
Hemker, B. T., Sijtsma, K., Molenaar, I. W., & Junker, B. W. (1996). Polytomous irt models and monotone likelihood ratio of the total score. Psychometrika, 61(4), 679-693.
İlhan, M., & Güler, N. (2017). The number of response categories and the reverse directional item problem in likert-type scales: a study with the rasch model. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 8(3), 321-343.
Jiang, S., Wang, C., & Weiss, D. J. (2016). Sample size requirements for estimation of item parameters in the multidimensional graded response model. Frontiers in psychology, 7, 109. doi: 10.3389/fpsyg.2016.00109
Junker, B., and Sijtsma, K. (2001). Nonparametric item response theory in action: an overview of the special issue. Applied Psychological Measurement, 25(3), 211- 220. doi:10.1177/01466210122032028
Koğar H., (2015). Madde tepki kuramına ait parametrelerin ve model uyumlarının karşılaştırılması: bir monte carlo çalışması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6, 142-157.
Lee, J., & Paek, I. (2014). In search of the optimal number of response categories in a rating scale. Journal of Psychoeducational Assessment, 32(7), 663-673. Leung, S. O. (2011). A comparison of psychometric properties and normality in 4-, 5-, 6-, and 11-point likert scales. Journal of Social Service Research, 37(4), 412-421.
Lozano, L. M., García-Cueto, E., & Muñiz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 4(2), 73-79.
Maydeu-Olivares, A., Kramp, U., García-Forero, C., Gallardo-Pujol, D., & Coffman, D. (2009). The effect of varying the number of response alternatives in rating scales: experimental evidence from intra-individual effects. Behavior Research Methods, 41(2), 295-308.
Meijer, R. R. (2004, March). Investigating the quality of items in cat using nonparametric irt. Law School Admission Council Computerized Testing Report. A Publication of the Law School Admission Council.
Meijer, R. R., & Baneke, J. J. (2004). Analyzing psychopathology items: a case for nonparametric item response theory modeling. Psychological Methods, 9(3), 354-368. doi: 10.1037/1082-989X.9.3.354
Mokken, R. J. (1971). A theory and procedure of scale analysis: with applications in political research. The Hague: Mouton.
Mokken, R. J. (1997). Nonparametric models for dichotomous responses. In W. J. van der Linden, and R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 351-368). New York: Springer-Verlag.
Molenaar, I. W. (2001). Thirty years of nonparametric item response theory. Applied Psychological Measurement, 25(3), 295-299. doi:10.1177/01466210122032091 Ostini, R., & Nering, M. L. (2006). Polytomous Item Response Theory Models. Thousand Oaks, CA: Sage
Pozehl, J. B. (1990). Application of item response theory to criterion-referenced measurement: an investigation of the effects of model choice, sample size, and test length on reliability and estimation accuracy (Doctoral Dissertation). Available from ProQuest Dissertations and Theses database. (UMI No. 9030146)
Preston, C. C., & Colman, A. M. (2000). Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104(1), 1-15.
Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56(4), 611-630.
Rivas, T., Bersabé, R., & Berrocal, C. (2005). Application of double monotonicity model to polytomous items: scalability of the beck depression items on subjects with eating disorders. European Journal of Psychological Assessment, 21(1), 1-10. doi:10.1027//1015-5759.21.1.1
Sachs, J., Law, Y. K., & Chan, C. K. K. (2003). A nonparametric item analysis of a selected item subset of the learning process. British Journal of Educational Psychology, 73(3), 395–423. doi: 10.1348/000709903322275902
Sijtsma, K. & Molenaar, W. I. (2002). Introduction to Nonparametric Item Response Theory, USA: Sage Publications.
Sijtsma, K., Debets, P., & Molenaar, W. I. (1990). Mokken scale analysis for polychotomous items: theory, a computer program and an empirical application. Quality and Quantity, Kluwer academic publishers, Netherlands.
Štochl, J. (2007). Nonparametric extension of item response theory models and its usefulness for assessment of dimensionality of motor tests. Acta Universitatis Carolinae, 42(1), 75-94.
Syu, J. J. (2013). Applying person fit-in faking detection-the simulation and practice of non parametric item response theory. (Doctoral Dissertation, National Chengchi University). Retrieved from http://nccur.lib.nccu.edu.tw/bitstream/140.119/58646/1/251501.pdf
Şengül Avşar, A., & Tavşancıl, E. (2017). Examination of polytomous items' psychometric properties according to nonparametric item response theory models in different test conditions. Educational Sciences: Theory & Practice, 17(2). doi:10.12738/estp.2017.2.0246
Tendeiro, J. N., & Meijer, R. R. (2013). The probability of exceedance as a nonparametric person fit statistic for tests of moderate length. Applied Psychological Measurement, 37(8), 653–665. doi: 10.1177/0146621613499066
Uyumaz, G., & Çokluk, Ö. (2016). An investigation of item order and rating differences in likert-type scales in terms of psychometric properties and attitudes of respondents. Journal of Theoretical Educational Science, 9(3), 400-425. doi:10.5578/keg.10011
van der Ark, L. A. (2007). Mokken scale analysis in r. Journal of Statistical Software, 20(11), 1-19.
van der Ark, L. A. (2015). Package ‘mokken’. Retrieved from http://cran.rproject.org/web/packages/mokken/mokken.pdf
van der Ark, L. A., van der Palm, D. W., & Sijtsma, K. (2011). A latent class approach to estimating test-score reliability. Applied Psychological Measurement, 35(5), 380-392. doi:10.1177/0146621610392911
van Onna, M. J. H. (2004). Estimates of the sampling distribution of scalability coefficient h. Applied Psychological Measurement, 28(6), 427-449. doi:10.1177/0146621604268735
Wang, W. C. (2004). Direct estimation of correlation as a measure of association strength using multidimensional item response models. Educational and Psychological Measurement, 64(6), 937-955. doi:10.1177/0013164404268671
Weng, L. J. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educational and Psychological Measurement, 64(6), 956-972.
Young, M. A., Blodgett, C., & Reardon, A. (2003). Measuring seasonality: psychometric properties of the seasonal pattern assessment questionnaire and the inventory for seasonal variation. Psychiatry Research, 117(1), 75-83. doi: 10.1016/S0165-1781(02)00299-8
Zhang, O. (2010). Polytomous irt or testlet model: an evaluation of scoring models in small testlet size situations (Master’s Thesis, Universtiy of Florida). Retrived from http://ufdc.ufl.edu/UFE0042638/00001
Zenisky, A. L., Hambleton, R. K., & Sireci, S. G. (2002). Identification and evaluation of local item dependencies in the medical college admissions test. Journal of Educational Measurement, 39(4), 291 -309. doi:10.1111/j.1745- 3984.2002.tb01144.x

Investigation of the Effects of the Number of Categories on Psychometric Properties According to Mokken Homogeneity Model

Year 2018, Volume: 9 Issue: 1, 49 - 63, 31.03.2018

Asiye Şengül Avşar

https://doi.org/10.21031/epod.357160

Cited By: 1

Abstract

The aim of the
research was to examine the effects of the number of categories for polytomous
items on psychometric properties in a nonparametric item response theory (NIRT)
model. For the purpose of the study, data sets with two different sample sizes (100
and 500) that come from different sample distribution shapes (normal
distribution, positively skewed distribution, and negatively skewed
distribution), two different test lengths (10 items and 30 items), and three different
number of categories (three, five, and seven) were generated. The effects of
the number of categories on psychometric properties of polytomous items were analyzed
by Mokken Homogeneity Model (MHM) under NIRT model. The research was designed
as a basic research. In the generation and analysis of data sets, R Studio
3.4.0 software was used. For analysis conducted with MHM, Mokken package was
used in R Studio. According to scaling with MHM, specific pattern of item fit to
MHM with changing the number of categories was not observed. In general, it was
found that the number of categories has no effect on reliability estimate. It was
determined that tests have weak fit to MHM under test conditions in the
research.

Keywords

polytomous items, number of category, nonparametric item response theory, mokken homogeneity model

References

Bahry, L. M. (2012). Polytomous item response theory parameter recovery: an investigation of nonnormal distributions and small sample size (Master’s Thesis). Available from ProQuest Dissertations and Theses database. (UMI No. MR90146)
Cohen, A. S., Kim, S. H., & Baker, F. B. (1993). Detection of differential item functioning in the graded response model. Applied Psychological Measurement, 17(4), 335-350. doi:10.1177/01466216930170040
Crocker, L. & Algina, J. (1986). Introduction to Classical and Modern Test Theory. Orlando: Harcourt Brace Jovanovich Inc.
DeMars, C. E. (2002, April). Recovery of graded response and partial credit parameters in multilog and parscale. Paper presented at the annual meeting of American Educational Research Association, Chicago.
Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32(3), 224- 247. doi:10.1177/0146621607302479
Erkuş, A., Sanlı, N., Bağlı, M., & Güven, K. (2000). Öğretmenliğe ilişkin tutum ölçeği geliştirilmesi. Eğitim ve Bilim, 25(116). http://egitimvebilim.ted.org.tr/index.php/EB/article/view/5276/1439 adresinden erişildi.
Fabiola, G., Iwin, L., Jennifer, L., & Zaira, V. (2012). The effect of the number of answer choices on the psychometric properties of stress measurement in an instrument applied to children. Evaluar, 12, 43-59. Retrieved from https://revistas.unc.edu.ar/index.php/revaluar/article/viewFile/4694/4488
Galindo-Garre, F., Hendriks, S. A., Volicer, L., Smalbrugge, M., Hertogh, C. M., & van der Steen, J. T. (2014). The Bedford Alzheimer nursing-severity scale to assess dementia severity in advanced dementia: a nonparametric item response analysis and a study of its psychometric characteristics. Am J Alzheimers Dis Other Demen, 29(1), 84-90. doi: 10.1177/1533317513506777
Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Educational Measurement, 12, 38-47.
Hemker, B. T., Sijtsma, K., Molenaar, I. W., & Junker, B. W. (1996). Polytomous irt models and monotone likelihood ratio of the total score. Psychometrika, 61(4), 679-693.
İlhan, M., & Güler, N. (2017). The number of response categories and the reverse directional item problem in likert-type scales: a study with the rasch model. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 8(3), 321-343.
Jiang, S., Wang, C., & Weiss, D. J. (2016). Sample size requirements for estimation of item parameters in the multidimensional graded response model. Frontiers in psychology, 7, 109. doi: 10.3389/fpsyg.2016.00109
Junker, B., and Sijtsma, K. (2001). Nonparametric item response theory in action: an overview of the special issue. Applied Psychological Measurement, 25(3), 211- 220. doi:10.1177/01466210122032028
Koğar H., (2015). Madde tepki kuramına ait parametrelerin ve model uyumlarının karşılaştırılması: bir monte carlo çalışması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6, 142-157.
Lee, J., & Paek, I. (2014). In search of the optimal number of response categories in a rating scale. Journal of Psychoeducational Assessment, 32(7), 663-673. Leung, S. O. (2011). A comparison of psychometric properties and normality in 4-, 5-, 6-, and 11-point likert scales. Journal of Social Service Research, 37(4), 412-421.
Lozano, L. M., García-Cueto, E., & Muñiz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 4(2), 73-79.
Maydeu-Olivares, A., Kramp, U., García-Forero, C., Gallardo-Pujol, D., & Coffman, D. (2009). The effect of varying the number of response alternatives in rating scales: experimental evidence from intra-individual effects. Behavior Research Methods, 41(2), 295-308.
Meijer, R. R. (2004, March). Investigating the quality of items in cat using nonparametric irt. Law School Admission Council Computerized Testing Report. A Publication of the Law School Admission Council.
Meijer, R. R., & Baneke, J. J. (2004). Analyzing psychopathology items: a case for nonparametric item response theory modeling. Psychological Methods, 9(3), 354-368. doi: 10.1037/1082-989X.9.3.354
Mokken, R. J. (1971). A theory and procedure of scale analysis: with applications in political research. The Hague: Mouton.
Mokken, R. J. (1997). Nonparametric models for dichotomous responses. In W. J. van der Linden, and R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 351-368). New York: Springer-Verlag.
Molenaar, I. W. (2001). Thirty years of nonparametric item response theory. Applied Psychological Measurement, 25(3), 295-299. doi:10.1177/01466210122032091 Ostini, R., & Nering, M. L. (2006). Polytomous Item Response Theory Models. Thousand Oaks, CA: Sage
Pozehl, J. B. (1990). Application of item response theory to criterion-referenced measurement: an investigation of the effects of model choice, sample size, and test length on reliability and estimation accuracy (Doctoral Dissertation). Available from ProQuest Dissertations and Theses database. (UMI No. 9030146)
Preston, C. C., & Colman, A. M. (2000). Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104(1), 1-15.
Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56(4), 611-630.
Rivas, T., Bersabé, R., & Berrocal, C. (2005). Application of double monotonicity model to polytomous items: scalability of the beck depression items on subjects with eating disorders. European Journal of Psychological Assessment, 21(1), 1-10. doi:10.1027//1015-5759.21.1.1
Sachs, J., Law, Y. K., & Chan, C. K. K. (2003). A nonparametric item analysis of a selected item subset of the learning process. British Journal of Educational Psychology, 73(3), 395–423. doi: 10.1348/000709903322275902
Sijtsma, K. & Molenaar, W. I. (2002). Introduction to Nonparametric Item Response Theory, USA: Sage Publications.
Sijtsma, K., Debets, P., & Molenaar, W. I. (1990). Mokken scale analysis for polychotomous items: theory, a computer program and an empirical application. Quality and Quantity, Kluwer academic publishers, Netherlands.
Štochl, J. (2007). Nonparametric extension of item response theory models and its usefulness for assessment of dimensionality of motor tests. Acta Universitatis Carolinae, 42(1), 75-94.
Syu, J. J. (2013). Applying person fit-in faking detection-the simulation and practice of non parametric item response theory. (Doctoral Dissertation, National Chengchi University). Retrieved from http://nccur.lib.nccu.edu.tw/bitstream/140.119/58646/1/251501.pdf
Şengül Avşar, A., & Tavşancıl, E. (2017). Examination of polytomous items' psychometric properties according to nonparametric item response theory models in different test conditions. Educational Sciences: Theory & Practice, 17(2). doi:10.12738/estp.2017.2.0246
Tendeiro, J. N., & Meijer, R. R. (2013). The probability of exceedance as a nonparametric person fit statistic for tests of moderate length. Applied Psychological Measurement, 37(8), 653–665. doi: 10.1177/0146621613499066
Uyumaz, G., & Çokluk, Ö. (2016). An investigation of item order and rating differences in likert-type scales in terms of psychometric properties and attitudes of respondents. Journal of Theoretical Educational Science, 9(3), 400-425. doi:10.5578/keg.10011
van der Ark, L. A. (2007). Mokken scale analysis in r. Journal of Statistical Software, 20(11), 1-19.
van der Ark, L. A. (2015). Package ‘mokken’. Retrieved from http://cran.rproject.org/web/packages/mokken/mokken.pdf
van der Ark, L. A., van der Palm, D. W., & Sijtsma, K. (2011). A latent class approach to estimating test-score reliability. Applied Psychological Measurement, 35(5), 380-392. doi:10.1177/0146621610392911
van Onna, M. J. H. (2004). Estimates of the sampling distribution of scalability coefficient h. Applied Psychological Measurement, 28(6), 427-449. doi:10.1177/0146621604268735
Wang, W. C. (2004). Direct estimation of correlation as a measure of association strength using multidimensional item response models. Educational and Psychological Measurement, 64(6), 937-955. doi:10.1177/0013164404268671
Weng, L. J. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educational and Psychological Measurement, 64(6), 956-972.
Young, M. A., Blodgett, C., & Reardon, A. (2003). Measuring seasonality: psychometric properties of the seasonal pattern assessment questionnaire and the inventory for seasonal variation. Psychiatry Research, 117(1), 75-83. doi: 10.1016/S0165-1781(02)00299-8
Zhang, O. (2010). Polytomous irt or testlet model: an evaluation of scoring models in small testlet size situations (Master’s Thesis, Universtiy of Florida). Retrived from http://ufdc.ufl.edu/UFE0042638/00001
Zenisky, A. L., Hambleton, R. K., & Sireci, S. G. (2002). Identification and evaluation of local item dependencies in the medical college admissions test. Journal of Educational Measurement, 39(4), 291 -309. doi:10.1111/j.1745- 3984.2002.tb01144.x

There are 43 citations in total.

Details

Primary Language	Turkish
Journal Section	Articles
Authors	Asiye Şengül Avşar 0000-0001-5522-2514
Publication Date	March 31, 2018
Acceptance Date	January 29, 2018
Published in Issue	Year 2018 Volume: 9 Issue: 1

Cite

APA	Şengül Avşar, A. (2018). Kategori Sayısının Psikometrik Özellikler Üzerine Etkisinin Mokken Homojenlik Modeli’ne Göre İncelenmesi. Journal of Measurement and Evaluation in Education and Psychology, 9(1), 49-63. https://doi.org/10.21031/epod.357160

Cited By

An Analysis of Parameter Invariance according to Different Sample Sizes and Dimensions in Parametric and Nonparametric Item Response Theory

Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi

https://doi.org/10.21031/epod.584977

Download Cover Image

Article Files

Full Text