Research Article
BibTex RIS Cite

The Number of Response Categories and the Reverse Directional Item Problem in Likert-Type Scales: A Study with the Rasch Model

Year 2017, Volume: 8 Issue: 3, 321 - 343, 30.09.2017
https://doi.org/10.21031/epod.321057

Abstract

This study addressed reverse directional item and the number of response categories problems in Likert-type scales. The Fear of Negative Evaluation Scale (FNES) and the Oxford Happiness Questionnaire (OHQ) were used as data collection tools. The data of the study were analyzed according to the Rasch model. The analysis found that the observed and expected test characteristic curves were largely overlapped, each of the three rating scales worked effectively, and the differences between response categories could be distinguished successfully by the participants in straightforward directional items. On the other hand, it was determined that there were significant differences between the observed and expected test characteristic curves in reverse directional items. It was also found that no matter which one of these three, five and seven-point rating scales was used, the participants could not distinguish the response categories of the reverse directional items on the FNES and the OHQ. Afterwards, the reverse directional items were removed from the data file, and the analysis was repeated. The analysis results revealed that item discrimination, reliability coefficients for person facet, separation ratios and Chi square values calculated for the facets of person and items were higher in five-pointed rating compared to three and seven pointed rating.


References

  • Adelson, J.L., & McCoach, D.B. (2010). Measuring the mathematical attitudes of elementary students: The effects of a 4-point or 5-point Likert-type scale. Educational and Psychological Measurement, 70(5), 796-807. http://dx.doi.org/10.1177/0013164410366694 Ahlawat, K.S. (1985). On the negative valence items in self-report measures. The Journal of General Psychology, 112(1), 89-99. http://dx.doi.org/10.1080/00221309.1985.9710992
  • Aiken, L.R. (1983). Number of response categories and statistics on a teacher rating scale. Educational and Psychological Measurement, 43(2), 397-401. http://dx.doi.org/10.1177/001316448304300209
  • Bachman, J.G., & O’Malley, P.M. (1984). Yea-saying, nay-saying, and going to extremes: Black-white differences in response styles. The Public Opinion Quarterly, 48(2), 491-509. http://dx.doi.org/10.1086/268845 Baker, F.B. (2001). The basics of item response theory. ERIC Clearinghouse on Assessment and Evaluation, University of Maryland, College Park, MD.
  • Barnette, J.J. (1999, April). Likert Response Alternative Direction: SA to SD or SD to SA: Does It Make a Difference? Paper presented at the Annual Meeting of the American Educational Research Association, Montreal, Quebec, Canada. Retrieved from http://eric.ed.gov/?id=ED429125
  • Benson, J., & Hocevar, D. (1985). The impact of item phrasing on the validity of attitude scales for elementary school children. Journal of Educational Measurement, 22(3), 231–240. http://dx.doi.org/10.1111/j.1745-3984.1985.tb01061.x
  • Bergstrom, B.A., & Lunz, M.E. (1998, April). Rating scale analysis: Gauging the impact of positively and negatively worded items. Paper presented at the Annual Meeting of the American Educational Research Association. San Diego, CA. Retrieved from http://files.eric.ed.gov/fulltext/ED423289.pdf
  • Birkett, N.J. (1986). Selecting the number of response categories for a Likert-type scale. Retrieved from http://www.amstat.org/sections/srms/Proceedings/papers/1986_091.pdf
  • Bolin, B.L., & Dodder, R.A. (1990). The affect balance scale in an American college population. The Journal of Social Psychology, 130(6), 839-40. http://dx.doi.org/10.1080/00224545.1990.9924639
  • Büyüköztürk, Ş. (2005). Anket geliştirme. Türk Eğitim Bilimleri Dergisi, 3(2), 133-151. Retrieved from http://www.tebd.gazi.edu.tr/index.php/tebd/article/view/315/297
  • Cicchetti, D.V., Showalter, D., & Tyrer, P.J. (1985). The effect of number of rating scale categories on levels of inter-rater reliability: A Monte-Carlo investigation. Applied Psychological Measurement, 9(1), 31-36. http://dx.doi.org/10.1177/014662168500900103
  • Chamberlain, V.M., & Cummings, M.N. (1984). Development of an instructor/course evaluation instrument. College Student Journal, 18(3), 246-250.
  • Chang, L. (1994). A psychometric evaluation of 4-point and 6-point Likert-type scales in relation to reliability and validity. Applied Psychological Measurement, 18(3), 205-215. http://dx.doi.org/10.1177/014662169401800302
  • Chiorri, C., Anselmi, P., & Robusto, E. (2009). Reverse items are not opposites of straightforward items. In U. Savardi (Ed.), The perception and cognition of contraries (pp. 295-328). Milano: McGraw-Hill.
  • Comrey, A.L., & Montang, I. (1982). Comparison of factor analytic results with two choice and seven choice personality item formats. Applied Psychological Measurement, 6(3), 285-289. http://dx.doi.org/10.1177/014662168200600304 Conrad, K.J., Wright, B.D., McKnight, P., McFall, M., Fontana A., & Rosenheck, R. (2004). Comparing traditional and Rasch analyses of the Mississippi PTSD scale: Revealing limitations of reverse-scored items. Journal of Applied Measurement, 5(1), 15-30. Retrieved from https://www.academia.edu/2832927/Comparing_traditional_and_Rasch_analyses_of_the_Mississippi_PTSD_scale_Revealing_limitations_of_reverse-scored_items Cronbach, L.J. (1950). Further evidence on response sets and test design. Educational and Psychological Measurement, 10(1), 3-31. http://dx.doi.org/10.1177/001316445001000101
  • Çetin, B., Doğan, T., & Sapmaz, F. (2010). Olumsuz değerlendirilme korkusu ölçeği kısa formu’nun Türkçe uyarlaması: Geçerlik ve güvenirlik çalışması. Eğitim ve Bilim, 35(156), 205-216.
  • Daher, A.M., Ahmad, S.H., Winn, T., & Selamat, M.I. (2015). Impact of rating scale categories on reliability and fit statistics of the Malay spiritual well-being scale using Rasch analysis. Malaysian Journal of Medical Sciences, 22(3), 48-55. Retrieved from http://www.bioline.org.br/pdf?mj15032
  • Dawes, J. (2007). Do data characteristics change according to the number of scale points used? An experiment using 5-point, 7-point and 10-point scales. International Journal of Market Research, 50(1), 61-77. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.417.9488&rep=rep1&type=pdf DeVellis, R.F. (2003). Scale development: Theory and applications. Newbury Park: Sage Publications.
  • Doğan, T., & Akıncı Çötok, N. (2011). Oxford mutluluk ölçeği kısa formunun Türkçe uyarlaması: Geçerlik ve güvenirlik çalışması. Türk Psikolojik Danışma ve Rehberlik Dergisi, 4(36), 165-172. Retrieved from http://dergipark.ulakbim.gov.tr/tpdrd/article/view/1058000176/1058000178
  • Erkuş, A. (2003). Psikometri üzerine yazılar. Ankara: Türk Psikologlar Derneği Yazıları. Erkuş, A. (2012). Psikolojide ölçme ve ölçek geliştirme-I. Ankara: Pegem Akademi Yayıncılık.
  • Fabiola, G.B., Iwin, L., Jennifer, L.M., & Zaira, V.V. (2012). The effect of the number of answer choices on the psychometric properties of stress measurement in an ınstrument applied to children. Evaluar, 12 43-59. Retrieved from https://revistas.unc.edu.ar/index.php/revaluar/article/download/4694/4488
  • Green, S.B., Akey, T.M., Fleming, K.K., Hershberger, S.L., & Marquis, J.G. (1997). Effect of the number of scale points on chi‐square fit indices in confirmatory factor analysis, Structural Equation Modeling: A Multidisciplinary Journal, 4(2), 108-120, http://dx.doi.org/10.1080/10705519709540064
  • Halpin, G., Halpin, G., & Arbet, S. (1994). Effects of number and type of response choices on internal consistency reliability. Perceptual and Motor Skills, 79(2), 928-930. http://dx.doi.org/10.2466/pms.1994.79.2.928
  • Herche, J., & Engelland, B. (1996). Reversed-Polarity İtems and scale unidimensionality. Journal of the Academy of Marketing Science, 24(4), 366-374. http://dx.doi.org/10.1177/0092070396244007 Hofstede, G. (1998). Masculinity and femininity: The taboo dimension of national cultures. Thousand Oaks, CA: Sage.
  • Hooper, M., Arora, A., Martin, M.O., & Mullis, I.V.S, (2013, June). Examining the behavior of “reverse directional” items in the TIMSS 2011 context questionnaire scales. Paper Presented at the 5th IEA International Research Conference. National Institute of Education, Nanyang Technological University, Singapore. Retrieved from http://www.iea.nl/fileadmin/user_upload/IRC/IRC_2013/Papers/IRC-2013_Hooper_etal.pdf
  • Hui, C.H., & Triandis, H.C. (1989). Effects of culture and response format on extreme response style. Journal of Cross-Cultural Psychology, 20(3), 296-309. http://dx.doi.org/10.1177/0022022189203004
  • Ibrahim, A.M. (2001). Differential responding to positive and negative items: The case of a negative item in a questionnaire for course and faculty evaluation. Psychological Reports, 88(2), 497-500. http://dx.doi.org/10.2466/pr0.2001.88.2.497 Jacoby, J., & Matell, M.S. (1971). Three-point likert scales are good enough. Journal of Marketing Research, 8, 495-500. Retrieved from https://www.jstor.org/stable/pdf/3150242.pdf?_=1472027712885 Jenkins, G.D., & Taber, T.D. (1977). A Monte-Carlo study of factors a€ecting three indices of composite scale reliability. Journal of Applied Psychology, 62(4), 392-398. http://dx.doi.org/10.1037/0021-9010.62.4.392 Johnson, T., Kulesa, P., Cho, Y.I., & Shavitt, S. (2005). The relation between culture and response styles evidence from 19 countries. Journal of Cross-Cultural Psychology, 36(2), 264-277. http://dx.doi.org/10.1177/0022022104272905 Kelloway, E.K., Catano, V.M., & Southwell, R.R. (1992). The construct validity of union commitment: Development and dimensionality of a shorter scale. Journal of Occupational and Organizational Psychology 65(3), 197-211. http://dx.doi.org/10.1111/j.2044-8325.1992.tb00498.x Kim, K.H. (1998). An analysis of optimum number of response categories for korean consumers. Journal of Global Academy of Marketing Science, 1(1), 61-86. http://dx.doi.org/10.1080/12297119.1998.9707386
  • King, L.A., King, D., & Klockars, A.J. (1983). Dichotomous and multipoint scales using bipolar adjectives. Applied Psychological Measurement, 7(2), 173-180. http://dx.doi.org/10.1177/014662168300700205 Knoch, U., &, McNamara, T. (2015). Rasch analysis. In L. Plonsky, (Ed.), Advancing quantitative methods in second language research (pp. 275–304). New York, NY: Routledge.
  • Lai, J.C.L. (1994). Differential predictive power of the positively versus the negatively worded items of the life orientation test. Psychological Repors, 75(3), 1507-1515. http://dx.doi.org/10.2466/pr0.1994.75.3f.1507
  • Lee, J.W., Jones, P.S., Mineyama, Y., & Zhang, X.E. (2002). Cultural differences in responses to a Likert scale. Research in Nursing & Health, 2002, 25, 295-306. http://dx.doi.org/10.1002/nur.10041
  • Leung, S. (2011). A comparison of psychometric properties and normality in 4-, 5-, 6-, and 11-point Likert scales. Journal of Social Service Research, 37(4), 412-421. http://dx.doi.org/10.1080/01488376.2011.580697
  • Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22, 2-55.
  • Linacre, J. M. (2014). A user’s guide to FACETS Rasch-model computer programs. Retrieved from http://www.winsteps.com/a/facets-manual.pdf
  • Lissitz, R.W., & Green, S.B. (1975). Effects of the number of scale points on reliability: A Monte Carlo approach. Journal of Applied Psychology, 60(1), 10-13. http://dx.doi.org/10.1037/h0076268 Locker, D., Jokovic, A., & Allison, P. (2013). Direction of wording and responses to items in oral health-related quality of life questionnaires for children and their parents. Community Dent Oral Epidemiol 35(4), 255-262. http://dx.doi.org/10.1111/j.1600-0528.2007.00320.x
  • Lozano, L.M:, García-Cueto, E., & Muñiz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 4(2), 73-79. http://dx.doi.org/10.1027/1614-2241.4.2.73
  • Matell, M. S., & Jacoby, J. (1971). Is there an optimal number of alternatives for Likert scale items? Study I: Reliability and validity. Educational and Psychological Measurement, 31(3), 657-674. http://dx.doi.org/10.1177/001316447103100307 Maydeu-Olivares A., Kramp U., García-Forero C., Gallardo-Pujol, D., Coffman, D. (2009). The effect of varying the number of response alternatives in rating scales: Experimental evidence from intra-individual effects. Behavior Research Methods, 41(2), 295-308. http://dx.doi.org/10.3758/BRM.41.2.295 McInerney, V., McInerney, D., & Roche, L. (1994, July). Definitely not just another computer anxiety instrument: The development and validation of CALM: Computer anxiety and learning measure. Paper presented at the Annual Stress and Anxiety Research Conference, Madrid, Spain. Retrieved from http://files.eric.ed.gov/fulltext/ED386161.pdf
  • Oaster, T. R. F. (1989). Number of alternatives per choice point and stability of Likert-type scales. Perceptual and Motor Skills, 68(2), 549-550. http://dx.doi.org/10.2466/pms.1989.68.2.549
  • Østerås, N., Gulbrandsen, P., Garratt, A., Benth, J.S., Dahl, F.A, Natvig, B., & Brage, S. (2008). A randomised comparison of a four- and a five-point scale version of the Norwegian function assessment scale. Health and Quality of Life Outcomes, 6(14), 1-9, http://dx.doi.org/10.1186/1477-7525-6-14 Pilotte, W.J., & Gable, R.K. (1990). The impact of positive and negative item stems on the validity of a computer anxiety scale. Educational and Psychological Measurement, 50(3), 603-610. http://dx.doi.org/10.1177/0013164490503016
  • Preston, C.C., & Colman, A.M. (2000). Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104(1), 1-15. http://dx.doi.org/10.1016/S0001-6918(99)00050-5
  • Ramsay, J. O. (1973). The effect of number of categories in rating scales on precision of estimation of scale values. Psychometrika, 38(4), 513-533. http://dx.doi.org/10.1007/BF02291492
  • Ray, J. (1980). How many answer categories should attitude and personality scales use? South AfricanJournal of Psychology, 10, 53-54. Retrieved from http://jonjayray.tripod.com/howmany.html
  • Rodebaugh, T.L., Woods, C.M., Thissen, D.M., Heimberg, R.G., Chambless, D.L., & Rapee, R.M. (2004). More information from fewer questions: The factor structure and item properties of the original and brief Fear of Negative Evaluation Scale. Psychological Assessment, 16, 169-181. http://dx.doi.org/10.1037/1040-3590.16.2.169
  • Roszkowski, M.J., & Soven, M. (2010). Shifting gears: Consequences of including two negatively worded items in the middle of a positively worded questionnaire. Assessment & Evaluation in Higher Education, 35(1), 113-130. http://dx.doi.org/10.1080/02602930802618344
  • Qasem, M., Almoshigah, T., & Gupta, S. (2014). The effect of number of alternatives on validity and reliability in Likert scale. International journal of innovative research & studies, 3(6), 324-333. http://dx.doi.org/10.13140/2.1.2237.2803
  • Schrieheim, C.A, & Hill, K.D. (1981). Controlling acquiescence response bias by item reversals: The effect on questionnaire validity. Educational and Psychological Measurement, 41(4), 1101-1114. http://dx.doi.org/10.1177/001316448104100420 Spector, P.E, van Katwyk, P.T., Brannick, M.T., & Chen, P.Y. (1997). When two factors don’t reflect two constructs: How Item characteristics can produce artifactual factors. Journal of Management, 23(5), 659-677. http://dx.doi.org/10.1016/S0149-2063(97)90020-9
  • Stening, B.W., & Everett, J.E. (1984). Response styles in a cross-cultural managerial study. Journal of Social Psychology, 122(2), 151-156. http://dx.doi.org/10.1080/00224545.1984.9713475
  • Sudweeks, R.R., Reeve, S., & Bradshaw, W.S. (2005). A comparison of generalizability theory and many-facet Rasch measurement in an analysis of college sophomore writing. Assessing Writing, 9(3), 239-261. http://dx.doi.org/10.1016/j.asw.2004.11.001 Swain S.D, Weathers D., Niedrich R.W. (2008) Assessing three sources of misresponse to reversed Likert items. Journal of Marketing Research 45, 116-131. Retrieved from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=990097
  • Şeker, H., & Gençdoğan, B. (2006). Psikolojide ve eğitimde ölçme aracı geliştirme. Ankara: Nobel Yayın Dağıtım.
  • Tarka, P. (2015). Likert scale and change in range of response categories vs. the factors extraction in EFA model. Folia Oeconomica, 1(311), 27-36. http://dx.doi.org/10.18778/0208‐6018.311.04 Taşdelen Teker, G., Güler, N. & Kaya Uyanık, G. (2015). Comparing the effectiveness of SPSS and EduG using different designs for generalizability theory. Educational Sciences: Theory & Practice, 15(3), 635-645. http://dx.doi.org/10.12738/estp.2015.3.2278 Tavşancıl, E. (2010). Tutumların ölçülmesi ve SPSS ile veri analizi. Ankara: Nobel Yayın Dağıtım.
  • Tekindal, S. (2009). Duyuşsal özelliklerin ölçülmesi için araç oluşturma. Ankara: Pegem Akademi Yayıncılık.
  • Tezbaşaran, A. (1997). Likert tipi ölçek hazırlama kılavuzu. Ankara: Türk Psikologlar Derneği.
  • Turan, İ., Şimşek, Ü., & Aslan, H. (2015). Eğitim araştırmalarında Likert ölçeği ve Likert tipi soruların kullanımı ve analizi. Sakarya Üniversitesi Eğitim Fakültesi Dergisi, (30), 186-203. Retrieved from http://dergipark.ulakbim.gov.tr/sakaefd/article/view/5000143504 Van Sonderen, E., SandermanR., & Coyne, J.C. (2013). Ineffectiveness of reverse wording of questionnaire ıtems: Let’s Learn from cows in the rain. PloS one, 8(7), 1-7. http://dx.doi.org/10.1371/journal.pone.0068967
  • Weems, G.H., Onwuegbuzie, A.J., & Lustig, D. (2003). Profiles of respondents who respond inconsistently to positively- and negatively- worded items on rating scales. Evaluation & Research in Education, 17(1), 45-60. http://dx.doi.org/10.1080/14664200308668290
  • Weng, L.J. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educational and Psychological Measurement, 64(6), 956-972. http://dx.doi.org/10.1177/0013164404268674 Wong, C.S., Peng, K.Z., Shi J., & Mao, Y. (2011). Differences between odd number and even number response formats: Evidence from mainland Chinese respondents. Asia Pacific Journal of Management, 28(2), 379–399. http://dx.doi.org/10.1007/s10490-009-9143-6
  • Wyatt, R.C., & Meyers, L.S. (1987). Psychometric properties of four 5-point likert type response scales. Educational and Psychological Measurement, 47(1), 27-35. http://dx.doi.org/10.1177/0013164487471003 Zhang, X, Noor, R., Savalei, V. (2016) Examining the effect of reverse worded items on the factor structure of the need for cognition scale. PLoS ONE, 11(6), 1-15. http://dx.doi.org/10.1371/journal.pone.0157795

Likert Tipi Ölçeklerde Olumsuz Madde ve Kategori Sayısı Sorunu: Rasch Modeli ile Bir İnceleme

Year 2017, Volume: 8 Issue: 3, 321 - 343, 30.09.2017
https://doi.org/10.21031/epod.321057

Abstract



Bu araştırmada Likert tipi ölçeklerde olumsuz madde ve kategori sayısı sorununun ele alınması amaçlanmıştır. Çalışmada veri toplama aracı olarak Olumsuz Değerlendirilme Korkusu Ölçeği (ODKÖ) ile Oxford Mutluluk Ölçeği (OMÖ) kullanılmıştır. Araştırma kapsamında toplanan veriler Rasch modeline göre analiz edilmiştir. Analiz sonucunda; ODKÖ ile OMÖ’deki olumlu maddelerde gözlenen ve beklenen test karakteristik eğrilerinin büyük ölçüde örtüştüğü, her üç kategori sayısının da etkin bir biçimde çalıştığı ve ölçek kategorileri arasındaki farkların katılımcılar tarafından başarılı bir biçimde ayırt edildiği belirlenmiştir. Öte yandan olumsuz maddelerde gözlenen ile beklenen test karakteristik eğrileri arasında önemli farklılıklar olduğu saptanmıştır. Üç, beş ve yedili derecelendirmeden hangisi kullanılırsa kullanılsın, ODKÖ ile OMÖ’deki olumsuz maddelerde kategorilerin katılımcılar tarafından ayırt edilemediği tespit edilmiştir. Bu tespitin ardından olumsuz maddeler veri dosyasından çıkarılarak analiz tekrarlanmıştır. Elde edilen bulgular; madde ayırt ediciliği, birey yüzeyine ilişkin güvenirlik katsayısı ile birey ve madde yüzeyleri için hesaplanan ayırma oranı ve Ki Kare değerlerinin beşli derecelemede üçlü ve yedili derecelemeye göre daha yüksek olduğunu göstermiştir. Bu bulgular, üçlü, beşli ya da yedili derecelemeden hangisi kullanılırsa kullanılsın olumsuz maddelerde ölçek kategorilerinin cevaplayıcılar tarafından ayırt edilemediğine ve olumsuz maddelerin olumlu maddelerle aynı örtük yapıyı ölçmediğine işaret etmektedir.


References

  • Adelson, J.L., & McCoach, D.B. (2010). Measuring the mathematical attitudes of elementary students: The effects of a 4-point or 5-point Likert-type scale. Educational and Psychological Measurement, 70(5), 796-807. http://dx.doi.org/10.1177/0013164410366694 Ahlawat, K.S. (1985). On the negative valence items in self-report measures. The Journal of General Psychology, 112(1), 89-99. http://dx.doi.org/10.1080/00221309.1985.9710992
  • Aiken, L.R. (1983). Number of response categories and statistics on a teacher rating scale. Educational and Psychological Measurement, 43(2), 397-401. http://dx.doi.org/10.1177/001316448304300209
  • Bachman, J.G., & O’Malley, P.M. (1984). Yea-saying, nay-saying, and going to extremes: Black-white differences in response styles. The Public Opinion Quarterly, 48(2), 491-509. http://dx.doi.org/10.1086/268845 Baker, F.B. (2001). The basics of item response theory. ERIC Clearinghouse on Assessment and Evaluation, University of Maryland, College Park, MD.
  • Barnette, J.J. (1999, April). Likert Response Alternative Direction: SA to SD or SD to SA: Does It Make a Difference? Paper presented at the Annual Meeting of the American Educational Research Association, Montreal, Quebec, Canada. Retrieved from http://eric.ed.gov/?id=ED429125
  • Benson, J., & Hocevar, D. (1985). The impact of item phrasing on the validity of attitude scales for elementary school children. Journal of Educational Measurement, 22(3), 231–240. http://dx.doi.org/10.1111/j.1745-3984.1985.tb01061.x
  • Bergstrom, B.A., & Lunz, M.E. (1998, April). Rating scale analysis: Gauging the impact of positively and negatively worded items. Paper presented at the Annual Meeting of the American Educational Research Association. San Diego, CA. Retrieved from http://files.eric.ed.gov/fulltext/ED423289.pdf
  • Birkett, N.J. (1986). Selecting the number of response categories for a Likert-type scale. Retrieved from http://www.amstat.org/sections/srms/Proceedings/papers/1986_091.pdf
  • Bolin, B.L., & Dodder, R.A. (1990). The affect balance scale in an American college population. The Journal of Social Psychology, 130(6), 839-40. http://dx.doi.org/10.1080/00224545.1990.9924639
  • Büyüköztürk, Ş. (2005). Anket geliştirme. Türk Eğitim Bilimleri Dergisi, 3(2), 133-151. Retrieved from http://www.tebd.gazi.edu.tr/index.php/tebd/article/view/315/297
  • Cicchetti, D.V., Showalter, D., & Tyrer, P.J. (1985). The effect of number of rating scale categories on levels of inter-rater reliability: A Monte-Carlo investigation. Applied Psychological Measurement, 9(1), 31-36. http://dx.doi.org/10.1177/014662168500900103
  • Chamberlain, V.M., & Cummings, M.N. (1984). Development of an instructor/course evaluation instrument. College Student Journal, 18(3), 246-250.
  • Chang, L. (1994). A psychometric evaluation of 4-point and 6-point Likert-type scales in relation to reliability and validity. Applied Psychological Measurement, 18(3), 205-215. http://dx.doi.org/10.1177/014662169401800302
  • Chiorri, C., Anselmi, P., & Robusto, E. (2009). Reverse items are not opposites of straightforward items. In U. Savardi (Ed.), The perception and cognition of contraries (pp. 295-328). Milano: McGraw-Hill.
  • Comrey, A.L., & Montang, I. (1982). Comparison of factor analytic results with two choice and seven choice personality item formats. Applied Psychological Measurement, 6(3), 285-289. http://dx.doi.org/10.1177/014662168200600304 Conrad, K.J., Wright, B.D., McKnight, P., McFall, M., Fontana A., & Rosenheck, R. (2004). Comparing traditional and Rasch analyses of the Mississippi PTSD scale: Revealing limitations of reverse-scored items. Journal of Applied Measurement, 5(1), 15-30. Retrieved from https://www.academia.edu/2832927/Comparing_traditional_and_Rasch_analyses_of_the_Mississippi_PTSD_scale_Revealing_limitations_of_reverse-scored_items Cronbach, L.J. (1950). Further evidence on response sets and test design. Educational and Psychological Measurement, 10(1), 3-31. http://dx.doi.org/10.1177/001316445001000101
  • Çetin, B., Doğan, T., & Sapmaz, F. (2010). Olumsuz değerlendirilme korkusu ölçeği kısa formu’nun Türkçe uyarlaması: Geçerlik ve güvenirlik çalışması. Eğitim ve Bilim, 35(156), 205-216.
  • Daher, A.M., Ahmad, S.H., Winn, T., & Selamat, M.I. (2015). Impact of rating scale categories on reliability and fit statistics of the Malay spiritual well-being scale using Rasch analysis. Malaysian Journal of Medical Sciences, 22(3), 48-55. Retrieved from http://www.bioline.org.br/pdf?mj15032
  • Dawes, J. (2007). Do data characteristics change according to the number of scale points used? An experiment using 5-point, 7-point and 10-point scales. International Journal of Market Research, 50(1), 61-77. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.417.9488&rep=rep1&type=pdf DeVellis, R.F. (2003). Scale development: Theory and applications. Newbury Park: Sage Publications.
  • Doğan, T., & Akıncı Çötok, N. (2011). Oxford mutluluk ölçeği kısa formunun Türkçe uyarlaması: Geçerlik ve güvenirlik çalışması. Türk Psikolojik Danışma ve Rehberlik Dergisi, 4(36), 165-172. Retrieved from http://dergipark.ulakbim.gov.tr/tpdrd/article/view/1058000176/1058000178
  • Erkuş, A. (2003). Psikometri üzerine yazılar. Ankara: Türk Psikologlar Derneği Yazıları. Erkuş, A. (2012). Psikolojide ölçme ve ölçek geliştirme-I. Ankara: Pegem Akademi Yayıncılık.
  • Fabiola, G.B., Iwin, L., Jennifer, L.M., & Zaira, V.V. (2012). The effect of the number of answer choices on the psychometric properties of stress measurement in an ınstrument applied to children. Evaluar, 12 43-59. Retrieved from https://revistas.unc.edu.ar/index.php/revaluar/article/download/4694/4488
  • Green, S.B., Akey, T.M., Fleming, K.K., Hershberger, S.L., & Marquis, J.G. (1997). Effect of the number of scale points on chi‐square fit indices in confirmatory factor analysis, Structural Equation Modeling: A Multidisciplinary Journal, 4(2), 108-120, http://dx.doi.org/10.1080/10705519709540064
  • Halpin, G., Halpin, G., & Arbet, S. (1994). Effects of number and type of response choices on internal consistency reliability. Perceptual and Motor Skills, 79(2), 928-930. http://dx.doi.org/10.2466/pms.1994.79.2.928
  • Herche, J., & Engelland, B. (1996). Reversed-Polarity İtems and scale unidimensionality. Journal of the Academy of Marketing Science, 24(4), 366-374. http://dx.doi.org/10.1177/0092070396244007 Hofstede, G. (1998). Masculinity and femininity: The taboo dimension of national cultures. Thousand Oaks, CA: Sage.
  • Hooper, M., Arora, A., Martin, M.O., & Mullis, I.V.S, (2013, June). Examining the behavior of “reverse directional” items in the TIMSS 2011 context questionnaire scales. Paper Presented at the 5th IEA International Research Conference. National Institute of Education, Nanyang Technological University, Singapore. Retrieved from http://www.iea.nl/fileadmin/user_upload/IRC/IRC_2013/Papers/IRC-2013_Hooper_etal.pdf
  • Hui, C.H., & Triandis, H.C. (1989). Effects of culture and response format on extreme response style. Journal of Cross-Cultural Psychology, 20(3), 296-309. http://dx.doi.org/10.1177/0022022189203004
  • Ibrahim, A.M. (2001). Differential responding to positive and negative items: The case of a negative item in a questionnaire for course and faculty evaluation. Psychological Reports, 88(2), 497-500. http://dx.doi.org/10.2466/pr0.2001.88.2.497 Jacoby, J., & Matell, M.S. (1971). Three-point likert scales are good enough. Journal of Marketing Research, 8, 495-500. Retrieved from https://www.jstor.org/stable/pdf/3150242.pdf?_=1472027712885 Jenkins, G.D., & Taber, T.D. (1977). A Monte-Carlo study of factors a€ecting three indices of composite scale reliability. Journal of Applied Psychology, 62(4), 392-398. http://dx.doi.org/10.1037/0021-9010.62.4.392 Johnson, T., Kulesa, P., Cho, Y.I., & Shavitt, S. (2005). The relation between culture and response styles evidence from 19 countries. Journal of Cross-Cultural Psychology, 36(2), 264-277. http://dx.doi.org/10.1177/0022022104272905 Kelloway, E.K., Catano, V.M., & Southwell, R.R. (1992). The construct validity of union commitment: Development and dimensionality of a shorter scale. Journal of Occupational and Organizational Psychology 65(3), 197-211. http://dx.doi.org/10.1111/j.2044-8325.1992.tb00498.x Kim, K.H. (1998). An analysis of optimum number of response categories for korean consumers. Journal of Global Academy of Marketing Science, 1(1), 61-86. http://dx.doi.org/10.1080/12297119.1998.9707386
  • King, L.A., King, D., & Klockars, A.J. (1983). Dichotomous and multipoint scales using bipolar adjectives. Applied Psychological Measurement, 7(2), 173-180. http://dx.doi.org/10.1177/014662168300700205 Knoch, U., &, McNamara, T. (2015). Rasch analysis. In L. Plonsky, (Ed.), Advancing quantitative methods in second language research (pp. 275–304). New York, NY: Routledge.
  • Lai, J.C.L. (1994). Differential predictive power of the positively versus the negatively worded items of the life orientation test. Psychological Repors, 75(3), 1507-1515. http://dx.doi.org/10.2466/pr0.1994.75.3f.1507
  • Lee, J.W., Jones, P.S., Mineyama, Y., & Zhang, X.E. (2002). Cultural differences in responses to a Likert scale. Research in Nursing & Health, 2002, 25, 295-306. http://dx.doi.org/10.1002/nur.10041
  • Leung, S. (2011). A comparison of psychometric properties and normality in 4-, 5-, 6-, and 11-point Likert scales. Journal of Social Service Research, 37(4), 412-421. http://dx.doi.org/10.1080/01488376.2011.580697
  • Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22, 2-55.
  • Linacre, J. M. (2014). A user’s guide to FACETS Rasch-model computer programs. Retrieved from http://www.winsteps.com/a/facets-manual.pdf
  • Lissitz, R.W., & Green, S.B. (1975). Effects of the number of scale points on reliability: A Monte Carlo approach. Journal of Applied Psychology, 60(1), 10-13. http://dx.doi.org/10.1037/h0076268 Locker, D., Jokovic, A., & Allison, P. (2013). Direction of wording and responses to items in oral health-related quality of life questionnaires for children and their parents. Community Dent Oral Epidemiol 35(4), 255-262. http://dx.doi.org/10.1111/j.1600-0528.2007.00320.x
  • Lozano, L.M:, García-Cueto, E., & Muñiz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 4(2), 73-79. http://dx.doi.org/10.1027/1614-2241.4.2.73
  • Matell, M. S., & Jacoby, J. (1971). Is there an optimal number of alternatives for Likert scale items? Study I: Reliability and validity. Educational and Psychological Measurement, 31(3), 657-674. http://dx.doi.org/10.1177/001316447103100307 Maydeu-Olivares A., Kramp U., García-Forero C., Gallardo-Pujol, D., Coffman, D. (2009). The effect of varying the number of response alternatives in rating scales: Experimental evidence from intra-individual effects. Behavior Research Methods, 41(2), 295-308. http://dx.doi.org/10.3758/BRM.41.2.295 McInerney, V., McInerney, D., & Roche, L. (1994, July). Definitely not just another computer anxiety instrument: The development and validation of CALM: Computer anxiety and learning measure. Paper presented at the Annual Stress and Anxiety Research Conference, Madrid, Spain. Retrieved from http://files.eric.ed.gov/fulltext/ED386161.pdf
  • Oaster, T. R. F. (1989). Number of alternatives per choice point and stability of Likert-type scales. Perceptual and Motor Skills, 68(2), 549-550. http://dx.doi.org/10.2466/pms.1989.68.2.549
  • Østerås, N., Gulbrandsen, P., Garratt, A., Benth, J.S., Dahl, F.A, Natvig, B., & Brage, S. (2008). A randomised comparison of a four- and a five-point scale version of the Norwegian function assessment scale. Health and Quality of Life Outcomes, 6(14), 1-9, http://dx.doi.org/10.1186/1477-7525-6-14 Pilotte, W.J., & Gable, R.K. (1990). The impact of positive and negative item stems on the validity of a computer anxiety scale. Educational and Psychological Measurement, 50(3), 603-610. http://dx.doi.org/10.1177/0013164490503016
  • Preston, C.C., & Colman, A.M. (2000). Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104(1), 1-15. http://dx.doi.org/10.1016/S0001-6918(99)00050-5
  • Ramsay, J. O. (1973). The effect of number of categories in rating scales on precision of estimation of scale values. Psychometrika, 38(4), 513-533. http://dx.doi.org/10.1007/BF02291492
  • Ray, J. (1980). How many answer categories should attitude and personality scales use? South AfricanJournal of Psychology, 10, 53-54. Retrieved from http://jonjayray.tripod.com/howmany.html
  • Rodebaugh, T.L., Woods, C.M., Thissen, D.M., Heimberg, R.G., Chambless, D.L., & Rapee, R.M. (2004). More information from fewer questions: The factor structure and item properties of the original and brief Fear of Negative Evaluation Scale. Psychological Assessment, 16, 169-181. http://dx.doi.org/10.1037/1040-3590.16.2.169
  • Roszkowski, M.J., & Soven, M. (2010). Shifting gears: Consequences of including two negatively worded items in the middle of a positively worded questionnaire. Assessment & Evaluation in Higher Education, 35(1), 113-130. http://dx.doi.org/10.1080/02602930802618344
  • Qasem, M., Almoshigah, T., & Gupta, S. (2014). The effect of number of alternatives on validity and reliability in Likert scale. International journal of innovative research & studies, 3(6), 324-333. http://dx.doi.org/10.13140/2.1.2237.2803
  • Schrieheim, C.A, & Hill, K.D. (1981). Controlling acquiescence response bias by item reversals: The effect on questionnaire validity. Educational and Psychological Measurement, 41(4), 1101-1114. http://dx.doi.org/10.1177/001316448104100420 Spector, P.E, van Katwyk, P.T., Brannick, M.T., & Chen, P.Y. (1997). When two factors don’t reflect two constructs: How Item characteristics can produce artifactual factors. Journal of Management, 23(5), 659-677. http://dx.doi.org/10.1016/S0149-2063(97)90020-9
  • Stening, B.W., & Everett, J.E. (1984). Response styles in a cross-cultural managerial study. Journal of Social Psychology, 122(2), 151-156. http://dx.doi.org/10.1080/00224545.1984.9713475
  • Sudweeks, R.R., Reeve, S., & Bradshaw, W.S. (2005). A comparison of generalizability theory and many-facet Rasch measurement in an analysis of college sophomore writing. Assessing Writing, 9(3), 239-261. http://dx.doi.org/10.1016/j.asw.2004.11.001 Swain S.D, Weathers D., Niedrich R.W. (2008) Assessing three sources of misresponse to reversed Likert items. Journal of Marketing Research 45, 116-131. Retrieved from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=990097
  • Şeker, H., & Gençdoğan, B. (2006). Psikolojide ve eğitimde ölçme aracı geliştirme. Ankara: Nobel Yayın Dağıtım.
  • Tarka, P. (2015). Likert scale and change in range of response categories vs. the factors extraction in EFA model. Folia Oeconomica, 1(311), 27-36. http://dx.doi.org/10.18778/0208‐6018.311.04 Taşdelen Teker, G., Güler, N. & Kaya Uyanık, G. (2015). Comparing the effectiveness of SPSS and EduG using different designs for generalizability theory. Educational Sciences: Theory & Practice, 15(3), 635-645. http://dx.doi.org/10.12738/estp.2015.3.2278 Tavşancıl, E. (2010). Tutumların ölçülmesi ve SPSS ile veri analizi. Ankara: Nobel Yayın Dağıtım.
  • Tekindal, S. (2009). Duyuşsal özelliklerin ölçülmesi için araç oluşturma. Ankara: Pegem Akademi Yayıncılık.
  • Tezbaşaran, A. (1997). Likert tipi ölçek hazırlama kılavuzu. Ankara: Türk Psikologlar Derneği.
  • Turan, İ., Şimşek, Ü., & Aslan, H. (2015). Eğitim araştırmalarında Likert ölçeği ve Likert tipi soruların kullanımı ve analizi. Sakarya Üniversitesi Eğitim Fakültesi Dergisi, (30), 186-203. Retrieved from http://dergipark.ulakbim.gov.tr/sakaefd/article/view/5000143504 Van Sonderen, E., SandermanR., & Coyne, J.C. (2013). Ineffectiveness of reverse wording of questionnaire ıtems: Let’s Learn from cows in the rain. PloS one, 8(7), 1-7. http://dx.doi.org/10.1371/journal.pone.0068967
  • Weems, G.H., Onwuegbuzie, A.J., & Lustig, D. (2003). Profiles of respondents who respond inconsistently to positively- and negatively- worded items on rating scales. Evaluation & Research in Education, 17(1), 45-60. http://dx.doi.org/10.1080/14664200308668290
  • Weng, L.J. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educational and Psychological Measurement, 64(6), 956-972. http://dx.doi.org/10.1177/0013164404268674 Wong, C.S., Peng, K.Z., Shi J., & Mao, Y. (2011). Differences between odd number and even number response formats: Evidence from mainland Chinese respondents. Asia Pacific Journal of Management, 28(2), 379–399. http://dx.doi.org/10.1007/s10490-009-9143-6
  • Wyatt, R.C., & Meyers, L.S. (1987). Psychometric properties of four 5-point likert type response scales. Educational and Psychological Measurement, 47(1), 27-35. http://dx.doi.org/10.1177/0013164487471003 Zhang, X, Noor, R., Savalei, V. (2016) Examining the effect of reverse worded items on the factor structure of the need for cognition scale. PLoS ONE, 11(6), 1-15. http://dx.doi.org/10.1371/journal.pone.0157795
There are 54 citations in total.

Details

Journal Section Articles
Authors

Mustafa İlhan

Neşe Güler

Publication Date September 30, 2017
Acceptance Date September 14, 2017
Published in Issue Year 2017 Volume: 8 Issue: 3

Cite

APA İlhan, M., & Güler, N. (2017). Likert Tipi Ölçeklerde Olumsuz Madde ve Kategori Sayısı Sorunu: Rasch Modeli ile Bir İnceleme. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 321-343. https://doi.org/10.21031/epod.321057