Research Article
BibTex RIS Cite

Higher Education End-of-Course Evaluations: Assessing the Psychometric Properties Utilizing Exploratory Factor Analysis and Rasch Modeling Approaches

Year 2016, Volume: 3 Issue: 1, 3 - 22, 11.07.2016

Abstract

This paper offers a critical assessment of the psychometric properties of a standard higher education end-of-course evaluation. Using both exploratory factor analysis (EFA) and Rasch modeling, the authors investigate the (a) an overall assessment of dimensionality using EFA, (b) a secondary assessment of dimensionality using a principal components analysis (PCA) of the residuals when the items are fit to the Rasch model, and (c) an assessment of item-level properties using item-level statistics provided when the items are fit to the Rasch model. The results support the usage of the scale as a supplement to high-stakes decision making such as tenure. However, the lack of precise targeting of item difficulty to person ability combined with the low person separation index renders rank-ordering professors according to minuscule differences in overall subscale scores a highly questionable practice.

References

  • American Association of University Professors, Committee C on College and University Teaching, Research, and Publication. (1974). Statement on teaching evaluation. AAUP Bulletin, 60, 168-170.
  • Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573.
  • Bangert, A. W. (2006). The development of an instrument for assessing online teaching effectiveness. Journal of Educational Computing Research, 35, 227-244.
  • Barth, M. M. (2008). Deciphering student evaluations of teaching: A factor analysis approach. Journal of Education for Business, 84, 40-46.
  • Beran, T., Violato, C., & Kline, D. (2007). What’s the “use” of student ratings of instruction for administrators? One university’s experience. Canadian Journal of Higher Education, 17(1), 27-43.
  • Brown, A., & Green, T. (2003). Showing up to class in pajamas (or less ): The fantasies and realities of on-line professional development courses for teachers. The Clearing House, 76, 148-151.
  • Calkins, S., & Micari, M. (2010). Less-than-perfect judges: Evaluating student evaluations. Thought & Action: The NEA Higher Education Journal, 26, 7-22.
  • Campbell, J. P., & Bozeman, W.C. (2008). The value of student ratings: Perceptions of students, teachers, and administrators. Community College Journal of Research and Practice, 32, 13-24.
  • Cashin, W. E. (1995). Student ratings of teaching: The research revisited (IDEA Paper No. 32). http://www.theideacenter.org/sites/default/files/Idea_Paper_32.pdf from the IDEA Center website:
  • Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 629-637.
  • Cohen, E. H. (2005). Student evaluations of course and teacher: Factor analysis and SSA approaches. Assessment & Evaluation in Higher Education, 30, 123–136.
  • Côté, J. E. & Allahar, A. L. (2007). Ivory tower blues: A university system in crisis. Toronto, Ontario, Canada: University of Toronto Press.
  • DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological Methods, 2, 292- 307.
  • Ewing, J. K., & Crockford, B. (2008). Changing the culture of expectations: Building a culture of evidence. The Department Chair, 18(3), 23-25.
  • Flora, D. B., LaBrish, C., & Chalmers, R. P. (2012). Old and new ideas for data screening and assumption testing for exploratory and confirmatory factor analysis. Frontiers in Psychology, 3, 1-21.
  • Franklin, J., & Theall, M. (1989, March). Who reads ratings: Knowledge, attitude, and practice of users of student ratings of instruction. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA.
  • Griffin A., & Cook, V. (2009). Acting on evaluation: Twelve tips from a national conference on student evaluations. Medical Teacher, 31, 101-104.
  • Guthrie, E. R. (1954). The evaluation of teaching: A progress report. Seattle, WA: University of Washington Press.
  • Hathorn, L., & Hathorn, J. (2010). Evaluation of online course websites: Is teaching online a tug-of-war? Journal of Educational Computing Research, 42, 197-217.
  • Hodges, L.C., & Stanton, K. (2007). Translating comments on student evaluations into the language of learning. Innovative Higher Education, 31, 279-286.
  • Jaeger, R. M. (1977). A word about the issue. Journal of Educational Measurement, 14, 73- 74.
  • Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20, 141-151.
  • Kaplan, R. M. & Saccuzzo, D. P. (1997). Psychological testing: Principles, applications and issues (4 ed.). Pacific Grove, CA: Brooks/Cole. th
  • Kelly, H. F., Ponton, M. K., & Rovai, A. P. (2007). A comparison of student evaluations of teaching between online and face-to-face courses. The Internet and Higher Education, 10, 89-101.
  • Kim, K., Liu, S., & Bonk, C. J. (2005). Online MBA students’ perceptions of online learning: Benefits, challenges, and suggestions. The Internet and Higher Education, 8, 335-344.
  • Laube, H., Massoni, K., Sprague, J., & Ferber, A. L. (2007). The impact of gender on the evaluation of teaching: What we know and what we can do. National Women’s Studies Association Journal, 19(3), 87-104.
  • Linacre, J. M. (2002). What do infit and outfit, mean square and standardized mean? Rasch Measurement Transactions, 16, 878.
  • Linacre, J. M. (2014a). Dimensionality: Contrasts and variances. In A user’s guide to Winsteps Ministep Rasch-model computer programs (version 3.81.0). Retrieved from http://www.winsteps.com/winman/principalcomponents.htm
  • Linacre, J. M. (2014b). Reliability and separation of measures. In A user’s guide to Winsteps Ministep Rasch-model computer programs (version 3.81.0). Retrieved from http://www.winsteps.com/winman/reliability.htm
  • Marsh, H. W., & Dunkin, M. J. (1997). Students’ evaluations of university teaching: A multidimensional perspective. In R. P. Perry & J. C. Smart (Eds.), Effective teaching in higher education: Research and practice (pp. 241-320). New York, NY: Agathon Press.
  • Mortelmans, D., & Spooren, P. (2009). A revalidation of the SET37-questionnaire for student evaluations of teaching. Educational Studies, 35, 547–552.
  • Morgan, D. A., Sneed, J., & Swinney, L. (2003). Are student evaluations a valid measure of teaching effectiveness: Perceptions of accounting faculty members and administrators. Management Research News, 26(7): 17-32.
  • Ory, J. C. (2001). Faculty thoughts and concerns about student ratings. New Directions for Teaching and Learning, 87, 3-15. doi: 10.1002/tl.23
  • Osborne, J. W., Costello, A. B., & Kellow, J. T. (2008). Best practices in exploratory factor analysis. In J. W. Osborne (Ed.), Best practices in quantitative methods (pp. 86-102). Thousand Oaks, CA: Sage.
  • Otani, K., Kim, B. J., & Cho, J. (2012). Student evaluation of teaching (SET) in higher education: How to use SET more effectively and efficiently in public affairs education. Journal of Public Affairs Education, 18, 531-544.
  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Danish Institute for Educational Research.
  • Raîche, G. (2005). Critical eigenvalue sizes (variances) in standardized residual principal components analysis (PCA). Rasch Measurement Transactions, 19, 1012.
  • Sick, J. (2009). Rasch measurement in language education part 3: The family of Rasch models. Shiken: JALT Testing and Evaluation SIG Newsletter, 13(1), 4-10
  • Spooren, P. (2010). On the credibility of the judge. A cross-classified multilevel analysis on student evaluations of teaching. Studies in Educational Evaluation, 36, 121–131.
  • Spooren, P. Brockx, B. & Mortelmans D. (2013). On the validating of student evaluation of teaching: The state of the art. Review of Educational Research, 83 (4), pp. 598-642.
  • Toland, M., & De Ayala, R. J. (2005). A multilevel factor analysis of students’ evaluations of teaching. Educational and Psychological Measurement, 65, 272–296.
  • Thorne, G. L. (1980). Student ratings of instructors: From scores to administrative decisions. Journal of Higher Education, 51, 207-214.
  • Van Der Ven, A. H. G. S. (1980). Introduction to scaling. New York, NY: Wiley.
  • Wachtel, H. K. (1998). Student evaluation of college teaching effectiveness: A brief review. Assessment & Evaluation in Higher Education, 23, 191-212.
  • Wattiaux, M. A., Moore, J. A., Rastani, R. R., & Crump, P. M. (2010). Excellence in teaching for promotion and tenure in animal and dairy sciences at doctoral/research universities: A faculty perspective. Journal of Dairy Science, 93, 3365-3376.
  • Warner, R. M. (2008). Applied statistics: From bivariate through multivariate techniques. Thousand Oaks, CA: Sage.
  • West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 56-75). Thousand Oaks, CA: Sage.
  • Wolfer, T. A., & Johnson, M. M. (2003). Re-evaluating student evaluation of teaching: The teaching evaluation form. Journal of Social Work Education, 39, 111-121.
  • Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8, 370.

Higher Education End-of-Course Evaluations: Assessing the Psychometric Properties Utilizing Exploratory Factor Analysis and Rasch Modeling Approaches

Year 2016, Volume: 3 Issue: 1, 3 - 22, 11.07.2016

Abstract

This paper offers a critical assessment of the psychometric properties of a standard higher education end-of-course evaluation. Using both exploratory factor analysis (EFA) and Rasch modeling, the authors investigate the (a) an overall assessment of dimensionality using EFA, (b) a secondary assessment of dimensionality using a principal components analysis (PCA) of the residuals when the items are fit to the Rasch model, and (c) an assessment of item-level properties using item-level statistics provided when the items are fit to the Rasch model. The results support the usage of the scale as a supplement to high-stakes decision making such as tenure. However, the lack of precise targeting of item difficulty to person ability combined with the low person separation index renders rank-ordering professors according to minuscule differences in overall subscale scores a highly questionable practice.

References

  • American Association of University Professors, Committee C on College and University Teaching, Research, and Publication. (1974). Statement on teaching evaluation. AAUP Bulletin, 60, 168-170.
  • Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573.
  • Bangert, A. W. (2006). The development of an instrument for assessing online teaching effectiveness. Journal of Educational Computing Research, 35, 227-244.
  • Barth, M. M. (2008). Deciphering student evaluations of teaching: A factor analysis approach. Journal of Education for Business, 84, 40-46.
  • Beran, T., Violato, C., & Kline, D. (2007). What’s the “use” of student ratings of instruction for administrators? One university’s experience. Canadian Journal of Higher Education, 17(1), 27-43.
  • Brown, A., & Green, T. (2003). Showing up to class in pajamas (or less ): The fantasies and realities of on-line professional development courses for teachers. The Clearing House, 76, 148-151.
  • Calkins, S., & Micari, M. (2010). Less-than-perfect judges: Evaluating student evaluations. Thought & Action: The NEA Higher Education Journal, 26, 7-22.
  • Campbell, J. P., & Bozeman, W.C. (2008). The value of student ratings: Perceptions of students, teachers, and administrators. Community College Journal of Research and Practice, 32, 13-24.
  • Cashin, W. E. (1995). Student ratings of teaching: The research revisited (IDEA Paper No. 32). http://www.theideacenter.org/sites/default/files/Idea_Paper_32.pdf from the IDEA Center website:
  • Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 629-637.
  • Cohen, E. H. (2005). Student evaluations of course and teacher: Factor analysis and SSA approaches. Assessment & Evaluation in Higher Education, 30, 123–136.
  • Côté, J. E. & Allahar, A. L. (2007). Ivory tower blues: A university system in crisis. Toronto, Ontario, Canada: University of Toronto Press.
  • DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological Methods, 2, 292- 307.
  • Ewing, J. K., & Crockford, B. (2008). Changing the culture of expectations: Building a culture of evidence. The Department Chair, 18(3), 23-25.
  • Flora, D. B., LaBrish, C., & Chalmers, R. P. (2012). Old and new ideas for data screening and assumption testing for exploratory and confirmatory factor analysis. Frontiers in Psychology, 3, 1-21.
  • Franklin, J., & Theall, M. (1989, March). Who reads ratings: Knowledge, attitude, and practice of users of student ratings of instruction. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA.
  • Griffin A., & Cook, V. (2009). Acting on evaluation: Twelve tips from a national conference on student evaluations. Medical Teacher, 31, 101-104.
  • Guthrie, E. R. (1954). The evaluation of teaching: A progress report. Seattle, WA: University of Washington Press.
  • Hathorn, L., & Hathorn, J. (2010). Evaluation of online course websites: Is teaching online a tug-of-war? Journal of Educational Computing Research, 42, 197-217.
  • Hodges, L.C., & Stanton, K. (2007). Translating comments on student evaluations into the language of learning. Innovative Higher Education, 31, 279-286.
  • Jaeger, R. M. (1977). A word about the issue. Journal of Educational Measurement, 14, 73- 74.
  • Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20, 141-151.
  • Kaplan, R. M. & Saccuzzo, D. P. (1997). Psychological testing: Principles, applications and issues (4 ed.). Pacific Grove, CA: Brooks/Cole. th
  • Kelly, H. F., Ponton, M. K., & Rovai, A. P. (2007). A comparison of student evaluations of teaching between online and face-to-face courses. The Internet and Higher Education, 10, 89-101.
  • Kim, K., Liu, S., & Bonk, C. J. (2005). Online MBA students’ perceptions of online learning: Benefits, challenges, and suggestions. The Internet and Higher Education, 8, 335-344.
  • Laube, H., Massoni, K., Sprague, J., & Ferber, A. L. (2007). The impact of gender on the evaluation of teaching: What we know and what we can do. National Women’s Studies Association Journal, 19(3), 87-104.
  • Linacre, J. M. (2002). What do infit and outfit, mean square and standardized mean? Rasch Measurement Transactions, 16, 878.
  • Linacre, J. M. (2014a). Dimensionality: Contrasts and variances. In A user’s guide to Winsteps Ministep Rasch-model computer programs (version 3.81.0). Retrieved from http://www.winsteps.com/winman/principalcomponents.htm
  • Linacre, J. M. (2014b). Reliability and separation of measures. In A user’s guide to Winsteps Ministep Rasch-model computer programs (version 3.81.0). Retrieved from http://www.winsteps.com/winman/reliability.htm
  • Marsh, H. W., & Dunkin, M. J. (1997). Students’ evaluations of university teaching: A multidimensional perspective. In R. P. Perry & J. C. Smart (Eds.), Effective teaching in higher education: Research and practice (pp. 241-320). New York, NY: Agathon Press.
  • Mortelmans, D., & Spooren, P. (2009). A revalidation of the SET37-questionnaire for student evaluations of teaching. Educational Studies, 35, 547–552.
  • Morgan, D. A., Sneed, J., & Swinney, L. (2003). Are student evaluations a valid measure of teaching effectiveness: Perceptions of accounting faculty members and administrators. Management Research News, 26(7): 17-32.
  • Ory, J. C. (2001). Faculty thoughts and concerns about student ratings. New Directions for Teaching and Learning, 87, 3-15. doi: 10.1002/tl.23
  • Osborne, J. W., Costello, A. B., & Kellow, J. T. (2008). Best practices in exploratory factor analysis. In J. W. Osborne (Ed.), Best practices in quantitative methods (pp. 86-102). Thousand Oaks, CA: Sage.
  • Otani, K., Kim, B. J., & Cho, J. (2012). Student evaluation of teaching (SET) in higher education: How to use SET more effectively and efficiently in public affairs education. Journal of Public Affairs Education, 18, 531-544.
  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Danish Institute for Educational Research.
  • Raîche, G. (2005). Critical eigenvalue sizes (variances) in standardized residual principal components analysis (PCA). Rasch Measurement Transactions, 19, 1012.
  • Sick, J. (2009). Rasch measurement in language education part 3: The family of Rasch models. Shiken: JALT Testing and Evaluation SIG Newsletter, 13(1), 4-10
  • Spooren, P. (2010). On the credibility of the judge. A cross-classified multilevel analysis on student evaluations of teaching. Studies in Educational Evaluation, 36, 121–131.
  • Spooren, P. Brockx, B. & Mortelmans D. (2013). On the validating of student evaluation of teaching: The state of the art. Review of Educational Research, 83 (4), pp. 598-642.
  • Toland, M., & De Ayala, R. J. (2005). A multilevel factor analysis of students’ evaluations of teaching. Educational and Psychological Measurement, 65, 272–296.
  • Thorne, G. L. (1980). Student ratings of instructors: From scores to administrative decisions. Journal of Higher Education, 51, 207-214.
  • Van Der Ven, A. H. G. S. (1980). Introduction to scaling. New York, NY: Wiley.
  • Wachtel, H. K. (1998). Student evaluation of college teaching effectiveness: A brief review. Assessment & Evaluation in Higher Education, 23, 191-212.
  • Wattiaux, M. A., Moore, J. A., Rastani, R. R., & Crump, P. M. (2010). Excellence in teaching for promotion and tenure in animal and dairy sciences at doctoral/research universities: A faculty perspective. Journal of Dairy Science, 93, 3365-3376.
  • Warner, R. M. (2008). Applied statistics: From bivariate through multivariate techniques. Thousand Oaks, CA: Sage.
  • West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 56-75). Thousand Oaks, CA: Sage.
  • Wolfer, T. A., & Johnson, M. M. (2003). Re-evaluating student evaluation of teaching: The teaching evaluation form. Journal of Social Work Education, 39, 111-121.
  • Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8, 370.
There are 49 citations in total.

Details

Primary Language English
Subjects Studies on Education
Other ID JA42YP24SP
Journal Section Articles
Authors

Kelly D. Bradley This is me

Eric M. Snyder This is me

Angela K. Tombari This is me

Publication Date July 11, 2016
Submission Date July 11, 2016
Published in Issue Year 2016 Volume: 3 Issue: 1

Cite

APA Bradley, K. D., Snyder, E. M., & Tombari, A. K. (2016). Higher Education End-of-Course Evaluations: Assessing the Psychometric Properties Utilizing Exploratory Factor Analysis and Rasch Modeling Approaches. International Journal of Assessment Tools in Education, 3(1), 3-22.

23823             23825