The Discrete Option Multiple Choice Items as A Measurement Instrument for Mathematics Achievement

Atilla Özdemir; Selahattin Gelbal

Research Article

The Discrete Option Multiple Choice Items as A Measurement Instrument for Mathematics Achievement

Year 2024, Volume: 8 Issue: 2, 317 - 348, 31.07.2024

Abstract

This study examines the applicability of discrete option multiple choice [DOMC] items in secondary school mathematics. The test included 25 questions, with 10 being traditional multiple-choice and 15 being DOMC items. Data were collected from 725 secondary school students during the second term of the 2020-2021 academic year. Among these students, 491 (68%) were in 7th grade and 234 (32%) were in 8th grade; 391 (54%) were female, and 334 (46%) were male. The findings revealed significant differences between the two item types, especially in high scores, using classical test theory [CTT]. However, item response theory [IRT] analysis showed that the question type did not affect estimations of students' ability levels, thus reducing errors in extreme values. This suggests that DOMC items do not significantly impact students' total scores when parameter estimations are performed using IRT instead of CTT. Additionally, some traditional multiple choice [TMC] items were adapted into the DOMC format to test the applicability of various question types in this format.

Keywords

Classical test theory , discrete option multiple choice , item response theory , mathematics achievement , traditional multiple choice

References

Adediwura, A. A., Ajayi, O. S., & Ibikunle, Y. A. (2021). Students and test variables as predictors of undergraduates’ self-compassion. Journal of Research & Method in Education, 11(3), 42-48.
Alnasraween, M. S., Alsmadi, M. S., Al-zboon, H. S., & Alkurshe, T. O. (2022). The level of universities students’ test wiseness in Jordan during distance learning in light of some variables. Education Research International, 1-10.
Anastasi, A. (1988). Psychological testing (6th ed.). Macmillan.
Bailey, C. D., Briggs, J. W., & Irving, J. H. (2022). Test-wiseness and multiple-choice accounting questions: Implications for instructors. Issues in Accounting Education, 37(2), 1–14.
Baker, D. L., & Baker, R. L. (2022). Knowledge and wisdom: High stakes testing and learning outcomes. In Neuroethical policy design. Studies in brain and mind. (pp.101-118). Springer.
Baker, F. B. (2001). The basics of item response theory (ED458219). ERIC. https://eric.ed.gov/?id=ED458219
Baker, F. B., & Kim, S. H. (Eds.). (2004). Item response theory: Parameter estimation techniques (2nd ed.). CRC.
Bolt, D. M., Kim, N., Wollack, J., Pan, Y., Eckerly, C., & Sowles, J. (2020). A psychometric model for discrete-option multiple-choice items. Applied Psychological Measurement, 44(1), 33–48.
Bolt, D. M., Lee, S., Wollack, J., Eckerly, C., & Sowles, J. (2018). Application of asymmetric IRT modeling to discrete-option Multiple-Choice Test items. Frontiers in Psychology, 9, 1-7.
Bolt, D. M., Wollack, J. A., & Suh, Y. (2012). Application of a multidimensional nested logit model to multiple-choice test items. Psychometrika, 77, 263-287.
Bowman, M. L. (1989). Testing individual differences in ancient China. American Psychologist, 44(3), 576–578.
Burt, C. (1911). Experimental tests of higher mental processes and their relation to general intelligence. Journal of Experimental Pedagogy, 1, 93–112.
Burt, C. (1972). Inheritance of general intelligence. American Psychologist, 27(3), 175–190.
Caveon. (2020). Technology and internet-based services subscriber agreement. Retrieved October 10, 2023, from https://caveon.com/caveon-privacy-policy-for-students-of-education-subscribers/
Cohen, R. J., & Swerdlik, M. E. (2018). Psychological testing and assessment (9th ed.). McGraw-Hill Education.
Cronbach, L. (1990). Essentials of psychological testing (5th ed.). Harper & Row.
Davis, N. T. (1996). Transition from objectivism to constructivism in science education. International Journal of Science Education, 15(6), 627-636. Eckerly, C., Smith, R. W., & Sowles, J. (2017). Analysis of the discrete option multiple choice item: Examples from its certification. Paper presented at the Conference on Test Security, Madison, WI.
Eckerly, C., Smith, R., & Sowles, J. (2018). Fairness concerns of discrete option multiple-choice items. Practical Assessment, Research & Evaluation, 23(16), 1–10.
Erdoğan, İ. (2003). Çağdaş eğitim sistemleri [Contemporary education systems] (5th. ed.). Sistem.
Fagley, N. S. (1987). Positional response bias in multiple-choice tests of learning: Its relation to test wiseness and guessing strategy. Journal of Educational Psychology, 79(1), 95–97.
Forsblom, L., Pekrun, R., Loderer, K., & Peixoto, F. (2022). Cognitive appraisals, achievement emotions, and students’ math achievement: A longitudinal analysis. Journal of Educational Psychology, 114(2), 346–367.
Foster, D. F., & Miller, H. L. (2009). A new format for multiple-choice testing: Discrete option multiple-choice. Results from early studies. Psychology Science Quarterly, 51, 355-369.
Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to design and evaluate research in education (8th ed.). McGraw Hill.
Funk, R., Hooper, T., Hadlock, E., Whicker, J., Estes, D., & Miller, H. L. (2010). Differential effects of the discrete-option multiple-choice format on test takers’ assessment preparation and scores. Poster presented at the Mary Lou Fulton Undergraduate Research Conference, Provo, UT.
Gibb, B. G. (1964). Testwiseness as a secondary cue response (Publication No. 6407643) [Doctoral dissertation, Stanford University]. ProQuest Thesis Center.
Goodenough, F. L. (1926). Measurement of intelligence by drawings. World Book Company.
Gorney, K., & Wollack, J. A. (2022). Does item format affect test security? Practical Assessment, Research, & Evaluation, 27(15), 1-13.
Guo, H., Rios, J. A., Ling, G., Wang, Z., Gu, L., Yang, Z., & Liu, L. O. (2022). Influence of selected-response format variants on test characteristics and test-taking effort: An empirical study. ETS Research Report Series, 2022(1), 1–20.
Güler, N. (2011). Eğitimde ölçme ve değerlendirme [Measurement and evaluation in education]. Pegem. Haladyna, T. M., & Downing, S. M. (2004). Construct-irrelevant variance in high-stakes testing. Educational Measurement: Issues and Practice, 23(1), 17–27.
Holmes, P. (2002). Multiple evaluation versus multiple choice as testing paradigm [Doctoral dissertation, Twente University]. University of Twente Research Information.
Janda, L. H. (1997). Psychological testing: Theory and applications (1st ed.). Pearson.
Kingston, N. M., Tiemann, G. C., Miller, H. L., & Foster, D. (2012). An analysis of the discrete-option multiple-choice item type. Psychological Test and Assessment Modeling, 54(1), 3–19. Kumandaş, H., & Kutlu, Ö. (2010). High stakes testing: Does secondary education examination involve any risks? Procedia- Social and Behavioral Sciences, 9, 758-764.
Lions, S., Monsalve, C., Dartnell, P., Blanco, M. P., Ortega, G., & Lemarié, J. (2022). Does the response options placement provide clues to the correct answers in multiple-choice tests? A systematic review. Applied Measurement in Education, 35(2), 133–152.
Lowell, F. (1919). A preliminary report of some group tests of general intelligence. Journal of Educational Psychology, 10(7), 323–344.
Papenberg, M. (2018). On how test wiseness and acceptance reluctance influence the validity of sequential knowledge tests [Unpublished doctoral dissertation]. Heinrich-Heine-University Düsseldorf.
Papenberg, M., Diedenhofen, B., & Musch, J. (2019). Experimental validation of sequential multiple-choice tests. Journal of Experimental Education, 89(2), 402–421.
Papenberg, M., Willing, S., & Musch, J. (2017). Sequentially presented response options prevent the use of testwiseness cues in multiple-choice testing. Psychological Test and Assessment Modeling, 59(2), 245-266.
Popham, W. J. (1999). Modern educational measurement: Practical guidelines for educational leaders (3rd ed.). Pearson.
Porteus, S. D. (1915). Mental tests for the feebleminded: A new series. Journel of Psycho-Asthenics, 19, 200-213.
Rodriguez, M. C. (2005). Three options are optimal for multiple-choice items: A meta-analysis of 80 years of research. Educational Measurement Issues and Practice. 24(2), 3–13.
Rost, D. H., & Sparfeldt, J. R. (2007). Reading comprehension without reading? On the construct validity of multiple-choice reading comprehension test items. Zeitschrift für Pädagogische Psychologie, 21, 305-314.
Rotthoff, T., Fahron, U., Baehring, T., & Scherbaum, W. A. (2008). The quality of CME questions as a component part of continuing medical education--an empirical study. Zeitschrift für Ärztliche Fortbildung und Qualität im Gesundheitswesen, 101, 667-674.
Samuel, J., & Hinson, J. (2012). Promoting motivation through technology-based testing. In P. Resta (Ed.), Proceedings of the Society for Information Technology & Teacher Education International Conference Chesapeake, VA: AACE.
Taylor, C., & Gardner, P. L. (1999). An alternative method of answering and scoring multiple-choice tests. Research in Science Education, 29, 353–363.
Vidal Rodeiro, C., & Macinska, S. (2022). Equal opportunity or unfair advantage? The impact of test accommodations on performance in high-stakes assessments. Assessment in Education: Principles, Policy & Practice, 29(4), 462–481. Wainer, H., Dorans, N. J., Eignor, D. R., Flaugher, R. L., Green, B. F., Mislevy, R. J., Steinberg, L., & Thissen, D. (2015). Computerized adaptive testing a primer (2nd ed.). Routledge.
Willing, S. (2013). Discrete-option multiple-choice: Evaluating the psychometric properties of a new method of knowledge assessment [Unpublished doctoral dissertation]. Heinrich-Heine-University Düsseldorf.
Willing, S., Ostapczuk, M., & Musch, J. (2015). Do sequentially presented answer options prevent the use of test wiseness cues on continuing medical education tests? Advances in Health Sciences Education: Theory and Practice, 20(1), 247–263.
Woodworth, R. S. (1910). Race differences in mental traits. Science, 31(788), 171-86.
Zhai, X., Haudek, K. C., Wilson, C., & Stuhlsatz, M. (2021). A framework of construct-irrelevant variance for contextualized constructed response assessment. Frontiers in Education, 6, 1-13.

Matematik Başarısı için Bir Ölçme Aracı Olarak Ayrık Seçenekli Çoktan Seçmeli Maddeler

Year 2024, Volume: 8 Issue: 2, 317 - 348, 31.07.2024

Atilla Özdemir , Selahattin Gelbal

Abstract

Bu çalışmanın amacı Ayrık Seçenekli Çoktan Seçmeli [ASÇS] maddelerin uygulanabilirliğinin incelenmesidir. Bu amaca ulaşmak için ortaokul matematik dersi kapsamında toplam 25 sorudan oluşan bir test kullanılmıştır. Testi oluşturan maddelerden 10 tanesi çoktan seçmeli maddelerden oluşurken 15 tanesi ASÇS maddelerden oluşmaktadır. Araştırmanın verileri 2020-2021 eğitim-öğretim yılı 2. döneminde ortaokulda öğrenim görmekte olan 725 öğrenciden elde edilmiştir. Öğrencilerin 491 (%68) tanesi 7. sınıf düzeyinde iken 234 (%32) tanesi 8. sınıf düzeyindedir. Çalışma grubundaki öğrencilerin 391 (%54)’ini kız öğrenciler, 334 (%46)’sını ise erkek öğrenciler oluşturmaktadır. Araştırmadan elde edilen bulgular incelendiğinde Klasik Test Kuramı [KTK] ile yapılan analizlerde özellikle yüksek puanlarda iki madde türü arasında büyük farklılıklar gözlenirken, Madde Tepki Kuramı [MTK] ile yapılan analizlerde öğrencilerin yetenek düzeylerinin kestirimlerinin soru tipinden etkilenmediği böylece uç değerlerde oluşabilecek hataları düşürdüğü gözlenmiştir. ASÇS maddelerinin, KTK yerine MTK ile parametre kestirimlerinin gerçekleştirilmesiyle öğrencilerin toplam puanlarında çok büyük bir farklılığa yol açmayacağı söylenebilir. Çalışmada kullanılan bazı Geleneksel Çoktan Seçmeli [GÇS] maddeler, ASÇS madde formatında iki veya daha fazla soru olacak şekilde seçilmiş ve farklı soru türlerinin ASÇS madde formatında uygulanabilirliği test edilmiştir.

Keywords

Klasik test kuramı , ayrık seçenekli çoktan seçmeli , madde tepki kuramı , matematik başarısı , geleneksel çoktan seçmeli

References

Adediwura, A. A., Ajayi, O. S., & Ibikunle, Y. A. (2021). Students and test variables as predictors of undergraduates’ self-compassion. Journal of Research & Method in Education, 11(3), 42-48.
Alnasraween, M. S., Alsmadi, M. S., Al-zboon, H. S., & Alkurshe, T. O. (2022). The level of universities students’ test wiseness in Jordan during distance learning in light of some variables. Education Research International, 1-10.
Anastasi, A. (1988). Psychological testing (6th ed.). Macmillan.
Bailey, C. D., Briggs, J. W., & Irving, J. H. (2022). Test-wiseness and multiple-choice accounting questions: Implications for instructors. Issues in Accounting Education, 37(2), 1–14.
Baker, D. L., & Baker, R. L. (2022). Knowledge and wisdom: High stakes testing and learning outcomes. In Neuroethical policy design. Studies in brain and mind. (pp.101-118). Springer.
Baker, F. B. (2001). The basics of item response theory (ED458219). ERIC. https://eric.ed.gov/?id=ED458219
Baker, F. B., & Kim, S. H. (Eds.). (2004). Item response theory: Parameter estimation techniques (2nd ed.). CRC.
Bolt, D. M., Kim, N., Wollack, J., Pan, Y., Eckerly, C., & Sowles, J. (2020). A psychometric model for discrete-option multiple-choice items. Applied Psychological Measurement, 44(1), 33–48.
Bolt, D. M., Lee, S., Wollack, J., Eckerly, C., & Sowles, J. (2018). Application of asymmetric IRT modeling to discrete-option Multiple-Choice Test items. Frontiers in Psychology, 9, 1-7.
Bolt, D. M., Wollack, J. A., & Suh, Y. (2012). Application of a multidimensional nested logit model to multiple-choice test items. Psychometrika, 77, 263-287.
Bowman, M. L. (1989). Testing individual differences in ancient China. American Psychologist, 44(3), 576–578.
Burt, C. (1911). Experimental tests of higher mental processes and their relation to general intelligence. Journal of Experimental Pedagogy, 1, 93–112.
Burt, C. (1972). Inheritance of general intelligence. American Psychologist, 27(3), 175–190.
Caveon. (2020). Technology and internet-based services subscriber agreement. Retrieved October 10, 2023, from https://caveon.com/caveon-privacy-policy-for-students-of-education-subscribers/
Cohen, R. J., & Swerdlik, M. E. (2018). Psychological testing and assessment (9th ed.). McGraw-Hill Education.
Cronbach, L. (1990). Essentials of psychological testing (5th ed.). Harper & Row.
Davis, N. T. (1996). Transition from objectivism to constructivism in science education. International Journal of Science Education, 15(6), 627-636. Eckerly, C., Smith, R. W., & Sowles, J. (2017). Analysis of the discrete option multiple choice item: Examples from its certification. Paper presented at the Conference on Test Security, Madison, WI.
Eckerly, C., Smith, R., & Sowles, J. (2018). Fairness concerns of discrete option multiple-choice items. Practical Assessment, Research & Evaluation, 23(16), 1–10.
Erdoğan, İ. (2003). Çağdaş eğitim sistemleri [Contemporary education systems] (5th. ed.). Sistem.
Fagley, N. S. (1987). Positional response bias in multiple-choice tests of learning: Its relation to test wiseness and guessing strategy. Journal of Educational Psychology, 79(1), 95–97.
Forsblom, L., Pekrun, R., Loderer, K., & Peixoto, F. (2022). Cognitive appraisals, achievement emotions, and students’ math achievement: A longitudinal analysis. Journal of Educational Psychology, 114(2), 346–367.
Foster, D. F., & Miller, H. L. (2009). A new format for multiple-choice testing: Discrete option multiple-choice. Results from early studies. Psychology Science Quarterly, 51, 355-369.
Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to design and evaluate research in education (8th ed.). McGraw Hill.
Funk, R., Hooper, T., Hadlock, E., Whicker, J., Estes, D., & Miller, H. L. (2010). Differential effects of the discrete-option multiple-choice format on test takers’ assessment preparation and scores. Poster presented at the Mary Lou Fulton Undergraduate Research Conference, Provo, UT.
Gibb, B. G. (1964). Testwiseness as a secondary cue response (Publication No. 6407643) [Doctoral dissertation, Stanford University]. ProQuest Thesis Center.
Goodenough, F. L. (1926). Measurement of intelligence by drawings. World Book Company.
Gorney, K., & Wollack, J. A. (2022). Does item format affect test security? Practical Assessment, Research, & Evaluation, 27(15), 1-13.
Guo, H., Rios, J. A., Ling, G., Wang, Z., Gu, L., Yang, Z., & Liu, L. O. (2022). Influence of selected-response format variants on test characteristics and test-taking effort: An empirical study. ETS Research Report Series, 2022(1), 1–20.
Güler, N. (2011). Eğitimde ölçme ve değerlendirme [Measurement and evaluation in education]. Pegem. Haladyna, T. M., & Downing, S. M. (2004). Construct-irrelevant variance in high-stakes testing. Educational Measurement: Issues and Practice, 23(1), 17–27.
Holmes, P. (2002). Multiple evaluation versus multiple choice as testing paradigm [Doctoral dissertation, Twente University]. University of Twente Research Information.
Janda, L. H. (1997). Psychological testing: Theory and applications (1st ed.). Pearson.
Kingston, N. M., Tiemann, G. C., Miller, H. L., & Foster, D. (2012). An analysis of the discrete-option multiple-choice item type. Psychological Test and Assessment Modeling, 54(1), 3–19. Kumandaş, H., & Kutlu, Ö. (2010). High stakes testing: Does secondary education examination involve any risks? Procedia- Social and Behavioral Sciences, 9, 758-764.
Lions, S., Monsalve, C., Dartnell, P., Blanco, M. P., Ortega, G., & Lemarié, J. (2022). Does the response options placement provide clues to the correct answers in multiple-choice tests? A systematic review. Applied Measurement in Education, 35(2), 133–152.
Lowell, F. (1919). A preliminary report of some group tests of general intelligence. Journal of Educational Psychology, 10(7), 323–344.
Papenberg, M. (2018). On how test wiseness and acceptance reluctance influence the validity of sequential knowledge tests [Unpublished doctoral dissertation]. Heinrich-Heine-University Düsseldorf.
Papenberg, M., Diedenhofen, B., & Musch, J. (2019). Experimental validation of sequential multiple-choice tests. Journal of Experimental Education, 89(2), 402–421.
Papenberg, M., Willing, S., & Musch, J. (2017). Sequentially presented response options prevent the use of testwiseness cues in multiple-choice testing. Psychological Test and Assessment Modeling, 59(2), 245-266.
Popham, W. J. (1999). Modern educational measurement: Practical guidelines for educational leaders (3rd ed.). Pearson.
Porteus, S. D. (1915). Mental tests for the feebleminded: A new series. Journel of Psycho-Asthenics, 19, 200-213.
Rodriguez, M. C. (2005). Three options are optimal for multiple-choice items: A meta-analysis of 80 years of research. Educational Measurement Issues and Practice. 24(2), 3–13.
Rost, D. H., & Sparfeldt, J. R. (2007). Reading comprehension without reading? On the construct validity of multiple-choice reading comprehension test items. Zeitschrift für Pädagogische Psychologie, 21, 305-314.
Rotthoff, T., Fahron, U., Baehring, T., & Scherbaum, W. A. (2008). The quality of CME questions as a component part of continuing medical education--an empirical study. Zeitschrift für Ärztliche Fortbildung und Qualität im Gesundheitswesen, 101, 667-674.
Samuel, J., & Hinson, J. (2012). Promoting motivation through technology-based testing. In P. Resta (Ed.), Proceedings of the Society for Information Technology & Teacher Education International Conference Chesapeake, VA: AACE.
Taylor, C., & Gardner, P. L. (1999). An alternative method of answering and scoring multiple-choice tests. Research in Science Education, 29, 353–363.
Vidal Rodeiro, C., & Macinska, S. (2022). Equal opportunity or unfair advantage? The impact of test accommodations on performance in high-stakes assessments. Assessment in Education: Principles, Policy & Practice, 29(4), 462–481. Wainer, H., Dorans, N. J., Eignor, D. R., Flaugher, R. L., Green, B. F., Mislevy, R. J., Steinberg, L., & Thissen, D. (2015). Computerized adaptive testing a primer (2nd ed.). Routledge.
Willing, S. (2013). Discrete-option multiple-choice: Evaluating the psychometric properties of a new method of knowledge assessment [Unpublished doctoral dissertation]. Heinrich-Heine-University Düsseldorf.
Willing, S., Ostapczuk, M., & Musch, J. (2015). Do sequentially presented answer options prevent the use of test wiseness cues on continuing medical education tests? Advances in Health Sciences Education: Theory and Practice, 20(1), 247–263.
Woodworth, R. S. (1910). Race differences in mental traits. Science, 31(788), 171-86.
Zhai, X., Haudek, K. C., Wilson, C., & Stuhlsatz, M. (2021). A framework of construct-irrelevant variance for contextualized constructed response assessment. Frontiers in Education, 6, 1-13.

There are 49 citations in total.

Details

Primary Language	English
Subjects	Measurement and Evaluation in Education (Other), Mathematics Education
Journal Section	Research Article
Authors	Atilla Özdemir 0000-0003-4775-4435 Selahattin Gelbal 0000-0001-5181-7262
Publication Date	July 31, 2024
Submission Date	January 17, 2024
Acceptance Date	June 24, 2024
Published in Issue	Year 2024 Volume: 8 Issue: 2

Cite

APA	Özdemir, A., & Gelbal, S. (2024). The Discrete Option Multiple Choice Items as A Measurement Instrument for Mathematics Achievement. Türk Akademik Yayınlar Dergisi (TAY Journal), 8(2), 317-348.

Download Cover Image

Article Files

Full Text

25622 28412 19117 19118 19119 27281 27284 27285 27290 27291 27292 27294 28976