Analytical Weighting Scoring for Physics Multiple Correct Items to Improve the Accuracy of Students’ Ability Assessment
Abstract
Purpose: This is a developmental research study that aims to develop a model of polytomous scoring based-on weighting for multiple correct items in the subject of physics. Weighting was analytically applied based on question complexity and imposed penalties on wrong answers. Research Methods: Within the development model, Fenrich's development cycle, consisting of analysis, planning, design, development, implementation, evaluation, and revision, was selected throughout the cycle. The multiple correct items used have 3–4 options. The items were implemented to 140 upper secondary school students and 410 first-year undergraduate students. The students’ physics ability was analyzed using the Quest program, and the results of dichotomous and polytomous scoring were compared. Findings: The results of this study showed that the analytical weighting scoring based on a complexity and penalty system on the developed assessment items generated a higher number of scoring level categories (three to seven categories) than that of dichotomous scoring (only two categories), estimated students’ physics abilities more accurately and in greater detail, had an approximate distribution closer to the normal distribution, and produced a standard deviation smaller than that of dichotomous scoring. Thus, the analytical weighting scoring for multiple correct items in this study was able to produce a more accurate estimation of physics ability than those using dichotomous scoring. Implications for Research and Practice: It is recommended that the assessment of physics ability using multiple-correct items on a large scale can apply the analytical weighting scoring based on the complexity of the content and a penalty system.
Keywords
References
- Adams, R.J., & Khoo, S.T. (1996). Quest (Computer software). The interactive test analysis system. Victoria: Acer.
- Adeyemo, S. A. (2010). Students’ ability level and their competence in problem-solving task in physics. International Journal of Educational Research and Technology, 1(2), 35 – 47.
- Ali, S. H., Carr, P. A., & Ruit, K. G. (2016). Validity and reliability of scores obtained on multiple-choice questions: Why functioning distractors matter. Journal of the Scholarship of Teaching and Learning, 16 (1), 1-14.
- Baghaei, P., & Dourakhshan, A. (2016). Properties of single-response and double-response multiple-choice grammar items. International Journal of Language Testing, 6 (1), 33-49.
- Baker, J.G., Rounds, J.B., & Zeron, M.A. (2000). A comparison of graded response and rasch partial credit models with subjective well-being. Journal of Educational and Behavioral Statistic, 25(3), 253-270.
- Bishara, A. J., & Lanzo, L. A. (2015). All of the above: When multiple correct response options enhance the testing effect. Journal Memory, 23(7), 1013-1028.
- Bush, M. (2015). Reducing the need for guesswork in multiple-choice tests. Assessment & Evaluation in Higher Education, 40(2), 218-231.
- Bond, T. G., & Fox, C. M. (2007). Applying the rasch model: Fundamental measurement in the human sciences (2nd ed.). Mahwah: Lawrence Erlbaum Associates, Publishers.
Details
Primary Language
English
Subjects
-
Journal Section
Research Article
Authors
Wasis -
This is me
Kumaidi -
This is me
Bastari -
This is me
Mundilarto -
This is me
Atik Wintarti
This is me
Publication Date
July 31, 2018
Submission Date
July 31, 2018
Acceptance Date
-
Published in Issue
Year 2018 Volume: 18 Number: 76