This study is based on a
vertical scaling implemented with reference to the Item Response Theory, and
involves a comparison of vertical scaling results obtained through the
application of proficiency estimation methods and calibration methods. The
vertical scales thus developed were assessed with reference to the criteria of
grade-to-grade growth, grade-to-grade variability, and the separation of grade
distributions. The data used in the study pertains to a dataset composed of a
total of 1500 students from twelve primary schools in the province of Ankara,
characterized by different levels of socio-economic cultural development. The
comparison of the findings pertaining to the first and second sub-problems
reveals that the mean differences found through separate calibration were lower
than those applicable to concurrent calibration, while the standard deviation
found in the case of separate calibration were again lower than the values
established through concurrent calibration. Furthermore, the scale of impact in
the case of separate calibration was again lower than the values applicable to
concurrent calibration. The results reached for all three criteria, using the
concurrent calibration method were ranked in the order ML < MAP < EAP,
with ML leading to the lowest value while EAP producing the highest one. In
case of separate calibration, on the other hand, the ranking of results was
found to vary with reference to the criteria applied.
This study is based on a
vertical scaling implemented with reference to the Item Response Theory, and
involves a comparison of vertical scaling results obtained through the
application of proficiency estimation methods and calibration methods. The
vertical scales thus developed were assessed with reference to the criteria of
grade-to-grade growth, grade-to-grade variability, and the separation of grade
distributions. The data used in the study pertains to a dataset composed of a
total of 1500 students from twelve primary schools in the province of Ankara,
characterized by different levels of socio-economic cultural development. The
comparison of the findings pertaining to the first and the second sub-problems
reveals that the mean differences found through separate calibration were lower
than those applicable to concurrent calibration, while the standard deviation
found in the case of separate calibration were again lower than the values
established through concurrent calibration. Furthermore, the scale of impact in
the case of separate calibration was again lower than the values applicable to
concurrent calibration. The results reached for all three criteria, using the
concurrent calibration method were ranked in the order ML < MAP < EAP, with
ML leading to the lowest value while EAP producing the highest one. In case of
separate calibration, on the other hand, the ranking of results was found to
vary with reference to the criteria applied.
Journal Section | Articles |
---|---|
Authors | |
Publication Date | April 3, 2017 |
Acceptance Date | March 3, 2017 |
Published in Issue | Year 2017 Volume: 8 Issue: 1 |