Defining Cut Point for Kullback-Leibler Divergence to Detect Answer Copying

Arzu Uçar; Celal Doğan

doi:10.21449/ijate.864078

EN TR

Defining Cut Point for Kullback-Leibler Divergence to Detect Answer Copying

Abstract

Distance learning has become a popular phenomenon across the world during the COVID-19 pandemic. This led to answer copying behavior among individuals. The cut point of the Kullback-Leibler Divergence (KL) method, one of the copy detecting methods, was calculated using the Youden Index, Cost-Benefit, and Min Score p-value approaches. Using the cut point obtained, individuals were classified as a copier or not, and the KL method was examined for cases where the determination power of the KL method was 1000, and 3000 sample size, 40 test length, copiers' rate was 0.05 and 0.15, and copying percentage was 0.1, 0.3 and 0.6. As a result, when the cut point was obtained with the Min Score p-value approach, one of the cutting methods approaches, it was seen that the power of the KL index to detect copier was high under all conditions. Similarly, under all conditions, it was observed that the second method, in which the detection power of the KL method was high, was the Youden Index approach. When the sample size and the copiers' rate increased, it was observed that the power of the KL method decreased when the cut point with the cost-benefit approach was used.

Keywords

Defining Cut Point for Kullback-Leibler Divergence to Detect Answer Copying

Öz

Distance learning has become a popular phenomenon across the world during the COVID-19 pandemic. This led to answer copying behavior among individuals. The cut point of the Kullback-Leibler Divergence (KL) method, one of the copy detecting methods, was calculated using the Youden Index, Cost-Benefit, and Min Score p-value approaches. Using the cut point obtained, individuals were classified as a copier or not, and the KL method was examined for cases where the determination power of the KL method was 1000, and 3000 sample size, 40 test length, copiers' rate was 0.05 and 0.15, and copying percentage was 0.1, 0.3 and 0.6. As a result, when the cut point was obtained with the Min Score p-value approach, one of the cutting methods approaches, it was seen that the power of the KL index to detect copier was high under all conditions. Similarly, under all conditions, it was observed that the second method, in which the detection power of the KL method was high, was the Youden Index approach. When the sample size and the copiers' rate increased, it was observed that the power of the KL method decreased when the cut point with the cost-benefit approach was used.

Anahtar Kelimeler

References

Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating graph. Journal of Mathematical Psychology, 12, 387 415. https://doi.org/10.1016/0022-2496(75)90001-2
Belov, D. I., & Armstrong, R. D. (2010). Automatic detection of answer copying via Kullback–Leibler divergence and K-index. Applied Psychological Measurement, 34, 379–392. https://doi.org/10.1177/0146621610370453
Belov, D. (2011). Detection of Answer Copying Based on the Structure of a High-Stakes Test. Applied Psychological Measurement, 35(7), 495 517. https://doi.org/10.1177/0146621611420705
Belov, D. (2013). Detection of test collusion via Kullback–Leibler divergence. Journal of Educational Measurement,50, 141-163. https://doi.org/10.1111/jedm.12008
Belov, D. (2014a). Detection of Aberrant Answer Changes via Kullback– Leibler Divergence (Report No. RR 14-04). Law School Admission Council.
Belov, D. I. (2014b). Detecting item preknowledge in computerized adaptive testing using information theory and combinatorial optimization. Journal of Computerized Adaptive Testing, 2, 37-58. http://dx.doi.org/10.7333%2Fjcat.v2i0.36
Chalmers, P. (2020). Multidimensional item response model (mirt) [Computer software manual]. https://cran.r project.org/web/packages/mirt/mirt.pdf
Chang, H.-H., & Ying, Z. (1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 20, 213 229. https://doi.org/10.1177/014662169602000303

He, Q., Meadows, M., & Black, B. (2018). Statistical techniques for studying anomaly in test results: a review of literature (Report No: Ofqual 6355-5). Office of Qualifications and Examinations Regulation.
Hurtz, G., & Weiner, J. (2019). Analysis of test-taker profiles across a suite of statistical indices for detecting the presence and impact of cheating. Journal of Applied Testing Technology, 20(1), 1 15. http://www.jattjournal.com/index.php/atp/article/view/140828
Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person fit statistics. Applied Measurement in Education, 16(4), 277 298. https://doi.org/10.1207/S15324818AME1604_2
Krzanowski, W., & Hand, D. (2009). ROC curves for continuous data. Chapman and Hall/CRC Press.
Kullback, S., & Leibler, R. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79 86. https://www.jstor.org/stable/2236703
Lindahl, J., & Danell, R. (2016). The information value of early career productivity in mathematics: a ROC analysis of prediction errors in bibliometricly informed decision making. Scientometrics, 109, 2241-2262. https://doi.org/10.1007/s11192-016-2097-9
Maynes, D. (2014). Detection of non-independent test taking by similarity analysis. In N.M. Kingston & A.K. Clark (Eds.), Test Fraud: Statistical Detection and Methodology (pp. 52-80). Routledge Research in Education.
McNeill, B., Keeler, E., & Adelstein, S. (1975). Primer on Certain Elements of Medical Decision Making, with Comments on Analysis ROC. The New England Journal of Medicine, 293, 211 215.https://www.researchgate.net/publication/22346698_Primer_on_Certain_Elements_of_Medical_Decision_Making
Meijer, R., & Sijtsma, K. (2001). Methodology review: evaluating person fit. Applied Psychological Measurement, 25, 107 135. https://doi.org/10.1177/01466210122031957
Meijer, R., & Tendeiro, J. (2014). The use of person-fit scores in high stakes educational testing: How to use them and what they tell us. (Report No. RR 14-03). Law School Admission Council.
Metz, C. (1978). Basic Principles of ROC Analysis. Seminars in Nuclear Medicine, 8, https://doi.org/10.1016/S0001-2998(78)80014-2
Metz, C., Starr, S., Lusted, L., & Rossmann, K. (1975). Progress in Evaluation of Human Observer Visual Detection Performance Using the ROC Curve Approach. In C. Raynaud & A. E. Todd-Pokropek (Eds.), Information processing in scintigraphy (pp. 420-436). Orsay.
Partchev, I. (2017). A collection of functions related to ıtem response theory (irtoys) [Computer software manual]. https://cran.r-project.org/web/packages/irtoys/irtoys.pdf
Raton-Lopez, M. & Rodriquez-Alvarez, X. M. (2019.). Computing optimal cut points in diagnostic tests (OptimalCutpoints) [Computer software manual]. https://cran.r project.org/web/packages/OptimalCutpoints/OptimalCutpoints.pdf
Raton-Lopez, M., Rodriquez-Alvarez, X. M., Suarez- Cadarso, C., & Sampedro-Gude, F. (2014). OptimalCutpoints: An R Package for Selecting Optimal Cut points in Diagnostic Tests. Journal of Statistical Software,61(8), 1-36. https://www.jstatsoft.org/v061/i08
Shu, Z., Henson, R., & Luecht, R. (2013). Using deterministic, gated item response. Psychometrika, 78, 481-497. https://doi.org/10.1007/s11336-012-9311-3
Singmann, H. (2020). Complete Environment for Bayesian Inference (LaplaceDemon) [Computer software manual]. https://cran.r project.org/web/packages/LaplacesDemon/LaplacesDemon.pdf
Sotaridona, L., & Meijer, R. (2002). Statistical properties of the K-index for detecting answer copying in a multiple-choice test. Journal of Educational Measurement, 39(2), 115-132. https://www.jstor.org/stable/1435251
Sotaridona, L., & Meijer, R. (2003). Two new statistics to detect answer copying. Journal of Educational Measurement, 40(1), 53-70. https://www.jstor.org/stable/1435054
Steinkamp, S. (2017). Identifying aberrant responding: Use of multiple measures [Doctoral dissertation]. https://conservancy.umn.edu/bitstream/handle/11299/188885/Steinkamp_umn_0130E_18212.pdf?sequence=1&isAllowed=y
Sunbul, O., & Yormaz, S. (2018). Investigating the performance of omega index according to item parameters and ability levels. Eurasian Journal of Educational Research, 74, 207-226. https://ejer.com.tr/public/assets/catalogs/en/11_EJER_SYormaz.pdf
Swets, J. (1979). ROC Analysis Applied to the Evaluation of Medical Imaging Techniques. Investigative Radiology, 14(2), 109-121.
Swets, J., & Pickett, R. (1982). Evaluation of diagnostic systems: methods from signal detection theory. Academic Press.
Swets, J., & Swets, J. (1976). ROC approach to cost/benefit analysis. In KL. Ripley & A. Murray (Eds.), Proceedings of the Sixth IEEE Conference on Computer Applications in Radiology. IEEE Computer Society Press.
van der Linden, W., & Sotaridona, L. (2006). Detecting answer copying when the regular response process follows a known response model. Journal of Educational and Behavioral Statistics, 31(3), 283 304. https://www.jstor.org/stable/4122441
Voncken, L. (2014). Comparison of the Lz* Person-Fit Index and ω Copying-Index in Copying Detection. (First Year Paper). Universiteit van Tilburg. http://arno.uvt.nl/show.cgi?fid=135361
Wesolowsky, G. (2000). Detecting excessive similarity in answers on multiple choice exams. Journal of Applied Statistics, 27(7), 909 921. https://doi.org/10.1080/02664760050120588
Wollack, J. (1997). A nominal response model approach for detecting answer copying. Applied Psychological Measurement, 21(4), 307 320. https://doi.org/10.1177/01466216970214002
Wollack, J. (2003). Comparison of answer copying indices with real data. Journal of Educational Measurement, 40(3), 189–205. https://www.jstor.org/stable/1435127
Wollack, J. (2006). Simultaneous use of multiple answer copying indexes to improve detection rates. Applied Measurement in Education, 19(4), 265 288. https://doi.org/10.1207/s15324818ame1904_3
Wollack, J., & Maynes, D. (2017). Detection of test collusion using cluster analysis. In G. Cizek & J. Wollack (Eds.), Handbook of quantitative methods for detecting cheating on tests (pp. 124-150). Routledge.
Yormaz, S., & Sunbul, O. (2017). Determination of Type I Error Rates and Power of Answer Copying Indices under Various Conditions. Educational Sciences: Theory & Praciıce, 17(1), 5-26. https://doi.org/10.12738/estp.2017.1.0105
Youden, W. (1950). Index for Rating Diagnostic Tests. Cancer, 3, 5 26. https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
Zopluoglu, C. (2016). Classification performance of answer-copying indices under different types of IRT models. Applied Psychological Measurement, 40, 592–607. https://doi.org/10.1177/0146621616664724
Zopluoglu, C., & Davenport, E. (2012). The empirical power and type I error rates of the GBT and ω indices in detecting answer copying on multiple-choice tests. Educational and Psychological Measurement, 72(6), 975 1000. https://doi.org/10.1177/0013164412442941
Zou, K. H., Yu, C.-R., Liu, K., Carlsson, M. O., & Cabrera, J. (2013). Optimal Thresholds by Maximizing or Minimizing Various Metrics via ROC-Type Analysis. Academic Radiology, 20(7), 807–815. https://doi.org/10.1016/j.acra.2013.02.004

Details

Primary Language

English

Subjects

Studies on Education

Journal Section

Research Article

Authors

Arzu Uçar ^* This is me
0000-0002-0099-1348
Türkiye

Celal Doğan This is me
0000-0003-0683-1334
Türkiye

Publication Date

March 15, 2021

Submission Date

September 6, 2020

Acceptance Date

January 16, 2021

Published in Issue

Year 2021 Volume: 8 Number: 1

DOI

https://doi.org/10.21449/ijate.864078

IZ

https://izlik.org/JA29UA68YA

Cite

RIS / Bibtex

APA

Uçar, A., & Doğan, C. (2021). Defining Cut Point for Kullback-Leibler Divergence to Detect Answer Copying. International Journal of Assessment Tools in Education, 8(1), 156-166. https://doi.org/10.21449/ijate.864078