Research Article
BibTex RIS Cite

Defining Cut Point for Kullback-Leibler Divergence to Detect Answer Copying

Year 2021, Volume: 8 Issue: 1, 156 - 166, 15.03.2021
https://doi.org/10.21449/ijate.864078

Abstract

Distance learning has become a popular phenomenon across the world during the COVID-19 pandemic. This led to answer copying behavior among individuals. The cut point of the Kullback-Leibler Divergence (KL) method, one of the copy detecting methods, was calculated using the Youden Index, Cost-Benefit, and Min Score p-value approaches. Using the cut point obtained, individuals were classified as a copier or not, and the KL method was examined for cases where the determination power of the KL method was 1000, and 3000 sample size, 40 test length, copiers' rate was 0.05 and 0.15, and copying percentage was 0.1, 0.3 and 0.6. As a result, when the cut point was obtained with the Min Score p-value approach, one of the cutting methods approaches, it was seen that the power of the KL index to detect copier was high under all conditions. Similarly, under all conditions, it was observed that the second method, in which the detection power of the KL method was high, was the Youden Index approach. When the sample size and the copiers' rate increased, it was observed that the power of the KL method decreased when the cut point with the cost-benefit approach was used.

References

  • Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating graph. Journal of Mathematical Psychology, 12, 387 415. https://doi.org/10.1016/0022-2496(75)90001-2
  • Belov, D. I., & Armstrong, R. D. (2010). Automatic detection of answer copying via Kullback–Leibler divergence and K-index. Applied Psychological Measurement, 34, 379–392. https://doi.org/10.1177/0146621610370453
  • Belov, D. (2011). Detection of Answer Copying Based on the Structure of a High-Stakes Test. Applied Psychological Measurement, 35(7), 495 517. https://doi.org/10.1177/0146621611420705
  • Belov, D. (2013). Detection of test collusion via Kullback–Leibler divergence. Journal of Educational Measurement,50, 141-163. https://doi.org/10.1111/jedm.12008
  • Belov, D. (2014a). Detection of Aberrant Answer Changes via Kullback– Leibler Divergence (Report No. RR 14-04). Law School Admission Council.
  • Belov, D. I. (2014b). Detecting item preknowledge in computerized adaptive testing using information theory and combinatorial optimization. Journal of Computerized Adaptive Testing, 2, 37-58. http://dx.doi.org/10.7333%2Fjcat.v2i0.36
  • Chalmers, P. (2020). Multidimensional item response model (mirt) [Computer software manual]. https://cran.r project.org/web/packages/mirt/mirt.pdf
  • Chang, H.-H., & Ying, Z. (1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 20, 213 229. https://doi.org/10.1177/014662169602000303
  • He, Q., Meadows, M., & Black, B. (2018). Statistical techniques for studying anomaly in test results: a review of literature (Report No: Ofqual 6355-5). Office of Qualifications and Examinations Regulation.
  • Hurtz, G., & Weiner, J. (2019). Analysis of test-taker profiles across a suite of statistical indices for detecting the presence and impact of cheating. Journal of Applied Testing Technology, 20(1), 1 15. http://www.jattjournal.com/index.php/atp/article/view/140828
  • Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person fit statistics. Applied Measurement in Education, 16(4), 277 298. https://doi.org/10.1207/S15324818AME1604_2
  • Krzanowski, W., & Hand, D. (2009). ROC curves for continuous data. Chapman and Hall/CRC Press.
  • Kullback, S., & Leibler, R. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79 86. https://www.jstor.org/stable/2236703
  • Lindahl, J., & Danell, R. (2016). The information value of early career productivity in mathematics: a ROC analysis of prediction errors in bibliometricly informed decision making. Scientometrics, 109, 2241-2262. https://doi.org/10.1007/s11192-016-2097-9
  • Maynes, D. (2014). Detection of non-independent test taking by similarity analysis. In N.M. Kingston & A.K. Clark (Eds.), Test Fraud: Statistical Detection and Methodology (pp. 52-80). Routledge Research in Education.
  • McNeill, B., Keeler, E., & Adelstein, S. (1975). Primer on Certain Elements of Medical Decision Making, with Comments on Analysis ROC. The New England Journal of Medicine, 293, 211 215.https://www.researchgate.net/publication/22346698_Primer_on_Certain_Elements_of_Medical_Decision_Making
  • Meijer, R., & Sijtsma, K. (2001). Methodology review: evaluating person fit. Applied Psychological Measurement, 25, 107 135. https://doi.org/10.1177/01466210122031957
  • Meijer, R., & Tendeiro, J. (2014). The use of person-fit scores in high stakes educational testing: How to use them and what they tell us. (Report No. RR 14-03). Law School Admission Council.
  • Metz, C. (1978). Basic Principles of ROC Analysis. Seminars in Nuclear Medicine, 8, https://doi.org/10.1016/S0001-2998(78)80014-2
  • Metz, C., Starr, S., Lusted, L., & Rossmann, K. (1975). Progress in Evaluation of Human Observer Visual Detection Performance Using the ROC Curve Approach. In C. Raynaud & A. E. Todd-Pokropek (Eds.), Information processing in scintigraphy (pp. 420-436). Orsay.
  • Partchev, I. (2017). A collection of functions related to ıtem response theory (irtoys) [Computer software manual]. https://cran.r-project.org/web/packages/irtoys/irtoys.pdf
  • Raton-Lopez, M. & Rodriquez-Alvarez, X. M. (2019.). Computing optimal cut points in diagnostic tests (OptimalCutpoints) [Computer software manual]. https://cran.r project.org/web/packages/OptimalCutpoints/OptimalCutpoints.pdf
  • Raton-Lopez, M., Rodriquez-Alvarez, X. M., Suarez- Cadarso, C., & Sampedro-Gude, F. (2014). OptimalCutpoints: An R Package for Selecting Optimal Cut points in Diagnostic Tests. Journal of Statistical Software,61(8), 1-36. https://www.jstatsoft.org/v061/i08
  • Shu, Z., Henson, R., & Luecht, R. (2013). Using deterministic, gated item response. Psychometrika, 78, 481-497. https://doi.org/10.1007/s11336-012-9311-3
  • Singmann, H. (2020). Complete Environment for Bayesian Inference (LaplaceDemon) [Computer software manual]. https://cran.r project.org/web/packages/LaplacesDemon/LaplacesDemon.pdf
  • Sotaridona, L., & Meijer, R. (2002). Statistical properties of the K-index for detecting answer copying in a multiple-choice test. Journal of Educational Measurement, 39(2), 115-132. https://www.jstor.org/stable/1435251
  • Sotaridona, L., & Meijer, R. (2003). Two new statistics to detect answer copying. Journal of Educational Measurement, 40(1), 53-70. https://www.jstor.org/stable/1435054
  • Steinkamp, S. (2017). Identifying aberrant responding: Use of multiple measures [Doctoral dissertation]. https://conservancy.umn.edu/bitstream/handle/11299/188885/Steinkamp_umn_0130E_18212.pdf?sequence=1&isAllowed=y
  • Sunbul, O., & Yormaz, S. (2018). Investigating the performance of omega index according to item parameters and ability levels. Eurasian Journal of Educational Research, 74, 207-226. https://ejer.com.tr/public/assets/catalogs/en/11_EJER_SYormaz.pdf
  • Swets, J. (1979). ROC Analysis Applied to the Evaluation of Medical Imaging Techniques. Investigative Radiology, 14(2), 109-121.
  • Swets, J., & Pickett, R. (1982). Evaluation of diagnostic systems: methods from signal detection theory. Academic Press.
  • Swets, J., & Swets, J. (1976). ROC approach to cost/benefit analysis. In KL. Ripley & A. Murray (Eds.), Proceedings of the Sixth IEEE Conference on Computer Applications in Radiology. IEEE Computer Society Press.
  • van der Linden, W., & Sotaridona, L. (2006). Detecting answer copying when the regular response process follows a known response model. Journal of Educational and Behavioral Statistics, 31(3), 283 304. https://www.jstor.org/stable/4122441
  • Voncken, L. (2014). Comparison of the Lz* Person-Fit Index and ω Copying-Index in Copying Detection. (First Year Paper). Universiteit van Tilburg. http://arno.uvt.nl/show.cgi?fid=135361
  • Wesolowsky, G. (2000). Detecting excessive similarity in answers on multiple choice exams. Journal of Applied Statistics, 27(7), 909 921. https://doi.org/10.1080/02664760050120588
  • Wollack, J. (1997). A nominal response model approach for detecting answer copying. Applied Psychological Measurement, 21(4), 307 320. https://doi.org/10.1177/01466216970214002
  • Wollack, J. (2003). Comparison of answer copying indices with real data. Journal of Educational Measurement, 40(3), 189–205. https://www.jstor.org/stable/1435127
  • Wollack, J. (2006). Simultaneous use of multiple answer copying indexes to improve detection rates. Applied Measurement in Education, 19(4), 265 288. https://doi.org/10.1207/s15324818ame1904_3
  • Wollack, J., & Maynes, D. (2017). Detection of test collusion using cluster analysis. In G. Cizek & J. Wollack (Eds.), Handbook of quantitative methods for detecting cheating on tests (pp. 124-150). Routledge.
  • Yormaz, S., & Sunbul, O. (2017). Determination of Type I Error Rates and Power of Answer Copying Indices under Various Conditions. Educational Sciences: Theory & Praciıce, 17(1), 5-26. https://doi.org/10.12738/estp.2017.1.0105
  • Youden, W. (1950). Index for Rating Diagnostic Tests. Cancer, 3, 5 26. https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
  • Zopluoglu, C. (2016). Classification performance of answer-copying indices under different types of IRT models. Applied Psychological Measurement, 40, 592–607. https://doi.org/10.1177/0146621616664724
  • Zopluoglu, C., & Davenport, E. (2012). The empirical power and type I error rates of the GBT and ω indices in detecting answer copying on multiple-choice tests. Educational and Psychological Measurement, 72(6), 975 1000. https://doi.org/10.1177/0013164412442941
  • Zou, K. H., Yu, C.-R., Liu, K., Carlsson, M. O., & Cabrera, J. (2013). Optimal Thresholds by Maximizing or Minimizing Various Metrics via ROC-Type Analysis. Academic Radiology, 20(7), 807–815. https://doi.org/10.1016/j.acra.2013.02.004

Defining Cut Point for Kullback-Leibler Divergence to Detect Answer Copying

Year 2021, Volume: 8 Issue: 1, 156 - 166, 15.03.2021
https://doi.org/10.21449/ijate.864078

Abstract

Distance learning has become a popular phenomenon across the world during the COVID-19 pandemic. This led to answer copying behavior among individuals. The cut point of the Kullback-Leibler Divergence (KL) method, one of the copy detecting methods, was calculated using the Youden Index, Cost-Benefit, and Min Score p-value approaches. Using the cut point obtained, individuals were classified as a copier or not, and the KL method was examined for cases where the determination power of the KL method was 1000, and 3000 sample size, 40 test length, copiers' rate was 0.05 and 0.15, and copying percentage was 0.1, 0.3 and 0.6. As a result, when the cut point was obtained with the Min Score p-value approach, one of the cutting methods approaches, it was seen that the power of the KL index to detect copier was high under all conditions. Similarly, under all conditions, it was observed that the second method, in which the detection power of the KL method was high, was the Youden Index approach. When the sample size and the copiers' rate increased, it was observed that the power of the KL method decreased when the cut point with the cost-benefit approach was used.

References

  • Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating graph. Journal of Mathematical Psychology, 12, 387 415. https://doi.org/10.1016/0022-2496(75)90001-2
  • Belov, D. I., & Armstrong, R. D. (2010). Automatic detection of answer copying via Kullback–Leibler divergence and K-index. Applied Psychological Measurement, 34, 379–392. https://doi.org/10.1177/0146621610370453
  • Belov, D. (2011). Detection of Answer Copying Based on the Structure of a High-Stakes Test. Applied Psychological Measurement, 35(7), 495 517. https://doi.org/10.1177/0146621611420705
  • Belov, D. (2013). Detection of test collusion via Kullback–Leibler divergence. Journal of Educational Measurement,50, 141-163. https://doi.org/10.1111/jedm.12008
  • Belov, D. (2014a). Detection of Aberrant Answer Changes via Kullback– Leibler Divergence (Report No. RR 14-04). Law School Admission Council.
  • Belov, D. I. (2014b). Detecting item preknowledge in computerized adaptive testing using information theory and combinatorial optimization. Journal of Computerized Adaptive Testing, 2, 37-58. http://dx.doi.org/10.7333%2Fjcat.v2i0.36
  • Chalmers, P. (2020). Multidimensional item response model (mirt) [Computer software manual]. https://cran.r project.org/web/packages/mirt/mirt.pdf
  • Chang, H.-H., & Ying, Z. (1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 20, 213 229. https://doi.org/10.1177/014662169602000303
  • He, Q., Meadows, M., & Black, B. (2018). Statistical techniques for studying anomaly in test results: a review of literature (Report No: Ofqual 6355-5). Office of Qualifications and Examinations Regulation.
  • Hurtz, G., & Weiner, J. (2019). Analysis of test-taker profiles across a suite of statistical indices for detecting the presence and impact of cheating. Journal of Applied Testing Technology, 20(1), 1 15. http://www.jattjournal.com/index.php/atp/article/view/140828
  • Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person fit statistics. Applied Measurement in Education, 16(4), 277 298. https://doi.org/10.1207/S15324818AME1604_2
  • Krzanowski, W., & Hand, D. (2009). ROC curves for continuous data. Chapman and Hall/CRC Press.
  • Kullback, S., & Leibler, R. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79 86. https://www.jstor.org/stable/2236703
  • Lindahl, J., & Danell, R. (2016). The information value of early career productivity in mathematics: a ROC analysis of prediction errors in bibliometricly informed decision making. Scientometrics, 109, 2241-2262. https://doi.org/10.1007/s11192-016-2097-9
  • Maynes, D. (2014). Detection of non-independent test taking by similarity analysis. In N.M. Kingston & A.K. Clark (Eds.), Test Fraud: Statistical Detection and Methodology (pp. 52-80). Routledge Research in Education.
  • McNeill, B., Keeler, E., & Adelstein, S. (1975). Primer on Certain Elements of Medical Decision Making, with Comments on Analysis ROC. The New England Journal of Medicine, 293, 211 215.https://www.researchgate.net/publication/22346698_Primer_on_Certain_Elements_of_Medical_Decision_Making
  • Meijer, R., & Sijtsma, K. (2001). Methodology review: evaluating person fit. Applied Psychological Measurement, 25, 107 135. https://doi.org/10.1177/01466210122031957
  • Meijer, R., & Tendeiro, J. (2014). The use of person-fit scores in high stakes educational testing: How to use them and what they tell us. (Report No. RR 14-03). Law School Admission Council.
  • Metz, C. (1978). Basic Principles of ROC Analysis. Seminars in Nuclear Medicine, 8, https://doi.org/10.1016/S0001-2998(78)80014-2
  • Metz, C., Starr, S., Lusted, L., & Rossmann, K. (1975). Progress in Evaluation of Human Observer Visual Detection Performance Using the ROC Curve Approach. In C. Raynaud & A. E. Todd-Pokropek (Eds.), Information processing in scintigraphy (pp. 420-436). Orsay.
  • Partchev, I. (2017). A collection of functions related to ıtem response theory (irtoys) [Computer software manual]. https://cran.r-project.org/web/packages/irtoys/irtoys.pdf
  • Raton-Lopez, M. & Rodriquez-Alvarez, X. M. (2019.). Computing optimal cut points in diagnostic tests (OptimalCutpoints) [Computer software manual]. https://cran.r project.org/web/packages/OptimalCutpoints/OptimalCutpoints.pdf
  • Raton-Lopez, M., Rodriquez-Alvarez, X. M., Suarez- Cadarso, C., & Sampedro-Gude, F. (2014). OptimalCutpoints: An R Package for Selecting Optimal Cut points in Diagnostic Tests. Journal of Statistical Software,61(8), 1-36. https://www.jstatsoft.org/v061/i08
  • Shu, Z., Henson, R., & Luecht, R. (2013). Using deterministic, gated item response. Psychometrika, 78, 481-497. https://doi.org/10.1007/s11336-012-9311-3
  • Singmann, H. (2020). Complete Environment for Bayesian Inference (LaplaceDemon) [Computer software manual]. https://cran.r project.org/web/packages/LaplacesDemon/LaplacesDemon.pdf
  • Sotaridona, L., & Meijer, R. (2002). Statistical properties of the K-index for detecting answer copying in a multiple-choice test. Journal of Educational Measurement, 39(2), 115-132. https://www.jstor.org/stable/1435251
  • Sotaridona, L., & Meijer, R. (2003). Two new statistics to detect answer copying. Journal of Educational Measurement, 40(1), 53-70. https://www.jstor.org/stable/1435054
  • Steinkamp, S. (2017). Identifying aberrant responding: Use of multiple measures [Doctoral dissertation]. https://conservancy.umn.edu/bitstream/handle/11299/188885/Steinkamp_umn_0130E_18212.pdf?sequence=1&isAllowed=y
  • Sunbul, O., & Yormaz, S. (2018). Investigating the performance of omega index according to item parameters and ability levels. Eurasian Journal of Educational Research, 74, 207-226. https://ejer.com.tr/public/assets/catalogs/en/11_EJER_SYormaz.pdf
  • Swets, J. (1979). ROC Analysis Applied to the Evaluation of Medical Imaging Techniques. Investigative Radiology, 14(2), 109-121.
  • Swets, J., & Pickett, R. (1982). Evaluation of diagnostic systems: methods from signal detection theory. Academic Press.
  • Swets, J., & Swets, J. (1976). ROC approach to cost/benefit analysis. In KL. Ripley & A. Murray (Eds.), Proceedings of the Sixth IEEE Conference on Computer Applications in Radiology. IEEE Computer Society Press.
  • van der Linden, W., & Sotaridona, L. (2006). Detecting answer copying when the regular response process follows a known response model. Journal of Educational and Behavioral Statistics, 31(3), 283 304. https://www.jstor.org/stable/4122441
  • Voncken, L. (2014). Comparison of the Lz* Person-Fit Index and ω Copying-Index in Copying Detection. (First Year Paper). Universiteit van Tilburg. http://arno.uvt.nl/show.cgi?fid=135361
  • Wesolowsky, G. (2000). Detecting excessive similarity in answers on multiple choice exams. Journal of Applied Statistics, 27(7), 909 921. https://doi.org/10.1080/02664760050120588
  • Wollack, J. (1997). A nominal response model approach for detecting answer copying. Applied Psychological Measurement, 21(4), 307 320. https://doi.org/10.1177/01466216970214002
  • Wollack, J. (2003). Comparison of answer copying indices with real data. Journal of Educational Measurement, 40(3), 189–205. https://www.jstor.org/stable/1435127
  • Wollack, J. (2006). Simultaneous use of multiple answer copying indexes to improve detection rates. Applied Measurement in Education, 19(4), 265 288. https://doi.org/10.1207/s15324818ame1904_3
  • Wollack, J., & Maynes, D. (2017). Detection of test collusion using cluster analysis. In G. Cizek & J. Wollack (Eds.), Handbook of quantitative methods for detecting cheating on tests (pp. 124-150). Routledge.
  • Yormaz, S., & Sunbul, O. (2017). Determination of Type I Error Rates and Power of Answer Copying Indices under Various Conditions. Educational Sciences: Theory & Praciıce, 17(1), 5-26. https://doi.org/10.12738/estp.2017.1.0105
  • Youden, W. (1950). Index for Rating Diagnostic Tests. Cancer, 3, 5 26. https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
  • Zopluoglu, C. (2016). Classification performance of answer-copying indices under different types of IRT models. Applied Psychological Measurement, 40, 592–607. https://doi.org/10.1177/0146621616664724
  • Zopluoglu, C., & Davenport, E. (2012). The empirical power and type I error rates of the GBT and ω indices in detecting answer copying on multiple-choice tests. Educational and Psychological Measurement, 72(6), 975 1000. https://doi.org/10.1177/0013164412442941
  • Zou, K. H., Yu, C.-R., Liu, K., Carlsson, M. O., & Cabrera, J. (2013). Optimal Thresholds by Maximizing or Minimizing Various Metrics via ROC-Type Analysis. Academic Radiology, 20(7), 807–815. https://doi.org/10.1016/j.acra.2013.02.004
There are 44 citations in total.

Details

Primary Language English
Subjects Studies on Education
Journal Section Articles
Authors

Arzu Uçar This is me 0000-0002-0099-1348

Celal Doğan This is me 0000-0003-0683-1334

Publication Date March 15, 2021
Submission Date September 6, 2020
Published in Issue Year 2021 Volume: 8 Issue: 1

Cite

APA Uçar, A., & Doğan, C. (2021). Defining Cut Point for Kullback-Leibler Divergence to Detect Answer Copying. International Journal of Assessment Tools in Education, 8(1), 156-166. https://doi.org/10.21449/ijate.864078

23823             23825             23824