Unproctored Computerized Adaptive Testing (CAT) is gaining traction due to its convenience, flexibility, and scalability, particularly in high-stakes assessments. However, the lack of proctor can give rise to aberrant testing behavior. These behaviors can impair the validity of test scores. This paper explores the use of a verification test to detect aberrant testing behavior in unproctored CAT environments. This study aims to use multiple measures to detect aberrant response patterns in CAT via a paper-and-pencil (P&P) test as well as to compare the sensitivity and specificity performances of the l_z person-fit statistic (PFS) using no-stage and two-stage (l_z is used after the Kullback–Leibler divergence (KLD) measure) methods in different conditions. Three factors were manipulated – the aberrance percentage, the aberrance scenario, and the aberrant examinee’s ability range. The study found that in all scenarios, the specificity performance of l_z in classifying examinees was higher than its sensitivity performance in no-stage and two-stage analyses. However, the sensitivity performance of〖 l〗_z was higher in two-stage analysis.
Aberrant testing behaviour l_z person-fit statistic Divergence measure Unproctored CAT Verification test.
Unproctored Computerized Adaptive Testing (CAT) is gaining traction due to its convenience, flexibility, and scalability, particularly in high-stakes assessments. However, the lack of proctor can give rise to aberrant testing behavior. These behaviors can impair the validity of test scores. This paper explores the use of a verification test to detect aberrant testing behavior in unproctored CAT environments. This study aims to use multiple measures to detect aberrant response patterns in CAT via a paper-and-pencil (P&P) test as well as to compare the sensitivity and specificity performances of the l_z person-fit statistic (PFS) using no-stage and two-stage (l_z is used after the Kullback–Leibler divergence (KLD) measure) methods in different conditions. Three factors were manipulated – the aberrance percentage, the aberrance scenario, and the aberrant examinee’s ability range. The study found that in all scenarios, the specificity performance of l_z in classifying examinees was higher than its sensitivity performance in no-stage and two-stage analyses. However, the sensitivity performance of〖 l〗_z was higher in two-stage analysis.
Aberrant testing behaviour l_z person-fit statistic Divergence measure Unproctored CAT Verification test.
| Primary Language | English |
|---|---|
| Subjects | Computer Based Exam Applications, Similation Study |
| Journal Section | Research Article |
| Authors | |
| Submission Date | December 8, 2024 |
| Acceptance Date | June 18, 2025 |
| Early Pub Date | July 21, 2025 |
| Publication Date | September 4, 2025 |
| Published in Issue | Year 2025 Volume: 12 Issue: 3 |