Developing an Automated Tool for Psychometric Evaluation and Exam Quality Indexing of Multiple-Choice Questions (MCQs): Phase I Study

Mohammad A. Elmorsy; Doaa El Morsi

doi:10.62425/jopres.1804954

Araştırma Makalesi

BibTex

RIS

Kaynak Göster

Developing an Automated Tool for Psychometric Evaluation and Exam Quality Indexing of Multiple-Choice Questions (MCQs): Phase I Study

Yıl 2026, Cilt: 4 Sayı: 1, 26 - 37, 28.02.2026

Mohammad A. Elmorsy , Doaa El Morsi

https://doi.org/10.62425/jopres.1804954

https://izlik.org/JA42AC52HE

Öz

Accurate educational assessment is critical for evaluating student learning and advancing instructional practices. Despite the robustness of Classical Test Theory (CTT) in assessing exam quality, its statistical complexity often limits its practical application by educators. This study presents the design and implementation of an innovative bilingual software tool (English/Arabic) that automates psychometric analysis of multiple-choice examinations. Developed using Python and leading data science libraries, the tool streamlines the computation of essential metrics such as item difficulty, discrimination indices, KR-20 reliability, and distractor efficiency. These parameters are synthesized into an intuitive Exam Quality Index (EQI), accompanied by automated narrative interpretations and targeted recommendations. This user-friendly application supports educators in enhancing assessment quality, promoting fairness, and fostering evidence-based educational practices.

Anahtar Kelimeler

Psychometric evaluation , Classical Test Theory , multiple-choice assessments , KR-20 reliability , Python analytics , Exam Quality Index

Kaynakça

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (AERA). (2014). Standards for educational and psychological testing. AERA.
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. assessment in education. Principles, Policy & Practice, 5(1), 7-74. http://dx.doi.org/10.1080/0969595980050102
Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78(1), 98-104. https://psycnet.apa.org/doi/10.1037/0021-9010.78.1.98
Crocker, L., & Algina, J. (2006). Introduction to classical and modern test theory. Holt, Rinehart and Winston.
DiBattista, D., & Kurzawa, L. (2011). Examination of the quality of multiple-choice items on class-room tests. Canadian Journal for the Scholarship of Teaching and Learning, 2(2), Article 4. https://doi.org/10.5206/cjsotl-rcacea.2011.2.4
Elmorsy, M. (2025). Psychometric exam analyzer (1.1). Zenodo.https://doi.org/10.5281/zenodo.18860050
Nitko, A. J., & Brookhart, S. M. (2011). Educational assessment of students (6th ed.). Pearson.
Rezigalla, A. A. (2020). Item analysis: Concepts and application. Intech Open.
Rezigalla, A. A. (2022). Item analysis: Concept and application. In M. S. Firstenberg & S. P. Stawicki (Eds.), Medical education for the 21st century (pp. 1-16) IntechOpen. https://doi.org/10.5772/intechopen. 100138.
Salati, A. S., & AlSulaim, S. L. (2024). Item analysis of multiple-choice questions in an undergraduate surgery course-an assessment of an assessment tool. Sanamed, 19(2), 163-171. https://doi.org/10.5937/sanamed0-50691
Salih, K. E. M. A., Jibo, A., Ishaq, M., Khan, S., Mohammed, O. A., Al-Shahrani, A. M., & Abbas, M. (2020). Psychometric analysis of multiple-choice questions in an innovative curriculum in Kingdom of Saudi Arabia. Journal of Family Medicine and Primary Care, 9(7), 3663-3668. https://doi.org/10.4103/jfmpc.jfmpc_358_20
Sugianto, A. (2021). Item analysis of English summative test: Efl teacher-made test. Indonesian EFL Research and Practices, 1(1), 35-54. https://jurnal.iaima.ac.id/i-efl/article/view/4
Zubairi, N. A., AlAhmadi, T. S., Ibrahim, M. H., Hegazi, M. A., & Gadi, F. U. (2025). Effective use of item analysis to improve the reliability and validity of Undergraduate Medical Examinations: Evaluating the same exam over many years: A different approach. Pakistan Journal of Medical Sciences, 41(3), 810-815. https://doi.org/10.12669/pjms.41.3.10693

Çoktan Seçmeli Soruların Psikometrik Değerlendirmesi ve Sınav Kalitesi Endekslemesi için Otomatik Bir Araç Geliştirilmesi: Aşama I Çalışması

Yıl 2026, Cilt: 4 Sayı: 1, 26 - 37, 28.02.2026

Mohammad A. Elmorsy , Doaa El Morsi

https://doi.org/10.62425/jopres.1804954

https://izlik.org/JA42AC52HE

Öz

Öğrenci öğrenmesinin değerlendirilmesi ve öğretim uygulamalarının geliştirilmesi açısından doğru ve güvenilir eğitimsel ölçme kritik bir öneme sahiptir. Klasik Test Kuramı’nın (Classical Test Theory; CTT) sınav kalitesini değerlendirmedeki sağlamlığına rağmen, içerdiği istatistiksel karmaşıklık çoğu zaman bu yaklaşımın eğitimciler tarafından pratikte kullanılmasını sınırlandırmaktadır. Bu çalışma, çoktan seçmeli sınavların psikometrik analizini otomatikleştiren yenilikçi ve iki dilli (İngilizce/Arapça) bir yazılım aracının tasarımını ve uygulanmasını sunmaktadır. Python programlama dili ve önde gelen veri bilimi kütüphaneleri kullanılarak geliştirilen bu araç; madde güçlüğü, ayırt edicilik indeksleri, KR-20 güvenirlik katsayısı ve çeldirici etkinliği gibi temel ölçütlerin hesaplanmasını yalın ve sistematik bir biçimde gerçekleştirmektedir. Bu psikometrik göstergeler, otomatik anlatımsal yorumlar ve hedefe yönelik önerilerle desteklenen Sınav Kalite İndeksi (Exam Quality Index; EQI) altında bütünleştirilmektedir. Kullanıcı dostu yapıya sahip bu uygulama, eğitimcilerin ölçme araçlarının kalitesini artırmalarını, adil değerlendirme süreçlerini desteklemelerini ve kanıta dayalı eğitim uygulamalarını teşvik etmelerini amaçlamaktadır.

Anahtar Kelimeler

Psikometrik değerlendirme , klasik test kuramı , çoktan seçmeli değerlendirmeler , KR-20 güvenirliği , python analitiği , sınav kalite indeksi

Kaynakça

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (AERA). (2014). Standards for educational and psychological testing. AERA.
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. assessment in education. Principles, Policy & Practice, 5(1), 7-74. http://dx.doi.org/10.1080/0969595980050102
Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78(1), 98-104. https://psycnet.apa.org/doi/10.1037/0021-9010.78.1.98
Crocker, L., & Algina, J. (2006). Introduction to classical and modern test theory. Holt, Rinehart and Winston.
DiBattista, D., & Kurzawa, L. (2011). Examination of the quality of multiple-choice items on class-room tests. Canadian Journal for the Scholarship of Teaching and Learning, 2(2), Article 4. https://doi.org/10.5206/cjsotl-rcacea.2011.2.4
Elmorsy, M. (2025). Psychometric exam analyzer (1.1). Zenodo.https://doi.org/10.5281/zenodo.18860050
Nitko, A. J., & Brookhart, S. M. (2011). Educational assessment of students (6th ed.). Pearson.
Rezigalla, A. A. (2020). Item analysis: Concepts and application. Intech Open.
Rezigalla, A. A. (2022). Item analysis: Concept and application. In M. S. Firstenberg & S. P. Stawicki (Eds.), Medical education for the 21st century (pp. 1-16) IntechOpen. https://doi.org/10.5772/intechopen. 100138.
Salati, A. S., & AlSulaim, S. L. (2024). Item analysis of multiple-choice questions in an undergraduate surgery course-an assessment of an assessment tool. Sanamed, 19(2), 163-171. https://doi.org/10.5937/sanamed0-50691
Salih, K. E. M. A., Jibo, A., Ishaq, M., Khan, S., Mohammed, O. A., Al-Shahrani, A. M., & Abbas, M. (2020). Psychometric analysis of multiple-choice questions in an innovative curriculum in Kingdom of Saudi Arabia. Journal of Family Medicine and Primary Care, 9(7), 3663-3668. https://doi.org/10.4103/jfmpc.jfmpc_358_20
Sugianto, A. (2021). Item analysis of English summative test: Efl teacher-made test. Indonesian EFL Research and Practices, 1(1), 35-54. https://jurnal.iaima.ac.id/i-efl/article/view/4
Zubairi, N. A., AlAhmadi, T. S., Ibrahim, M. H., Hegazi, M. A., & Gadi, F. U. (2025). Effective use of item analysis to improve the reliability and validity of Undergraduate Medical Examinations: Evaluating the same exam over many years: A different approach. Pakistan Journal of Medical Sciences, 41(3), 810-815. https://doi.org/10.12669/pjms.41.3.10693

Toplam 13 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Klasik Test Kuramları
Bölüm	Araştırma Makalesi
Yazarlar	Mohammad A. Elmorsy 0000-0001-7575-6556 Doaa El Morsi 0000-0003-4129-8002
Gönderilme Tarihi	16 Ekim 2025
Kabul Tarihi	31 Aralık 2025
Yayımlanma Tarihi	28 Şubat 2026
DOI	https://doi.org/10.62425/jopres.1804954
IZ	https://izlik.org/JA42AC52HE
Yayımlandığı Sayı	Yıl 2026 Cilt: 4 Sayı: 1

Kaynak Göster

APA	Elmorsy, M. A., & El Morsi, D. (2026). Developing an Automated Tool for Psychometric Evaluation and Exam Quality Indexing of Multiple-Choice Questions (MCQs): Phase I Study. Journal of Psychometric Research, 4(1), 26-37. https://doi.org/10.62425/jopres.1804954