INVESTIGATION OF SCIENTIFIC LITERACY ACCORDING TO DIFFERENT ITEM TYPES: PISA 2015 TURKEY SAMPLE
Year 2019,
Volume: 19 Issue: 2, 695 - 709, 01.07.2019
Esin Yılmaz Koğar
,
Hakan Koğar
Abstract
The aim of the present study was to reveal the psychometric
properties of the items in the cognitive test of PISA 2015 assessing scientific
literacy according to different item types and to examine scientific literacy
in relation to different independent variables. In the sample of PISA 2015
Turkey, 175 students from 5895 students were included in the study with the aim
of researching. Descriptive statistics and various hypothesis tests were used
to obtain the findings of the research. When scientific literacy and item
difficulty averages for all three item types were examined, students with a
high level of scientific literacy level were more successful at responding to
constructed response (CR) items, while students with a low level of scientific
literacy were more successful at answering multiple choice (MC) items. Male
students were more successful than female students in responding to MC and
complex multiple choice (CMC) items, while female students were more successful
than male students in answering CR items. It was found that students with a
high level of economic, social and cultural status were more successful than
those with a low level of economic, social and cultural status.
References
- Arthur, N., & Everaert, P. (2011). Gender and performance in accounting examinations: exploring the ımpact of examination format. Accounting Education, 21(5), 471-487.
- Ayılın, B. (1993). Seçme gerektiren test maddeleri ile kısa cevap gerektiren test maddelerinin psikometrik özellikleri ve öğrenci başarısı bakımından karşılaştırılması. Yayımlanmamış yüksek lisans tezi, Hacettepe Üniversitesi, Ankara.
- Bacon, D. R. (2003). Assessing learning outcomes: A comparison of multiple-choice and short-answer questions in a marketing context. Journal of Marketing Education, 25(1), 31-36.
- Baker, D. P., Goesling, B., & LeTendre, G. K. (2002). Socioeconomic status, school quality, and national economic development: A cross-national analysis of the “Heyneman-Loxley effect” on mathematics and science achievement. Comparative education review, 46(3), 291-312.
- Becker, W. E., & Johnston, C. (1999). The relationship between multiple choice and essay response questions in assessing economics understanding. Economic Record, 75(4), 348-357.
- Bell, R., & Hay, J. (1987). Differences and biases in English language examination formats. British Journal of Educational Psychology, 57(2), 212–220.
- Beller, M., & Gafni, N. (2000). Can item format (multiple choice vs. open-ended) account for gender differences in mathematics achievement? Sex Roles, 42(1-2), 1-21.
- Ben-Simon, A., Budescu, D. V., & Nevo, B. (1997). A comparative study of measures of partial knowledge in multiple-choice tests. Applied Psychological Measurement, 21(1), 65-88.
- Bennett, R. E., Rock, D. A., Braun, H. I., Frye, D., Spohrer, J. C., & Soloway, E. (1990). The relationship of expert-system scored constrained free-response items to multiple-choice and open-ended items. Applied Psychological Measurement, 14(2), 151-162.
- Berberoğlu, G., Çelebi, Ö., Özdemir, E., Uysal, E., & Yayan, B. (2003). Üçüncü uluslararası matematik ve fen çalışmasında Türk öğrencilerinin başarı düzeylerini etkileyen etmenler. Eğitim Bilimleri ve Uygulama, 2(3), 3-14.
- Bible, L., Simkin, M. G. & Kuechler, W. L. (2008). Using multiple-choice tests to evaluate students’ understanding of accounting. Accounting Education: An International Journal, 17 (Supplement), 55-S68.
- Bleske‐Rechek, A., Zeug, N., & Webb, R. M. (2007). Discrepant performance on multiple‐choice and short answer assessments and the relation of performance to general scholastic aptitude. Assessment & Evaluation in Higher Education, 32(2), 89-105.
- Bolger, N. (1984). Gender differences in academic achievement according to method of measurement. 92nd The Annual Meeting of the American Psychological Association, August, Toronto, Rio.
- Bolger, N., & Kellaghan, T. (1990). Method of measurement and gender differences in scholastic achievement. Journal of Educational Measurement, 27(2), 165–174.
- Breland, H. M., Danos, D. O., Kahn, H. D., Kubota, M. Y., & Bonner, M. W. (1994). Performance versus objective testing and gender: An exploratory study of an Advanced Placement history examination. Journal of Educational Measurement, 31(4), 275-293.
- Bridgeman, B. & Lewis, C. (1994). The relationship of essay and multiple-choice scores with grades in college courses. Journal of Educational Measurement, 31(1), 37–50.
- Bridgeman, B., & Rock, D. A. (1993). Relationships among multiple–choice and open–ended analytical questions. Journal of Educational Measurement, 30(4), 313-329.
- Brown, G. A., Bull, J., & Pendlebury, M. (2013). Assessing student learning in higher education. London: Routledge.
- Bush, M. (2001). A multiple choice test that rewards partial knowledge. Journal of Further and Higher education, 25(2), 157-163.
- Chan, N., & Kennedy, P. E. (2002). Are multiple-choice exams easier for economics students? A comparison of multiple-choice and "equivalent" constructed-response exam questions. Southern Economic Journal, 68(4), 957-971.
- Chiu, M. M. (2007). Families, economies, cultures, and science achievement in 41 countries: country-, school-, and student-level analyses. Journal of Family Psychology, 21(3), 510.
- DeMars, C. E. (2000). Test stakes and item format interactions. Applied Measurement in Education, 13(1), 55-77.
- Dudley, A. (2006).Multiple dichotomous-scored items in second language testing: Investigating the multiple true–false item type under norm-referenced conditions. Language Testing, 23(2), 198–228.
- Dufresne, R. J., Leonard, W. J., & Gerace, W. J. (2002). Marking sense of students' answers to multiple-choice questions. The Physics Teacher, 40(3), 174-180.
- Ebel, R. L. (1970). The case for true-false test items. The School Review, 78(3), 373-389.
- Fenna, D. S. (2004). Assessment of foundation knowledge: are students confident in their ability?. European journal of engineering education, 29(2), 307-312.
- Frisbie, D. A., & Druva, C. A. (1986). Estimating the reliability of multiple true-false tests. Journal of Educational Measurement, 23(2), 99-105.
- Frisbie, D. A., & Sweeney, D. C. (1982). The relative merits of multiple true‐false achievement tests. Journal of Educational Measurement, 19(1), 29-35.
- Ghorpade, J., & Lackritz, R. (1998). Equal opportunity in the classroom: test construction in a diversity-sensitive environment. Journal of Management Education, 22(4), 452–471.
- Greene, B. (1997). Verbal abilities, gender, and the introductory economics course: A new look at an old assumption. The Journal of Economic Education, 28(1), 13-30.
- Griffo, V. (2011). Examining NAEP: The effect of item format on struggling 4th graders' reading comprehension. Unpublished Doctoral Dissertation in Special Education. University of California, Berkeley, USA.
- Haberkorn, K., Pohl, S., & Carstensen, C. (2015). Incorporating different response formats of competence tests in an IRT model. Psychological Test and Assessment Modeling, 58 (2), 223-252.
- Haladyna, T. M. (1992). The effectiveness of several multiple-choice formats. Applied Measurement in Education, 5(1), 73-88.
- Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. New York: Routledge.
- Hancock, G. R. (1994). Cognitive complexity and the comparability of multiple-choice and constructed-response test formats. The Journal of experimental education, 62(2), 143-157.
- Harris, R. B., & Kerby, W. C. (1997). Statewide performance assessment as a complement to multiple-choice testing in high school economics. The Journal of Economic Education, 28(2), 122-134.
- Hobson, A., & Ghoshal, D. (1996). Flexible scoring for multiple‐choice exams. The Physics Teacher, 34(5), 284-284.
- Kan, A., & Kayapınar, U. (2010). Yabancı dil eğitiminde aynı davranışları yoklayan çoktan seçmeli ve kısa cevaplı iki testin madde ve test özelliklerinin karşılaştırılması. Eğitim ve Bilim, 32(142), 65-71.
- Kennedy, P., & Walstad, W. B. (1997). Combining multiple-choice and constructed-response test scores: An economist's view. Applied Measurement in Education, 10(4), 359-375.
- Kinsey, T. L. (2003). A comparison of IRT and rasch procedures in a mixed-item format test. Doctoral dissertation, University of North Texas.
- Klein, C. A. (1971). Differences in science concepts held by children from three social-economic levels. School Science and Mathematics, 71(6), 550-558.
- Koutsoulis, M. K., & Campbell, J. R. (2001). Family processes affect students' motivation, and science and math achievement in Cypriot high schools. Structural Equation Modeling, 8(1), 108-127.
- Kuechler, W. L., & Simkin, M. G. (2003). How well do multiple choice tests evaluate student understanding in computer programming classes?. Journal of Information Systems Education, 14(4), 389.
- Lukhele, R., Thissen, D., & Wainer, H. (1994). On the relative value of multiple‐choice, constructed response, and examinee‐selected ıtems on two achievement tests. Journal of Educational Measurement, 31(3), 234-250.
- Lumsden, K. G., & Scott, A. (1987). The economics student reexamined: Male-female differences in comprehension. Research in Accounting Education, 18(4), 365–375.
- Lumsden, K. G., & Scott, A. (1995). Economics performance on multiple-choice and essay examinations: a large-scale study of accounting students. Accounting Education: an international journal, 4(2): 153–167.
- Martinez, M. E. (1999). Cognition and the question of test item format. Educational Psychologist, 34(4), 207-218.
- Murphy, R. J. L. (1980). Sex differences in GCE examination entry statistics and success rates. Educational Studies, 6(2), 169–178.
- Murphy, R. (1982). Sex differences in objective test performance. British Journal of Educational Psychology, 52(2), 213–219.
- OECD (Organisation for Economic Cooperation and Development) (2009). PISA 2006 technical report.
- OECD, (2016a). PISA 2015 assessment and analytical framework: Science, reading, mathematic and financial literacy, PISA, Paris: OECD Publishing.
- OECD, (2016b). PISA 2015 PISA results in focus, Paris: OECD Publishing.
- OECD, (2016c). PISA 2015 technical report. Chapter 2: Test design and development, Paris: OECD Publishing.
- Osterlind, S. J. (1998). Constructing test items: Multiple-choice, constructed-response, performance, and other formats. Dordrecht, Netherlands: Kluwer Academic.
- Perry, L. B., & McConney, A. (2010). Does the SES of the school matter? An examination of socioeconomic status and student achievement using PISA 2003. Teachers College Record, 112(4), 1137-1162.
- Rogers, W. T., & Harley, D. (1999). An empirical comparison of three-and four-choice items and tests: susceptibility to testwiseness and internal consistency reliability. Educational and Psychological Measurement, 59(2), 234-247.
- Thawabieh, A. M. (2016). A comparison between two test item formats: Multiple-choice items and completion items. British Journal of Education, 4(8), 32-43.
- Topçu, M. S., & Yılmaz-Tüzün, Ö. (2009). Elementary students’ metacognition and epistemological beliefs considering science achievement, gender and socioeconomic status. Elementary Education Online, 8(3), 676-693.
- Uğurlu, N. (1996). Aynı davranışı ölçmeye yönelik kısa cevaplı, çoktan seçmeli ve kompozisyon tipi maddelerden oluşa testlerin özelliklerinin incelenmesi. Yayımlanmamış yüksek lisans tezi, Hacettepe Üniversitesi, Ankara.
- Walstad, W. B. (1998). Multiple choice tests for the economics course. W. B. Walstad & P. Saunder (Eds.). In Teaching Undergraduate Economics: A Handbook for İnstructors (pp. 287-304), New York: McGraw-Hill.
- Walstad, W. B., & Becker, W. E. (1994). Achievement differences on multiple-choice and essay tests in economics. The American Economic Review, 84(2), 193-196. Walstad, W. & Robinson, D. (1997). Differential item functioning and male-female differences on multiple-choice tests in economics. Journal of Economic Education, 28(2), 155–171.
- Zeidner, M. (1987). Essay versus multiple-choice type classroom exams: The student’s perspective. The Journal of Educational Research, 80(6), 352-358.
- Zhang, L., & Manon, J. (2000) Gender and achievement-understanding gender differences and similarities in mathematics assessment. Paper presented at the Annual Meeting of the American Educational Research Association, April 2000 (pp. 24-28). New Orleans, LA.
FARKLI MADDE TÜRLERİNE GÖRE FEN OKURYAZARLIĞININ İNCELENMESİ: PISA 2015 TÜRKİYE ÖRNEĞİ
Year 2019,
Volume: 19 Issue: 2, 695 - 709, 01.07.2019
Esin Yılmaz Koğar
,
Hakan Koğar
Abstract
Bu araştırmanın amacı, farklı madde türlerine
göre PISA 2015 fen okuryazarlığı bilişsel testinde yer alan maddelerin
psikometrik özelliklerini ortaya koymak ve farklı madde türlerinden elde edilen
fen okuryazarlığı puanlarını farklı bağımsız değişkenlere göre incelemektir.
PISA 2015 Türkiye örnekleminde yer alan 5895 öğrenciden, araştırmanın amacı
doğrultusunda 175 öğrenci çalışmaya dâhil edilmiştir. Araştırmaya iki fen
kümesinde yer alan toplam 35 fen okuryazarlığı maddesi alınmıştır. Araştırmanın
bulgularını elde edebilmek için betimsel istatistikler ve çeşitli hipotez
testleri kullanılmıştır. Çoktan seçmeli (ÇS) maddelerin daha kolay olmasının
yanında ayırt ediciliğinin oldukça düşük olduğu; cevabı yazılan (CY) maddelerin
ise diğer madde türlerine göre zor ve ayırt ediciliğinin daha yüksek olduğu
belirlenmiştir. Karmaşık çoktan seçmeli (KÇS) maddelerin ise ÇS maddeleri ile
benzer bir madde güçlüğüne sahip olmasıyla birlikte, ayırt ediciliğinin ÇS
maddelerine göre daha yüksek olduğu sonucuna varılmıştır. Üç ayrı madde
türündeki fen okuryazarlığı ve madde güçlük ortalamaları incelendiğinde, fen
okuryazarlık düzeyi yüksek olan öğrencilerin CY maddelerinde; fen okuryazarlık
düzeyi düşük olan öğrencilerin ise ÇS maddelerinde daha başarılı olduğu
belirlenmiştir. Erkek öğrenciler ÇS ve KÇS maddelerinde kız öğrencilere göre
daha başarılı bulunurken kız öğrenciler ise CY maddelerinde erkek öğrencilere
göre daha başarılı bulunmuştur. Üst ekonomik, sosyal ve kültürel duruma sahip
öğrencilerin tüm madde türlerinde alt ekonomik, sosyal ve kültürel duruma sahip
öğrencilere göre daha başarılı olduğu sonucuna varılmıştır. Bu çalışmada
kullanılan madde türü sınıflandırması PISA 2015 uygulamasına aittir. Farklı
madde türü sınıflandırmaları ile bu araştırma tekrar edilebilir. Ayrıca fen
okuryazarlık düzeyi düşük, orta ve yüksek olan ülkelerden elde edilecek
örneklemler üzerinden benzer bir çalışma yürütülebilir. Bu araştırma doğrultusunda
Türkiye’de, özellikle merkezi sınavlarda farklı soru türlerinin birlikte
kullanılması yönünde atılan adımların olumlu sonuçlar verebileceği
düşünülmektedir. Farklı madde türlerinin fen başarısının belirlenmesinde
birlikte kullanılması önerilmektedir.
References
- Arthur, N., & Everaert, P. (2011). Gender and performance in accounting examinations: exploring the ımpact of examination format. Accounting Education, 21(5), 471-487.
- Ayılın, B. (1993). Seçme gerektiren test maddeleri ile kısa cevap gerektiren test maddelerinin psikometrik özellikleri ve öğrenci başarısı bakımından karşılaştırılması. Yayımlanmamış yüksek lisans tezi, Hacettepe Üniversitesi, Ankara.
- Bacon, D. R. (2003). Assessing learning outcomes: A comparison of multiple-choice and short-answer questions in a marketing context. Journal of Marketing Education, 25(1), 31-36.
- Baker, D. P., Goesling, B., & LeTendre, G. K. (2002). Socioeconomic status, school quality, and national economic development: A cross-national analysis of the “Heyneman-Loxley effect” on mathematics and science achievement. Comparative education review, 46(3), 291-312.
- Becker, W. E., & Johnston, C. (1999). The relationship between multiple choice and essay response questions in assessing economics understanding. Economic Record, 75(4), 348-357.
- Bell, R., & Hay, J. (1987). Differences and biases in English language examination formats. British Journal of Educational Psychology, 57(2), 212–220.
- Beller, M., & Gafni, N. (2000). Can item format (multiple choice vs. open-ended) account for gender differences in mathematics achievement? Sex Roles, 42(1-2), 1-21.
- Ben-Simon, A., Budescu, D. V., & Nevo, B. (1997). A comparative study of measures of partial knowledge in multiple-choice tests. Applied Psychological Measurement, 21(1), 65-88.
- Bennett, R. E., Rock, D. A., Braun, H. I., Frye, D., Spohrer, J. C., & Soloway, E. (1990). The relationship of expert-system scored constrained free-response items to multiple-choice and open-ended items. Applied Psychological Measurement, 14(2), 151-162.
- Berberoğlu, G., Çelebi, Ö., Özdemir, E., Uysal, E., & Yayan, B. (2003). Üçüncü uluslararası matematik ve fen çalışmasında Türk öğrencilerinin başarı düzeylerini etkileyen etmenler. Eğitim Bilimleri ve Uygulama, 2(3), 3-14.
- Bible, L., Simkin, M. G. & Kuechler, W. L. (2008). Using multiple-choice tests to evaluate students’ understanding of accounting. Accounting Education: An International Journal, 17 (Supplement), 55-S68.
- Bleske‐Rechek, A., Zeug, N., & Webb, R. M. (2007). Discrepant performance on multiple‐choice and short answer assessments and the relation of performance to general scholastic aptitude. Assessment & Evaluation in Higher Education, 32(2), 89-105.
- Bolger, N. (1984). Gender differences in academic achievement according to method of measurement. 92nd The Annual Meeting of the American Psychological Association, August, Toronto, Rio.
- Bolger, N., & Kellaghan, T. (1990). Method of measurement and gender differences in scholastic achievement. Journal of Educational Measurement, 27(2), 165–174.
- Breland, H. M., Danos, D. O., Kahn, H. D., Kubota, M. Y., & Bonner, M. W. (1994). Performance versus objective testing and gender: An exploratory study of an Advanced Placement history examination. Journal of Educational Measurement, 31(4), 275-293.
- Bridgeman, B. & Lewis, C. (1994). The relationship of essay and multiple-choice scores with grades in college courses. Journal of Educational Measurement, 31(1), 37–50.
- Bridgeman, B., & Rock, D. A. (1993). Relationships among multiple–choice and open–ended analytical questions. Journal of Educational Measurement, 30(4), 313-329.
- Brown, G. A., Bull, J., & Pendlebury, M. (2013). Assessing student learning in higher education. London: Routledge.
- Bush, M. (2001). A multiple choice test that rewards partial knowledge. Journal of Further and Higher education, 25(2), 157-163.
- Chan, N., & Kennedy, P. E. (2002). Are multiple-choice exams easier for economics students? A comparison of multiple-choice and "equivalent" constructed-response exam questions. Southern Economic Journal, 68(4), 957-971.
- Chiu, M. M. (2007). Families, economies, cultures, and science achievement in 41 countries: country-, school-, and student-level analyses. Journal of Family Psychology, 21(3), 510.
- DeMars, C. E. (2000). Test stakes and item format interactions. Applied Measurement in Education, 13(1), 55-77.
- Dudley, A. (2006).Multiple dichotomous-scored items in second language testing: Investigating the multiple true–false item type under norm-referenced conditions. Language Testing, 23(2), 198–228.
- Dufresne, R. J., Leonard, W. J., & Gerace, W. J. (2002). Marking sense of students' answers to multiple-choice questions. The Physics Teacher, 40(3), 174-180.
- Ebel, R. L. (1970). The case for true-false test items. The School Review, 78(3), 373-389.
- Fenna, D. S. (2004). Assessment of foundation knowledge: are students confident in their ability?. European journal of engineering education, 29(2), 307-312.
- Frisbie, D. A., & Druva, C. A. (1986). Estimating the reliability of multiple true-false tests. Journal of Educational Measurement, 23(2), 99-105.
- Frisbie, D. A., & Sweeney, D. C. (1982). The relative merits of multiple true‐false achievement tests. Journal of Educational Measurement, 19(1), 29-35.
- Ghorpade, J., & Lackritz, R. (1998). Equal opportunity in the classroom: test construction in a diversity-sensitive environment. Journal of Management Education, 22(4), 452–471.
- Greene, B. (1997). Verbal abilities, gender, and the introductory economics course: A new look at an old assumption. The Journal of Economic Education, 28(1), 13-30.
- Griffo, V. (2011). Examining NAEP: The effect of item format on struggling 4th graders' reading comprehension. Unpublished Doctoral Dissertation in Special Education. University of California, Berkeley, USA.
- Haberkorn, K., Pohl, S., & Carstensen, C. (2015). Incorporating different response formats of competence tests in an IRT model. Psychological Test and Assessment Modeling, 58 (2), 223-252.
- Haladyna, T. M. (1992). The effectiveness of several multiple-choice formats. Applied Measurement in Education, 5(1), 73-88.
- Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. New York: Routledge.
- Hancock, G. R. (1994). Cognitive complexity and the comparability of multiple-choice and constructed-response test formats. The Journal of experimental education, 62(2), 143-157.
- Harris, R. B., & Kerby, W. C. (1997). Statewide performance assessment as a complement to multiple-choice testing in high school economics. The Journal of Economic Education, 28(2), 122-134.
- Hobson, A., & Ghoshal, D. (1996). Flexible scoring for multiple‐choice exams. The Physics Teacher, 34(5), 284-284.
- Kan, A., & Kayapınar, U. (2010). Yabancı dil eğitiminde aynı davranışları yoklayan çoktan seçmeli ve kısa cevaplı iki testin madde ve test özelliklerinin karşılaştırılması. Eğitim ve Bilim, 32(142), 65-71.
- Kennedy, P., & Walstad, W. B. (1997). Combining multiple-choice and constructed-response test scores: An economist's view. Applied Measurement in Education, 10(4), 359-375.
- Kinsey, T. L. (2003). A comparison of IRT and rasch procedures in a mixed-item format test. Doctoral dissertation, University of North Texas.
- Klein, C. A. (1971). Differences in science concepts held by children from three social-economic levels. School Science and Mathematics, 71(6), 550-558.
- Koutsoulis, M. K., & Campbell, J. R. (2001). Family processes affect students' motivation, and science and math achievement in Cypriot high schools. Structural Equation Modeling, 8(1), 108-127.
- Kuechler, W. L., & Simkin, M. G. (2003). How well do multiple choice tests evaluate student understanding in computer programming classes?. Journal of Information Systems Education, 14(4), 389.
- Lukhele, R., Thissen, D., & Wainer, H. (1994). On the relative value of multiple‐choice, constructed response, and examinee‐selected ıtems on two achievement tests. Journal of Educational Measurement, 31(3), 234-250.
- Lumsden, K. G., & Scott, A. (1987). The economics student reexamined: Male-female differences in comprehension. Research in Accounting Education, 18(4), 365–375.
- Lumsden, K. G., & Scott, A. (1995). Economics performance on multiple-choice and essay examinations: a large-scale study of accounting students. Accounting Education: an international journal, 4(2): 153–167.
- Martinez, M. E. (1999). Cognition and the question of test item format. Educational Psychologist, 34(4), 207-218.
- Murphy, R. J. L. (1980). Sex differences in GCE examination entry statistics and success rates. Educational Studies, 6(2), 169–178.
- Murphy, R. (1982). Sex differences in objective test performance. British Journal of Educational Psychology, 52(2), 213–219.
- OECD (Organisation for Economic Cooperation and Development) (2009). PISA 2006 technical report.
- OECD, (2016a). PISA 2015 assessment and analytical framework: Science, reading, mathematic and financial literacy, PISA, Paris: OECD Publishing.
- OECD, (2016b). PISA 2015 PISA results in focus, Paris: OECD Publishing.
- OECD, (2016c). PISA 2015 technical report. Chapter 2: Test design and development, Paris: OECD Publishing.
- Osterlind, S. J. (1998). Constructing test items: Multiple-choice, constructed-response, performance, and other formats. Dordrecht, Netherlands: Kluwer Academic.
- Perry, L. B., & McConney, A. (2010). Does the SES of the school matter? An examination of socioeconomic status and student achievement using PISA 2003. Teachers College Record, 112(4), 1137-1162.
- Rogers, W. T., & Harley, D. (1999). An empirical comparison of three-and four-choice items and tests: susceptibility to testwiseness and internal consistency reliability. Educational and Psychological Measurement, 59(2), 234-247.
- Thawabieh, A. M. (2016). A comparison between two test item formats: Multiple-choice items and completion items. British Journal of Education, 4(8), 32-43.
- Topçu, M. S., & Yılmaz-Tüzün, Ö. (2009). Elementary students’ metacognition and epistemological beliefs considering science achievement, gender and socioeconomic status. Elementary Education Online, 8(3), 676-693.
- Uğurlu, N. (1996). Aynı davranışı ölçmeye yönelik kısa cevaplı, çoktan seçmeli ve kompozisyon tipi maddelerden oluşa testlerin özelliklerinin incelenmesi. Yayımlanmamış yüksek lisans tezi, Hacettepe Üniversitesi, Ankara.
- Walstad, W. B. (1998). Multiple choice tests for the economics course. W. B. Walstad & P. Saunder (Eds.). In Teaching Undergraduate Economics: A Handbook for İnstructors (pp. 287-304), New York: McGraw-Hill.
- Walstad, W. B., & Becker, W. E. (1994). Achievement differences on multiple-choice and essay tests in economics. The American Economic Review, 84(2), 193-196. Walstad, W. & Robinson, D. (1997). Differential item functioning and male-female differences on multiple-choice tests in economics. Journal of Economic Education, 28(2), 155–171.
- Zeidner, M. (1987). Essay versus multiple-choice type classroom exams: The student’s perspective. The Journal of Educational Research, 80(6), 352-358.
- Zhang, L., & Manon, J. (2000) Gender and achievement-understanding gender differences and similarities in mathematics assessment. Paper presented at the Annual Meeting of the American Educational Research Association, April 2000 (pp. 24-28). New Orleans, LA.