Araştırma Makalesi
BibTex RIS Kaynak Göster

PISA 2018’de Okuduğunu Anlama Başarısının Yordanmasında Ağırlık Değişkeninin Etkisi: Veri Madenciliği Yaklaşımı

Yıl 2025, Cilt: 22 Sayı: 5, 1146 - 1159, 30.09.2025
https://doi.org/10.26466/opusjsr.1730221

Öz

Bu çalışma, öğrenci düzeyi örneklem ağırlıklarının başarı puanlarını yordamadaki model performansını nasıl etkilediğini incelemektedir. Analizlerde, 2018 PISA öğrenci anketinden elde edilen 34 bağımsız değişken kullanılarak Sınıflama ve Regresyon Ağacı (CART) ve Rastgele Orman (RF) yöntemleri uygulanmıştır. Türkiye’de daha önceki veri madenciliği çalışmalarında örneklem ağırlıkları dikkate alınmadığından, bu araştırma alana özgün bir katkı sunmaktadır. Bulgulara göre, örneklem ağırlıkları kullanıldığında CART yöntemiyle belirlenen on önemli değişkenden yalnızca biri farklılaşmış, ancak değişkenlerin önem sırası da değişmiştir. RF yöntemiyle oluşturulan modellerde ise yalnızca beş değişken ortak kalmış, diğerleri farklılık göstermiştir. Her iki yöntemde örneklem ağırlıkları dâhil edildiğinde, modellerin yordama performansında hafif fakat istatistiksel olarak anlamlı olmayan bir düşüş gözlenmiştir. Bu sonuçlar, örneklem ağırlıklarının değişken seçiminde etkili olduğunu ancak genel model doğruluğunu anlamlı biçimde etkilemediğini göstermektedir. Genel olarak, elde edilen bulgular, geniş ölçekli eğitimsel veri madenciliğinde geçerli ve güvenilir sonuçlar elde etmek için örneklem ağırlıklarının kullanılmasının gerekliliğini ortaya koymaktadır.

Kaynakça

  • Abad, F. M., & López, A. C. (2017). Data-mining techniques in detecting factors linked to academic achievement. School Effectiveness and School Improvement, 28(1), 39–55. https://doi.org/-10.1080/09243453.2016.1235591
  • Addey, C., Sellar, S., Steiner-Khamsi, G., Lingard, B., & Verger, A. (2017). Forum discussion: The rise of international large-scale assessments and rationales for participation. Compare, 47(3), 434–452. https://doi.org/10.1080-/03057925.2017.1301399
  • Aksu, G., & Güzeller, C. O. (2016). Classification of PISA 2012 mathematical literacy scores using decision-tree method: Turkey sampling. Education and Science, 41(185), 101–122. https://doi.org/10.15390/EB.2016.4766
  • Arıkan, S., Özer, F., Şeker, V., & Ertaş, G. (2020). The importance of sample weights and plausible values in large-scale tests [Geniş ölçekli testlerde örneklem ağırlıklarının ve olası değerlerin önemi]. Journal of Measurement and Evaluation in Education and Psychology, 11(1), 43–60. https://doi.org/10.21031/epod.602765
  • Asparouhov, T. (2005). Sampling weights in latent variable modeling. Structural Equation Modeling, 12(3), 411–434. https://doi.org/10.1207/s1532-8007sem1203_4
  • Bezek Güre, Ö., Kayri, M., & Erdoğan, F. (2020). Analysis of factors affecting PISA 2015 mathematics literacy via educational data mining [PISA 2015 matematik okuryazarlığını etkileyen faktörlerin eğitimsel veri madenciliği ile analizi]. Education and Science, 45(202), 393–415. https://doi.org/10.15390/EB.2020.8477
  • Büyüköztürk, Ş., Kılıç-Çakmak, E., Akgün, Ö. E., Karadeniz, Ş., & Demirel, F. (2018). Scientific research methods (25th ed.) [Bilimsel araştırma yöntemleri]. Pegem Academi Publishing.
  • Chiu, M. M., & McBride-Chang, C. (2010). Family and reading in 41 countries. Scientific Studies of Reading, 14(6), 514–543. https://doi.org/-10.1080/10888431003623520
  • Cochran, W. G. (1977). Sampling techniques (3rd ed.). John Wiley & Sons.
  • Cunningham, A. E., & Stanovich, K. E. (1998). What reading does for the mind. American Educator, 22(1–2), 8–15.
  • Domingos, P. (2000). A unified bias-variance decomposition and its applications. In Proceedings of the 17th International Conference on Machine Learning (pp. 231–238). Morgan Kaufmann.
  • Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10), 78–87. https://doi.org/-10.1145/2347736.2347755
  • Field, A. (2009). Discovering statistics using SPSS (3rd ed.). Sage.
  • Gamazo, A., & Abad, F. M. (2020). An exploration of factors linked to academic performance in PISA 2018 through data mining techniques. Frontiers in Psychology, 11, 575167. https://doi.org/10.3389/fpsyg.2020.575167
  • Grilli, L., & Pratesi, M. (2004). Weighted estimation in multilevel ordinal and binary models in the presence of informative sampling designs. Survey Methodology, 30(1), 93–103.
  • Hamilton, L. S. (2003). Assessment as a policy tool. Review of Research in Education, 27(1), 25–68. https://doi.org/10.3102/0091732X027001025
  • Hong, E. (1999). Test anxiety, perceived test difficulty, and test performance: Temporal patterns of their effects. Learning and Individual Differences, 11(4), 431–447. https://doi.org/10.-1016/S1041-6080(99)80012-0
  • Kiray, S. A., Gok, B., & Bozkir, A. S. (2015). Identifying the factors affecting science and mathematics achievement using data mining methods. Journal of Education in Science, Environment and Health, 1(1), 28–48. https://doi.-org/10.21891/jeseh.41216
  • Kish, L. (1992). Weighting for unequal Pi. Journal of Official Statistics, 8(2), 183–200.
  • LaRoche, S., & Foy, P. (2016). Sample design in TIMSS Advanced 2015. In M. O. Martin, I. V. S. Mullis, & M. Hooper (Eds.), Methods and procedures in TIMSS Advanced 2015 (pp. 3.1–3.27). IEA.
  • Lohr, S. L. (2010). Sampling: Design and analysis (2nd ed.). Brooks/Cole.
  • MEB. (2016a). PISA 2015 national report [PISA 2015 ulusal raporu]. Ministry of National Education.
  • MEB. (2019). PISA 2018 preliminary national report [PISA 2018 ulusal ön raporu]. Ministry of National Education.
  • Meinck, S. (2015). Computing sampling weights in large-scale assessments in education. Survey Insights: Methods from the Field. https://surveyinsights.org/?p=5353
  • Ng, A. Y. (1997). Preventing “overfitting” of cross-validation data. In Proceedings of the 14th International Conference on Machine Learning. Morgan Kaufmann.
  • OECD. (2009). Survey weighting and the calculation of sampling variance. In PISA 2006 technical report. OECD Publishing.
  • OECD. (2017). PISA 2015 assessment and analytical framework (rev. ed.). OECD Publishing. https://doi.org/10.1787/9789264281820-en
  • OECD. (2019). PISA 2018 assessment and analytical framework. OECD Publishing. https://doi.org-/10.1787/b25efab8-en
  • Pfeffermann, D., Skinner, C. J., Holmes, D. J., Goldstein, H., & Rasbash, J. (1998). Weighting for unequal selection probabilities in multilevel models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60(1), 23–40. https://doi.org/10.1111/1467-9868.00106
  • Rust, K. (2013). Sampling, weighting, and variance estimation in international large-scale assessments. In L. Rutkowski, M. von Davier, & D. Rutkowski (Eds.), Handbook of international large-scale assessment (pp. 117–154). Chapman & Hall/CRC. https://doi.org/10.1201/b16061
  • Rutkowski, L., Gonzalez, E., Joncas, M., & von Davier, M. (2010). International large-scale assessment data: Issues in secondary analysis and reporting. Educational Researcher, 39(2), 142–151. https://doi.org/10.3102/0013189-X10363170
  • Rutkowski, L., von Davier, M., & Rutkowski, D. (Eds.). (2013). Handbook of international large-scale assessment. CRC Press. https://doi.-org/10.1201/b16061
  • Särndal, C.-E., Swensson, B., & Wretman, J. (1992). Model assisted survey sampling. Springer-Verlag. Shah, S. O., & Hussain, M. (2021). Parental occupation and its effect on the academic performance of children. JETIR, 8(8).
  • Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics (5th ed.). Pearson/Allyn & Bacon.
  • Tat, O., Koyuncu, İ., & Gelbal, S. (2019). The effect of using plausible values and weights on linear regression and HLM parameters [Makul değer ve ağırlıklandırma kullanımının doğrusal regresyon ve HLM parametrelerine etkisi]. Journal of Measurement and Evaluation in Education and Psychology, 10(3), 235–248.
  • Von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI Monograph Series, 2(1), 9–36.
  • Waldow, F. (2009). What PISA did and did not do: Germany after the “PISA-shock”. European Educational Research Journal, 8(3), 476–483. https://doi.org/10.2304/eerj.2009.8.3.476
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427–450. https://doi.org/10.1007-/BF02294627
  • Wiseman, A. W. (2013). Policy responses to PISA in comparative perspective. In H. D. Meyer & A. Benavot (Eds.), PISA, power, and policy: The emergence of global educational governance (pp. 303–322). Symposium Books.
  • Wu, M. (2005). The role of plausible values in large-scale surveys. Studies in Educational Evaluation, 31(2–3), 114–128. https://doi.org/10.-1016/j.stueduc.2005.05.005
  • Yung, J. L., Hsu, Y.-C., & Rice, K. (2012). Integrating data mining in program evaluation of K-12 online education. Journal of Educational Technology & Society, 15(3), 27–41.

The Effect of the Weight Variable on Predicting Reading Comprehension Achievement in PISA 2018: A Data Mining Approach

Yıl 2025, Cilt: 22 Sayı: 5, 1146 - 1159, 30.09.2025
https://doi.org/10.26466/opusjsr.1730221

Öz

This study investigates how student-level sample weights affect model performance in predicting achievement scores. The analyses employed Classification and Regression Tree (CART) and Random Forest (RF) methods with 34 independent variables from the 2018 PISA student survey. Since no prior data mining studies in Turkey have considered sample weights, this research provides an original contribution to the field. According to the findings, when sample weights were used, only one of the ten significant variables identified by the CART method differed, while the order of variable importance also shifted. In the models created with the RF method, only five variables remained common, and the others differed. When sample weights were included in both methods, a slight, statistically non-significant decrease was observed in the prediction performance of the models. These results indicate that sample weights are effective in variable selection but do not significantly affect overall model accuracy. Overall, the findings highlight the necessity of incorporating sample weights to ensure valid and reliable results in large-scale educational data mining.

Kaynakça

  • Abad, F. M., & López, A. C. (2017). Data-mining techniques in detecting factors linked to academic achievement. School Effectiveness and School Improvement, 28(1), 39–55. https://doi.org/-10.1080/09243453.2016.1235591
  • Addey, C., Sellar, S., Steiner-Khamsi, G., Lingard, B., & Verger, A. (2017). Forum discussion: The rise of international large-scale assessments and rationales for participation. Compare, 47(3), 434–452. https://doi.org/10.1080-/03057925.2017.1301399
  • Aksu, G., & Güzeller, C. O. (2016). Classification of PISA 2012 mathematical literacy scores using decision-tree method: Turkey sampling. Education and Science, 41(185), 101–122. https://doi.org/10.15390/EB.2016.4766
  • Arıkan, S., Özer, F., Şeker, V., & Ertaş, G. (2020). The importance of sample weights and plausible values in large-scale tests [Geniş ölçekli testlerde örneklem ağırlıklarının ve olası değerlerin önemi]. Journal of Measurement and Evaluation in Education and Psychology, 11(1), 43–60. https://doi.org/10.21031/epod.602765
  • Asparouhov, T. (2005). Sampling weights in latent variable modeling. Structural Equation Modeling, 12(3), 411–434. https://doi.org/10.1207/s1532-8007sem1203_4
  • Bezek Güre, Ö., Kayri, M., & Erdoğan, F. (2020). Analysis of factors affecting PISA 2015 mathematics literacy via educational data mining [PISA 2015 matematik okuryazarlığını etkileyen faktörlerin eğitimsel veri madenciliği ile analizi]. Education and Science, 45(202), 393–415. https://doi.org/10.15390/EB.2020.8477
  • Büyüköztürk, Ş., Kılıç-Çakmak, E., Akgün, Ö. E., Karadeniz, Ş., & Demirel, F. (2018). Scientific research methods (25th ed.) [Bilimsel araştırma yöntemleri]. Pegem Academi Publishing.
  • Chiu, M. M., & McBride-Chang, C. (2010). Family and reading in 41 countries. Scientific Studies of Reading, 14(6), 514–543. https://doi.org/-10.1080/10888431003623520
  • Cochran, W. G. (1977). Sampling techniques (3rd ed.). John Wiley & Sons.
  • Cunningham, A. E., & Stanovich, K. E. (1998). What reading does for the mind. American Educator, 22(1–2), 8–15.
  • Domingos, P. (2000). A unified bias-variance decomposition and its applications. In Proceedings of the 17th International Conference on Machine Learning (pp. 231–238). Morgan Kaufmann.
  • Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10), 78–87. https://doi.org/-10.1145/2347736.2347755
  • Field, A. (2009). Discovering statistics using SPSS (3rd ed.). Sage.
  • Gamazo, A., & Abad, F. M. (2020). An exploration of factors linked to academic performance in PISA 2018 through data mining techniques. Frontiers in Psychology, 11, 575167. https://doi.org/10.3389/fpsyg.2020.575167
  • Grilli, L., & Pratesi, M. (2004). Weighted estimation in multilevel ordinal and binary models in the presence of informative sampling designs. Survey Methodology, 30(1), 93–103.
  • Hamilton, L. S. (2003). Assessment as a policy tool. Review of Research in Education, 27(1), 25–68. https://doi.org/10.3102/0091732X027001025
  • Hong, E. (1999). Test anxiety, perceived test difficulty, and test performance: Temporal patterns of their effects. Learning and Individual Differences, 11(4), 431–447. https://doi.org/10.-1016/S1041-6080(99)80012-0
  • Kiray, S. A., Gok, B., & Bozkir, A. S. (2015). Identifying the factors affecting science and mathematics achievement using data mining methods. Journal of Education in Science, Environment and Health, 1(1), 28–48. https://doi.-org/10.21891/jeseh.41216
  • Kish, L. (1992). Weighting for unequal Pi. Journal of Official Statistics, 8(2), 183–200.
  • LaRoche, S., & Foy, P. (2016). Sample design in TIMSS Advanced 2015. In M. O. Martin, I. V. S. Mullis, & M. Hooper (Eds.), Methods and procedures in TIMSS Advanced 2015 (pp. 3.1–3.27). IEA.
  • Lohr, S. L. (2010). Sampling: Design and analysis (2nd ed.). Brooks/Cole.
  • MEB. (2016a). PISA 2015 national report [PISA 2015 ulusal raporu]. Ministry of National Education.
  • MEB. (2019). PISA 2018 preliminary national report [PISA 2018 ulusal ön raporu]. Ministry of National Education.
  • Meinck, S. (2015). Computing sampling weights in large-scale assessments in education. Survey Insights: Methods from the Field. https://surveyinsights.org/?p=5353
  • Ng, A. Y. (1997). Preventing “overfitting” of cross-validation data. In Proceedings of the 14th International Conference on Machine Learning. Morgan Kaufmann.
  • OECD. (2009). Survey weighting and the calculation of sampling variance. In PISA 2006 technical report. OECD Publishing.
  • OECD. (2017). PISA 2015 assessment and analytical framework (rev. ed.). OECD Publishing. https://doi.org/10.1787/9789264281820-en
  • OECD. (2019). PISA 2018 assessment and analytical framework. OECD Publishing. https://doi.org-/10.1787/b25efab8-en
  • Pfeffermann, D., Skinner, C. J., Holmes, D. J., Goldstein, H., & Rasbash, J. (1998). Weighting for unequal selection probabilities in multilevel models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60(1), 23–40. https://doi.org/10.1111/1467-9868.00106
  • Rust, K. (2013). Sampling, weighting, and variance estimation in international large-scale assessments. In L. Rutkowski, M. von Davier, & D. Rutkowski (Eds.), Handbook of international large-scale assessment (pp. 117–154). Chapman & Hall/CRC. https://doi.org/10.1201/b16061
  • Rutkowski, L., Gonzalez, E., Joncas, M., & von Davier, M. (2010). International large-scale assessment data: Issues in secondary analysis and reporting. Educational Researcher, 39(2), 142–151. https://doi.org/10.3102/0013189-X10363170
  • Rutkowski, L., von Davier, M., & Rutkowski, D. (Eds.). (2013). Handbook of international large-scale assessment. CRC Press. https://doi.-org/10.1201/b16061
  • Särndal, C.-E., Swensson, B., & Wretman, J. (1992). Model assisted survey sampling. Springer-Verlag. Shah, S. O., & Hussain, M. (2021). Parental occupation and its effect on the academic performance of children. JETIR, 8(8).
  • Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics (5th ed.). Pearson/Allyn & Bacon.
  • Tat, O., Koyuncu, İ., & Gelbal, S. (2019). The effect of using plausible values and weights on linear regression and HLM parameters [Makul değer ve ağırlıklandırma kullanımının doğrusal regresyon ve HLM parametrelerine etkisi]. Journal of Measurement and Evaluation in Education and Psychology, 10(3), 235–248.
  • Von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI Monograph Series, 2(1), 9–36.
  • Waldow, F. (2009). What PISA did and did not do: Germany after the “PISA-shock”. European Educational Research Journal, 8(3), 476–483. https://doi.org/10.2304/eerj.2009.8.3.476
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427–450. https://doi.org/10.1007-/BF02294627
  • Wiseman, A. W. (2013). Policy responses to PISA in comparative perspective. In H. D. Meyer & A. Benavot (Eds.), PISA, power, and policy: The emergence of global educational governance (pp. 303–322). Symposium Books.
  • Wu, M. (2005). The role of plausible values in large-scale surveys. Studies in Educational Evaluation, 31(2–3), 114–128. https://doi.org/10.-1016/j.stueduc.2005.05.005
  • Yung, J. L., Hsu, Y.-C., & Rice, K. (2012). Integrating data mining in program evaluation of K-12 online education. Journal of Educational Technology & Society, 15(3), 27–41.
Toplam 41 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Psikolojik Metodoloji, Tasarım ve Analiz
Bölüm Research Articles
Yazarlar

Yusuf Kasap 0000-0002-5114-1175

Mustafa Köroğlu 0000-0001-9610-8523

Erken Görünüm Tarihi 28 Eylül 2025
Yayımlanma Tarihi 30 Eylül 2025
Gönderilme Tarihi 29 Haziran 2025
Kabul Tarihi 27 Eylül 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 22 Sayı: 5

Kaynak Göster

APA Kasap, Y., & Köroğlu, M. (2025). The Effect of the Weight Variable on Predicting Reading Comprehension Achievement in PISA 2018: A Data Mining Approach. OPUS Journal of Society Research, 22(5), 1146-1159. https://doi.org/10.26466/opusjsr.1730221