Research Article
BibTex RIS Cite

Comparison of G and Phi coefficients estimated in generalizability theory with real cases

Year 2021, Volume: 8 Issue: 3, 583 - 595, 05.09.2021
https://doi.org/10.21449/ijate.948677

Abstract

This study aims to compare the G and Phi coefficients as estimated by D studies for a measurement tool with the G and Phi coefficients obtained from real cases in which items of differing difficulty levels were added and also to determine the conditions under which the D studies estimated reliability coefficients closer to reality. The study group for this research consisted of 80 seventh-grade students from various public and private secondary schools in the provinces of Ankara, Istanbul, and Adana in Turkey. Four raters who served as Turkish teachers in various public secondary schools in Ankara were included in this study. A data collection tool consisting of 12 tasks was prepared to measure the participating seventh grade students’ written expression skills in Turkish. The equation of the G and Phi coefficients estimated in the D study and obtained through the real cases was observed only when six tasks with item difficulty indexes close to the mean difficulty of the test were added in such a way that the mean difficulty of the test never changed. In other cases, where the mean difficulty of the test changed because of the addition of easy or difficult tasks, it was determined that the reliability coefficients estimated in the D study and obtained in real cases were similar, but they had different values.

References

  • Aiken, L., R. (2009). Psychological testing and assessment (Twelfth ed.). Pearson.
  • Anastasi, A., & Urbina, S. (1997). Psychological testing. Pearson.
  • Ankenmann, R. D., & Stone, C. A. (1992, April). A monte carlo study of marginal maximum likelihood parameter estimates fort he graded model. Paper presented at the Annual Meeting of the Council on Measurement in Education, San Francisco, CA.
  • Atılgan, H., & Tezbaşaran, A. A. (2005). Genellenebilirlik kuramı alternatif karar çalışmaları ile senaryolar ve gerçek durumlar için elde edilen G ve Phi katsayılarının tutarlılığının incelenmesi [An investigation on consistency of G and Phi coefficients obtained by generalizability theory alternative decisions study for scenarios and actual cases]. Eurasian Journal of Educational Research, 18, 236-252.
  • Baykul, Y. (2015). Eğitimde ve psikolojide ölçme: Klasik test teorisi ve uygulaması. Pegem.
  • Bıkmaz Bilgen, Ö., & Doğan, N. (2017). Çok kategorili parametrik ve parametrik olmayan madde tepki kuramı modellerinin karşılaştırılması [Comparison of polytomous parametric and nonparametric item response theory models]. Journal of Measurement and Evaluation in Education and Psychology, 8(4), 354-372.
  • Brennan, R. L. (2001). Generalizability theory. Springer-Verlag.
  • Büyüköztürk, Ş., Çakmak, E.K., Akgün, Ö.E., Karadeniz, Ş. ve Demirel, F. (2012). Bilimsel Araştırma Yöntemleri. Pegem.
  • Choi, J., & Wilson, M. R. (2018). Modeling rater effects using a combination of generalizability theory and IRT. Psychological Test and Assessment Modeling, 60(1), 53-80.
  • Crocker, L., & Algina J. (1986). Introduction to classical and modern test theory. Harcourt Brace Jovanovich Inc.
  • Çakıcı Eser, D., & Gelbal, S. (2013). Genellenebilirlik kuramı ve lojistik regresyona dayalı hesaplanan puanlayıcılar arası tutarlılığın karşılaştırılması [Comparison of interrater agreement calculated with generalizability theory and logistic regression]. Kastamonu Education Journal, 21(2), 421-438.
  • Çakır, M., & Aldemir, B. (2011). İki aşamalı genetik kavramlar tanı testi geliştirme ve geçerlik çalışması [Developing and validating a two tier mendel genetics diagnostic test]. Mustafa Kemal University Journal of Social Sciences Institute, 8(16), 335-353.
  • Deliceoğlu, G., & Çıkrıkçı Demirtaşlı, N. (2012). Futbol yetilerine ilişkin dereceleme ölçeğinin güvenirliğinin genellenebilirlik kuramına ve klasik test kuramına dayalı olarak karşılaştırılması [The comparison of the reliability of the soccer abilities’ rating scale based on the classical test theory and generalizabilty theory]. Hacettepe Journal of Sport Sciences, 23(1), 1-12.
  • Demir, B. P. (2016). Vee diyagramından elde edilen puanların güvenirliğinin klasik test kuramı ve genellenebilirlik kuramına göre incelenmesi [The examination of reliability of vee diagrams according to classical test theory and generalizability theory]. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 419-431.
  • Demirel, Ö., & Epçaçan, C. (2012). Okuduğunu anlama stratejilerinin bilişsel ve duyuşsal öğrenme ürünlerine etkisi [Effects of reading comprehension strategies on cognitive and affective learning outcomes]. Kalem International Journal of Education and Human Sciences, 2(1), 71-106.
  • Doğan, C. D., & Anadol, H. Ö. (2017). Genellenebilirlik kuramında tümüyle çaprazlanmış ve maddelerin puanlayıcılara yuvalandığı desenlerin karşılaştırılması [Comparing fully crossed and nested designs where items nested in raters in generalizability theory]. Kastamonu Education Journal, 25(1), 361-372.
  • Doğan, N., & Bıkmaz Bilgen, Ö. (2017). Using generalizability theory in reliability estimation of measurements of higher-order cognitive skills. The Journal of Academic Social Science, 44, 1-9.
  • Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2015). How to design and evaluate research in education. McGraw Hill Education.
  • Giray, M. D., & Sahin, D. N. (2012). Algılanan örgütsel, yönetici ve çalışma arkadaşları desteği ölçekleri: Geçerlik ve güvenirlik çalışması [Perceived organizational, supervisor and co-worker support scales: A study for validity and reliability]. Turkish Psychological Articles, 15(30), 1-9.
  • Güler, M., & Yetim, Ü. (2008). Ebeveyn rolüne ilişkin kendilik algısı ölçeği: Geçerlik ve güvenirlik çalışması [Self-perception of parental role (SPPR) scale: Validity and reliability study]. Turkish Psychological Articles, 11(22), 34-43.
  • Güler, N. (2011). Rasgele veriler üzerinde genellenebilirlik kuramı ve klasik test kuramına göre güvenirliğin karşılaştırılması [The comparison of reliability according to generalizability theory and classical test theory on random data]. Education and Science, 36(162), 225-234.
  • Güler, N., Eroğlu, Y., & Akbaba, S. (2014). Genellenebilirlik kuramına göre ölçüt bağımlı ölçme araçlarında güvenirlik: Yemek yeme becerileri örneğinde bir uygulama [Reliability of criterion-dependent measurement tools according to generalizability theory: Application in the case of eating skills]. Abant İzzet Baysal University Journal of Faculty of Education, 14(2), 217-232.
  • Güler, N., Kaya Uyanık, G., & Taşdelen Teker, G. (2012). Genellenebilirlik kuramı [Generalizability theory]. Pegem.
  • Gülle, A., Uzun, N. B., & Akay, C. (2018). Ortaokul öğrencilerine yönelik blok flüt icra performansı dereceli puanlama anahtarının güvenirliğinin genellenebilirlik kuramı ile incelenmesi [The study on the reliability of the grading key measuring the performance of the block flute performance of the secondary school students via generalizability theory]. Elementary Education Online, 17(3), 1463-1475.
  • Hathcoat, J. D., & Penn, J. D. (2012). Generalizability of student writing across multiple tasks: A challenge for authentic assessment. Research & Practice in Assessment, 7, 16-28.
  • Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56-64.
  • Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two and three parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6, 249-260.
  • Kamış, Ö., & Doğan, C. D. (2017). Genellenebilirlik kuramında gerçekleştirilen karar çalışmaları ne kadar kararlı? [How consistent are decision studies in g theory?]. Gazi University Journal of Gazi Educational Faculty, 37(2), 591-610.
  • Kaplan, A., & Duran, M. (2016). Ortaokul öğrencilerine yönelik matematiksel üstbiliş farkındalık ölçeği: Geçerlik ve güvenirlik çalışması [Mathematical metacognition awareness inventory towards middle school students: Validity and reliability study]. Journal of Kazım Karabekir Education Faculty, 32, 1-17.
  • Karasar, N. (2016). Bilimsel Araştırma Yöntemi. Nobel.
  • Karlsson, J. (2017). Generalizability theory and a scale measuring emotion knowledge in preschool children [Master's thesis, Stockholm University]. http://www.diva-portal.org/smash/get/diva2:1065849/FULLTEXT01.pdf
  • Katrancı, M., & Yangın, B. (2012). Üstbiliş stratejileri öğretiminin dinlediğini anlama becerisine ve dinlemeye yönelik tutuma etkisi [Effects of teaching metacognition strategies to listening comprehension skills and attitude toward listening]. Adiyaman University Journal of Social Sciences, 2013(11), 733-771.
  • Kaya, A. (2005). Çocuklar için yalnızlık ölçeğinin Türkçe formunun geçerlik ve güvenirlik çalışması [The validity and reliability study of the Turkish version of the children`s loneliness scale]. Eurasian Journal of Educational Research, 19, 220-237.
  • Kenny, D.A. (1987). Statistics for the social and behavioral science. Little, Brown.
  • Nitko, A. (2004). Educational asssessments of students. Pearson.
  • Scherbaum, C., Dickson, M., Larson, E., Bellenger, B., Yusko, K., & Goldstein, H. (2018). Creating test score bands for assessments involving ratings using a generalizability theory approach to reliability estimation. Personnel Assessment and Decisions, 4(1), 1-8. https://doi.org/10.25035/pad.2018.001
  • Solano-Flores, G., & Li, M. (2013). Generalizability theory and the fair and valid assessment of linguistic minorities. Educational Research and Evaluation, 19, 245-263. https://doi.org/10.1080/13803611.2013.767632
  • Tavşancıl, E. (2005). Tutumların ölçülmesi ve SPPS ile veri analizi [Measurement of attitudes and data analysis with SPPS]. Nobel.
  • Shavelson, J., & Webb, N. M. (1991). Generalizability theory: A primer. Sage.
  • Yılmaz Nalbantoğlu, F., & Gelbal, S. (2011). İletişim becerileri istasyonu örneğinde genellenebilirlik kuramıyla farklı desenlerin karşılaştırılması [Comparison of different designs in accordance with the generalizability theory in communication skills example]. Hacettepe University Journal of Education, 41, 509-518.

Comparison of G and Phi coefficients estimated in generalizability theory with real cases

Year 2021, Volume: 8 Issue: 3, 583 - 595, 05.09.2021
https://doi.org/10.21449/ijate.948677

Abstract

This study aims to compare the G and Phi coefficients as estimated by D studies for a measurement tool with the G and Phi coefficients obtained from real cases in which items of differing difficulty levels were added and also to determine the conditions under which the D studies estimated reliability coefficients closer to reality. The study group for this research consisted of 80 seventh-grade students from various public and private secondary schools in the provinces of Ankara, Istanbul, and Adana in Turkey. Four raters who served as Turkish teachers in various public secondary schools in Ankara were included in this study. A data collection tool consisting of 12 tasks was prepared to measure the participating seventh grade students’ written expression skills in Turkish. The equation of the G and Phi coefficients estimated in the D study and obtained through the real cases was observed only when six tasks with item difficulty indexes close to the mean difficulty of the test were added in such a way that the mean difficulty of the test never changed. In other cases, where the mean difficulty of the test changed because of the addition of easy or difficult tasks, it was determined that the reliability coefficients estimated in the D study and obtained in real cases were similar, but they had different values.

References

  • Aiken, L., R. (2009). Psychological testing and assessment (Twelfth ed.). Pearson.
  • Anastasi, A., & Urbina, S. (1997). Psychological testing. Pearson.
  • Ankenmann, R. D., & Stone, C. A. (1992, April). A monte carlo study of marginal maximum likelihood parameter estimates fort he graded model. Paper presented at the Annual Meeting of the Council on Measurement in Education, San Francisco, CA.
  • Atılgan, H., & Tezbaşaran, A. A. (2005). Genellenebilirlik kuramı alternatif karar çalışmaları ile senaryolar ve gerçek durumlar için elde edilen G ve Phi katsayılarının tutarlılığının incelenmesi [An investigation on consistency of G and Phi coefficients obtained by generalizability theory alternative decisions study for scenarios and actual cases]. Eurasian Journal of Educational Research, 18, 236-252.
  • Baykul, Y. (2015). Eğitimde ve psikolojide ölçme: Klasik test teorisi ve uygulaması. Pegem.
  • Bıkmaz Bilgen, Ö., & Doğan, N. (2017). Çok kategorili parametrik ve parametrik olmayan madde tepki kuramı modellerinin karşılaştırılması [Comparison of polytomous parametric and nonparametric item response theory models]. Journal of Measurement and Evaluation in Education and Psychology, 8(4), 354-372.
  • Brennan, R. L. (2001). Generalizability theory. Springer-Verlag.
  • Büyüköztürk, Ş., Çakmak, E.K., Akgün, Ö.E., Karadeniz, Ş. ve Demirel, F. (2012). Bilimsel Araştırma Yöntemleri. Pegem.
  • Choi, J., & Wilson, M. R. (2018). Modeling rater effects using a combination of generalizability theory and IRT. Psychological Test and Assessment Modeling, 60(1), 53-80.
  • Crocker, L., & Algina J. (1986). Introduction to classical and modern test theory. Harcourt Brace Jovanovich Inc.
  • Çakıcı Eser, D., & Gelbal, S. (2013). Genellenebilirlik kuramı ve lojistik regresyona dayalı hesaplanan puanlayıcılar arası tutarlılığın karşılaştırılması [Comparison of interrater agreement calculated with generalizability theory and logistic regression]. Kastamonu Education Journal, 21(2), 421-438.
  • Çakır, M., & Aldemir, B. (2011). İki aşamalı genetik kavramlar tanı testi geliştirme ve geçerlik çalışması [Developing and validating a two tier mendel genetics diagnostic test]. Mustafa Kemal University Journal of Social Sciences Institute, 8(16), 335-353.
  • Deliceoğlu, G., & Çıkrıkçı Demirtaşlı, N. (2012). Futbol yetilerine ilişkin dereceleme ölçeğinin güvenirliğinin genellenebilirlik kuramına ve klasik test kuramına dayalı olarak karşılaştırılması [The comparison of the reliability of the soccer abilities’ rating scale based on the classical test theory and generalizabilty theory]. Hacettepe Journal of Sport Sciences, 23(1), 1-12.
  • Demir, B. P. (2016). Vee diyagramından elde edilen puanların güvenirliğinin klasik test kuramı ve genellenebilirlik kuramına göre incelenmesi [The examination of reliability of vee diagrams according to classical test theory and generalizability theory]. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 419-431.
  • Demirel, Ö., & Epçaçan, C. (2012). Okuduğunu anlama stratejilerinin bilişsel ve duyuşsal öğrenme ürünlerine etkisi [Effects of reading comprehension strategies on cognitive and affective learning outcomes]. Kalem International Journal of Education and Human Sciences, 2(1), 71-106.
  • Doğan, C. D., & Anadol, H. Ö. (2017). Genellenebilirlik kuramında tümüyle çaprazlanmış ve maddelerin puanlayıcılara yuvalandığı desenlerin karşılaştırılması [Comparing fully crossed and nested designs where items nested in raters in generalizability theory]. Kastamonu Education Journal, 25(1), 361-372.
  • Doğan, N., & Bıkmaz Bilgen, Ö. (2017). Using generalizability theory in reliability estimation of measurements of higher-order cognitive skills. The Journal of Academic Social Science, 44, 1-9.
  • Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2015). How to design and evaluate research in education. McGraw Hill Education.
  • Giray, M. D., & Sahin, D. N. (2012). Algılanan örgütsel, yönetici ve çalışma arkadaşları desteği ölçekleri: Geçerlik ve güvenirlik çalışması [Perceived organizational, supervisor and co-worker support scales: A study for validity and reliability]. Turkish Psychological Articles, 15(30), 1-9.
  • Güler, M., & Yetim, Ü. (2008). Ebeveyn rolüne ilişkin kendilik algısı ölçeği: Geçerlik ve güvenirlik çalışması [Self-perception of parental role (SPPR) scale: Validity and reliability study]. Turkish Psychological Articles, 11(22), 34-43.
  • Güler, N. (2011). Rasgele veriler üzerinde genellenebilirlik kuramı ve klasik test kuramına göre güvenirliğin karşılaştırılması [The comparison of reliability according to generalizability theory and classical test theory on random data]. Education and Science, 36(162), 225-234.
  • Güler, N., Eroğlu, Y., & Akbaba, S. (2014). Genellenebilirlik kuramına göre ölçüt bağımlı ölçme araçlarında güvenirlik: Yemek yeme becerileri örneğinde bir uygulama [Reliability of criterion-dependent measurement tools according to generalizability theory: Application in the case of eating skills]. Abant İzzet Baysal University Journal of Faculty of Education, 14(2), 217-232.
  • Güler, N., Kaya Uyanık, G., & Taşdelen Teker, G. (2012). Genellenebilirlik kuramı [Generalizability theory]. Pegem.
  • Gülle, A., Uzun, N. B., & Akay, C. (2018). Ortaokul öğrencilerine yönelik blok flüt icra performansı dereceli puanlama anahtarının güvenirliğinin genellenebilirlik kuramı ile incelenmesi [The study on the reliability of the grading key measuring the performance of the block flute performance of the secondary school students via generalizability theory]. Elementary Education Online, 17(3), 1463-1475.
  • Hathcoat, J. D., & Penn, J. D. (2012). Generalizability of student writing across multiple tasks: A challenge for authentic assessment. Research & Practice in Assessment, 7, 16-28.
  • Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56-64.
  • Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two and three parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6, 249-260.
  • Kamış, Ö., & Doğan, C. D. (2017). Genellenebilirlik kuramında gerçekleştirilen karar çalışmaları ne kadar kararlı? [How consistent are decision studies in g theory?]. Gazi University Journal of Gazi Educational Faculty, 37(2), 591-610.
  • Kaplan, A., & Duran, M. (2016). Ortaokul öğrencilerine yönelik matematiksel üstbiliş farkındalık ölçeği: Geçerlik ve güvenirlik çalışması [Mathematical metacognition awareness inventory towards middle school students: Validity and reliability study]. Journal of Kazım Karabekir Education Faculty, 32, 1-17.
  • Karasar, N. (2016). Bilimsel Araştırma Yöntemi. Nobel.
  • Karlsson, J. (2017). Generalizability theory and a scale measuring emotion knowledge in preschool children [Master's thesis, Stockholm University]. http://www.diva-portal.org/smash/get/diva2:1065849/FULLTEXT01.pdf
  • Katrancı, M., & Yangın, B. (2012). Üstbiliş stratejileri öğretiminin dinlediğini anlama becerisine ve dinlemeye yönelik tutuma etkisi [Effects of teaching metacognition strategies to listening comprehension skills and attitude toward listening]. Adiyaman University Journal of Social Sciences, 2013(11), 733-771.
  • Kaya, A. (2005). Çocuklar için yalnızlık ölçeğinin Türkçe formunun geçerlik ve güvenirlik çalışması [The validity and reliability study of the Turkish version of the children`s loneliness scale]. Eurasian Journal of Educational Research, 19, 220-237.
  • Kenny, D.A. (1987). Statistics for the social and behavioral science. Little, Brown.
  • Nitko, A. (2004). Educational asssessments of students. Pearson.
  • Scherbaum, C., Dickson, M., Larson, E., Bellenger, B., Yusko, K., & Goldstein, H. (2018). Creating test score bands for assessments involving ratings using a generalizability theory approach to reliability estimation. Personnel Assessment and Decisions, 4(1), 1-8. https://doi.org/10.25035/pad.2018.001
  • Solano-Flores, G., & Li, M. (2013). Generalizability theory and the fair and valid assessment of linguistic minorities. Educational Research and Evaluation, 19, 245-263. https://doi.org/10.1080/13803611.2013.767632
  • Tavşancıl, E. (2005). Tutumların ölçülmesi ve SPPS ile veri analizi [Measurement of attitudes and data analysis with SPPS]. Nobel.
  • Shavelson, J., & Webb, N. M. (1991). Generalizability theory: A primer. Sage.
  • Yılmaz Nalbantoğlu, F., & Gelbal, S. (2011). İletişim becerileri istasyonu örneğinde genellenebilirlik kuramıyla farklı desenlerin karşılaştırılması [Comparison of different designs in accordance with the generalizability theory in communication skills example]. Hacettepe University Journal of Education, 41, 509-518.
There are 40 citations in total.

Details

Primary Language English
Subjects Studies on Education
Journal Section Articles
Authors

Kaan Zulfikar Deniz 0000-0003-0920-538X

Emel Ilıcan This is me 0000-0003-4244-6441

Publication Date September 5, 2021
Submission Date October 3, 2020
Published in Issue Year 2021 Volume: 8 Issue: 3

Cite

APA Deniz, K. Z., & Ilıcan, E. (2021). Comparison of G and Phi coefficients estimated in generalizability theory with real cases. International Journal of Assessment Tools in Education, 8(3), 583-595. https://doi.org/10.21449/ijate.948677

23824         23823             23825