Değişen Madde Fonksiyonunun (DMF) belirlenmesi, bir testin ve testten alınan puanların geçerliğine ilişkin önemli göstergeler sunmaktadır. difR paketi ise farklı DMF belirleme yöntemlerinin uygulanmasına izin vererek araştırmacılara ve uygulayıcılara büyük kolaylık sağlayan R paketidir. Bu araştırmanın temel amacı örnek bir araştırma verisi üzerinden, difR paketinde farklı DMF belirleme yöntemlerine ilişkin; yazılım kurulumu, varsayımların incelenmesi, analiz adımları ve analiz sonuçlarının yorumlanması için izlenen sürecin resmedilmesidir. Bu temel amaç doğrultusunda Türkiye'de 8. sınıf öğrencilerine uygulanan Ortaöğretime Geçiş Sınavı 2018 uygulamasında yer alan fen maddelerinin, madde sıra etkisi bakımından DMF gösterme durumları incelenmiştir. Bu yönüyle araştırma tarama modelinde bir araştırmadır. Araştırmada sıklıkla kullanılan DMF belirleme yöntemlerinden Klasik Test Kuramına dayalı Mantel-Haenszel, Lojistik Regresyon ve SIBTEST ile Madde Tepki Kuramına dayalı Olabilirlik Oran yöntemlerine ilişkin adımlar ele alınmıştır. DMF analizleri sonucu elde edilen bulgulara göre fen maddelerinin madde sıra etkisi bakımından çoğunlukla DMF göstermediği ya da ihmal edilebilir düzeyde DMF gösterdiği sonucuna ulaşılmıştır.
Adedoyin, O. (2010). An investigation of the effects of teachers’ classroom questions on the achievements of students in mathematics: case study of Botswana community junior secondary schools. European Journal of Educational Studies, 2(3).
Akalın, Ş. (2014). Kamu personeli seçme sınavı genel yetenek testinin madde yanlılığı açısından incelenmesi. (Yayınlanmamış Doktora Tezi). Ankara Üniversitesi
Allaire, J. J. (2011, August). R Studio: Integrated development environment for R. In The R User Conference, useR! 2011 August 16-18 2011 University of Warwick, Coventry, UK (p. 14).
Asil, M. ve Gelbal, S. (2012). PISA öğrenci anketinin kültürler arası eşdeğerliği. Eğitim ve Bilim, 37(166).
American Educational Research Association, American Psychological Association, National Council on Measurement in Education [AERA/APA/NCME]. (1999). Standards for educational and psychological testing. Washington, DC: American Psychological Association.
Avcu, A., Tunç, E. B. and Uluman, M. (2018). How the order of the items in a booklet affects item functioning: Emprical findings from course level data. European Journal of Education Studies.4(3). 227-239.doi:10.5281/zenodo.1199695
Bakan Kalaycıoğlu, D. and Kelecioğlu, H. (2011). Item bias analysis of the university entrance examination. Education and Science, 36(161), 3.
Balta, E. and Ömür Sümbül, S. (2017). An investigation of ordering test items differently depending on their difficulty level by differential item functioning. Eurasian Journal of Educational Research 72, 23-42
Barcikovski, R. S. and Olsen, H. (1975). Test item arrangement and adaptation level. The Journal of Psychology, 90(1), 87-93. doi: 10.1080/00223980.1975.9923929.
Bolt, D. (2000). A SIBTEST approach to testing DIF hypothesis using experimentally designed test items. Journal of Educational Measurement, 37, 307-327.
Bolt, S. E., and Ysseldyke, J. (2008). Accommodating students with disabilities in large-scale testing: A comparison of differential item functioning (DIF) identified across disability types. Journal of Psychoeducational Assessment, 26(2), 121-138. doi:10.1177/0734282907307703
Büyüköztürk, Ş. (2004). Veri analizi el kitabı. Ankara: Pegem A Yayıncılık.
Camilli, G. and Shepard, L. A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.
Cattell, R. B. (1966). The data box: Its ordering of total resources in terms of possible relational systems. In R. B. Cattell (Ed.), Handbook of multivariate experimental psychology (pp. 67–128). Chicago, IL: Rand-McNally
Cheung, G. W. and Rensvold, R. B. (2000). Assessing extreme and acquiescence response sets in cross-cultural research using structural equations modeling. Journal of Cross-cultural Psychology, 31(2), 188-213.
Chiu, P. (2012). The effect of item position on state mathematics assessment. Paper presented at the Annual Meeting of the American Educational Research Association, Kansas. Erişim adresi:https://aai.ku.edu/sites/cete.ku.edu/files/docs/Presentations/2012_04_Chiu_Math_Item_Or dering_AERA_2012.pdf
Chiu, P.C., and Irwin, P.M. (2011). Chronological item ordering: Does it make a difference on a state history and government assessment? Paper presented at the Annual Meeting of the American Educational Research Association, Kansas. Erişim adresi: https://cete.ku.edu/sites/cete.drupal.ku.edu/files/docs/Presentations/2011/History_Item_Order_7-12-2011.pdf
Crocker, L. M., and Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart and Winston.
Çepni, Z. (2011). Değişen madde fonksiyonlarının SIBTEST, Mantel-Haenzsel, lojistik regresyon ve madde tepki kuramı yöntemleriyle incelenmesi. (Yayınlanmamış doktora tezi). Hacettepe Üniversitesi, Ankara.
DeMars, C. E. (2010). Type I errorinflationfordetecting DIF in the presence of impact. EducationalandPsychologicalMeasurement, 70(6), 961–972. https://doi.org/10.1177/0013164410366691
Doğan, C., D. ve Uluman, M. (2016). İstatistiksel Veri Analizinde R Yazılımı ve Kullanımı, İlköğretim Online, 15(2), 615-634, 2016. doi: http://dx.doi.org/10.17051/io.2016.24991
Dorans, N. J., and Holland, P. W. (1993). DIF detection and description: Mantel-Haenszel and standardization. In P. W. Holland and H. Wainer (Eds.), Differential item functioning (p. 35–66). Lawrence Erlbaum Associates, Inc.
Embretson, S. E., and Reise, S. P. (2000). Multivariate Applications Books Series.Item response theory for psychologists. Lawrence Erlbaum Associates Publishers.
Finch, W. H. and French, B. F. (2007). Detection of crossing differential item functioning: A comparison of fourmethods. Educational and Psychological Measurement, 67(4), 565-582.
Fraenkel, J.R., and Wallen, N.E. (2006). How to design and evaluate research in education. McGraw Hill Higher Education. New York, NY.
Gierl, M. J., Jodoin, M. G. and Ackerman, T. A. (2000). Performance of Mantel-Haenszel, Simultaneous Item Bias Test and Logistic Regression when the proportion of DIF items is large. Paper presented at the Annual meeting of the American Educational Research Association, New Orleans, LA.
Gierl, M. J. (2000). Construct Equivalence on Translated Achievement Tests. Canadian Journal of Education, 25(4), 280-296.
Gotzmann, A., Wright, K. and Rodden, L.(2006). A Comparison of Power Rates for Items Favoring the Reference and Focal group for the Mantel-Haenszel and SIBTEST Procedures. Paper presented at the American Educational Research Association (AERA) in San Francisco, California.
Gök, B., Kelecioğlu, H. ve Doğan, N. (2010). Değişen Madde Fonksiyonunu belirlemede Mantel–Haenszel ve Lojistik Regresyon tekniklerinin karşılaştırılması. Eğitim ve Bilim, 35(156).
Greer, T.G. (2004). Detection of differential item functioning (dif) on the satv: a comparison of four methods: Mantel-Haenszel, logistic regression, simultaneous item bias and likelihood ratio test (Yayımlanmamış Doktora Tezi). University of Houston, Houston.
Grover, R. K. and Ercikan, K. (2017). For which boys and which girls are reading assessment items biased against? Detection of differential item functioning in heterogeneous gender populations. Applied Measurement in Education, 30(3), 178–195. https://doi.org/10.1080/08957347.2017.1316276.
Hahne J. (2008). Analyzing position effects within reasoning items using the LLTM for structurally incomplete data. Psychology Science Quarterly, 50, 379-390.
Hambleton, R. K. and Traub, R. E. (1974). The effects of item order on test performance and stress. Journal of Experimental Education, 43(1), 40–46. https://doi.org/10.1080/00220973.1974.10806302
Hambleton, R.K., and Swaminathan, H. (1989). Item Response Theory: Principles and Applications. USA: Kluwer Nijhoff Publishing.
Herrera, A.-N. and Gómez, J. (2008). Influence of equal or unequal comparison group sample sizes on the detection of differential item functioning using the Mantel-Haenszel and Logistic Regression techniques. Quality & Quantity: International Journal of Methodology, 42(6), 739–755. https://doi.org/10.1007/s11135-006-9065-z
Horgan, J., M. (2012).Programming in R. WIREs Comp Stat, 4,75-84. doi: 10.1002/wics.183
International Test Commission (2005). International Test Commission Guidelines for Test Adaptation. London: Author.
Jodoin, M. G. and Gierl, M.J. (2001). Evaluating Type I error and power rates using an effect size measurewithlogisticregressionprocedurefor DIF detection. Applied Measurement in Education, 14(4), 329-349.doi:10.1207/S15324818AME1404_2
Johnstone, C. J.,Thompson, S. J., Moen, R. E., Bolt, S., and Kato, K. (2005). Analyzing results of large-scale assessments to ensure universal design (Technical Report 41). RetrievedfromUniversity of Minnesota, National Center on Educational Outcomes website: http://education.umn.edu/NCEO/OnlinePubs/Technical41.htm
Karakoç Alatlı, B. ve Çokluk Bökeoğlu, Ö. (2018). Investigation of measurement invariance of literacy tests in the Programme for International Student Assessment (PISA-2012). Elementary Education Online, 17(2), 1096-1115. [Online]: http://ilkogretim-online.org.tr doi 10.17051/ilkonline.2018.419357
Kingston, N. M., and Dorans, N. J. (1984).I tem location effects and their implications for IRT equating and adaptive testing. Applied Psychological Measurement, 8(2), 147–154.doi: 10.1177/014662168400800202.
Kleinke, D. J. (1980). Item order, response location, and examinee sex and handedness on performance on multiple-choice tests. Journal of Educational Research, 73(4), 225–229. doi:10.1080/00220671.1980.10885240.
Koyuncu, İ., Aksu, G., ve Kelecioğlu, H. (2018). Mantel-Haenszel, Lojistik Regresyon ve Olabilirlik Oranı Değişen Madde Fonksiyonu İnceleme Yöntemlerinin Farklı Yazılımlar Kullanılarak Karşılaştırılması. İlköğretim Online, 17, 902-925.
LanLuo, Cara Arizmendi and Kathleen M. Gates (2019) Exploratory Factor Analysis (EFA) Programs in R, Structural Equation Modeling: A Multidisciplinary Journal, 26(5), 819-826, DOI: 10.1080/10705511.2019.1615835
Loehlin, J. C. (1987). Latent variable models: An introduction to factor, path, and structural analysis. Hillsdale, NJ: Erlbaum.
Lyons-Thomas, J., Sandilands, D. D., and Ercikan, K. (2014). Gender differential item functioning in mathematics in four international jurisdictions. Education and Science, 39(172), 20-32.
Magis, D., Beland, S. and Raiche, G. (2018). Collection of Methods to Detect Dichotomous Differential Item Functioning (DIF).https://cran.r-project.org/web/packages/difR/difR.pdf
Magis, D.,Beland, S., Tuerlinckx, F. and De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847-862.
Mertler, C. A. and Vannatta, R. A. (2005). Advanced and multivariate statistical methods: Practical application and interpretation. Los Angeles: Pyrczak.
Milli Eğitim Bakanlığı (2018). Sınavla öğrenci alacak ortaöğretim kurumlarına ilişkin merkezî sınav başvuru ve uygulama kılavuzu. Ankara
Narayanan, P., and Swaminathan, H. (1996).Identification of items that show nonuniform DIF. Applied Psychological Measurement, 20(3), 257–274. https://doi.org/10.1177/014662169602000306
Osadebe, P. U. and Agbure, B. (2018). Assessment of differential item functioning in social studies multiple choice questions in basic education certificate examination. European Journal of Education Studies.4(9). 236-257. doi: 10.5281/zenodo.1301272.
Osterlind, S., J. and Everson, H., T. (2009). Differential Item Functioning. Thousand Oaks. CA: SAGE Publications, Inc. doi: http://dx.doi.org/10.4135/9781412993913
Raju, N. S., Laffitte, L. J. and Byrne, B. M. (2002). Measurement equivalence: A comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87(3), 527-529.
Rogers, H. J., and Swaminathan, H. (1993) A comparison of the logistic regression and Mantel-Haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17(2), 105–116.
Ryan, K. E. and Chiu, S. (2001). An examination of item context effects, DIF, and gender DIF. Applied Measurement in Education, 14 (1), 73–90.doi:10.1207/S15324818AME1401_06
Shealy, R. and Stout, W. F. (1993). A model-based standardization approach that separates true bias/ DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika,58, 159–194.
Stark, S., Chernyshenko, O. S. and Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91(6), 1292-1306.
Stiglic,G.,Watson,R., and Cilar, L. (2019). R you ready? Using the R program for statistical analysis and graphics. Research in Nursing and Health, 42(6), 494-499.
Stoneberg, B. D. (2004). A study of gender-based and ethnic-based Differential Item Functioning (DIF) in the Spring 2003 Idaho Standards Achievement tests applying the Simultaneous Bias Test (SIBTEST) and the Mantel-Haenszel chi-square Test.The University of Maryland (UM) Measurement, Statistics, and Evaluation Department and the National Center for Education Statistics (NCES) Assessment Division. Retrieved from http://elearndesign, http://files.eric.ed.gov/ fulltext/ED489949.pdf
Stout, W. and Roussos, L. (1995). SIBTEST user manual. Urbana: University of Illinois.
Swaminathan, H., and Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361–370. https://doi.org/10.1111/j.1745-3984.1990.tb00754.x
Tabachnick, B. G. and Fidell, L. S. (2007). Using multivariate statistics. Boston, MA: Pearson.
Tan, Ş. (2009). Soru Sırasının Madde Güçlüğü ve Ayırıcılık Gücüne Etkisi. e-Journal of New World Sciences Academy, 4(2), 486-493. Erişimadresi:http://www.newwsa.com/download/gecici_makale_dosyalari/NWSA-1091-3-2.pdf
Ünsal Özberk, E. B. ve Koç, N. (2017). WÇZÖ-IV maddelerinin cinsiyet ve sosyo-ekonomik düzey açısından işlev farklılığının belirlenmesinde kullanılan yöntemlerin karşılaştırılması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 8(1), 112-127.
Van Buuren, S. (2012). Flexible imputation of missing data. Chapman and Hall/CRC.
Vandenberg, R. J. and Lance, C. E. (2000). A review and synthesis of the MI literature: suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4-69.
Walzebug, A. (2014). Is there a language-based social disadvantage in solving mathematical items? Learning, Culture and Social Interaction, 3(2), 159-169.
Yen, W. M. (1980). The extent, causes and importance of context effects on item parameters for two latent trait models. Journal of Educational Measurement, 17(4), 297–311. doi: 10.1111/j.1745- 3984.1980.tb00833.x
Yıldırım, H. (2015). 2012 yılı seviye belirleme sınavı matematik alt testinin madde yanlılığı açısından incelenmesi. (Yayınlanmamış yüksek lisans tezi). Gazi Üniversitesi, Ankara.
Zumbo, B. D., and Thomas, D. R. (1996). A measure of DIF effect size using logistic regression procedures. National Board of Medical Examiners, Philadelphia, PA.
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (Ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.
Using R to Detect Differential Item Functioning: Science sub-test of Secondary School Entrance Examination
Differential Item Functioning (DIF) analyses provide critical information about validity of a test. R; an open source software, that comprises all of the DIF detection methods, has an important role in DIF research. Therefore, conducting a guiding study for measurement invariance or DIF analyses by following scientific methods and procedures will be very useful for researchers and practitioners. In this research, it is aimed to illustrate the procedures followed in different DIF detection methods in R, beginning from the installation of the R software to the interpretation of the analysis results, using a sample test (science sub-test of Secondary School Entrance Examination) and data. Four DIF detection methods, which are commonly used in DIF analyses, Mantel-Haenszel, Logistic Regression, SIBTEST and Likelihood Ratio methods are handled in this study. According to the analysis results, no items indicate DIF or indicate negligible DIF
Adedoyin, O. (2010). An investigation of the effects of teachers’ classroom questions on the achievements of students in mathematics: case study of Botswana community junior secondary schools. European Journal of Educational Studies, 2(3).
Akalın, Ş. (2014). Kamu personeli seçme sınavı genel yetenek testinin madde yanlılığı açısından incelenmesi. (Yayınlanmamış Doktora Tezi). Ankara Üniversitesi
Allaire, J. J. (2011, August). R Studio: Integrated development environment for R. In The R User Conference, useR! 2011 August 16-18 2011 University of Warwick, Coventry, UK (p. 14).
Asil, M. ve Gelbal, S. (2012). PISA öğrenci anketinin kültürler arası eşdeğerliği. Eğitim ve Bilim, 37(166).
American Educational Research Association, American Psychological Association, National Council on Measurement in Education [AERA/APA/NCME]. (1999). Standards for educational and psychological testing. Washington, DC: American Psychological Association.
Avcu, A., Tunç, E. B. and Uluman, M. (2018). How the order of the items in a booklet affects item functioning: Emprical findings from course level data. European Journal of Education Studies.4(3). 227-239.doi:10.5281/zenodo.1199695
Bakan Kalaycıoğlu, D. and Kelecioğlu, H. (2011). Item bias analysis of the university entrance examination. Education and Science, 36(161), 3.
Balta, E. and Ömür Sümbül, S. (2017). An investigation of ordering test items differently depending on their difficulty level by differential item functioning. Eurasian Journal of Educational Research 72, 23-42
Barcikovski, R. S. and Olsen, H. (1975). Test item arrangement and adaptation level. The Journal of Psychology, 90(1), 87-93. doi: 10.1080/00223980.1975.9923929.
Bolt, D. (2000). A SIBTEST approach to testing DIF hypothesis using experimentally designed test items. Journal of Educational Measurement, 37, 307-327.
Bolt, S. E., and Ysseldyke, J. (2008). Accommodating students with disabilities in large-scale testing: A comparison of differential item functioning (DIF) identified across disability types. Journal of Psychoeducational Assessment, 26(2), 121-138. doi:10.1177/0734282907307703
Büyüköztürk, Ş. (2004). Veri analizi el kitabı. Ankara: Pegem A Yayıncılık.
Camilli, G. and Shepard, L. A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.
Cattell, R. B. (1966). The data box: Its ordering of total resources in terms of possible relational systems. In R. B. Cattell (Ed.), Handbook of multivariate experimental psychology (pp. 67–128). Chicago, IL: Rand-McNally
Cheung, G. W. and Rensvold, R. B. (2000). Assessing extreme and acquiescence response sets in cross-cultural research using structural equations modeling. Journal of Cross-cultural Psychology, 31(2), 188-213.
Chiu, P. (2012). The effect of item position on state mathematics assessment. Paper presented at the Annual Meeting of the American Educational Research Association, Kansas. Erişim adresi:https://aai.ku.edu/sites/cete.ku.edu/files/docs/Presentations/2012_04_Chiu_Math_Item_Or dering_AERA_2012.pdf
Chiu, P.C., and Irwin, P.M. (2011). Chronological item ordering: Does it make a difference on a state history and government assessment? Paper presented at the Annual Meeting of the American Educational Research Association, Kansas. Erişim adresi: https://cete.ku.edu/sites/cete.drupal.ku.edu/files/docs/Presentations/2011/History_Item_Order_7-12-2011.pdf
Crocker, L. M., and Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart and Winston.
Çepni, Z. (2011). Değişen madde fonksiyonlarının SIBTEST, Mantel-Haenzsel, lojistik regresyon ve madde tepki kuramı yöntemleriyle incelenmesi. (Yayınlanmamış doktora tezi). Hacettepe Üniversitesi, Ankara.
DeMars, C. E. (2010). Type I errorinflationfordetecting DIF in the presence of impact. EducationalandPsychologicalMeasurement, 70(6), 961–972. https://doi.org/10.1177/0013164410366691
Doğan, C., D. ve Uluman, M. (2016). İstatistiksel Veri Analizinde R Yazılımı ve Kullanımı, İlköğretim Online, 15(2), 615-634, 2016. doi: http://dx.doi.org/10.17051/io.2016.24991
Dorans, N. J., and Holland, P. W. (1993). DIF detection and description: Mantel-Haenszel and standardization. In P. W. Holland and H. Wainer (Eds.), Differential item functioning (p. 35–66). Lawrence Erlbaum Associates, Inc.
Embretson, S. E., and Reise, S. P. (2000). Multivariate Applications Books Series.Item response theory for psychologists. Lawrence Erlbaum Associates Publishers.
Finch, W. H. and French, B. F. (2007). Detection of crossing differential item functioning: A comparison of fourmethods. Educational and Psychological Measurement, 67(4), 565-582.
Fraenkel, J.R., and Wallen, N.E. (2006). How to design and evaluate research in education. McGraw Hill Higher Education. New York, NY.
Gierl, M. J., Jodoin, M. G. and Ackerman, T. A. (2000). Performance of Mantel-Haenszel, Simultaneous Item Bias Test and Logistic Regression when the proportion of DIF items is large. Paper presented at the Annual meeting of the American Educational Research Association, New Orleans, LA.
Gierl, M. J. (2000). Construct Equivalence on Translated Achievement Tests. Canadian Journal of Education, 25(4), 280-296.
Gotzmann, A., Wright, K. and Rodden, L.(2006). A Comparison of Power Rates for Items Favoring the Reference and Focal group for the Mantel-Haenszel and SIBTEST Procedures. Paper presented at the American Educational Research Association (AERA) in San Francisco, California.
Gök, B., Kelecioğlu, H. ve Doğan, N. (2010). Değişen Madde Fonksiyonunu belirlemede Mantel–Haenszel ve Lojistik Regresyon tekniklerinin karşılaştırılması. Eğitim ve Bilim, 35(156).
Greer, T.G. (2004). Detection of differential item functioning (dif) on the satv: a comparison of four methods: Mantel-Haenszel, logistic regression, simultaneous item bias and likelihood ratio test (Yayımlanmamış Doktora Tezi). University of Houston, Houston.
Grover, R. K. and Ercikan, K. (2017). For which boys and which girls are reading assessment items biased against? Detection of differential item functioning in heterogeneous gender populations. Applied Measurement in Education, 30(3), 178–195. https://doi.org/10.1080/08957347.2017.1316276.
Hahne J. (2008). Analyzing position effects within reasoning items using the LLTM for structurally incomplete data. Psychology Science Quarterly, 50, 379-390.
Hambleton, R. K. and Traub, R. E. (1974). The effects of item order on test performance and stress. Journal of Experimental Education, 43(1), 40–46. https://doi.org/10.1080/00220973.1974.10806302
Hambleton, R.K., and Swaminathan, H. (1989). Item Response Theory: Principles and Applications. USA: Kluwer Nijhoff Publishing.
Herrera, A.-N. and Gómez, J. (2008). Influence of equal or unequal comparison group sample sizes on the detection of differential item functioning using the Mantel-Haenszel and Logistic Regression techniques. Quality & Quantity: International Journal of Methodology, 42(6), 739–755. https://doi.org/10.1007/s11135-006-9065-z
Horgan, J., M. (2012).Programming in R. WIREs Comp Stat, 4,75-84. doi: 10.1002/wics.183
International Test Commission (2005). International Test Commission Guidelines for Test Adaptation. London: Author.
Jodoin, M. G. and Gierl, M.J. (2001). Evaluating Type I error and power rates using an effect size measurewithlogisticregressionprocedurefor DIF detection. Applied Measurement in Education, 14(4), 329-349.doi:10.1207/S15324818AME1404_2
Johnstone, C. J.,Thompson, S. J., Moen, R. E., Bolt, S., and Kato, K. (2005). Analyzing results of large-scale assessments to ensure universal design (Technical Report 41). RetrievedfromUniversity of Minnesota, National Center on Educational Outcomes website: http://education.umn.edu/NCEO/OnlinePubs/Technical41.htm
Karakoç Alatlı, B. ve Çokluk Bökeoğlu, Ö. (2018). Investigation of measurement invariance of literacy tests in the Programme for International Student Assessment (PISA-2012). Elementary Education Online, 17(2), 1096-1115. [Online]: http://ilkogretim-online.org.tr doi 10.17051/ilkonline.2018.419357
Kingston, N. M., and Dorans, N. J. (1984).I tem location effects and their implications for IRT equating and adaptive testing. Applied Psychological Measurement, 8(2), 147–154.doi: 10.1177/014662168400800202.
Kleinke, D. J. (1980). Item order, response location, and examinee sex and handedness on performance on multiple-choice tests. Journal of Educational Research, 73(4), 225–229. doi:10.1080/00220671.1980.10885240.
Koyuncu, İ., Aksu, G., ve Kelecioğlu, H. (2018). Mantel-Haenszel, Lojistik Regresyon ve Olabilirlik Oranı Değişen Madde Fonksiyonu İnceleme Yöntemlerinin Farklı Yazılımlar Kullanılarak Karşılaştırılması. İlköğretim Online, 17, 902-925.
LanLuo, Cara Arizmendi and Kathleen M. Gates (2019) Exploratory Factor Analysis (EFA) Programs in R, Structural Equation Modeling: A Multidisciplinary Journal, 26(5), 819-826, DOI: 10.1080/10705511.2019.1615835
Loehlin, J. C. (1987). Latent variable models: An introduction to factor, path, and structural analysis. Hillsdale, NJ: Erlbaum.
Lyons-Thomas, J., Sandilands, D. D., and Ercikan, K. (2014). Gender differential item functioning in mathematics in four international jurisdictions. Education and Science, 39(172), 20-32.
Magis, D., Beland, S. and Raiche, G. (2018). Collection of Methods to Detect Dichotomous Differential Item Functioning (DIF).https://cran.r-project.org/web/packages/difR/difR.pdf
Magis, D.,Beland, S., Tuerlinckx, F. and De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847-862.
Mertler, C. A. and Vannatta, R. A. (2005). Advanced and multivariate statistical methods: Practical application and interpretation. Los Angeles: Pyrczak.
Milli Eğitim Bakanlığı (2018). Sınavla öğrenci alacak ortaöğretim kurumlarına ilişkin merkezî sınav başvuru ve uygulama kılavuzu. Ankara
Narayanan, P., and Swaminathan, H. (1996).Identification of items that show nonuniform DIF. Applied Psychological Measurement, 20(3), 257–274. https://doi.org/10.1177/014662169602000306
Osadebe, P. U. and Agbure, B. (2018). Assessment of differential item functioning in social studies multiple choice questions in basic education certificate examination. European Journal of Education Studies.4(9). 236-257. doi: 10.5281/zenodo.1301272.
Osterlind, S., J. and Everson, H., T. (2009). Differential Item Functioning. Thousand Oaks. CA: SAGE Publications, Inc. doi: http://dx.doi.org/10.4135/9781412993913
Raju, N. S., Laffitte, L. J. and Byrne, B. M. (2002). Measurement equivalence: A comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87(3), 527-529.
Rogers, H. J., and Swaminathan, H. (1993) A comparison of the logistic regression and Mantel-Haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17(2), 105–116.
Ryan, K. E. and Chiu, S. (2001). An examination of item context effects, DIF, and gender DIF. Applied Measurement in Education, 14 (1), 73–90.doi:10.1207/S15324818AME1401_06
Shealy, R. and Stout, W. F. (1993). A model-based standardization approach that separates true bias/ DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika,58, 159–194.
Stark, S., Chernyshenko, O. S. and Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91(6), 1292-1306.
Stiglic,G.,Watson,R., and Cilar, L. (2019). R you ready? Using the R program for statistical analysis and graphics. Research in Nursing and Health, 42(6), 494-499.
Stoneberg, B. D. (2004). A study of gender-based and ethnic-based Differential Item Functioning (DIF) in the Spring 2003 Idaho Standards Achievement tests applying the Simultaneous Bias Test (SIBTEST) and the Mantel-Haenszel chi-square Test.The University of Maryland (UM) Measurement, Statistics, and Evaluation Department and the National Center for Education Statistics (NCES) Assessment Division. Retrieved from http://elearndesign, http://files.eric.ed.gov/ fulltext/ED489949.pdf
Stout, W. and Roussos, L. (1995). SIBTEST user manual. Urbana: University of Illinois.
Swaminathan, H., and Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361–370. https://doi.org/10.1111/j.1745-3984.1990.tb00754.x
Tabachnick, B. G. and Fidell, L. S. (2007). Using multivariate statistics. Boston, MA: Pearson.
Tan, Ş. (2009). Soru Sırasının Madde Güçlüğü ve Ayırıcılık Gücüne Etkisi. e-Journal of New World Sciences Academy, 4(2), 486-493. Erişimadresi:http://www.newwsa.com/download/gecici_makale_dosyalari/NWSA-1091-3-2.pdf
Ünsal Özberk, E. B. ve Koç, N. (2017). WÇZÖ-IV maddelerinin cinsiyet ve sosyo-ekonomik düzey açısından işlev farklılığının belirlenmesinde kullanılan yöntemlerin karşılaştırılması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 8(1), 112-127.
Van Buuren, S. (2012). Flexible imputation of missing data. Chapman and Hall/CRC.
Vandenberg, R. J. and Lance, C. E. (2000). A review and synthesis of the MI literature: suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4-69.
Walzebug, A. (2014). Is there a language-based social disadvantage in solving mathematical items? Learning, Culture and Social Interaction, 3(2), 159-169.
Yen, W. M. (1980). The extent, causes and importance of context effects on item parameters for two latent trait models. Journal of Educational Measurement, 17(4), 297–311. doi: 10.1111/j.1745- 3984.1980.tb00833.x
Yıldırım, H. (2015). 2012 yılı seviye belirleme sınavı matematik alt testinin madde yanlılığı açısından incelenmesi. (Yayınlanmamış yüksek lisans tezi). Gazi Üniversitesi, Ankara.
Zumbo, B. D., and Thomas, D. R. (1996). A measure of DIF effect size using logistic regression procedures. National Board of Medical Examiners, Philadelphia, PA.
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (Ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.
Alatlı, B., & Şenel, S. (2020). Değişen Madde Fonksiyonunun Belirlenmesinde “difR” R Paketinin Kullanımı: Ortaöğretime Geçiş Sınavı Fen Alt Testi. Ankara University Journal of Faculty of Educational Sciences (JFES), 53(3), 865-902. https://doi.org/10.30964/auebfd.684727