Research Article
BibTex RIS Cite

Validity of Simulation Studies: A Case Research in the Context of Differential Item Functioning Detection

Year 2025, Volume: 3 Issue: 1, 24 - 40, 17.03.2025
https://doi.org/10.5281/zenodo.15036409

Abstract

The aim of this study is to examine the simulation validity by determining whether the simulation process produces results that are realistically close to expectations, through the generation of artificial data containing Differential Item Functioning (DIF) and assessing whether the data were accurately generated. In the study, which involves one reference group and two focal groups, 2250 different conditions were simulated by considering factors such as the sample size of the reference group, the sample size ratios of the focal groups, the amount of DIF, and the DIF technique. During the data generation process, random data for difficulty and discrimination parameters were generated using the Two-Parameter Logistic Model (2PLM), and it was planned that 20% of the items in the test would contain DIF. To test the validity of the simulation, mean absolute bias and RMSE values for the difficulty and discrimination parameters were calculated both at the item level and by considering the relevant factors. The analysis results revealed that the mean absolute bias and RMSE values calculated for the difficulty and discrimination parameters were low and close to zero. This indicates that estimation errors were minimal and supports the validity of the results. Additionally, it was found that the sample size of the reference group and the sample size ratios of the focal groups had a statistically significant effect on the mean absolute bias and RMSE values for both difficulty and discrimination parameters, and it was observed that as the sample size increased, the mean absolute bias and RMSE values decreased. However, it was concluded that the amount of DIF added to the focal groups did not have a significant effect on the accuracy of parameter estimations. The findings demonstrate that sample size plays a critical role in the accuracy of parameter estimations, while the amount of DIF does not significantly impact this process, and the results of the study are consistent with relevant research in the literature. As a result of this research, it has been recommended that validity evidence for the simulation should be provided not only in DIF investigation studies but also in simulation studies conducted in various subject areas within the field of psychometrics.

Ethical Statement

Ankara University Social Sciences Sub-Ethics Committee, 05-169, 22.04.2019

References

  • Alfons, A., Templ, M., & Filzmoser, P. (2010). An object-oriented framework for statistical simulation: The r package simFrame. Journal of Statistical Software, 37(3), 1-35. https://doi.org/10.18637/jss.v037.i03
  • Atar, B. (2007). Differential item functioning analyses for mixed response data using IRT likelihood-ratio test, logistic regression, and GLLAMM procedures [Doctoral dissertation]. Florida State University, Florida.
  • Berends, P., & Romme, G. (1999). Simulation as a research tool in management studies. European Management Journal, 17(6), 576-583. https://doi.org/10.1016/S0263-2373(99)00048-1
  • Bolt, D. M., Cohen, A. S., & Wollack, J. A. (2002). Item parameter estimation under conditions of test speededness: Application of a mixture Rasch model with ordinal constraints. Journal of Educational Measurement, 39(4), 331-348. https://doi.org/10.1111/j.1745-3984.2002.tb01146.x
  • Bulut, O., & Sünbül, Ö. (2017). Monte carlo simulation studies in item response theory with the R programming language. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 266-287. https://doi.org/10.21031/epod.305821
  • Choi, Y. J., & Asilkalkan, A. (2019) R packages for item response theory analysis: Descriptions and features. Measurement: Interdisciplinary Research and Perspectives, 17(3), 168-175, https://doi.org/10.1080/15366367.2019.1586404
  • Chung, C. A. (2004). Simulation modeling handbook: A practical approach. CRC Press.
  • Davis, J. P., Eisenhardt, K. M., & Bingham, C. B. (2007). Developing theory through simulation methods. Academy of Management Review, 32(2), 480-499. https://doi.org/10.5465/amr.2007.24351453
  • DeMars, C. E. (2003). Sample size and recovery of nominal response model item parameters. Applied Psychological Measurement, 27(4), 275-288. https://doi.org/10.1177/0146621603027004003
  • DeMars, C. E., & Lau, A. (2011). Differential item functioning detection with latent classes: How accurately can we detect who is responding differentially? Educational and Psychological Measurement, 71(4), 597-616. https://doi.org/10.1177/0013164411404221
  • Feinberg, R. A., & Rubright, J. D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36-49. https://doi.org/10.1111/emip.12111
  • Finch, W. H. (2016). Detection of item functioning for more than two groups: A monte carlo comparison of methods. Applied Measurement in Education, 29(1), 30-45. https://doi.org/10.1080/08957347.2015.1102916
  • Gao, X. (2019). A comparison of six DIF detection methods [Master thesis]. University of Connecticut Graduate School.
  • Gray, C. D., & Kinnear, P. R. (2012). IBM SPSS statistics 19 made simple. Psychology Press, Taylor & Francis Group.
  • Hallgren, K. A. (2013). Conducting simulation studies in the R programming environment. Tutorials in Quantitative Methods for Psychology, 9(2), 43-60. https://doi.org/10.20982/tqmp.09.2.p043
  • Happach, R. M., & Tilebein, M. (2015). Simulation as research method: Modeling social interactions in management science. In C. Misselhorn (Ed.), Collective agency and cooperation in natural and articial systems (pp. 239-259). Springer
  • Harwell, M. R., Kohli, N., & Peralta, Y. (2017). Experimental design and data analysis in computer simulation studies in the behavioral sciences. Journal of Modern Applied Statistical Methods, 16(2), 3-28. https://doi.org/10.22237/jmasm/1509494520
  • Harwell, M. R., Kohli, N., & Peralta-Torres, Y. (2018). A survey of reporting practices of computer simulation studies in statistical research. The American Statistician, 72(4), 321-327. https://doi.org/10.1080/00031305.2017.1342692
  • Harwell, M. R., Rubinstein, E. N., Hayes, W. S., & Olds, C. C. (1992). Summarizing monte carlo results in methodological research: The one-and two-factor fixed effects ANOVA cases. Journal of Educational Statistics, 17(4), 315-339. https://doi.org/10.2307/1165127
  • Harwell, M. R., Stone, C. A., Hsu, T., & Kirisci, L. (1996). Monte carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101-125. https://doi.org/10.1177/014662169602000201
  • Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two- and three-parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6(3), 249-260. https://doi.org/10.1177/014662168200600301
  • Keppel, G., & Wickens, T. D. (2004). Design and analysis: A researcher’s handbook (4th ed.). Pearson.
  • Kim, J. (2010). Controlling type 1 error rate in evaluating differential item functioning for four DIF methods: Use of three procedures for adjustment of multiple item testing [Doctoral dissertation]. Georgia State University, Atlanta.
  • Law, A. M. (2003). How to conduct a successful simulation study. Proceedings of the 2003 Winter Simulation Conference, New Orleans, LA. U.S.A.
  • Li, Y., Brooks, G. P., & Johanson, G. A. (2012). Item discrimination and Type I error in the detection of differential item functioning. Educational and Psychological Measurement, 72(5), 847-861. https://doi.org/10.1177/0013164411432333
  • Li, Z., & Zumbo, B. D. (2009). Impact of differential item functioning on subsequent statistical conclusions based on observed test score data. Psicológica, 30(2), 343-370. https://psycnet.apa.org/record/2009-18227-011
  • Liu, H., Zhang, Y., & Luo, F. (2015). Mediation analysis for ordinal outcome variables. In Millsap, Bolt, Ark & Wang, (Eds.), Quantitative psychology research (pp. 429-450). Springer International Publishing.
  • Lopez Rivas, G. E. (2012). Detection and classification of DIF types using parametric and nonparametric methods: A comparison of the IRT-likelihood ratio test, crossing-SIBTEST, and logistic regression procedures [Doctoral dissertation]. University of South Florida, Florida.
  • Magis, D., Raîche, G., Béland, S., & Gérard, P. (2011). A generalized logistic regression procedure to detect differential item functioning among multiple groups. International Journal of Testing, 11(4), 365-386. https://doi.org/10.1080/15305058.2011.602810
  • Mooney, C. Z. (1997). Monte carlo simulation. Sage.
  • Morris, T. P., White, I. R., & Crowther, M. J. (2019). Using simulation studies to evaluate statistical methods. Tutorial in Biostatistics, 38(11), 2074-2102. https://doi.org/10.1002/sim.8086
  • Paxton, P., Curran, P. J., Bollen, K. A., Kirby, J., & Chen, F. (2001). Monte carlo experiments: Design and implementation. Structural Equation Modeling, 8(2), 287-312. https://doi.org/10.1207/S15328007SEM0802_7
  • R Core Team (2019). R: A language and environment for statistical computing. Vienna, Austria. URL http://www.R-project.org/
  • Rockoff, D. (2018). A randomization test for the detection of differential item functioning [Doctoral dissertation]. The University of Arizona, Arizona.
  • Rollins III, J. D. (2018). A comparison of observed score approaches to detecting differential item functioning among multiple groups [Doctoral dissertation]. The University of North Carolina at Greensboro, Greensboro.
  • Rubinstein, R. Y., & Kroese, D. P. (2017). Simulation and the monte carlo method. John Wiley & Sons.
  • Rusch, T., Mair, P., & Hatzinger, R. (2013). Psychometrics with R: A review of CRAN packages for item response theory (Discussion Paper). Center for Empirical Research Methods.
  • Sandilands, D. A. (2014). Accuracy of differential item functioning detection methods in structurally missing data due to booklet design [Doctoral dissertation]. The University of British Columbia, Vancouver.
  • Scott, L. (2014). Controlling Analytic selection of a valid subtest for DIF analysis when DIF has multiple potential causes among multiple groups [Doctoral dissertation]. Arizona State University, Arizona.
  • Seybert, J., & Stark, S. (2012). Iterative linking with the differential functioning of items and tests (DFIT) method: Comparison of testwide and item parameter replication (IPR) critical values. Applied Psychological Measurement, 36(6), 494-515. https://doi.org/10.1177/0146621612445182
  • Sigal, M. J., & Chalmers, R. P. (2016). Play it again: Teaching statistics with monte carlo simulation. Journal of Statistics Education, 24(3), 136-156. https://doi.org/10.1080/10691898.2016.1246953
  • Socha, A., DeMars, C. E., Zilberberg, A., & Phan, H. (2015). Differential item functioning detection with the Mantel-Haenszel procedure: The effects of matching types and other factors. International Journal of Testing, 15(3), 193-215. https://doi.org/10.1080/15305058.2014.984066
  • Spence, I. (1983). Monte carlo simulation studies. Applied Psychological Measurement, 7(4), 405-425. https://doi.org/10.1177/014662168300700403
  • Svetina, D., & Rutkowski, L. (2014). Detecting differential item functioning using generelized logistic regression in the context of large-scale assessments. Large Scale Assessments in Education, 2(4), 1-17. https://doi.org/10.1186/s40536-014-0004-5
  • Tay, L., Meade, A. W., & Cao, M. (2015). An overview and practical guide to IRT measurement equivalence analysis. Organizational Research Methods, 18(1), 3-46. https://doi.org/10.1177/1094428114553062
  • Tureson, K., & Odland, A. (2018). Monte Carlo simulation studies. In Bruce B. Frey (Ed.). The SAGE encylopedia of educational research, measurement, and evaluation (pp. 1085-1089). SAGE Publications, Inc.
  • Wang, W. C., & Chen, C. T. (2005). Item parameter recovery, standard error estimates, and fit statistics of the winsteps program for the family of Rasch models. Educational and Psychological Measurement, 65(3), 376-404. https://doi.org/10.1177/0013164404268673
  • Wen, Y. (2014). DIF analyses in multilevel data: Identification and effects on ability estimates [Doctoral dissertation]. The University of Wisconsin, Milwaukee.
  • Wood, W. S. (2011). Differential item functioning procedures for polytomous items when examinee sample sizes are small [Doctoral dissertation]. Graduate College of The University of Iowa, Iowa.
  • Woods, C. M. (2009). Evaluation of MIMIC-model methods for DIF testing with comparison to two-group analysis. Multivariate Behavioral Research, 44(1), 1-27. https://doi.org/10.1080/00273170802620121
  • Yuan, K.-H., Tong, X., & Zhang, Z. (2015). Bias and efficiency for SEM with missing data and auxiliary variables: Two-stage robust method versus two-stage ML. Structural Equation Modeling: A Multidisciplinary Journal, 22(2), 178-192. https://doi.org/10.1080/10705511.2014.935750

Simülasyon Çalışmalarının Geçerliği: Değişen Madde Fonksiyonu Belirleme Çalışması Bağlamında Bir Örnek Araştırma

Year 2025, Volume: 3 Issue: 1, 24 - 40, 17.03.2025
https://doi.org/10.5281/zenodo.15036409

Abstract

Bu çalışmanın amacı, simülasyon sürecinin beklenen şekilde gerçeğe yakın sonuçlar ortaya çıkarıp çıkarmadığını belirlemek amacıyla Değişen Madde Fonksiyonu (DMF) içeren yapay verilerin doğru bir şekilde üretilip üretilmediğine yönelik simülasyon geçerliğinin incelenmesidir. Bir referans iki odak olmak üzere üç grubun ele alındığı araştırmada, referans grubun örneklem büyüklüğü, odak grupların örneklem büyüklüğü oranları, DMF miktarı ve DMF tekniği faktörleri dikkate alınarak 2250 farklı koşul simüle edilmiştir. Veri üretim sürecinde, İki Parametreli Lojistik Model (2PLM) ile güçlük ve ayırt edicilik parametreleri için rastgele veriler oluşturulmuş ve testteki maddelerin %20'sinin DMF içermesi planlanmıştır. Simülasyonun geçerliğini test etmek amacıyla, güçlük ve ayırt edicilik parametrelerine ilişkin ortalama mutlak yanlılık ve RMSE değerleri hem madde düzeyinde hem de ilgili faktörler dikkate alınarak hesaplanmıştır. Analizler sonucunda, güçlük ve ayırt edicilik parametreleri için hesaplanan ortalama mutlak yanlılık ve RMSE değerlerinin düşük ve sıfıra yakın olduğu bulunmuştur. Bu durum kestirim hatalarının az olduğunu ve sonuçların geçerliğinin desteklendiğini ortaya koymuştur. Ayrıca referans grubun örneklem büyüklüğünün ve odak grupların örneklem büyüklüğü oranlarının hem güçlük hem de ayırt edicilik parametreleri için ortalama mutlak yanlılık ve RMSE değerleri üzerinde istatistiksel olarak manidar bir etkiye sahip olduğu belirlenmiş ve örneklem büyüklüğü arttıkça ortalama mutlak yanlılık ve RMSE değerlerinin azaldığı tespit edilmiştir. Bununla birlikte, odak gruplara eklenen DMF miktarlarının, parametre kestirimlerinin doğruluğu üzerinde anlamlı bir etki oluşturmadığı sonucuna ulaşılmıştır. Elde edilen bulgular, örneklem büyüklüğünün parametre kestirimlerinin doğruluğu üzerinde kritik bir rol oynadığını ve DMF miktarının bu süreçte anlamlı bir etki yaratmadığını ortaya koymuş ve çalışmanın bulguları alanyazındaki ilgili araştırmalar ile tutarlılık göstermiştir. Yapılan bu araştırma sonucunda DMF inceleme çalışmalarının yanı sıra psikometrinin farklı konu alanlarında yapılacak olan simülasyon çalışmalarında da simülasyonun geçerlik kanıtlarının sunulması gerektiği önerilmiştir.

Ethical Statement

Ankara Üniversitesi Sosyal Bilimler Alt Etik Kurulu, 05-169, 22.04.2019

References

  • Alfons, A., Templ, M., & Filzmoser, P. (2010). An object-oriented framework for statistical simulation: The r package simFrame. Journal of Statistical Software, 37(3), 1-35. https://doi.org/10.18637/jss.v037.i03
  • Atar, B. (2007). Differential item functioning analyses for mixed response data using IRT likelihood-ratio test, logistic regression, and GLLAMM procedures [Doctoral dissertation]. Florida State University, Florida.
  • Berends, P., & Romme, G. (1999). Simulation as a research tool in management studies. European Management Journal, 17(6), 576-583. https://doi.org/10.1016/S0263-2373(99)00048-1
  • Bolt, D. M., Cohen, A. S., & Wollack, J. A. (2002). Item parameter estimation under conditions of test speededness: Application of a mixture Rasch model with ordinal constraints. Journal of Educational Measurement, 39(4), 331-348. https://doi.org/10.1111/j.1745-3984.2002.tb01146.x
  • Bulut, O., & Sünbül, Ö. (2017). Monte carlo simulation studies in item response theory with the R programming language. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 266-287. https://doi.org/10.21031/epod.305821
  • Choi, Y. J., & Asilkalkan, A. (2019) R packages for item response theory analysis: Descriptions and features. Measurement: Interdisciplinary Research and Perspectives, 17(3), 168-175, https://doi.org/10.1080/15366367.2019.1586404
  • Chung, C. A. (2004). Simulation modeling handbook: A practical approach. CRC Press.
  • Davis, J. P., Eisenhardt, K. M., & Bingham, C. B. (2007). Developing theory through simulation methods. Academy of Management Review, 32(2), 480-499. https://doi.org/10.5465/amr.2007.24351453
  • DeMars, C. E. (2003). Sample size and recovery of nominal response model item parameters. Applied Psychological Measurement, 27(4), 275-288. https://doi.org/10.1177/0146621603027004003
  • DeMars, C. E., & Lau, A. (2011). Differential item functioning detection with latent classes: How accurately can we detect who is responding differentially? Educational and Psychological Measurement, 71(4), 597-616. https://doi.org/10.1177/0013164411404221
  • Feinberg, R. A., & Rubright, J. D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36-49. https://doi.org/10.1111/emip.12111
  • Finch, W. H. (2016). Detection of item functioning for more than two groups: A monte carlo comparison of methods. Applied Measurement in Education, 29(1), 30-45. https://doi.org/10.1080/08957347.2015.1102916
  • Gao, X. (2019). A comparison of six DIF detection methods [Master thesis]. University of Connecticut Graduate School.
  • Gray, C. D., & Kinnear, P. R. (2012). IBM SPSS statistics 19 made simple. Psychology Press, Taylor & Francis Group.
  • Hallgren, K. A. (2013). Conducting simulation studies in the R programming environment. Tutorials in Quantitative Methods for Psychology, 9(2), 43-60. https://doi.org/10.20982/tqmp.09.2.p043
  • Happach, R. M., & Tilebein, M. (2015). Simulation as research method: Modeling social interactions in management science. In C. Misselhorn (Ed.), Collective agency and cooperation in natural and articial systems (pp. 239-259). Springer
  • Harwell, M. R., Kohli, N., & Peralta, Y. (2017). Experimental design and data analysis in computer simulation studies in the behavioral sciences. Journal of Modern Applied Statistical Methods, 16(2), 3-28. https://doi.org/10.22237/jmasm/1509494520
  • Harwell, M. R., Kohli, N., & Peralta-Torres, Y. (2018). A survey of reporting practices of computer simulation studies in statistical research. The American Statistician, 72(4), 321-327. https://doi.org/10.1080/00031305.2017.1342692
  • Harwell, M. R., Rubinstein, E. N., Hayes, W. S., & Olds, C. C. (1992). Summarizing monte carlo results in methodological research: The one-and two-factor fixed effects ANOVA cases. Journal of Educational Statistics, 17(4), 315-339. https://doi.org/10.2307/1165127
  • Harwell, M. R., Stone, C. A., Hsu, T., & Kirisci, L. (1996). Monte carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101-125. https://doi.org/10.1177/014662169602000201
  • Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two- and three-parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6(3), 249-260. https://doi.org/10.1177/014662168200600301
  • Keppel, G., & Wickens, T. D. (2004). Design and analysis: A researcher’s handbook (4th ed.). Pearson.
  • Kim, J. (2010). Controlling type 1 error rate in evaluating differential item functioning for four DIF methods: Use of three procedures for adjustment of multiple item testing [Doctoral dissertation]. Georgia State University, Atlanta.
  • Law, A. M. (2003). How to conduct a successful simulation study. Proceedings of the 2003 Winter Simulation Conference, New Orleans, LA. U.S.A.
  • Li, Y., Brooks, G. P., & Johanson, G. A. (2012). Item discrimination and Type I error in the detection of differential item functioning. Educational and Psychological Measurement, 72(5), 847-861. https://doi.org/10.1177/0013164411432333
  • Li, Z., & Zumbo, B. D. (2009). Impact of differential item functioning on subsequent statistical conclusions based on observed test score data. Psicológica, 30(2), 343-370. https://psycnet.apa.org/record/2009-18227-011
  • Liu, H., Zhang, Y., & Luo, F. (2015). Mediation analysis for ordinal outcome variables. In Millsap, Bolt, Ark & Wang, (Eds.), Quantitative psychology research (pp. 429-450). Springer International Publishing.
  • Lopez Rivas, G. E. (2012). Detection and classification of DIF types using parametric and nonparametric methods: A comparison of the IRT-likelihood ratio test, crossing-SIBTEST, and logistic regression procedures [Doctoral dissertation]. University of South Florida, Florida.
  • Magis, D., Raîche, G., Béland, S., & Gérard, P. (2011). A generalized logistic regression procedure to detect differential item functioning among multiple groups. International Journal of Testing, 11(4), 365-386. https://doi.org/10.1080/15305058.2011.602810
  • Mooney, C. Z. (1997). Monte carlo simulation. Sage.
  • Morris, T. P., White, I. R., & Crowther, M. J. (2019). Using simulation studies to evaluate statistical methods. Tutorial in Biostatistics, 38(11), 2074-2102. https://doi.org/10.1002/sim.8086
  • Paxton, P., Curran, P. J., Bollen, K. A., Kirby, J., & Chen, F. (2001). Monte carlo experiments: Design and implementation. Structural Equation Modeling, 8(2), 287-312. https://doi.org/10.1207/S15328007SEM0802_7
  • R Core Team (2019). R: A language and environment for statistical computing. Vienna, Austria. URL http://www.R-project.org/
  • Rockoff, D. (2018). A randomization test for the detection of differential item functioning [Doctoral dissertation]. The University of Arizona, Arizona.
  • Rollins III, J. D. (2018). A comparison of observed score approaches to detecting differential item functioning among multiple groups [Doctoral dissertation]. The University of North Carolina at Greensboro, Greensboro.
  • Rubinstein, R. Y., & Kroese, D. P. (2017). Simulation and the monte carlo method. John Wiley & Sons.
  • Rusch, T., Mair, P., & Hatzinger, R. (2013). Psychometrics with R: A review of CRAN packages for item response theory (Discussion Paper). Center for Empirical Research Methods.
  • Sandilands, D. A. (2014). Accuracy of differential item functioning detection methods in structurally missing data due to booklet design [Doctoral dissertation]. The University of British Columbia, Vancouver.
  • Scott, L. (2014). Controlling Analytic selection of a valid subtest for DIF analysis when DIF has multiple potential causes among multiple groups [Doctoral dissertation]. Arizona State University, Arizona.
  • Seybert, J., & Stark, S. (2012). Iterative linking with the differential functioning of items and tests (DFIT) method: Comparison of testwide and item parameter replication (IPR) critical values. Applied Psychological Measurement, 36(6), 494-515. https://doi.org/10.1177/0146621612445182
  • Sigal, M. J., & Chalmers, R. P. (2016). Play it again: Teaching statistics with monte carlo simulation. Journal of Statistics Education, 24(3), 136-156. https://doi.org/10.1080/10691898.2016.1246953
  • Socha, A., DeMars, C. E., Zilberberg, A., & Phan, H. (2015). Differential item functioning detection with the Mantel-Haenszel procedure: The effects of matching types and other factors. International Journal of Testing, 15(3), 193-215. https://doi.org/10.1080/15305058.2014.984066
  • Spence, I. (1983). Monte carlo simulation studies. Applied Psychological Measurement, 7(4), 405-425. https://doi.org/10.1177/014662168300700403
  • Svetina, D., & Rutkowski, L. (2014). Detecting differential item functioning using generelized logistic regression in the context of large-scale assessments. Large Scale Assessments in Education, 2(4), 1-17. https://doi.org/10.1186/s40536-014-0004-5
  • Tay, L., Meade, A. W., & Cao, M. (2015). An overview and practical guide to IRT measurement equivalence analysis. Organizational Research Methods, 18(1), 3-46. https://doi.org/10.1177/1094428114553062
  • Tureson, K., & Odland, A. (2018). Monte Carlo simulation studies. In Bruce B. Frey (Ed.). The SAGE encylopedia of educational research, measurement, and evaluation (pp. 1085-1089). SAGE Publications, Inc.
  • Wang, W. C., & Chen, C. T. (2005). Item parameter recovery, standard error estimates, and fit statistics of the winsteps program for the family of Rasch models. Educational and Psychological Measurement, 65(3), 376-404. https://doi.org/10.1177/0013164404268673
  • Wen, Y. (2014). DIF analyses in multilevel data: Identification and effects on ability estimates [Doctoral dissertation]. The University of Wisconsin, Milwaukee.
  • Wood, W. S. (2011). Differential item functioning procedures for polytomous items when examinee sample sizes are small [Doctoral dissertation]. Graduate College of The University of Iowa, Iowa.
  • Woods, C. M. (2009). Evaluation of MIMIC-model methods for DIF testing with comparison to two-group analysis. Multivariate Behavioral Research, 44(1), 1-27. https://doi.org/10.1080/00273170802620121
  • Yuan, K.-H., Tong, X., & Zhang, Z. (2015). Bias and efficiency for SEM with missing data and auxiliary variables: Two-stage robust method versus two-stage ML. Structural Equation Modeling: A Multidisciplinary Journal, 22(2), 178-192. https://doi.org/10.1080/10705511.2014.935750
There are 51 citations in total.

Details

Primary Language English
Subjects Testing, Assessment and Psychometrics (Other)
Journal Section Research Articles
Authors

Özkan Saatçioğlu 0000-0001-8131-9619

Early Pub Date March 17, 2025
Publication Date March 17, 2025
Submission Date February 14, 2025
Acceptance Date March 14, 2025
Published in Issue Year 2025 Volume: 3 Issue: 1

Cite

APA Saatçioğlu, Ö. (2025). Validity of Simulation Studies: A Case Research in the Context of Differential Item Functioning Detection. Journal of Psychometric Research, 3(1), 24-40. https://doi.org/10.5281/zenodo.15036409

Journal of Psychometric Research is licensed under a Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0). 

30434