Research Article
BibTex RIS Cite

Comparison of Propensity Score Weighting Methods to Remove Selection Bias in Average Treatment Effect Estimates

Year 2023, Volume: 2023 Issue: 21, 989 - 1031, 31.10.2023
https://doi.org/10.46778/goputeb.1312865

Abstract

In this Monte Carlo simulation study, the performance of six different propensity score methods implemented through weighting cases was investigated: inverse probability of treatment weighting, truncated inverse probability of treatment weighting, propensity score stratification, marginal mean weighting through propensity score stratification, optimal full propensity score matching, and marginal mean weighting through optimal full propensity score matching. These methods aim to reduce selection bias in estimates of the average treatment effect (ATE) in observational studies. For the estimation of standard errors of the ATE with weights, three methods were compared: weighted least squares (WLS), Taylor series linearization (TSL), and jackknife (JK). Results indicated that covariance adjustment extensions of the investigated propensity score methods, in combination with TSL and JK standard error estimation methods, remove the selection bias appropriately and provide the most accurate standard errors under the simulated conditions.

References

  • Abadie, A., & Imbens, G. W. (2006). Large Sample properties of matching estimators for average treatment effects. Econometrica, 74, 235-2667. https://doi.org/10.1111/j.1468-0262.2006.00655.x
  • Arpino, B., & Mealli, F. (2011). The specification of the propensity score in multilevel observational studies. Computational Statistics & Data Analysis, 55, 1770-1780. https://doi.org/10.1016/j.csda.2010.11.008
  • Asparouhov, T. (2006). General Multi-Level Modeling with Sampling Weights. Communications in Statistics: Theory and Methods, 35(3), 439-460. https://doi.org/10.1080/03610920500476598
  • Austin, P. C. (2009a). The relative ability of different propensity score methods to balance measured covariates between treated and untreated subjects in observational studies. Medical Decision Making, 29, 661-677. https://doi.org/10.1177/0272989X09341755
  • Austin, P. C. (2009b). Type I error rates, coverage of confidence intervals, and variance estimation in propensity-score matched analyses. The International Journal of Biostatistics, 5(1), Art. 13. https://doi.org/10.2202/1557-4679.1146
  • Austin, P. C. (2010a). The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute risk reductions) in observational studies. Statistics in Medicine, 29, 2137-2148. https://doi.org/10.1002/sim.3854
  • Austin, P. C. (2010b). Statistical criteria for selecting the optimal number of untreated subjects matched to each treated subject when using many-to-one matching on propensity score. Practice of Epidemiology, 172(9), 1092-1097. https://doi.org/10.1093/aje/kwq224
  • Austin P.C., Grootendorst P., & Anderson G.M. (2007). A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: A Monte Carlo study. Statistics in Medicine, 26, 734–753. https://doi.org/10.1002/sim.2580
  • Bang, H., & Robins, J. M. (2005). Doubly Robust Estimation in Missing Data and Causal Inference Models. Biometrics, 61(4), 962-973. https://doi.org/10.1111/j.1541-0420.2005.00377.x
  • Bembom, O., & van der Laan M. J. (2008). Data-adaptive selection of the truncation level for inverse-probability-of-treatment-weighted estimators. U.C. Berkeley Division of Biostatistics Working Paper Series. Paper 230.
  • Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Sturmer, T. (2006). Variable selection for propensity score models. American Journal of Epidemiology, 163, 1149-1156. https://doi.org/10.1093/aje/kwj149
  • Cepeda M. S., Boston, R., Farrar, J. T., & Strom, B. L., (2003). Optimal matching with a variable number of controls vs. a fixed number of controls for a cohort study: trade-offs. Journal of Clinical Epidemiology, 56, 230-237. https://doi.org/10.1016/S0895-4356(02)00583-8
  • Cochran, W. G. (1968). The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics, 24, 295-313. https://doi.org/10.2307/2528036
  • Cochran, W.G., & Rubin, D. B. (1973). Controlling bias in observational studies: a review. Sankhya: The Indian Journal of Statistics, Series A 35(4), 417-446.
  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Academic Press. https://doi.org/10.4324/9780203771587
  • Crump, R. K., Hotz, V. J., Imbens, G. W., & Mitnik O. A. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96, 187-199. https://doi.org/10.1093/biomet/asn055
  • Freedman, D. A. & Berk, R. A. (2008). Weighting regressions by propensity scores. Evaluation Review, 32(4), 392-409. https://doi.org/10.1177/0193841X08317586
  • Funk M. J., Westreich D., Wiesen C., Sturmer T., Brookhart M. A., & Davidian M. (2011). Doubly robust estimation of causal effects. American Journal of Epidemiology, 173(7), 761-767. https://doi.org/10.1093/aje/kwq439
  • Gu, X. S., & Rosenbaum, P. R. (1993). Comparison of multivariate matching methods: structures, distances, and algorithms. Journal of Computational and Graphical Statistics, 4, 405-420. https://doi.org/10.2307/1390693
  • Guo, S., & Fraser, M. W. (2010). Propensity score analysis: statistical methods and applications. Sage.
  • Hansen, B.B., & Klopfer, S.O. (2006) Optimal full matching and related designs via network flows. Journal of Computational and Graphical Statistics, 15, 609-627. https://doi.org/10.1198/106186006X137047
  • Harder, V. S., Stuart, E. A., & Anthony, J. C. (2010). Propensity score techniques and the assessment of measured covariate balance to test causal associations in psychological research. Psychological Methods, 15(3), 234-349. https://doi.org/10.1037/a0019623
  • Heckman, J. J. (1978). Dummy endogenous variables in simultaneous equations system. Econometrica, 47, 931-960. https://doi.org/10.2307/1909757
  • Heckman, J. J., Ichimura, H., & Todd, P. E. (1997). Matching as an econometric evaluation estimator: Evidence from evaluating a job training programme. Review of Economic Studies, 65, 261-294. https://doi.org/10.2307/2971733
  • Hernan, M. A., Hernandez-Diaz, S., & Robins, J. M. (2004). A structural approach to selection bias. Epidemiology, 82, 387-394. https://doi.org/10.1097/01.ede.0000135174.63482.43
  • Ho, D., Imai, K., King, G., & Stuart, A. E. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis. 15(3), 199-236. https://doi.org/10.1093/pan/mpl013
  • Hong, G. (2012). Marginal mean weighting through stratification: a generalized method for evaluating multivalued and multiple treatments with nonexperimental data. Psychological methods, 17, 44-60. https://doi.org/10.1037/a0024918
  • Hong, G., & Hong, Y. (2008). Reading instruction time and homogeneous grouping in kindergarten: An application of marginal mean weighting through stratification. Educational Evaluation and Policy Analysis, 31, 54-81. https://doi.org/10.3102/0162373708328259
  • Hong, G., & Raudenbush, S. W. (2006). Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association, 101, 901-910. https://doi.org/10.1198/016214506000000447
  • Hoogland, J. J., & Boomsma, A. (1998). Robustness studies in covariance structure modeling: An overview and meta-analysis. Sociological Methods & Research, 26, 523-539. https://doi.org/10.1177/0049124198026003003
  • Horvitz, D. G., & Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of American Statistical Association, 47, 663-685. https://doi.org/10.2307/2280784
  • Kang, J. D. Y., & Schafer, J. L. (2007). Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22(4), 523-539. https://doi.org/10.1214/07-STS227
  • Leite, W. L. (2016). Practical propensity score methods using R. Sage.
  • Leite, W. L., Aydin, B., & Gurel, S. (2019). A comparison of propensity score weighting methods for evaluating the effects of programs with multiple versions. Journal of Experimental Education, 87(1), 75-88. https://doi.org/10.1080/00220973.2017.1409179
  • Leite, W. L., Jimenez, F., Kaya, Y., Stapleton, L. M., MacInnes, J. W., & Sandbach, R. (2015). An Evaluation of Weighting Methods Based on Propensity Scores to Reduce Selection Bias in Multilevel Observational Studies. Multivariate Behavioral Research, 50, 265-284. https://doi.org/10.1080/00273171.2014.991018
  • Lohr, S. L. (1999). Sampling: design and analysis. Duxbury Press.
  • Lumley, T. (2011). “survey: analysis of complex survey samples”. R package version 3.62.1
  • Lunceford, J. K., & Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Statistics in Medicine, 23, 2937-2960. https://doi.org/10.1002/sim.1903
  • McCaffrey, D. F., Ridgeway, G., & Morral, A. R. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9, 403-425. https://doi.org/10.1037/1082-989X.9.4.403
  • McKelvey, R. D., & Zavoina,W. (1975).A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4, 103-120. https://doi.org/10.1080/0022250X.1975.9989847 National Center for Education Statistics. (2010). School survey on crime and safety. Retrieved from http://nces.ed.gov/surveys/ssocs on June 1 2011.
  • Neugebauer, R., & van der Laan, M. (2005). Why prefer double robust estimates in causal inference?. Journal of Statistical Planning and Inference, 129, 405-426. https://doi.org/10.1016/j.jspi.2004.06.060
  • Olejnik, S., & Algina, J. (2003). Generalized eta and omega squared statistics: measures of effect size for some common research designs. Psychological Methods, 8(4), 434-447. https://doi.org/10.1037/1082-989X.8.4.434
  • R Development Core Team. (2011). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrived from http://www.Rproject.org.
  • Robins, J. M., Hernan, M. A., & Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology, 11, 550-560. https://doi.org/10.1097/00001648-200009000-00011
  • Robins, J. M., & Rotnitzky A. (2001). Comment on the Peter J. Bickel and Jaimyoung Kwon, ‘Inference for semiparametric models: Some questions and an answer’. Statistica Sinica, 11, 920-936
  • Rosenbaum, R. P. (1989). Optimal matching for observational studies. Journal of the American Statistical Association, 408, 1024-1032. https://doi.org/10.2307/2290079
  • Rosenbaum, R. P. (1991). A characterization of optimal designs for observational studies. Journal of the Royal Statistics Society, 53, 597-610. https://doi.org/10.1111/j.2517-6161.1991.tb01848.x
  • Rosenbaum, P. R. (2010). Design of observational studies. Springer.
  • Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrica, 70, 41-55. https://doi.org/10.1093/biomet/70.1.41
  • Rosenbaum, P. R., & Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79, 516-524. https://doi.org/10.2307/2288398
  • Rodgers, J. L. (1999). The bootstrap, the jackknife, and the randomization test: A sampling taxonomy. Multivariate Behavioral Research, 34, 441-456. https://doi.org/10.1207/S15327906MBR3404_2
  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688-701. https://doi.org/10.1037/h0037350
  • Rubin, D. B. (2007). Statistical inference for causal effects, with emphasis on applications in epidemiology and medical statistics. Handbook of Statistics, 27, 28-63. https://doi.org/10.1016/S0169-7161(07)27002-6
  • Schafer, J. L. & Kang, J. (2008). Average causal effects from nonrandomized studies: a practical guide and simulated example. Psychological Method, 13(4), 279-313. https://doi.org/10.1037/a0014268
  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin.
  • Stapleton, L. (2008). Chapter18: Analysis of data from complex surveys. In: E. D. de Leeuw, J. J. Hox & D. A. Dillman. International handbook of survey methodology. Psychology Press.
  • Steiner, P. M., Cook, T. D., Shadish, W. R., & Clark, M. H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods, 15(3), 250-276. https://doi.org/10.1037/a0018719
  • Strayhorn, T. L. (2009). Accessing and analyzing national databases. In T. J. Kowalski & T. J. Lasley II (Eds.), Handbook of data-based decision making in education (pp. 105-122). NY: Routledge.
  • Sturmer, T., Rothman, K. J., Avorn, J., & Glynn, R. J. (2010). Treatment effects in the presence of unmeasured confounding: dealing with observations in the tails of the propensity score distribution-a simulation study. Practice of Epidemiology, 172(7), 842-854. https://doi.org/10.1093/aje/kwq198
  • Stuart, E. A. (2010). Matching methods for causal inference: A review and look forward. Statistical Science, 25(1), 1-21. https://doi.org/10.1214/09-STS313
  • Thoemmes, F. J. & Kim, E. S. (2011). A systematic review of propensity score methods in the social sciences. Multivariate Behavioral Research, 46, 90-118. https://doi.org/10.1080/00273171.2011.540475
  • Thoemmes, F. J. & West, S. (2011). The use of propensity scores for nonrandomized designs with clustered data. Multivariate Behavioral Research, 46, 514-543. https://doi.org/10.1080/00273171.2011.569395
  • U.S. Department of Education, Institute of Education Sciences, & What Works Clearinghouse. (2013). What Works Clearinghouse: Procedures and Standards Handbook (Version 3.0). Retrieved from Washington, DC: http://whatworks.ed.gov
  • Venables, W. N. & Ripley, B. D. (2002). Modern Applied Statistics with S (4th ed.). Springer.
  • Weitzen S., Lapane K. L., Toledano A. Y., Hume A. L., & Mor V. (2004). Principles for modeling propensity scores in medical research: A systematic literature review. Pharmacoepidemiology and Drug Safety, 13, 841–853. https://doi.org/10.1002/pds.969
  • Winship, C. & Morgan, S. (1999). The estimation of causal effects from observational data. Annual Review of Sociology, 25, 659-706.
  • Wolter, K. M. (2007). Introduction to Variance Estimation. Springer.

Ortalama İşlem Etkisi Kestiriminde Seçim Yanlılığını Gidermek İçin Eğilim Puanı Ağırlıklandırma Metotlarının Karşılaştırılması

Year 2023, Volume: 2023 Issue: 21, 989 - 1031, 31.10.2023
https://doi.org/10.46778/goputeb.1312865

Abstract

Bu Monte Carlo simulasyon çalışmasında, ters olasılık ağırlıklandırması, kesilmiş ters olasılık ağırlıklandırması, eğilim puanı tabakalandırması, eğilim puanı tabakalandırması üzerinden marjinal ortalama ağırlıklandırması, optimal tam eğilim puanı eşleştirmesi ve optimal tam eğilim puanı eşleştirmesi üzerinden marjinal ortalama ağırlıklandırması olmak üzere bireylerin ağırlıklandırmalarına dayalı altı farklı eğilim puanı metodu uygulamasının performansı araştırılmıştır. Bu metotlar gözlemsel çalışmalarda kestirilen ortalama işlem etkisinde bulunan seçim yanlılığını düşürmeyi amaçlar. Ağırlıklandırma ile ortalama işlem etkisinin standart hatası kestiriminde ağırlıklandırılmış en küçük kareler, Taylor serileri doğrusallaştırma ve jackknife metotları kullanılmıştır. Araştırma sonucunda, simule edilen bütün durumlarda kovaryans düzeltmesi ilaveli ağırlıklandırmaya dayalı eğilim puanı metotlarının Taylor serileri doğrusallaştırma ve jackknife standart hata kestirim metotları ile birlikte kullanılması ile seçim yanlılığının uygun bir şekilde ortadan kaldırdığı ve doğru standart hataların kestirildiği bulunmuştur.

References

  • Abadie, A., & Imbens, G. W. (2006). Large Sample properties of matching estimators for average treatment effects. Econometrica, 74, 235-2667. https://doi.org/10.1111/j.1468-0262.2006.00655.x
  • Arpino, B., & Mealli, F. (2011). The specification of the propensity score in multilevel observational studies. Computational Statistics & Data Analysis, 55, 1770-1780. https://doi.org/10.1016/j.csda.2010.11.008
  • Asparouhov, T. (2006). General Multi-Level Modeling with Sampling Weights. Communications in Statistics: Theory and Methods, 35(3), 439-460. https://doi.org/10.1080/03610920500476598
  • Austin, P. C. (2009a). The relative ability of different propensity score methods to balance measured covariates between treated and untreated subjects in observational studies. Medical Decision Making, 29, 661-677. https://doi.org/10.1177/0272989X09341755
  • Austin, P. C. (2009b). Type I error rates, coverage of confidence intervals, and variance estimation in propensity-score matched analyses. The International Journal of Biostatistics, 5(1), Art. 13. https://doi.org/10.2202/1557-4679.1146
  • Austin, P. C. (2010a). The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute risk reductions) in observational studies. Statistics in Medicine, 29, 2137-2148. https://doi.org/10.1002/sim.3854
  • Austin, P. C. (2010b). Statistical criteria for selecting the optimal number of untreated subjects matched to each treated subject when using many-to-one matching on propensity score. Practice of Epidemiology, 172(9), 1092-1097. https://doi.org/10.1093/aje/kwq224
  • Austin P.C., Grootendorst P., & Anderson G.M. (2007). A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: A Monte Carlo study. Statistics in Medicine, 26, 734–753. https://doi.org/10.1002/sim.2580
  • Bang, H., & Robins, J. M. (2005). Doubly Robust Estimation in Missing Data and Causal Inference Models. Biometrics, 61(4), 962-973. https://doi.org/10.1111/j.1541-0420.2005.00377.x
  • Bembom, O., & van der Laan M. J. (2008). Data-adaptive selection of the truncation level for inverse-probability-of-treatment-weighted estimators. U.C. Berkeley Division of Biostatistics Working Paper Series. Paper 230.
  • Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Sturmer, T. (2006). Variable selection for propensity score models. American Journal of Epidemiology, 163, 1149-1156. https://doi.org/10.1093/aje/kwj149
  • Cepeda M. S., Boston, R., Farrar, J. T., & Strom, B. L., (2003). Optimal matching with a variable number of controls vs. a fixed number of controls for a cohort study: trade-offs. Journal of Clinical Epidemiology, 56, 230-237. https://doi.org/10.1016/S0895-4356(02)00583-8
  • Cochran, W. G. (1968). The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics, 24, 295-313. https://doi.org/10.2307/2528036
  • Cochran, W.G., & Rubin, D. B. (1973). Controlling bias in observational studies: a review. Sankhya: The Indian Journal of Statistics, Series A 35(4), 417-446.
  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Academic Press. https://doi.org/10.4324/9780203771587
  • Crump, R. K., Hotz, V. J., Imbens, G. W., & Mitnik O. A. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96, 187-199. https://doi.org/10.1093/biomet/asn055
  • Freedman, D. A. & Berk, R. A. (2008). Weighting regressions by propensity scores. Evaluation Review, 32(4), 392-409. https://doi.org/10.1177/0193841X08317586
  • Funk M. J., Westreich D., Wiesen C., Sturmer T., Brookhart M. A., & Davidian M. (2011). Doubly robust estimation of causal effects. American Journal of Epidemiology, 173(7), 761-767. https://doi.org/10.1093/aje/kwq439
  • Gu, X. S., & Rosenbaum, P. R. (1993). Comparison of multivariate matching methods: structures, distances, and algorithms. Journal of Computational and Graphical Statistics, 4, 405-420. https://doi.org/10.2307/1390693
  • Guo, S., & Fraser, M. W. (2010). Propensity score analysis: statistical methods and applications. Sage.
  • Hansen, B.B., & Klopfer, S.O. (2006) Optimal full matching and related designs via network flows. Journal of Computational and Graphical Statistics, 15, 609-627. https://doi.org/10.1198/106186006X137047
  • Harder, V. S., Stuart, E. A., & Anthony, J. C. (2010). Propensity score techniques and the assessment of measured covariate balance to test causal associations in psychological research. Psychological Methods, 15(3), 234-349. https://doi.org/10.1037/a0019623
  • Heckman, J. J. (1978). Dummy endogenous variables in simultaneous equations system. Econometrica, 47, 931-960. https://doi.org/10.2307/1909757
  • Heckman, J. J., Ichimura, H., & Todd, P. E. (1997). Matching as an econometric evaluation estimator: Evidence from evaluating a job training programme. Review of Economic Studies, 65, 261-294. https://doi.org/10.2307/2971733
  • Hernan, M. A., Hernandez-Diaz, S., & Robins, J. M. (2004). A structural approach to selection bias. Epidemiology, 82, 387-394. https://doi.org/10.1097/01.ede.0000135174.63482.43
  • Ho, D., Imai, K., King, G., & Stuart, A. E. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis. 15(3), 199-236. https://doi.org/10.1093/pan/mpl013
  • Hong, G. (2012). Marginal mean weighting through stratification: a generalized method for evaluating multivalued and multiple treatments with nonexperimental data. Psychological methods, 17, 44-60. https://doi.org/10.1037/a0024918
  • Hong, G., & Hong, Y. (2008). Reading instruction time and homogeneous grouping in kindergarten: An application of marginal mean weighting through stratification. Educational Evaluation and Policy Analysis, 31, 54-81. https://doi.org/10.3102/0162373708328259
  • Hong, G., & Raudenbush, S. W. (2006). Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association, 101, 901-910. https://doi.org/10.1198/016214506000000447
  • Hoogland, J. J., & Boomsma, A. (1998). Robustness studies in covariance structure modeling: An overview and meta-analysis. Sociological Methods & Research, 26, 523-539. https://doi.org/10.1177/0049124198026003003
  • Horvitz, D. G., & Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of American Statistical Association, 47, 663-685. https://doi.org/10.2307/2280784
  • Kang, J. D. Y., & Schafer, J. L. (2007). Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22(4), 523-539. https://doi.org/10.1214/07-STS227
  • Leite, W. L. (2016). Practical propensity score methods using R. Sage.
  • Leite, W. L., Aydin, B., & Gurel, S. (2019). A comparison of propensity score weighting methods for evaluating the effects of programs with multiple versions. Journal of Experimental Education, 87(1), 75-88. https://doi.org/10.1080/00220973.2017.1409179
  • Leite, W. L., Jimenez, F., Kaya, Y., Stapleton, L. M., MacInnes, J. W., & Sandbach, R. (2015). An Evaluation of Weighting Methods Based on Propensity Scores to Reduce Selection Bias in Multilevel Observational Studies. Multivariate Behavioral Research, 50, 265-284. https://doi.org/10.1080/00273171.2014.991018
  • Lohr, S. L. (1999). Sampling: design and analysis. Duxbury Press.
  • Lumley, T. (2011). “survey: analysis of complex survey samples”. R package version 3.62.1
  • Lunceford, J. K., & Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Statistics in Medicine, 23, 2937-2960. https://doi.org/10.1002/sim.1903
  • McCaffrey, D. F., Ridgeway, G., & Morral, A. R. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9, 403-425. https://doi.org/10.1037/1082-989X.9.4.403
  • McKelvey, R. D., & Zavoina,W. (1975).A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4, 103-120. https://doi.org/10.1080/0022250X.1975.9989847 National Center for Education Statistics. (2010). School survey on crime and safety. Retrieved from http://nces.ed.gov/surveys/ssocs on June 1 2011.
  • Neugebauer, R., & van der Laan, M. (2005). Why prefer double robust estimates in causal inference?. Journal of Statistical Planning and Inference, 129, 405-426. https://doi.org/10.1016/j.jspi.2004.06.060
  • Olejnik, S., & Algina, J. (2003). Generalized eta and omega squared statistics: measures of effect size for some common research designs. Psychological Methods, 8(4), 434-447. https://doi.org/10.1037/1082-989X.8.4.434
  • R Development Core Team. (2011). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrived from http://www.Rproject.org.
  • Robins, J. M., Hernan, M. A., & Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology, 11, 550-560. https://doi.org/10.1097/00001648-200009000-00011
  • Robins, J. M., & Rotnitzky A. (2001). Comment on the Peter J. Bickel and Jaimyoung Kwon, ‘Inference for semiparametric models: Some questions and an answer’. Statistica Sinica, 11, 920-936
  • Rosenbaum, R. P. (1989). Optimal matching for observational studies. Journal of the American Statistical Association, 408, 1024-1032. https://doi.org/10.2307/2290079
  • Rosenbaum, R. P. (1991). A characterization of optimal designs for observational studies. Journal of the Royal Statistics Society, 53, 597-610. https://doi.org/10.1111/j.2517-6161.1991.tb01848.x
  • Rosenbaum, P. R. (2010). Design of observational studies. Springer.
  • Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrica, 70, 41-55. https://doi.org/10.1093/biomet/70.1.41
  • Rosenbaum, P. R., & Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79, 516-524. https://doi.org/10.2307/2288398
  • Rodgers, J. L. (1999). The bootstrap, the jackknife, and the randomization test: A sampling taxonomy. Multivariate Behavioral Research, 34, 441-456. https://doi.org/10.1207/S15327906MBR3404_2
  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688-701. https://doi.org/10.1037/h0037350
  • Rubin, D. B. (2007). Statistical inference for causal effects, with emphasis on applications in epidemiology and medical statistics. Handbook of Statistics, 27, 28-63. https://doi.org/10.1016/S0169-7161(07)27002-6
  • Schafer, J. L. & Kang, J. (2008). Average causal effects from nonrandomized studies: a practical guide and simulated example. Psychological Method, 13(4), 279-313. https://doi.org/10.1037/a0014268
  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin.
  • Stapleton, L. (2008). Chapter18: Analysis of data from complex surveys. In: E. D. de Leeuw, J. J. Hox & D. A. Dillman. International handbook of survey methodology. Psychology Press.
  • Steiner, P. M., Cook, T. D., Shadish, W. R., & Clark, M. H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods, 15(3), 250-276. https://doi.org/10.1037/a0018719
  • Strayhorn, T. L. (2009). Accessing and analyzing national databases. In T. J. Kowalski & T. J. Lasley II (Eds.), Handbook of data-based decision making in education (pp. 105-122). NY: Routledge.
  • Sturmer, T., Rothman, K. J., Avorn, J., & Glynn, R. J. (2010). Treatment effects in the presence of unmeasured confounding: dealing with observations in the tails of the propensity score distribution-a simulation study. Practice of Epidemiology, 172(7), 842-854. https://doi.org/10.1093/aje/kwq198
  • Stuart, E. A. (2010). Matching methods for causal inference: A review and look forward. Statistical Science, 25(1), 1-21. https://doi.org/10.1214/09-STS313
  • Thoemmes, F. J. & Kim, E. S. (2011). A systematic review of propensity score methods in the social sciences. Multivariate Behavioral Research, 46, 90-118. https://doi.org/10.1080/00273171.2011.540475
  • Thoemmes, F. J. & West, S. (2011). The use of propensity scores for nonrandomized designs with clustered data. Multivariate Behavioral Research, 46, 514-543. https://doi.org/10.1080/00273171.2011.569395
  • U.S. Department of Education, Institute of Education Sciences, & What Works Clearinghouse. (2013). What Works Clearinghouse: Procedures and Standards Handbook (Version 3.0). Retrieved from Washington, DC: http://whatworks.ed.gov
  • Venables, W. N. & Ripley, B. D. (2002). Modern Applied Statistics with S (4th ed.). Springer.
  • Weitzen S., Lapane K. L., Toledano A. Y., Hume A. L., & Mor V. (2004). Principles for modeling propensity scores in medical research: A systematic literature review. Pharmacoepidemiology and Drug Safety, 13, 841–853. https://doi.org/10.1002/pds.969
  • Winship, C. & Morgan, S. (1999). The estimation of causal effects from observational data. Annual Review of Sociology, 25, 659-706.
  • Wolter, K. M. (2007). Introduction to Variance Estimation. Springer.
There are 67 citations in total.

Details

Primary Language English
Subjects Similation Study
Journal Section Articles
Authors

Sungur Gürel 0000-0003-3425-858X

Walter Lana Leite This is me 0000-0001-7655-5668

Publication Date October 31, 2023
Submission Date June 13, 2023
Acceptance Date September 22, 2023
Published in Issue Year 2023 Volume: 2023 Issue: 21

Cite

APA Gürel, S., & Leite, W. L. (2023). Comparison of Propensity Score Weighting Methods to Remove Selection Bias in Average Treatment Effect Estimates. International Journal of Turkish Education Sciences, 2023(21), 989-1031. https://doi.org/10.46778/goputeb.1312865