Research Article
BibTex RIS Cite

Makine Öğrenmesi ve Düzenlileştirilmiş Regresyon Yöntemleri ile Küresel Sağlık Harcamalarının Tahmini

Year 2025, Volume: 12 Issue: 2, 462 - 476, 29.12.2025
https://doi.org/10.47097/piar.1792425

Abstract

Sağlık harcamaları, ülkelerin ekonomik sürdürülebilirliği ve sağlık politikalarının etkinliği açısından kritik öneme sahiptir. Bu harcamaların doğru biçimde modellenmesi karmaşık bir süreçtir ve klasik regresyon yöntemlerinin ötesinde yaklaşımlar gerektirir. Bu çalışma, 190 ülkenin 2022 yılına ait Dünya Bankası verileri kullanılarak kişi başına sağlık harcamalarını makine öğrenmesi ve düzenlileştirilmiş regresyon yöntemleriyle tahmin etmeyi amaçlamıştır.
Veri bütünlüğünü sağlamak için eksik değerler Zincirleme Denklemlerle Çoklu Atama (MICE) yöntemiyle tamamlanmıştır. Bağımlı değişken kişi başına sağlık harcaması olup, bağımsız değişkenler sosyoekonomik ve demografik göstergeleri içermektedir. Altı model—Destek Vektör Regresyonu (SVR), Rastgele Ormanlar (RF), Aşırı Gradyan Artırma (XGBoost), Elastic Net, Lasso ve Ridge regresyonu—RMSE, MAE ve R² ölçütleri kullanılarak karşılaştırılmıştır. En iyi performans SVR modeliyle elde edilmiştir (RMSE = 463 ± 13.3, R² = 0.940 ± 0.003). XGBoost modeli en düşük MAE değerine (262 ± 15.5) ve yüksek doğruluk oranına (R² = 0.923 ± 0.007) ulaşmıştır. Kişi başına düşen GSYİH en güçlü yordayıcı olurken, yaşlı nüfus oranı, yaşam beklentisi ve kentleşme oranı ikincil katkılar sağlamıştır. SVR ve XGBoost modelleri yüksek tahmin gücü sergileyerek sağlık harcamalarının öngörülmesinde politika yapıcılar için değerli karar destek araçları olarak öne çıkmaktadır.

Ethical Statement

Çalışmalarda gerçekleştirilen tüm prosedürler, benzer kurumsal ve/veya ulusal araştırma komitelerinin etik standartlarına uygundur.

Supporting Institution

Herhangi bir kurum veya kuruluştan destek alınmamaktadır.

Thanks

.

References

  • Acemoglu, D., Finkelstein, A., & Notowidigdo, M. J. (2013). Income and health spending: Evidence from oil price shocks. Review of Economics and Statistics, 95(4), 1079-1095.
  • Azur, M. J., Stuart, E. A., Frangakis, C., & Leaf, P. J. (2011). Multiple imputation by chained equations: What is it and how does it work?. International Journal of Methods in Psychiatric Research, 20(1), 40-49.
  • Baltagi, B. H., & Moscone, F. (2010). Health care expenditure and income in the OECD reconsidered: Evidence from panel data. Economic Modelling, 27(4), 804–811.
  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
  • Çınaroğlu, S. (2017). Sağlık harcamasının tahmininde makine öğrenmesi regresyon yöntemlerinin karşılaştırılması. Uludağ Üniversitesi Mühendislik Fakültesi Dergisi, 22(2), 179–197. https://doi.org/10.17482/uumfd.338805
  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451
  • Gerdtham, U. G., & Jönsson, B. (2000). International comparisons of health expenditure: Theory, data and econometric analysis. In Handbook of health economics, (1), 11-53.
  • Güleryüz, D. (2021). Predicting health spending in Turkey using the GPR, SVR, and DT models. Acta Infologica, 5(1), 155–166. https://doi.org/10.26650/acin.885940
  • Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
  • Martin, A. B., Hartman, M., Benson, J., Catlin, A., & National Health Expenditure Accounts Team. (2011). National health spending in 2011: Overall growth remains low, but some payers and services show signs of acceleration. Health Affairs, 32(1), 87–99. https://doi.org/10.1377/hlthaff.2012.1206
  • Mihaylova, B., Briggs, A., O’Hagan, A., & Thompson, S. G. (2011). Review of statistical methods for analysing healthcare resources and costs. Health Economics, 20(8), 897–916. https://doi.org/10.1002/hec.1653
  • OECD. (2023a). Health at a Glance 2023: OECD Indicators, OECD Publishing, https://doi.org/10.1787/7a7afb35-en
  • OECD. (2023b). Health spending (indicator). OECD Data. https://doi.org/10.1787/8643de7e-en
  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. John Wiley & Sons. https://doi.org/10.1002/9780470316696
  • Sinha, R., Khandelwal, S., & Deshmukh, P. R. (2016). Determinants of out-of-pocket health expenditure: A systematic review. Journal of Health Management, 18(2), 213–242. https://doi.org/10.1177/0972063416637700
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  • Vapnik, V. N. (1995). The nature of statistical learning theory. Springer. https://doi.org/10.1007/978-1-4757-2440-0
  • World Bank. (2023). World Development Indicators: Health expenditure per capita (current US$). The World Bank. https://data.worldbank.org/indicator/SH.XPD.CHEX.PC.CD
  • World Bank, & World Health Organization. (2023). Tracking universal health coverage: 2023 global monitoring report. World Bank /World Health Organization. https://openknowledge.worldbank.org/entities/publication/1ced1b12-896e-49f1-ab6f-f1a95325f39b
  • World Health Organization. (2023a). Global spending on health: Report summary. World Health Organization. https://iris.who.int/bitstream/handle/10665/379750/9789240104495-eng.pdf?isAllowed=y&sequence=1
  • World Health Organization. (2023b). Financial protection. World Health Organization. https://www.who.int/data/gho/indicator-metadata-registry/imr-details/4950
  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x

Predicting Global Health Expenditures Using Machine Learning and Regularized Regression Methods

Year 2025, Volume: 12 Issue: 2, 462 - 476, 29.12.2025
https://doi.org/10.47097/piar.1792425

Abstract

Health expenditures are crucial for countries’ economic sustainability and the effectiveness of health policies. Accurately modeling these expenditures is complex and requires methods beyond classical regression. This study aimed to estimate per capita health expenditures using machine learning and regularized regression approaches based on 2022 World Bank data from 190 countries.
Missing values were imputed using the Multiple Imputation by Chained Equations (MICE) method. The dependent variable was per capita health expenditure, while independent variables included socioeconomic and demographic indicators. Six models—Support Vector Regression (SVR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Elastic Net, Lasso, and Ridge—were compared using RMSE, MAE, and R² metrics. SVR achieved the best performance (RMSE = 463 ± 13.3, R² = 0.940 ± 0.003). XGBoost yielded the lowest MAE (262 ± 15.5) with high accuracy (R² = 0.923 ± 0.007). GDP per capita was the most important predictor, followed by the proportion of elderly population, life expectancy, and urbanization rate. SVR and XGBoost models demonstrated high predictive power, highlighting their potential as decision-support tools for forecasting health expenditures.

Ethical Statement

All procedures performed in studies comply with the ethical standards of comparable institutional and/or national research committees.

Supporting Institution

No support is taken from any institution or organization.

Thanks

.

References

  • Acemoglu, D., Finkelstein, A., & Notowidigdo, M. J. (2013). Income and health spending: Evidence from oil price shocks. Review of Economics and Statistics, 95(4), 1079-1095.
  • Azur, M. J., Stuart, E. A., Frangakis, C., & Leaf, P. J. (2011). Multiple imputation by chained equations: What is it and how does it work?. International Journal of Methods in Psychiatric Research, 20(1), 40-49.
  • Baltagi, B. H., & Moscone, F. (2010). Health care expenditure and income in the OECD reconsidered: Evidence from panel data. Economic Modelling, 27(4), 804–811.
  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
  • Çınaroğlu, S. (2017). Sağlık harcamasının tahmininde makine öğrenmesi regresyon yöntemlerinin karşılaştırılması. Uludağ Üniversitesi Mühendislik Fakültesi Dergisi, 22(2), 179–197. https://doi.org/10.17482/uumfd.338805
  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451
  • Gerdtham, U. G., & Jönsson, B. (2000). International comparisons of health expenditure: Theory, data and econometric analysis. In Handbook of health economics, (1), 11-53.
  • Güleryüz, D. (2021). Predicting health spending in Turkey using the GPR, SVR, and DT models. Acta Infologica, 5(1), 155–166. https://doi.org/10.26650/acin.885940
  • Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
  • Martin, A. B., Hartman, M., Benson, J., Catlin, A., & National Health Expenditure Accounts Team. (2011). National health spending in 2011: Overall growth remains low, but some payers and services show signs of acceleration. Health Affairs, 32(1), 87–99. https://doi.org/10.1377/hlthaff.2012.1206
  • Mihaylova, B., Briggs, A., O’Hagan, A., & Thompson, S. G. (2011). Review of statistical methods for analysing healthcare resources and costs. Health Economics, 20(8), 897–916. https://doi.org/10.1002/hec.1653
  • OECD. (2023a). Health at a Glance 2023: OECD Indicators, OECD Publishing, https://doi.org/10.1787/7a7afb35-en
  • OECD. (2023b). Health spending (indicator). OECD Data. https://doi.org/10.1787/8643de7e-en
  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. John Wiley & Sons. https://doi.org/10.1002/9780470316696
  • Sinha, R., Khandelwal, S., & Deshmukh, P. R. (2016). Determinants of out-of-pocket health expenditure: A systematic review. Journal of Health Management, 18(2), 213–242. https://doi.org/10.1177/0972063416637700
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  • Vapnik, V. N. (1995). The nature of statistical learning theory. Springer. https://doi.org/10.1007/978-1-4757-2440-0
  • World Bank. (2023). World Development Indicators: Health expenditure per capita (current US$). The World Bank. https://data.worldbank.org/indicator/SH.XPD.CHEX.PC.CD
  • World Bank, & World Health Organization. (2023). Tracking universal health coverage: 2023 global monitoring report. World Bank /World Health Organization. https://openknowledge.worldbank.org/entities/publication/1ced1b12-896e-49f1-ab6f-f1a95325f39b
  • World Health Organization. (2023a). Global spending on health: Report summary. World Health Organization. https://iris.who.int/bitstream/handle/10665/379750/9789240104495-eng.pdf?isAllowed=y&sequence=1
  • World Health Organization. (2023b). Financial protection. World Health Organization. https://www.who.int/data/gho/indicator-metadata-registry/imr-details/4950
  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
There are 23 citations in total.

Details

Primary Language English
Subjects Econometrics (Other)
Journal Section Research Article
Authors

Hakan Öztürk 0000-0001-8112-4934

Elvan Hayat 0000-0001-8200-8046

Submission Date September 28, 2025
Acceptance Date October 16, 2025
Publication Date December 29, 2025
Published in Issue Year 2025 Volume: 12 Issue: 2

Cite

APA Öztürk, H., & Hayat, E. (2025). Predicting Global Health Expenditures Using Machine Learning and Regularized Regression Methods. Pamukkale Üniversitesi İşletme Araştırmaları Dergisi, 12(2), 462-476. https://doi.org/10.47097/piar.1792425

PIAR is licensed under a Creative Commons Attribution 4.0 International License.

by-nc-nd.png