Araştırma Makalesi
BibTex RIS Kaynak Göster

Makine Öğrenmesi ve Düzenlileştirilmiş Regresyon Yöntemleri ile Küresel Sağlık Harcamalarının Tahmini

Yıl 2025, Cilt: 12 Sayı: 2, 462 - 476, 29.12.2025
https://doi.org/10.47097/piar.1792425

Öz

Sağlık harcamaları, ülkelerin ekonomik sürdürülebilirliği ve sağlık politikalarının etkinliği açısından kritik öneme sahiptir. Bu harcamaların doğru biçimde modellenmesi karmaşık bir süreçtir ve klasik regresyon yöntemlerinin ötesinde yaklaşımlar gerektirir. Bu çalışma, 190 ülkenin 2022 yılına ait Dünya Bankası verileri kullanılarak kişi başına sağlık harcamalarını makine öğrenmesi ve düzenlileştirilmiş regresyon yöntemleriyle tahmin etmeyi amaçlamıştır.
Veri bütünlüğünü sağlamak için eksik değerler Zincirleme Denklemlerle Çoklu Atama (MICE) yöntemiyle tamamlanmıştır. Bağımlı değişken kişi başına sağlık harcaması olup, bağımsız değişkenler sosyoekonomik ve demografik göstergeleri içermektedir. Altı model—Destek Vektör Regresyonu (SVR), Rastgele Ormanlar (RF), Aşırı Gradyan Artırma (XGBoost), Elastic Net, Lasso ve Ridge regresyonu—RMSE, MAE ve R² ölçütleri kullanılarak karşılaştırılmıştır. En iyi performans SVR modeliyle elde edilmiştir (RMSE = 463 ± 13.3, R² = 0.940 ± 0.003). XGBoost modeli en düşük MAE değerine (262 ± 15.5) ve yüksek doğruluk oranına (R² = 0.923 ± 0.007) ulaşmıştır. Kişi başına düşen GSYİH en güçlü yordayıcı olurken, yaşlı nüfus oranı, yaşam beklentisi ve kentleşme oranı ikincil katkılar sağlamıştır. SVR ve XGBoost modelleri yüksek tahmin gücü sergileyerek sağlık harcamalarının öngörülmesinde politika yapıcılar için değerli karar destek araçları olarak öne çıkmaktadır.

Etik Beyan

Çalışmalarda gerçekleştirilen tüm prosedürler, benzer kurumsal ve/veya ulusal araştırma komitelerinin etik standartlarına uygundur.

Destekleyen Kurum

Herhangi bir kurum veya kuruluştan destek alınmamaktadır.

Teşekkür

.

Kaynakça

  • Acemoglu, D., Finkelstein, A., & Notowidigdo, M. J. (2013). Income and health spending: Evidence from oil price shocks. Review of Economics and Statistics, 95(4), 1079-1095.
  • Azur, M. J., Stuart, E. A., Frangakis, C., & Leaf, P. J. (2011). Multiple imputation by chained equations: What is it and how does it work?. International Journal of Methods in Psychiatric Research, 20(1), 40-49.
  • Baltagi, B. H., & Moscone, F. (2010). Health care expenditure and income in the OECD reconsidered: Evidence from panel data. Economic Modelling, 27(4), 804–811.
  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
  • Çınaroğlu, S. (2017). Sağlık harcamasının tahmininde makine öğrenmesi regresyon yöntemlerinin karşılaştırılması. Uludağ Üniversitesi Mühendislik Fakültesi Dergisi, 22(2), 179–197. https://doi.org/10.17482/uumfd.338805
  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451
  • Gerdtham, U. G., & Jönsson, B. (2000). International comparisons of health expenditure: Theory, data and econometric analysis. In Handbook of health economics, (1), 11-53.
  • Güleryüz, D. (2021). Predicting health spending in Turkey using the GPR, SVR, and DT models. Acta Infologica, 5(1), 155–166. https://doi.org/10.26650/acin.885940
  • Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
  • Martin, A. B., Hartman, M., Benson, J., Catlin, A., & National Health Expenditure Accounts Team. (2011). National health spending in 2011: Overall growth remains low, but some payers and services show signs of acceleration. Health Affairs, 32(1), 87–99. https://doi.org/10.1377/hlthaff.2012.1206
  • Mihaylova, B., Briggs, A., O’Hagan, A., & Thompson, S. G. (2011). Review of statistical methods for analysing healthcare resources and costs. Health Economics, 20(8), 897–916. https://doi.org/10.1002/hec.1653
  • OECD. (2023a). Health at a Glance 2023: OECD Indicators, OECD Publishing, https://doi.org/10.1787/7a7afb35-en
  • OECD. (2023b). Health spending (indicator). OECD Data. https://doi.org/10.1787/8643de7e-en
  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. John Wiley & Sons. https://doi.org/10.1002/9780470316696
  • Sinha, R., Khandelwal, S., & Deshmukh, P. R. (2016). Determinants of out-of-pocket health expenditure: A systematic review. Journal of Health Management, 18(2), 213–242. https://doi.org/10.1177/0972063416637700
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  • Vapnik, V. N. (1995). The nature of statistical learning theory. Springer. https://doi.org/10.1007/978-1-4757-2440-0
  • World Bank. (2023). World Development Indicators: Health expenditure per capita (current US$). The World Bank. https://data.worldbank.org/indicator/SH.XPD.CHEX.PC.CD
  • World Bank, & World Health Organization. (2023). Tracking universal health coverage: 2023 global monitoring report. World Bank /World Health Organization. https://openknowledge.worldbank.org/entities/publication/1ced1b12-896e-49f1-ab6f-f1a95325f39b
  • World Health Organization. (2023a). Global spending on health: Report summary. World Health Organization. https://iris.who.int/bitstream/handle/10665/379750/9789240104495-eng.pdf?isAllowed=y&sequence=1
  • World Health Organization. (2023b). Financial protection. World Health Organization. https://www.who.int/data/gho/indicator-metadata-registry/imr-details/4950
  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x

Predicting Global Health Expenditures Using Machine Learning and Regularized Regression Methods

Yıl 2025, Cilt: 12 Sayı: 2, 462 - 476, 29.12.2025
https://doi.org/10.47097/piar.1792425

Öz

Health expenditures are crucial for countries’ economic sustainability and the effectiveness of health policies. Accurately modeling these expenditures is complex and requires methods beyond classical regression. This study aimed to estimate per capita health expenditures using machine learning and regularized regression approaches based on 2022 World Bank data from 190 countries.
Missing values were imputed using the Multiple Imputation by Chained Equations (MICE) method. The dependent variable was per capita health expenditure, while independent variables included socioeconomic and demographic indicators. Six models—Support Vector Regression (SVR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Elastic Net, Lasso, and Ridge—were compared using RMSE, MAE, and R² metrics. SVR achieved the best performance (RMSE = 463 ± 13.3, R² = 0.940 ± 0.003). XGBoost yielded the lowest MAE (262 ± 15.5) with high accuracy (R² = 0.923 ± 0.007). GDP per capita was the most important predictor, followed by the proportion of elderly population, life expectancy, and urbanization rate. SVR and XGBoost models demonstrated high predictive power, highlighting their potential as decision-support tools for forecasting health expenditures.

Etik Beyan

All procedures performed in studies comply with the ethical standards of comparable institutional and/or national research committees.

Destekleyen Kurum

No support is taken from any institution or organization.

Teşekkür

.

Kaynakça

  • Acemoglu, D., Finkelstein, A., & Notowidigdo, M. J. (2013). Income and health spending: Evidence from oil price shocks. Review of Economics and Statistics, 95(4), 1079-1095.
  • Azur, M. J., Stuart, E. A., Frangakis, C., & Leaf, P. J. (2011). Multiple imputation by chained equations: What is it and how does it work?. International Journal of Methods in Psychiatric Research, 20(1), 40-49.
  • Baltagi, B. H., & Moscone, F. (2010). Health care expenditure and income in the OECD reconsidered: Evidence from panel data. Economic Modelling, 27(4), 804–811.
  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
  • Çınaroğlu, S. (2017). Sağlık harcamasının tahmininde makine öğrenmesi regresyon yöntemlerinin karşılaştırılması. Uludağ Üniversitesi Mühendislik Fakültesi Dergisi, 22(2), 179–197. https://doi.org/10.17482/uumfd.338805
  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451
  • Gerdtham, U. G., & Jönsson, B. (2000). International comparisons of health expenditure: Theory, data and econometric analysis. In Handbook of health economics, (1), 11-53.
  • Güleryüz, D. (2021). Predicting health spending in Turkey using the GPR, SVR, and DT models. Acta Infologica, 5(1), 155–166. https://doi.org/10.26650/acin.885940
  • Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
  • Martin, A. B., Hartman, M., Benson, J., Catlin, A., & National Health Expenditure Accounts Team. (2011). National health spending in 2011: Overall growth remains low, but some payers and services show signs of acceleration. Health Affairs, 32(1), 87–99. https://doi.org/10.1377/hlthaff.2012.1206
  • Mihaylova, B., Briggs, A., O’Hagan, A., & Thompson, S. G. (2011). Review of statistical methods for analysing healthcare resources and costs. Health Economics, 20(8), 897–916. https://doi.org/10.1002/hec.1653
  • OECD. (2023a). Health at a Glance 2023: OECD Indicators, OECD Publishing, https://doi.org/10.1787/7a7afb35-en
  • OECD. (2023b). Health spending (indicator). OECD Data. https://doi.org/10.1787/8643de7e-en
  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. John Wiley & Sons. https://doi.org/10.1002/9780470316696
  • Sinha, R., Khandelwal, S., & Deshmukh, P. R. (2016). Determinants of out-of-pocket health expenditure: A systematic review. Journal of Health Management, 18(2), 213–242. https://doi.org/10.1177/0972063416637700
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  • Vapnik, V. N. (1995). The nature of statistical learning theory. Springer. https://doi.org/10.1007/978-1-4757-2440-0
  • World Bank. (2023). World Development Indicators: Health expenditure per capita (current US$). The World Bank. https://data.worldbank.org/indicator/SH.XPD.CHEX.PC.CD
  • World Bank, & World Health Organization. (2023). Tracking universal health coverage: 2023 global monitoring report. World Bank /World Health Organization. https://openknowledge.worldbank.org/entities/publication/1ced1b12-896e-49f1-ab6f-f1a95325f39b
  • World Health Organization. (2023a). Global spending on health: Report summary. World Health Organization. https://iris.who.int/bitstream/handle/10665/379750/9789240104495-eng.pdf?isAllowed=y&sequence=1
  • World Health Organization. (2023b). Financial protection. World Health Organization. https://www.who.int/data/gho/indicator-metadata-registry/imr-details/4950
  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Toplam 23 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Ekonometri (Diğer)
Bölüm Araştırma Makalesi
Yazarlar

Hakan Öztürk 0000-0001-8112-4934

Elvan Hayat 0000-0001-8200-8046

Gönderilme Tarihi 28 Eylül 2025
Kabul Tarihi 16 Ekim 2025
Yayımlanma Tarihi 29 Aralık 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 12 Sayı: 2

Kaynak Göster

APA Öztürk, H., & Hayat, E. (2025). Predicting Global Health Expenditures Using Machine Learning and Regularized Regression Methods. Pamukkale Üniversitesi İşletme Araştırmaları Dergisi, 12(2), 462-476. https://doi.org/10.47097/piar.1792425

Pamukkale Üniversitesi İşletme Araştırmaları Dergisinde yayınlanmış makalelerin telif hakları Creative Commons Atıf-Gayriticari 4.0 Uluslararası Lisansı (CC BY-NC-ND 4.0) kapsamındadır.

by-nc-nd.png