Research Article

Comparison of Machine Learning Regression Methods to Predict Health Expenditures

Volume: 22 Number: 2 September 19, 2017
EN TR

Comparison of Machine Learning Regression Methods to Predict Health Expenditures

Abstract

As a result of experimental studies on different datasets, it is recommended to use machine learning regression methods as an alternative to classical regression methods in the existence of variables which are difficult to model. Health expenditure is an indicator which is difficult to model and there is no study in the literature about modelling health expenditure comparing machine learning regression methods. In this study a multiple regression model was conducted to predict health expenditure per capita. Performance results of Lasso Regression, Random Forest Regression and Support Vector Machine Regression compared when different hyperparameter values were determined. Lambda (λ) value for Lasso Regression, number of trees for Random Forest Regression, epsilon () value for Support Vector Regression was determined as hyperparameter values. Study results performed by using “k” fold cross validation changed from 5 to 50, indicate the difference between machine learning results in terms of R2, RMSE and MAE values that are statistically significant (p<0.001). Surface and bar plots and statistical test results of prediction performances show that Random Forest Regression (R2 ˃ 0.7500, RMSE ≤ 0.6000 ve MAE ≤ 0.4000) has better prediction performance according to different hyperparameter values. It is hoped that study results make contribution to studies about determining optimal hyperparameter values for machine learning regression methods for studies about modelling health expenditures. 

Keywords

References

  1. Alpar R. (2011) Uygulamalı çok değişkenli istatistiksel yöntemler, Detay Yayıncılık, Ankara, 415-620.
  2. Basu, A., Manning, W.G. ve Mullahy, J. (2004). Comparing alternative model: log and cox proportional hazard? Health Economics, 13(8), 749-765. doi: 10.1002/hec.852.
  3. Belloni, A., Chernozhukov, V., Hansen, C. (2012) Inference for high-dimensional sparse econometric models. https://arxiv.org/abs/1201.0220. doi: 10.1017/CBO9781139060035.008. Erişim Tarihi: 01.01.2016.
  4. Bergstra, J. ve Bengio, Y. (2012) Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281-305. http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf. Erişim Tarihi: 01.02.2016.
  5. Box, G.E.P. ve Cox, D.R. (1964) An analysis of transformations, Journal of the Royal Statistical Society, 26(2), 211-252. doi: 10.1.1.321.3819.
  6. Brieman, L. (2001) Random forests, Machine Learning, 45, 5-32. doi: 10.1023%2FA%3A1010933404324.
  7. Cherkassky, V. ve Ma, Y. (2004) Practical selection of SVM parameters and noise estimation for SVM regression, Neural Networks, 17(1), 113-126. doi:10.1016/S0893-6080(03)00169-2.
  8. Cosgun E., Karaağaoğlu E. (2011). Veri madenciliği yöntemleriyle mikrodizilim gen ifade analizi, Hacettepe Tıp Dergisi, 42, 180-189. http://docplayer.biz.tr/3432783-Veri-madencili-i-yontemleriyle-mikrodizilim-gen-ifade-analizi.html. Erişim Tarihi: 01.02.2016.

Details

Primary Language

Turkish

Subjects

Engineering

Journal Section

Research Article

Publication Date

September 19, 2017

Submission Date

March 7, 2016

Acceptance Date

August 19, 2017

Published in Issue

Year 2017 Volume: 22 Number: 2

APA
Çınaroğlu, S. (2017). SAĞLIK HARCAMASININ TAHMİNİNDE MAKİNE ÖĞRENMESİ REGRESYON YÖNTEMLERİNİN KARŞILAŞTIRILMASI. Uludağ Üniversitesi Mühendislik Fakültesi Dergisi, 22(2), 179-200. https://doi.org/10.17482/uumfd.338805
AMA
1.Çınaroğlu S. SAĞLIK HARCAMASININ TAHMİNİNDE MAKİNE ÖĞRENMESİ REGRESYON YÖNTEMLERİNİN KARŞILAŞTIRILMASI. UUJFE. 2017;22(2):179-200. doi:10.17482/uumfd.338805
Chicago
Çınaroğlu, Songül. 2017. “SAĞLIK HARCAMASININ TAHMİNİNDE MAKİNE ÖĞRENMESİ REGRESYON YÖNTEMLERİNİN KARŞILAŞTIRILMASI”. Uludağ Üniversitesi Mühendislik Fakültesi Dergisi 22 (2): 179-200. https://doi.org/10.17482/uumfd.338805.
EndNote
Çınaroğlu S (August 1, 2017) SAĞLIK HARCAMASININ TAHMİNİNDE MAKİNE ÖĞRENMESİ REGRESYON YÖNTEMLERİNİN KARŞILAŞTIRILMASI. Uludağ Üniversitesi Mühendislik Fakültesi Dergisi 22 2 179–200.
IEEE
[1]S. Çınaroğlu, “SAĞLIK HARCAMASININ TAHMİNİNDE MAKİNE ÖĞRENMESİ REGRESYON YÖNTEMLERİNİN KARŞILAŞTIRILMASI”, UUJFE, vol. 22, no. 2, pp. 179–200, Aug. 2017, doi: 10.17482/uumfd.338805.
ISNAD
Çınaroğlu, Songül. “SAĞLIK HARCAMASININ TAHMİNİNDE MAKİNE ÖĞRENMESİ REGRESYON YÖNTEMLERİNİN KARŞILAŞTIRILMASI”. Uludağ Üniversitesi Mühendislik Fakültesi Dergisi 22/2 (August 1, 2017): 179-200. https://doi.org/10.17482/uumfd.338805.
JAMA
1.Çınaroğlu S. SAĞLIK HARCAMASININ TAHMİNİNDE MAKİNE ÖĞRENMESİ REGRESYON YÖNTEMLERİNİN KARŞILAŞTIRILMASI. UUJFE. 2017;22:179–200.
MLA
Çınaroğlu, Songül. “SAĞLIK HARCAMASININ TAHMİNİNDE MAKİNE ÖĞRENMESİ REGRESYON YÖNTEMLERİNİN KARŞILAŞTIRILMASI”. Uludağ Üniversitesi Mühendislik Fakültesi Dergisi, vol. 22, no. 2, Aug. 2017, pp. 179-00, doi:10.17482/uumfd.338805.
Vancouver
1.Songül Çınaroğlu. SAĞLIK HARCAMASININ TAHMİNİNDE MAKİNE ÖĞRENMESİ REGRESYON YÖNTEMLERİNİN KARŞILAŞTIRILMASI. UUJFE. 2017 Aug. 1;22(2):179-200. doi:10.17482/uumfd.338805

Cited By

Announcements:

30.03.2021-Beginning with our April 2021 (26/1) issue, in accordance with the new criteria of TR-Dizin, the Declaration of Conflict of Interest and the Declaration of Author Contribution forms fulfilled and signed by all authors are required as well as the Copyright form during the initial submission of the manuscript. Furthermore two new sections, i.e. ‘Conflict of Interest’ and ‘Author Contribution’, should be added to the manuscript. Links of those forms that should be submitted with the initial manuscript can be found in our 'Author Guidelines' and 'Submission Procedure' pages. The manuscript template is also updated. For articles reviewed and accepted for publication in our 2021 and ongoing issues and for articles currently under review process, those forms should also be fulfilled, signed and uploaded to the system by authors.