Machine learning techniques can identify the non-linear patterns in a dataset and can uncover hidden relationships. Random forest is one of the modern machine learning techniques that provides an alternative to traditional classification methods such as logistic regression. In this study it is aimed to compare the prediction performance of logistic regression with that of random forest and to identify the predicting factors of public health outcomes at a provincial level. The data representing 81 provinces of Turkey are taken from the Turkish Statistical Institute for the year 2013. Life expectancy at birth and mortality are chosen as the public health outcomes. Three different random forest models are constructed by determining the number of trees: 50, 100, and 150. The prediction results of different methods are recorded by changing the “k” parameter from 3 to 20 in k-fold cross validation. The Area Under the ROC Curve (AUC), sensitivity, and specificity are considered as performance measures. The study results reveal that the differences between the prediction model performances to predict health outcomes are statistically significant (p<0.000). Moreover, logistic regression outperformed random forest models. The decision tree graphs show that the most important predictor variables for mortality are the total number of beds and for life expectancy at birth, the percentage of higher education graduates. In the light of this study, it is highly recommended for health professionals to be more aware about increasing potential of modern prediction methods in health services research.
Machine learning logistic regression random forest health outcomes
Birincil Dil | İngilizce |
---|---|
Konular | Sağlık Kurumları Yönetimi |
Bölüm | Makaleler |
Yazarlar | |
Yayımlanma Tarihi | 19 Mart 2020 |
Yayımlandığı Sayı | Yıl 2020 Cilt: 23 Sayı: 1 |