Healthcare insurance costs are a significant concern for individuals and providers. Accurately predicting these costs can assist in financial planning and risk assessment. This study explores machine learning ensemble methods to predict healthcare insurance costs based on various factors, including age, sex, body mass index (BMI), number of children, smoking status, and region. Additionally, new features were introduced by incorporating the mean and standard deviation of BMI and smoking habits, which are known to affect insurance costs substantially.
The study began with a comprehensive statistical analysis of the dataset, followed by feature engineering to enhance its predictive power. Categorical variables such as sex, smoking status, and region were appropriately encoded. Two datasets were constructed: one containing all the original features, and the other containing the engineered features. Ensemble learning methods, including Bagging, Stacking, and the proposed MedCost-AdaBoost model, were employed to predict the insurance costs for both datasets. The results revealed that the MedCost-AdaBoost model outperformed the other methods in terms of lower Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) values, along with higher R-squared (R2) scores.
These findings underscore the effectiveness of ensemble learning techniques in predicting healthcare insurance costs, with feature engineering playing a crucial role in improving prediction accuracy. Despite certain limitations, such as the dataset size, this study provides valuable insights for researchers and professionals in the healthcare insurance industry. Future research could explore additional factors and larger datasets to enhance the predictive models in this domain further.
Medical cost machine learning ensemble methods feature engineering prediction accuracy
Birincil Dil | İngilizce |
---|---|
Konular | Bilgisayar Yazılımı, Yazılım Mühendisliği (Diğer) |
Bölüm | Araştırma Makalesi |
Yazarlar | |
Erken Görünüm Tarihi | 23 Ağustos 2024 |
Yayımlanma Tarihi | 30 Haziran 2024 |
Gönderilme Tarihi | 16 Ekim 2023 |
Kabul Tarihi | 14 Ocak 2024 |
Yayımlandığı Sayı | Yıl 2024 |
All articles published by EJT are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.