Healthcare insurance costs are a significant concern for individuals and providers. Accurately predicting these costs can assist in financial planning and risk assessment. This study explores machine learning ensemble methods to predict healthcare insurance costs based on various factors, including age, sex, body mass index (BMI), number of children, smoking status, and region. Additionally, new features were introduced by incorporating the mean and standard deviation of BMI and smoking habits, which are known to affect insurance costs substantially.
The study began with a comprehensive statistical analysis of the dataset, followed by feature engineering to enhance its predictive power. Categorical variables such as sex, smoking status, and region were appropriately encoded. Two datasets were constructed: one containing all the original features, and the other containing the engineered features. Ensemble learning methods, including Bagging, Stacking, and the proposed MedCost-AdaBoost model, were employed to predict the insurance costs for both datasets. The results revealed that the MedCost-AdaBoost model outperformed the other methods in terms of lower Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) values, along with higher R-squared (R2) scores.
These findings underscore the effectiveness of ensemble learning techniques in predicting healthcare insurance costs, with feature engineering playing a crucial role in improving prediction accuracy. Despite certain limitations, such as the dataset size, this study provides valuable insights for researchers and professionals in the healthcare insurance industry. Future research could explore additional factors and larger datasets to enhance the predictive models in this domain further.
Primary Language | English |
---|---|
Subjects | Computer Software, Software Engineering (Other) |
Journal Section | Research Article |
Authors | |
Early Pub Date | August 23, 2024 |
Publication Date | June 30, 2024 |
Submission Date | October 16, 2023 |
Acceptance Date | January 14, 2024 |
Published in Issue | Year 2024 |
All articles published by EJT are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.