Research Article

Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques

Volume: 16 Number: 1 March 26, 2025
EN TR

Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques

Abstract

Bankruptcy prediction is an essential task in financial risk management, often hindered by challenges such as class imbalance, feature selection, and overfitting. This study investigates the comparative effectiveness of data balancing techniques, specifically focusing on oversampling with SMOTE (Synthetic Minority Over-sampling Technique) and undersampling with Tomek Links, in addressing class imbalance in bankruptcy datasets. A range of machine learning models, including ensemble and boosting algorithms such as Stacking Classifier and XGBoost, were applied to imbalanced, SMOTE-balanced, and Tomek Links-balanced datasets. Dimensionality reduction was performed using Principal Component Analysis (PCA) to enhance computational efficiency and reduce overfitting risks, while hyperparameter optimization was conducted using the Optuna framework to maximize model performance. The findings demonstrate that SMOTE significantly improved classification accuracy and F1 scores, particularly for ensemble-based models, by generating synthetic samples to balance the dataset. In contrast, Tomek Links often reduced model performance due to the removal of potentially informative data points. Among the models tested, the Stacking Classifier performed best on SMOTE-balanced data, achieving a prediction accuracy of 99%. These results support integrating advanced predictive tools into financial decision-making. The Stacking Classifier’s strong performance on SMOTE-balanced data enhances risk management systems, enabling proactive bankruptcy detection.

Keywords

References

  1. [1] T. J. Zywicki, “An economic analysis of the consumer bankruptcy crisis,” Nw. UL Rev., vol. 99, pp. 1463, 2004.
  2. [2] E. I. Altman, “Predicting financial distress of companies: revisiting the Z-score and ZETA® models,” in Handbook of Research Methods and Applications in Empirical Finance, Edward Elgar Publishing, 2013, pp. 428–456.
  3. [3] M. K. Brunnermeier and Y. Sannikov, “A macroeconomic model with a financial sector,” American Economic Review, vol. 104, no. 2, pp. 379–421, 2014.
  4. [4] A. W. Lo and D. V. Repin, “The psychophysiology of real-time financial risk processing,” Journal of Cognitive Neuroscience, vol. 14, no. 3, pp. 323–339, 2002.
  5. [5] J. E. Stiglitz, “Reforming the global economic architecture: lessons from recent crises,” The Journal of Finance, vol. 54, no. 4, pp. 1508–1521, 1999.
  6. [6] V. Sinap, “Comparative performance analysis of machine learning algorithms in the retail industry: Black Friday sales forecasting,” Journal of Selçuk University Social Sciences Vocational School, vol. 27, no. 1, pp. 65–90, 2024.
  7. [7] J. Furman, J. E. Stiglitz, B. P. Bosworth, and S. Radelet, “Economic crises: evidence and insights from East Asia,” Brookings Papers on Economic Activity, vol. 1998, no. 2, pp. 1–135, 1998.
  8. [8] G. Allayannis and E. Ofek, “Exchange rate exposure, hedging, and the use of foreign currency derivatives,” Journal of International Money and Finance, vol. 20, no. 2, pp. 273–296, 2001.

Details

Primary Language

English

Subjects

Machine Learning (Other)

Journal Section

Research Article

Early Pub Date

March 26, 2025

Publication Date

March 26, 2025

Submission Date

December 6, 2024

Acceptance Date

March 5, 2025

Published in Issue

Year 2025 Volume: 16 Number: 1

APA
Sinap, V. (2025). Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi, 16(1), 97-113. https://doi.org/10.24012/dumf.1597564
AMA
1.Sinap V. Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques. DUJE. 2025;16(1):97-113. doi:10.24012/dumf.1597564
Chicago
Sinap, Vahid. 2025. “Bankruptcy Prediction With Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques”. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi 16 (1): 97-113. https://doi.org/10.24012/dumf.1597564.
EndNote
Sinap V (March 1, 2025) Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi 16 1 97–113.
IEEE
[1]V. Sinap, “Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques”, DUJE, vol. 16, no. 1, pp. 97–113, Mar. 2025, doi: 10.24012/dumf.1597564.
ISNAD
Sinap, Vahid. “Bankruptcy Prediction With Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques”. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi 16/1 (March 1, 2025): 97-113. https://doi.org/10.24012/dumf.1597564.
JAMA
1.Sinap V. Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques. DUJE. 2025;16:97–113.
MLA
Sinap, Vahid. “Bankruptcy Prediction With Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques”. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi, vol. 16, no. 1, Mar. 2025, pp. 97-113, doi:10.24012/dumf.1597564.
Vancouver
1.Vahid Sinap. Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques. DUJE. 2025 Mar. 1;16(1):97-113. doi:10.24012/dumf.1597564