Araştırma Makalesi

Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques

Cilt: 16 Sayı: 1 26 Mart 2025
PDF İndir
EN TR

Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques

Abstract

Bankruptcy prediction is an essential task in financial risk management, often hindered by challenges such as class imbalance, feature selection, and overfitting. This study investigates the comparative effectiveness of data balancing techniques, specifically focusing on oversampling with SMOTE (Synthetic Minority Over-sampling Technique) and undersampling with Tomek Links, in addressing class imbalance in bankruptcy datasets. A range of machine learning models, including ensemble and boosting algorithms such as Stacking Classifier and XGBoost, were applied to imbalanced, SMOTE-balanced, and Tomek Links-balanced datasets. Dimensionality reduction was performed using Principal Component Analysis (PCA) to enhance computational efficiency and reduce overfitting risks, while hyperparameter optimization was conducted using the Optuna framework to maximize model performance. The findings demonstrate that SMOTE significantly improved classification accuracy and F1 scores, particularly for ensemble-based models, by generating synthetic samples to balance the dataset. In contrast, Tomek Links often reduced model performance due to the removal of potentially informative data points. Among the models tested, the Stacking Classifier performed best on SMOTE-balanced data, achieving a prediction accuracy of 99%. These results support integrating advanced predictive tools into financial decision-making. The Stacking Classifier’s strong performance on SMOTE-balanced data enhances risk management systems, enabling proactive bankruptcy detection.

Keywords

Kaynakça

  1. [1] T. J. Zywicki, “An economic analysis of the consumer bankruptcy crisis,” Nw. UL Rev., vol. 99, pp. 1463, 2004.
  2. [2] E. I. Altman, “Predicting financial distress of companies: revisiting the Z-score and ZETA® models,” in Handbook of Research Methods and Applications in Empirical Finance, Edward Elgar Publishing, 2013, pp. 428–456.
  3. [3] M. K. Brunnermeier and Y. Sannikov, “A macroeconomic model with a financial sector,” American Economic Review, vol. 104, no. 2, pp. 379–421, 2014.
  4. [4] A. W. Lo and D. V. Repin, “The psychophysiology of real-time financial risk processing,” Journal of Cognitive Neuroscience, vol. 14, no. 3, pp. 323–339, 2002.
  5. [5] J. E. Stiglitz, “Reforming the global economic architecture: lessons from recent crises,” The Journal of Finance, vol. 54, no. 4, pp. 1508–1521, 1999.
  6. [6] V. Sinap, “Comparative performance analysis of machine learning algorithms in the retail industry: Black Friday sales forecasting,” Journal of Selçuk University Social Sciences Vocational School, vol. 27, no. 1, pp. 65–90, 2024.
  7. [7] J. Furman, J. E. Stiglitz, B. P. Bosworth, and S. Radelet, “Economic crises: evidence and insights from East Asia,” Brookings Papers on Economic Activity, vol. 1998, no. 2, pp. 1–135, 1998.
  8. [8] G. Allayannis and E. Ofek, “Exchange rate exposure, hedging, and the use of foreign currency derivatives,” Journal of International Money and Finance, vol. 20, no. 2, pp. 273–296, 2001.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Makine Öğrenme (Diğer)

Bölüm

Araştırma Makalesi

Erken Görünüm Tarihi

26 Mart 2025

Yayımlanma Tarihi

26 Mart 2025

Gönderilme Tarihi

6 Aralık 2024

Kabul Tarihi

5 Mart 2025

Yayımlandığı Sayı

Yıl 2025 Cilt: 16 Sayı: 1

Kaynak Göster

IEEE
[1]V. Sinap, “Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques”, DÜMF MD, c. 16, sy 1, ss. 97–113, Mar. 2025, doi: 10.24012/dumf.1597564.
DUJE tarafından yayınlanan tüm makaleler, Creative Commons Atıf 4.0 Uluslararası Lisansı ile lisanslanmıştır. Bu, orijinal eser ve kaynağın uygun şekilde belirtilmesi koşuluyla, herkesin eseri kopyalamasına, yeniden dağıtmasına, yeniden düzenlemesine, iletmesine ve uyarlamasına izin verir. 24456