Araştırma Makalesi

Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application

Cilt: 5 Sayı: 1 4 Temmuz 2024
PDF İndir
TR EN

Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application

Öz

Credit risk arises as a result of the failure of the loans given by banks to the customers to fulfill their obligations at the end of the specified term. Technological advances allow the use of machine learning methods in various sectors. These methods aim to facilitate the identification of customers at risk with the system adapted to the creditworthiness processes of banks. For this purpose, in order to make the most appropriate evaluation in the lending process of banks, re-sampling techniques to eliminate the problem of class imbalance encountered in unbalanced data sets were made balanced and their effects on machine learning were investigated. During the implementation phase, German, Australian and HMEQ credit data sets were used. Different machine learning classification methods such as Logistic Regression (LR), K-Narest Neighbor (KNN), Naive Bayes (NB), Support Vector Machines (SVM), Multilayer Perceptron (MLP), Decision Trees (DT), Random Forests (RF), Gradient Boosting Decision Trees (GBDT), Extremely Randomized Trees, Hard and Soft Voting were used to detect risky customers. The problem of class imbalance was balanced with resampling and hybrid techniques such as Random Oversampling (ROS), Random Undersampling (RUS), Balanced Bagging Classifier (BBC), SMOTE-Tomek Links and SMOTE-ENN. In this context, the performances of three different data sets were examined in four different scenarios. As a result of the study, the hybrid method, in which oversampling and undersampling methods are used together for the class balancing problem, showed the best classification performance among machine learning techniques.

Anahtar Kelimeler

Kaynakça

  1. Akman, M., Genç, Y. ve Ankarali, H. (2011). Random Forests Yöntemi ve Saglik Alaninda Bir Uygulama/Random Forests Methods and an Application in Health Science. Türkiye Klinikleri Biyoistatistik. 3(1): 36.
  2. Alam, T. M., Shaukat, K., Hameed, I. A., Luo, S., Sarwar, M. U., Shabbir, S. ve Khushi, M. (2020). An investigation of credit card default prediction in the imbalanced datasets. IEEE Access. 8: 201173-201198.
  3. Barros, T. M., Souza Neto, P. A., Silva, I. ve Guedes, L. A. (2019). Predictive models for imbalanced data: A school dropout perspective. Education Sciences. 9(4): 275.
  4. Batista, G. E., Bazzan, A. L. ve Monard, M. C. (2003, December). Balancing Training Data for Automated Annotation of Keywords: a Case Study. In WOB (ss. 10-18).
  5. Bradley, A. P., Duin, R. P. W., Paclik, P. ve Landgrebe, T. C. W. (2006). Precision-Recall Operating Characteristic (P-ROC) Curves in Imprecise Environments. In 18th International Conference on Pattern Recognition (ICPR'06) (pp.123-127). Cambridge , United Kingdom.
  6. Breiman, L. (2001). Random forests. Machine learning. 45(1): 5-32.
  7. Boughorbel, S., Jarray, F. ve El-Anbari, M. (2017). Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. PloS One. 12(6): 0177678.
  8. Chicco, D. ve Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics. 21(1): 1-13.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Yöneylem

Bölüm

Araştırma Makalesi

Erken Görünüm Tarihi

27 Haziran 2024

Yayımlanma Tarihi

4 Temmuz 2024

Gönderilme Tarihi

13 Şubat 2024

Kabul Tarihi

27 Haziran 2024

Yayımlandığı Sayı

Yıl 2024 Cilt: 5 Sayı: 1

Kaynak Göster

APA
Milli, M. E. F., Deveci Kocakoç, İ., & Aras, S. (2024). Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application. İzmir Yönetim Dergisi, 5(1), 55-70. https://doi.org/10.56203/iyd.1436742
AMA
1.Milli MEF, Deveci Kocakoç İ, Aras S. Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application. İzmir Yönetim Dergisi. 2024;5(1):55-70. doi:10.56203/iyd.1436742
Chicago
Milli, Migraç Enes Furkan, İpek Deveci Kocakoç, ve Serkan Aras. 2024. “Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application”. İzmir Yönetim Dergisi 5 (1): 55-70. https://doi.org/10.56203/iyd.1436742.
EndNote
Milli MEF, Deveci Kocakoç İ, Aras S (01 Temmuz 2024) Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application. İzmir Yönetim Dergisi 5 1 55–70.
IEEE
[1]M. E. F. Milli, İ. Deveci Kocakoç, ve S. Aras, “Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application”, İzmir Yönetim Dergisi, c. 5, sy 1, ss. 55–70, Tem. 2024, doi: 10.56203/iyd.1436742.
ISNAD
Milli, Migraç Enes Furkan - Deveci Kocakoç, İpek - Aras, Serkan. “Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application”. İzmir Yönetim Dergisi 5/1 (01 Temmuz 2024): 55-70. https://doi.org/10.56203/iyd.1436742.
JAMA
1.Milli MEF, Deveci Kocakoç İ, Aras S. Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application. İzmir Yönetim Dergisi. 2024;5:55–70.
MLA
Milli, Migraç Enes Furkan, vd. “Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application”. İzmir Yönetim Dergisi, c. 5, sy 1, Temmuz 2024, ss. 55-70, doi:10.56203/iyd.1436742.
Vancouver
1.Migraç Enes Furkan Milli, İpek Deveci Kocakoç, Serkan Aras. Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application. İzmir Yönetim Dergisi. 01 Temmuz 2024;5(1):55-70. doi:10.56203/iyd.1436742

Cited By

Makalenizi sisteme yüklemeden önce mutlaka şablon'lardan ve yazım kurallarından faydalanınız. Yazım kurallarına uygun olmayan çalışmaların hakem süreci başlatılmayacaktır.