Araştırma Makalesi

Evaluation of Oversampling Methods (OVER, SMOTE, and ROSE) in Classifying Soil Liquefaction Dataset based on SVM, RF, and Naïve Bayes

Sayı: 34 31 Mart 2022
PDF İndir
EN TR

Evaluation of Oversampling Methods (OVER, SMOTE, and ROSE) in Classifying Soil Liquefaction Dataset based on SVM, RF, and Naïve Bayes

Abstract

Class imbalanced datasets are prevalent in real-world applications, including engineering, medical domain, financial sector, and others. Machine learning (ML)-based prediction models have successfully demonstrated the applicability of various algorithms for the solution of different problems. However, their application for the soil liquefaction issue considering the class imbalance situation is limited. This paper presents the prediction results of random forest (RF), support vector machine (SVM), and naïve bayes (NB) algorithms with different training sample sizes for soil liquefaction. The effect of oversampling methods, namely simple oversampling (OVER), random oversampling examples (ROSE), and synthetic minority oversampling technique (SMOTE), on the prediction performance of classification algorithms is also investigated. Performance results are evaluated by means of some metrics, including Accuracy, Kappa, Precision, Recall, and F-measure. The results concluded the effectiveness of applying oversampling methods on imbalanced data before the modeling phase. All of the oversampling methods helped to enhance the overall performances of the classification models. It is also observed that the SMOTE exhibited slightly better performance than other considered oversampling methods. Furthermore, the SVM model outperformed compared to RF and NB models when all algorithms were trained by the SMOTE algorithm.

Keywords

Kaynakça

  1. Adalier, K., & Elgamal, A. (2004). Mitigation of liquefaction and associated ground deformations by stone columns. Engineering Geology, 72(3-4), 275-291.
  2. Allen, J. R. L. (1982). Sedimentary Structures: Their Character and Physical Basis. Volume II. Developments in Sedimentology, 30B, Amsterdam.
  3. Amiri, M., Bakhshandeh Amnieh, H., Hasanipanah, M., & Mohammad Khanli, L. (2016). A new combination of artificial neural network and K-nearest neighbors models to predict blast-induced ground vibration and air-overpressure. Engineering with Computers, 32(4), 631-644.
  4. Cetin, K. O., Seed, R. B., Der Kiureghian, A., Tokimatsu, K., Harder Jr, L. F., Kayen, R. E., & Moss, R. E. (2004). Standard penetration test-based probabilistic and deterministic assessment of seismic soil liquefaction potential. Journal of Geotechnical and Geoenvironmental Engineering, 130(12), 1314-1340.
  5. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321-357.
  6. Chen, B., Xia, S., Chen, Z., Wang, B., & Wang, G. (2021). RSMOTE: A self-adaptive robust SMOTE for imbalanced problems with label noise. Information Sciences, 553, 397-428.
  7. Demir, S., & Sahin, E. K. (2022). Comparison of tree-based machine learning algorithms for predicting liquefaction potential using canonical correlation forest, rotation forest, and random forest based on CPT data. Soil Dynamics and Earthquake Engineering, 154, 107130.
  8. Douzas, G., & Bacao, F. (2017). Self-Organizing Map Oversampling (SOMO) for imbalanced data set learning. Expert Systems with Applications, 82, 40-52.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Mühendislik

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

31 Mart 2022

Gönderilme Tarihi

23 Şubat 2022

Kabul Tarihi

23 Şubat 2022

Yayımlandığı Sayı

Yıl 2022 Sayı: 34

Kaynak Göster

APA
Demir, S., & Şahin, E. K. (2022). Evaluation of Oversampling Methods (OVER, SMOTE, and ROSE) in Classifying Soil Liquefaction Dataset based on SVM, RF, and Naïve Bayes. Avrupa Bilim ve Teknoloji Dergisi, 34, 142-147. https://doi.org/10.31590/ejosat.1077867

Cited By