Research Article

Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation

Volume: 2 Number: 2 December 21, 2021
TR EN

Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation

Abstract

One of the main goals in machine learning studies is to determine the most significant variables on a specific research problem. Various algorithms have been developed to achieve this goal. Random forest, Cubist, and MARS algorithms are the most common ones among these algorithms. Although classical statistical algorithms have been useful to obtain the importance level of the effective variables on the output in a certain amount, the machine learning algorithms may provide clearer and more precise results. In this study, the estimation results of Random Forest, Cubist, and MARS algorithms have been presented comparatively in terms of some performance criteria like mean squares error, the coefficient of determination, and mean absolute error by using a real data set. The results show that the performances of Random Forest and Cubist are similar amongst themselves but better than MARS. Additionally, the rank of the most important variables varies according to the type of algorithm. The concordance between algorithms is investigated from a statistical perspective and found satisfactory. Consequently, Random Forest, Cubist, and MARS can be considered effective and reasonable algorithms for both estimation performance and variable importance evaluation.

Keywords

Cubist , Random Forest , Machine Learning , Mars , Variable Importance

References

  1. [1] Ertoy, U. & Akçay, M. (2021). Covid-19 Virüsü Salgını İle Mücadelede Büyük Veri Çalışmaları: Çin Örneği . Journal of Scientific, Technology and Engineering Research , 2 (2) , 4-14 . DOI: 10.5281/zenodo.4718425.
  2. [2] Pazar, Ş. , Bulut, M. & Uysal, C. (2020). Yapay Zeka Tabanlı Araç Algılama Sistemi Geliştirilmesi . Journal of Scientific, Technology and Engineering Research , 1 (1) , 31-37 . DOI: 10.5281/zenodo.3922425
  3. [3] Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature selection: A data perspective. ACM Computing Surveys (CSUR), 50(6), 1-45. DOI: 10.1145/3136625.
  4. [4] Hall, M. A., & Smith, L. A. (1999, May). Feature selection for machine learning: comparing a correlation-based filter approach to the wrapper. In FLAIRS conference (Vol. 1999, pp. 235-239). DOI: 10.5555/646812.707499.
  5. [5] Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3(Mar), 1157-1182. DOI: 10.5555/944919.944968.
  6. [6] Saeys, Y., Inza, I., & Larranaga, P. (2007). A review of feature selection techniques in bioinformatics. bioinformatics, 23(19), 2507-2517. DOI: 10.1093/bioinformatics/btm344.
  7. [7] Alelyani, S., Tang, J., & Liu, H. (2018). Feature selection for clustering: A review. Data Clustering, 29-60. DOI: 10.1201/9781315373515-2.
  8. [8] Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28. DOI: 10.1016/j.compeleceng.2013.11.024.
  9. [9] Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. Data classification: Algorithms and applications, 37. DOI: 10.1201/b17320.
  10. [10] El-Hasnony, I. M., Barakat, S. I., Elhoseny, M., & Mostafa, R. R. (2020). Improved feature selection model for big data analytics. IEEE Access, 8, 66989-67004. DOI: 10.1109/ACCESS.2020.2986232.
APA
Yıldırım, H. (2021). Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation. Journal of Science, Technology and Engineering Research, 2(2), 46-53. https://doi.org/10.53525/jster.988672
AMA
1.Yıldırım H. Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation. Journal of Science, Technology and Engineering Research. 2021;2(2):46-53. doi:10.53525/jster.988672
Chicago
Yıldırım, Hasan. 2021. “Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation”. Journal of Science, Technology and Engineering Research 2 (2): 46-53. https://doi.org/10.53525/jster.988672.
EndNote
Yıldırım H (December 1, 2021) Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation. Journal of Science, Technology and Engineering Research 2 2 46–53.
IEEE
[1]H. Yıldırım, “Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation”, Journal of Science, Technology and Engineering Research, vol. 2, no. 2, pp. 46–53, Dec. 2021, doi: 10.53525/jster.988672.
ISNAD
Yıldırım, Hasan. “Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation”. Journal of Science, Technology and Engineering Research 2/2 (December 1, 2021): 46-53. https://doi.org/10.53525/jster.988672.
JAMA
1.Yıldırım H. Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation. Journal of Science, Technology and Engineering Research. 2021;2:46–53.
MLA
Yıldırım, Hasan. “Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation”. Journal of Science, Technology and Engineering Research, vol. 2, no. 2, Dec. 2021, pp. 46-53, doi:10.53525/jster.988672.
Vancouver
1.Hasan Yıldırım. Comparative Analysis of Machine Learning Algorithms Based on Variable Importance Evaluation. Journal of Science, Technology and Engineering Research. 2021 Dec. 1;2(2):46-53. doi:10.53525/jster.988672