Araştırma Makalesi

Performance Evaluation of Feature Subset Selection Approaches on Rule-Based Learning Algorithms

Cilt: 1 Sayı: 1 27 Aralık 2018
PDF İndir

Performance Evaluation of Feature Subset Selection Approaches on Rule-Based Learning Algorithms

Abstract

There are two main approaches for feature subset selection, i.e., wrapper and filter based. In wrapper based approach, which is a supervised method, the feature subset selection algorithm acts as a wrapper around an induction algorithm. The induction algorithm is actually a black-box for the feature subset selection algorithm and is mostly the classifier itself. The filter approach is an unsupervised method and attempts to assess the merits of features from the data while ignoring the performance of the induction algorithm. In this study, the effects of the feature subset selection approaches on the classification performance of rule-based learning algorithms, i.e., C4.5, RIPPER, PART, BFTree were investigated. These algorithms are fast in case of wrapper based approach. For various datasets, significant accuracy improvements were achieved with the wrapper based feature subset selection method. Other algorithms like Multilayer Perceptron (MLP) and Random Forests (RF) were also applied on the same datasets for the purpose of accuracy comparison. These two algorithms were very inefficient in terms of time when they were used in wrapper approach.

Keywords

Rule-based learning , feature extraction , wrapper , filtering

Kaynakça

  1. [1] H. Almuallim ve T.G. Dietterich, “Learning Boolean concepts in the presence of many irrelevant features”, Artificial Intelligence, cilt 69, ss. 279-306, 1994.
  2. [2] G. John, R. Kohavi ve K. Pfleger, “Irrelevant features and the subset selection problem”, Proc. 5th International Conference on Machine Learning, New Brunswick, NJ, 1994, ss. 121-129.
  3. [3] R. Kohavi ve G.John,“Wrappers for feature subset selection”, Artificial Intelligence, cilt 97, ss. 273-324, 1997.
  4. [4] G. Forman, "An extensive empirical study of feature selection metrics for text classification," J. Mach. Learn. Res., cilt 3, ss. 1289–1305, 2003.
  5. [5] T. Liu, S. Liu ve Z. Chen, "An evaluation on feature selection for text clustering", Proc. 20th International Conference on Machine Learning (ICML-2003), Washington DC, USA, AAAI Press, ss. 488–495, 2003.
  6. [6] M. Mustra, M. Grgic ve K. Delac, "Breast density classification using multiple feature selection", Automatika, cilt 53, ss. 1289–1305, 2012.
  7. [7] H. Abusamra, "A comparative study of feature selection and classification methods for gene expression data of glioma", Procedia Computer Science, cilt 23, ss. 5–14, 2013.
  8. [8] C. Liu, D. Jiang ve W. Yang, "Global geometric similarity scheme for feature selection in fault diagnosis", Expert Systems with Applications, cilt 41, sayı 8, ss. 3585–3595, 2014.
  9. [9] M. Lichman,“UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science”, 2013.
  10. [10] E. Frank ve H.I.Witten, “Generating Accurate Rule Sets Without Global Optimization”, Proc. 15th Int. Conf. on Machine Learning, ss. 144-151.

Kaynak Göster

IEEE
[1]A. Öztürk, “Performance Evaluation of Feature Subset Selection Approaches on Rule-Based Learning Algorithms”, DataSCI, c. 1, sy 1, ss. 16–20, Ara. 2018, [çevrimiçi]. Erişim adresi: https://izlik.org/JA82UP86DR