Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction
Abstract
Selection of strong features is crucial problem in machine learning. It is also considered as an inescapable exercise to minimize the number of variables available in the primary feature space for finer classification performance, decrease computation complexity , and minimized memory utilization. In this current work, a novel structure using Symmetrical Uncertainty (SU) and Correlation Coefficient (CCE) by constructing the graph to select the candidate feature set is presented. The nominated features are assembled into limited number of clusters by evaluating their CCE and considering the highest SU score feature. In every cluster, a feature with highest SU score is selected while remaining features in the same cluster are disregarded. The presented methodology was investigated with Ten(10) well known data sets. Exploratory results assures that the presented method is out pass than most of the traditional feature selection methods in accuracy. This framework is assessed using Lazy, Tree Based, Naive Bayes, and Rule Based learners.
Keywords
References
- 1. Liao, S.H., Chu, P.H. and Hsiao, P.Y., 2012. Data mining techniques and applications–A decade review from 2000 to 2011. Expert systems with applications, 39(12), pp.11303-11311.
- 2. Kamal, N.A.M., Bakar, A.A. and Zainudin, S., 2015, August. Filter-wrapper approach to feature selection of GPCR protein. In Electrical Engineering and Informatics (ICEEI), 2015 International Conference on (pp. 693-698). IEEE.
- 3. Chatcharaporn, K.O.M.K.I.D., Kittidachanupap, N.A.R.O.D.O.M., Kerdprasop, K.I.T.T.I.S.A.K. and KERDPRASOP, N., 2012. Comparison of feature selection and classification algorithms for restaurant dataset classification. In Proceedings of the 11th Conference on Latest Advances in Systems Science & Computational Intelligence (pp. 129-134).
- 4. Chandrashekar, G. and Sahin, F., 2014. A survey on feature selection methods. Computers & Electrical Engineering, 40(1), pp.16-28.
- 5. Lin, K.C., Zhang, K.Y., Huang, Y.H., Hung, J.C. and Yen, N., 2016. Feature selection based on an improved cat swarm optimization algorithm for big data classification. The Journal of Supercomputing, 72(8), pp.3210-3221.
- 6. Panthong, R. and Srivihok, A., 2015. Wrapper Feature Subset Selection for Dimension Reduction Based on Ensemble Learning Algorithm. Procedia Computer Science, 72, pp.162-169.
- 7. Maldonado, S. and Weber, R., 2011, November. Embedded Feature Selection for Support Vector Machines: State-of-the-Art and Future Challenges. In CIARP (pp. 304-311).
- 8. Fonti, V. and Belitser, E., 2017. Feature Selection using LASSO. https://beta.vu.nl/nl/Images/werkstuk-fonti_tcm235-836234.pdf
Details
Primary Language
English
Subjects
-
Journal Section
Research Article
Authors
Sai Prasad Potharaju
K L University
India
Marriboyina Sreedevı
This is me
K L University
India
Publication Date
September 1, 2018
Submission Date
September 13, 2017
Acceptance Date
April 16, 2018
Published in Issue
Year 2018 Volume: 31 Number: 3