Research Article
BibTex RIS Cite

Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction

Year 2018, Volume: 31 Issue: 3, 775 - 787, 01.09.2018

Abstract





Selection
of strong features is crucial problem in machine learning. It is also
considered as an inescapable exercise to minimize the number of
variables available in the primary feature space for finer
classification performance, decrease computation complexity , and
minimized memory utilization. In this current work, a novel structure
using Symmetrical Uncertainty (SU) and Correlation Coefficient (CCE)
by constructing the graph to select the candidate feature set is
presented. The nominated features are assembled into limited number
of clusters by evaluating their CCE and considering the highest SU
score feature. In every cluster, a feature with highest SU score is
selected while remaining features in the same cluster are
disregarded. The presented methodology was investigated with Ten(10)
well known data sets. Exploratory results assures that the presented
method is out pass than most of the traditional feature selection
methods in accuracy. This framework is assessed using Lazy, Tree
Based, Naive Bayes, and Rule Based learners.

References

  • 1. Liao, S.H., Chu, P.H. and Hsiao, P.Y., 2012. Data mining techniques and applications–A decade review from 2000 to 2011. Expert systems with applications, 39(12), pp.11303-11311.
  • 2. Kamal, N.A.M., Bakar, A.A. and Zainudin, S., 2015, August. Filter-wrapper approach to feature selection of GPCR protein. In Electrical Engineering and Informatics (ICEEI), 2015 International Conference on (pp. 693-698). IEEE.
  • 3. Chatcharaporn, K.O.M.K.I.D., Kittidachanupap, N.A.R.O.D.O.M., Kerdprasop, K.I.T.T.I.S.A.K. and KERDPRASOP, N., 2012. Comparison of feature selection and classification algorithms for restaurant dataset classification. In Proceedings of the 11th Conference on Latest Advances in Systems Science & Computational Intelligence (pp. 129-134).
  • 4. Chandrashekar, G. and Sahin, F., 2014. A survey on feature selection methods. Computers & Electrical Engineering, 40(1), pp.16-28.
  • 5. Lin, K.C., Zhang, K.Y., Huang, Y.H., Hung, J.C. and Yen, N., 2016. Feature selection based on an improved cat swarm optimization algorithm for big data classification. The Journal of Supercomputing, 72(8), pp.3210-3221.
  • 6. Panthong, R. and Srivihok, A., 2015. Wrapper Feature Subset Selection for Dimension Reduction Based on Ensemble Learning Algorithm. Procedia Computer Science, 72, pp.162-169.
  • 7. Maldonado, S. and Weber, R., 2011, November. Embedded Feature Selection for Support Vector Machines: State-of-the-Art and Future Challenges. In CIARP (pp. 304-311).
  • 8. Fonti, V. and Belitser, E., 2017. Feature Selection using LASSO. https://beta.vu.nl/nl/Images/werkstuk-fonti_tcm235-836234.pdf
  • 9. Lebedev, A.V., Westman, E., Van Westen, G.J.P., Kramberger, M.G., Lundervold, A., Aarsland, D., Soininen, H., Kłoszewska, I., Mecocci, P., Tsolaki, M. and Vellas, B., 2014. Random Forest ensembles for detection and prediction of Alzheimer's disease with a good between-cohort robustness. NeuroImage: Clinical, 6, pp.115-125.
  • 10. Hsu, H.H., Hsieh, C.W. and Lu, M.D., 2008, November. A Hybrid Feature Selection Mechanism. In Intelligent Systems Design and Applications, 2008. ISDA'08. Eighth International Conference on (Vol. 2, pp. 271-276). IEEE.
  • 11. Potharaju, S.P. and Sreedevi, M., 2017.A Novel Cluster of Feature Selection Method Based on Information Gain.IJCTA, 10(14),pp.9-16
  • 12. Peng, H., Long, F. and Ding, C., 2005. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence, 27(8), pp.1226-1238.
  • 13. Ding, C. and Peng, H., 2005. Minimum redundancy feature selection from microarray gene expression data. Journal of bioinformatics and computational biology, 3(02), pp.185-205.
  • 14. Yuan, M., Yang, Z., Huang, G. and Ji, G., 2017. Feature selection by maximizing correlation information for integrated high-dimensional protein data. Pattern Recognition Letters, 92, pp.17-24.
  • 15. Partila, P., Voznak, M. and Tovarek, J., 2015. Pattern recognition methods and features selection for speech emotion recognition system. The Scientific World Journal, 2015.
  • 16. Koprinska, I., Rana, M. and Agelidis, V.G., 2015. Correlation and instance based feature selection for electricity load forecasting. Knowledge-Based Systems, 82, pp.29-40.
  • 17. Song, Q., Ni, J. and Wang, G., 2013. A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE transactions on knowledge and data engineering, 25(1), pp.1-14.
Year 2018, Volume: 31 Issue: 3, 775 - 787, 01.09.2018

Abstract

References

  • 1. Liao, S.H., Chu, P.H. and Hsiao, P.Y., 2012. Data mining techniques and applications–A decade review from 2000 to 2011. Expert systems with applications, 39(12), pp.11303-11311.
  • 2. Kamal, N.A.M., Bakar, A.A. and Zainudin, S., 2015, August. Filter-wrapper approach to feature selection of GPCR protein. In Electrical Engineering and Informatics (ICEEI), 2015 International Conference on (pp. 693-698). IEEE.
  • 3. Chatcharaporn, K.O.M.K.I.D., Kittidachanupap, N.A.R.O.D.O.M., Kerdprasop, K.I.T.T.I.S.A.K. and KERDPRASOP, N., 2012. Comparison of feature selection and classification algorithms for restaurant dataset classification. In Proceedings of the 11th Conference on Latest Advances in Systems Science & Computational Intelligence (pp. 129-134).
  • 4. Chandrashekar, G. and Sahin, F., 2014. A survey on feature selection methods. Computers & Electrical Engineering, 40(1), pp.16-28.
  • 5. Lin, K.C., Zhang, K.Y., Huang, Y.H., Hung, J.C. and Yen, N., 2016. Feature selection based on an improved cat swarm optimization algorithm for big data classification. The Journal of Supercomputing, 72(8), pp.3210-3221.
  • 6. Panthong, R. and Srivihok, A., 2015. Wrapper Feature Subset Selection for Dimension Reduction Based on Ensemble Learning Algorithm. Procedia Computer Science, 72, pp.162-169.
  • 7. Maldonado, S. and Weber, R., 2011, November. Embedded Feature Selection for Support Vector Machines: State-of-the-Art and Future Challenges. In CIARP (pp. 304-311).
  • 8. Fonti, V. and Belitser, E., 2017. Feature Selection using LASSO. https://beta.vu.nl/nl/Images/werkstuk-fonti_tcm235-836234.pdf
  • 9. Lebedev, A.V., Westman, E., Van Westen, G.J.P., Kramberger, M.G., Lundervold, A., Aarsland, D., Soininen, H., Kłoszewska, I., Mecocci, P., Tsolaki, M. and Vellas, B., 2014. Random Forest ensembles for detection and prediction of Alzheimer's disease with a good between-cohort robustness. NeuroImage: Clinical, 6, pp.115-125.
  • 10. Hsu, H.H., Hsieh, C.W. and Lu, M.D., 2008, November. A Hybrid Feature Selection Mechanism. In Intelligent Systems Design and Applications, 2008. ISDA'08. Eighth International Conference on (Vol. 2, pp. 271-276). IEEE.
  • 11. Potharaju, S.P. and Sreedevi, M., 2017.A Novel Cluster of Feature Selection Method Based on Information Gain.IJCTA, 10(14),pp.9-16
  • 12. Peng, H., Long, F. and Ding, C., 2005. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence, 27(8), pp.1226-1238.
  • 13. Ding, C. and Peng, H., 2005. Minimum redundancy feature selection from microarray gene expression data. Journal of bioinformatics and computational biology, 3(02), pp.185-205.
  • 14. Yuan, M., Yang, Z., Huang, G. and Ji, G., 2017. Feature selection by maximizing correlation information for integrated high-dimensional protein data. Pattern Recognition Letters, 92, pp.17-24.
  • 15. Partila, P., Voznak, M. and Tovarek, J., 2015. Pattern recognition methods and features selection for speech emotion recognition system. The Scientific World Journal, 2015.
  • 16. Koprinska, I., Rana, M. and Agelidis, V.G., 2015. Correlation and instance based feature selection for electricity load forecasting. Knowledge-Based Systems, 82, pp.29-40.
  • 17. Song, Q., Ni, J. and Wang, G., 2013. A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE transactions on knowledge and data engineering, 25(1), pp.1-14.
There are 17 citations in total.

Details

Journal Section Computer Engineering
Authors

Sai Prasad Potharaju

Marriboyina Sreedevı This is me

Publication Date September 1, 2018
Published in Issue Year 2018 Volume: 31 Issue: 3

Cite

APA Potharaju, S. P., & Sreedevı, M. (2018). Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction. Gazi University Journal of Science, 31(3), 775-787.
AMA Potharaju SP, Sreedevı M. Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction. Gazi University Journal of Science. September 2018;31(3):775-787.
Chicago Potharaju, Sai Prasad, and Marriboyina Sreedevı. “Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction”. Gazi University Journal of Science 31, no. 3 (September 2018): 775-87.
EndNote Potharaju SP, Sreedevı M (September 1, 2018) Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction. Gazi University Journal of Science 31 3 775–787.
IEEE S. P. Potharaju and M. Sreedevı, “Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction”, Gazi University Journal of Science, vol. 31, no. 3, pp. 775–787, 2018.
ISNAD Potharaju, Sai Prasad - Sreedevı, Marriboyina. “Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction”. Gazi University Journal of Science 31/3 (September 2018), 775-787.
JAMA Potharaju SP, Sreedevı M. Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction. Gazi University Journal of Science. 2018;31:775–787.
MLA Potharaju, Sai Prasad and Marriboyina Sreedevı. “Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction”. Gazi University Journal of Science, vol. 31, no. 3, 2018, pp. 775-87.
Vancouver Potharaju SP, Sreedevı M. Correlation Coefficient Based Candidate Feature Selection Framework Using Graph Construction. Gazi University Journal of Science. 2018;31(3):775-87.