Research Article
BibTex RIS Cite

An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques

Year 2018, Volume: 31 Issue: 3, 789 - 799, 01.09.2018

Abstract





Clustering
is an unsupervised Data Mining approach.In this research article, we
have proposed an unsupervised approach using filter based feature
selection methods and K-Means clustering technique to derive the
candidate subset. Initially, score of each feature is recorded using
traditional filter based methods, then normalized the dataset using
Min-Max technique, then formed the unsupervised dataset. K-Means
algorithm is employed on the dataset to form the clusters of
features. To decide the strong subset, Multi Layer Perceptron(MLP) is
applied on each cluster. Based on the minimum Root Mean Square (RMS)
error rate given by MLP best cluster is selected. This framework is
compared with traditional methods over six well known datasets having
the total features in between 34 and 90 using various classification
algorithms. The proposed method has shown competitive performance
than few of the traditional methods.

References

  • 1. Huan Liu, Lei Yu, “Toward Integrating Feature Selection Algorithms for Classification and Clustering,” IEEE Transaction on Knowledge and Data Engineering, 2005. 2. Liu, Chuan, et al. "A new feature selection method based on a validity index of feature subset." Pattern Recognition Letters 92 (2017): 1-8. 3. Stańczyk, Urszula. "Feature evaluation by filter, wrapper, and embedded approaches." Feature Selection for Data and Pattern Recognition. Springer Berlin Heidelberg, 2015. 29-44. 4. Omar, Nazlia, et al. "A comparative study of feature selection and machine learning algorithms for Arabic sentiment classification." Asia information retrieval symposium. Springer, Cham, 2014. 5. Chandrashekar, Girish, and Ferat Sahin. "A survey on feature selection methods." Computers & Electrical Engineering 40.1 (2014): 16-28. 6. V. Kumar, S. Minz, “Poem Classification using Machine Learning Approach,” International Conference on Soft Computing for problem Solving (SocPro 2012), Dec. 2012. 7. S. Minz,V. Kumar, “Multi-view Ensemble Learning for Poem Data Classification using SentiWordNet,” in Proc. of 2nd International Conference on Advanced Computing, Networking, and Informatics (ICACNI-2014), 2014. 8. T. Hastie, R. Tibshirani, J. Friedman, “The elements of statistical learning,” Springer, 2001. 9. Kumar, Navin. "Literature Review." Chronic Regulatory Focus and Financial Decision-Making. Springer, Singapore, 2016. 5-20. 10. Hall, M.A., and Smith, L.A., “Practical feature subset selection for machine learning”, Proceedings of the 21st Australian Computer Science Conference, 1998, 181–191. 11. Liu, H., and Setiono, R., “Chi2: Feature selection and discretization of numeric attributes”, Proc. IEEE 7th International Conference on Tools with Artificial Intelligence, 1995, 338-391 12. Marko, R.S., and Igor, K., “Theoretical and empirical analysis of relief and rreliefF”, Machine Learning Journal, 53 (2003) 23–69. doi: 10.1023/A:1025667309714 13. Kalapatapu, Prafulla, et al. "A Study on Feature Selection and Classification Techniques of Indian Music." Procedia Computer Science 98 (2016): 125-131. 14. Novaković, Jasmina. "Toward optimal feature selection using ranking methods and classification algorithms." Yugoslav Journal of Operations Research 21.1 (2016). 15. Alshalabi, Hamood, et al. "Experiments on the use of feature selection and machine learning methods in automatic malay text categorization." Procedia Technology 11 (2013): 748-754. 16. Hasan, Md Al Mehedi, et al. "Feature Selection for Intrusion Detection Using Random Forest." Journal of Information Security 7.03 (2016): 129. 17. https://sites.google.com/site/malikacharrad/research/nbclust-package 18. Chawla, Nitesh V., et al. "SMOTE: synthetic minority over-sampling technique." Journal of artificial intelligence research 16 (2002): 321-357. 19. Potharaju, Sai Prasad, and M. Sreedevi. "An Improved prediction of Kidney disease using SMOTE." Indian Journal of Science and Technology 9.31 (2016). 20. Potharaju, Sai Prasad, and M. Sreedevi. "Ensembled Rule Based Classification Algorithms for predicting Imbalanced Kidney Disease Data." Journal of Engineering Science and Technology Review 9.5 (2016): 201-207. 21. Patro, S., and Kishore Kumar Sahu. "Normalization: A preprocessing stage." arXiv preprint arXiv:1503.06462 (2015). 22. https://saiprasadcomp.files.wordpress.com/2017/12/supply.pdf
Year 2018, Volume: 31 Issue: 3, 789 - 799, 01.09.2018

Abstract

References

  • 1. Huan Liu, Lei Yu, “Toward Integrating Feature Selection Algorithms for Classification and Clustering,” IEEE Transaction on Knowledge and Data Engineering, 2005. 2. Liu, Chuan, et al. "A new feature selection method based on a validity index of feature subset." Pattern Recognition Letters 92 (2017): 1-8. 3. Stańczyk, Urszula. "Feature evaluation by filter, wrapper, and embedded approaches." Feature Selection for Data and Pattern Recognition. Springer Berlin Heidelberg, 2015. 29-44. 4. Omar, Nazlia, et al. "A comparative study of feature selection and machine learning algorithms for Arabic sentiment classification." Asia information retrieval symposium. Springer, Cham, 2014. 5. Chandrashekar, Girish, and Ferat Sahin. "A survey on feature selection methods." Computers & Electrical Engineering 40.1 (2014): 16-28. 6. V. Kumar, S. Minz, “Poem Classification using Machine Learning Approach,” International Conference on Soft Computing for problem Solving (SocPro 2012), Dec. 2012. 7. S. Minz,V. Kumar, “Multi-view Ensemble Learning for Poem Data Classification using SentiWordNet,” in Proc. of 2nd International Conference on Advanced Computing, Networking, and Informatics (ICACNI-2014), 2014. 8. T. Hastie, R. Tibshirani, J. Friedman, “The elements of statistical learning,” Springer, 2001. 9. Kumar, Navin. "Literature Review." Chronic Regulatory Focus and Financial Decision-Making. Springer, Singapore, 2016. 5-20. 10. Hall, M.A., and Smith, L.A., “Practical feature subset selection for machine learning”, Proceedings of the 21st Australian Computer Science Conference, 1998, 181–191. 11. Liu, H., and Setiono, R., “Chi2: Feature selection and discretization of numeric attributes”, Proc. IEEE 7th International Conference on Tools with Artificial Intelligence, 1995, 338-391 12. Marko, R.S., and Igor, K., “Theoretical and empirical analysis of relief and rreliefF”, Machine Learning Journal, 53 (2003) 23–69. doi: 10.1023/A:1025667309714 13. Kalapatapu, Prafulla, et al. "A Study on Feature Selection and Classification Techniques of Indian Music." Procedia Computer Science 98 (2016): 125-131. 14. Novaković, Jasmina. "Toward optimal feature selection using ranking methods and classification algorithms." Yugoslav Journal of Operations Research 21.1 (2016). 15. Alshalabi, Hamood, et al. "Experiments on the use of feature selection and machine learning methods in automatic malay text categorization." Procedia Technology 11 (2013): 748-754. 16. Hasan, Md Al Mehedi, et al. "Feature Selection for Intrusion Detection Using Random Forest." Journal of Information Security 7.03 (2016): 129. 17. https://sites.google.com/site/malikacharrad/research/nbclust-package 18. Chawla, Nitesh V., et al. "SMOTE: synthetic minority over-sampling technique." Journal of artificial intelligence research 16 (2002): 321-357. 19. Potharaju, Sai Prasad, and M. Sreedevi. "An Improved prediction of Kidney disease using SMOTE." Indian Journal of Science and Technology 9.31 (2016). 20. Potharaju, Sai Prasad, and M. Sreedevi. "Ensembled Rule Based Classification Algorithms for predicting Imbalanced Kidney Disease Data." Journal of Engineering Science and Technology Review 9.5 (2016): 201-207. 21. Patro, S., and Kishore Kumar Sahu. "Normalization: A preprocessing stage." arXiv preprint arXiv:1503.06462 (2015). 22. https://saiprasadcomp.files.wordpress.com/2017/12/supply.pdf
There are 1 citations in total.

Details

Journal Section Computer Engineering
Authors

Sai Prasad Potharaju 0000-0002-7511-6855

Publication Date September 1, 2018
Published in Issue Year 2018 Volume: 31 Issue: 3

Cite

APA Potharaju, S. P. (2018). An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques. Gazi University Journal of Science, 31(3), 789-799.
AMA Potharaju SP. An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques. Gazi University Journal of Science. September 2018;31(3):789-799.
Chicago Potharaju, Sai Prasad. “An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques”. Gazi University Journal of Science 31, no. 3 (September 2018): 789-99.
EndNote Potharaju SP (September 1, 2018) An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques. Gazi University Journal of Science 31 3 789–799.
IEEE S. P. Potharaju, “An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques”, Gazi University Journal of Science, vol. 31, no. 3, pp. 789–799, 2018.
ISNAD Potharaju, Sai Prasad. “An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques”. Gazi University Journal of Science 31/3 (September 2018), 789-799.
JAMA Potharaju SP. An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques. Gazi University Journal of Science. 2018;31:789–799.
MLA Potharaju, Sai Prasad. “An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques”. Gazi University Journal of Science, vol. 31, no. 3, 2018, pp. 789-9.
Vancouver Potharaju SP. An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques. Gazi University Journal of Science. 2018;31(3):789-9.