| | | |

## An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques

#### Sai Prasad POTHARAJU [1]

Clustering is an unsupervised Data Mining approach.In this research article, we have proposed an unsupervised approach using filter based feature selection methods and K-Means clustering technique to derive the candidate subset. Initially, score of each feature is recorded using traditional filter based methods, then normalized the dataset using Min-Max technique, then formed the unsupervised dataset. K-Means algorithm is employed on the dataset to form the clusters of features. To decide the strong subset, Multi Layer Perceptron(MLP) is applied on each cluster. Based on the minimum Root Mean Square (RMS) error rate given by MLP best cluster is selected. This framework is compared with traditional methods over six well known datasets having the total features in between 34 and 90 using various classification algorithms. The proposed method has shown competitive performance than few of the traditional methods.

Feature Selection, Clustering, Multi Layer Perceptron, Min-Max Normalization, Filter Methods
• 1. Huan Liu, Lei Yu, “Toward Integrating Feature Selection Algorithms for Classification and Clustering,” IEEE Transaction on Knowledge and Data Engineering, 2005. 2. Liu, Chuan, et al. "A new feature selection method based on a validity index of feature subset." Pattern Recognition Letters 92 (2017): 1-8. 3. Stańczyk, Urszula. "Feature evaluation by filter, wrapper, and embedded approaches." Feature Selection for Data and Pattern Recognition. Springer Berlin Heidelberg, 2015. 29-44. 4. Omar, Nazlia, et al. "A comparative study of feature selection and machine learning algorithms for Arabic sentiment classification." Asia information retrieval symposium. Springer, Cham, 2014. 5. Chandrashekar, Girish, and Ferat Sahin. "A survey on feature selection methods." Computers & Electrical Engineering 40.1 (2014): 16-28. 6. V. Kumar, S. Minz, “Poem Classification using Machine Learning Approach,” International Conference on Soft Computing for problem Solving (SocPro 2012), Dec. 2012. 7. S. Minz,V. Kumar, “Multi-view Ensemble Learning for Poem Data Classification using SentiWordNet,” in Proc. of 2nd International Conference on Advanced Computing, Networking, and Informatics (ICACNI-2014), 2014. 8. T. Hastie, R. Tibshirani, J. Friedman, “The elements of statistical learning,” Springer, 2001. 9. Kumar, Navin. "Literature Review." Chronic Regulatory Focus and Financial Decision-Making. Springer, Singapore, 2016. 5-20. 10. Hall, M.A., and Smith, L.A., “Practical feature subset selection for machine learning”, Proceedings of the 21st Australian Computer Science Conference, 1998, 181–191. 11. Liu, H., and Setiono, R., “Chi2: Feature selection and discretization of numeric attributes”, Proc. IEEE 7th International Conference on Tools with Artificial Intelligence, 1995, 338-391 12. Marko, R.S., and Igor, K., “Theoretical and empirical analysis of relief and rreliefF”, Machine Learning Journal, 53 (2003) 23–69. doi: 10.1023/A:1025667309714 13. Kalapatapu, Prafulla, et al. "A Study on Feature Selection and Classification Techniques of Indian Music." Procedia Computer Science 98 (2016): 125-131. 14. Novaković, Jasmina. "Toward optimal feature selection using ranking methods and classification algorithms." Yugoslav Journal of Operations Research 21.1 (2016). 15. Alshalabi, Hamood, et al. "Experiments on the use of feature selection and machine learning methods in automatic malay text categorization." Procedia Technology 11 (2013): 748-754. 16. Hasan, Md Al Mehedi, et al. "Feature Selection for Intrusion Detection Using Random Forest." Journal of Information Security 7.03 (2016): 129. 17. https://sites.google.com/site/malikacharrad/research/nbclust-package 18. Chawla, Nitesh V., et al. "SMOTE: synthetic minority over-sampling technique." Journal of artificial intelligence research 16 (2002): 321-357. 19. Potharaju, Sai Prasad, and M. Sreedevi. "An Improved prediction of Kidney disease using SMOTE." Indian Journal of Science and Technology 9.31 (2016). 20. Potharaju, Sai Prasad, and M. Sreedevi. "Ensembled Rule Based Classification Algorithms for predicting Imbalanced Kidney Disease Data." Journal of Engineering Science and Technology Review 9.5 (2016): 201-207. 21. Patro, S., and Kishore Kumar Sahu. "Normalization: A preprocessing stage." arXiv preprint arXiv:1503.06462 (2015). 22. https://saiprasadcomp.files.wordpress.com/2017/12/supply.pdf
Subjects Science Computer Engineering Orcid: 0000-0002-7511-6855Author: Sai Prasad POTHARAJU (Primary Author)Institution: K L UniversityCountry: India Publication Date : September 1, 2018
 Bibtex @research article { gujs365961, journal = {Gazi University Journal of Science}, issn = {}, eissn = {2147-1762}, address = {}, publisher = {Gazi University}, year = {2018}, volume = {31}, pages = {789 - 799}, doi = {}, title = {An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques}, key = {cite}, author = {POTHARAJU, Sai Prasad} } APA POTHARAJU, S . (2018). An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques. Gazi University Journal of Science , 31 (3) , 789-799 . Retrieved from https://dergipark.org.tr/en/pub/gujs/issue/38948/365961 MLA POTHARAJU, S . "An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques". Gazi University Journal of Science 31 (2018 ): 789-799 Chicago POTHARAJU, S . "An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques". Gazi University Journal of Science 31 (2018 ): 789-799 RIS TY - JOUR T1 - An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques AU - Sai Prasad POTHARAJU Y1 - 2018 PY - 2018 N1 - DO - T2 - Gazi University Journal of Science JF - Journal JO - JOR SP - 789 EP - 799 VL - 31 IS - 3 SN - -2147-1762 M3 - UR - Y2 - 2018 ER - EndNote %0 Gazi University Journal of Science An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques %A Sai Prasad POTHARAJU %T An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques %D 2018 %J Gazi University Journal of Science %P -2147-1762 %V 31 %N 3 %R %U ISNAD POTHARAJU, Sai Prasad . "An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques". Gazi University Journal of Science 31 / 3 (September 2018): 789-799 . AMA POTHARAJU S . An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques. Gazi University Journal of Science. 2018; 31(3): 789-799. Vancouver POTHARAJU S . An Unsupervised Approach For Selection of Candidate Feature Set Using Filter Based Techniques. Gazi University Journal of Science. 2018; 31(3): 799-789.