Research Article
BibTex RIS Cite

Binary white shark optimization algorithm for feature selection problems

Year 2023, Volume: 13 Issue: 2, 281 - 298, 15.04.2023
https://doi.org/10.17714/gumusfenbil.1175548

Abstract

Feature selection is the process of eliminating redundant, irrelevant and noisy features from a large-scale dataset while aiming for acceptable classification accuracy in machine learning problems. In fact, feature selection can also be described as an optimization problem. In the literature, there are studies in which metaheuristic optimization algorithms have successful performance in finding optimal feature subsets. In this study, the white shark optimization algorithm (WSO) has been converted into binary form with S, V and U-shaped transfer functions and used for feature selection. The proposed methods have been applied on eight different datasets in the UCI data repository and examined in terms of classification accuracies, fitness values and selected feature numbers. The k-nearest neighbor classifier has been used as a classifier. Then, Freidman rank test has been applied by comparing with different metaheuristic algorithms. Experimental results show that the proposed methods are successful in feature selection and increase the classification success. It can be interpreted that especially the V and U-shaped versions produce more stable and high accuracy results.

References

  • Abdel-Basset, M., Abdel-Fatah, L., & Sangaiah, A. K. (2018). Metaheuristic algorithms: A comprehensive review. Computational intelligence for multimedia big data on the cloud with engineering applications, 185-231. doi:https://doi.org/10.1016/B978-0-12-813314-9.00010-4
  • Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics, 376, 113609. doi:https://doi.org/10.1016/j.cma.2020.113609
  • Arora, S., & Anand, P. (2019). Binary butterfly optimization approaches for feature selection. Expert Systems with Applications, 116, 147-160. doi:https://doi.org/10.1016/j.eswa.2018.08.051
  • Awadallah, M. A., Hammouri, A. I., Al-Betar, M. A., Braik, M. S., & Abd Elaziz, M. (2022). Binary Horse herd optimization algorithm with crossover operators for feature selection. Computers in Biology Medicine, 141, 105152. doi:https://doi.org/10.1016/j.compbiomed.2021.105152
  • Bäck, T., & Schwefel, H.-P. (1993). An overview of evolutionary algorithms for parameter optimization. Evolutionary computation, 1(1), 1-23. doi:https://doi.org/10.1162/evco.1993.1.1.1
  • Braik, M., Hammouri, A., Atwan, J., Al-Betar, M. A., & Awadallah, M. A. J. K.-B. S. (2022). White Shark Optimizer: A novel bio-inspired meta-heuristic algorithm for global optimization problems. 243, 108457. doi:https://doi.org/10.1016/j.knosys.2022.108457
  • Dash, M., & Liu, H. (1997). Feature selection for classification. Intelligent data analysis, 1(1-4), 131-156. doi:https://doi.org/10.1016/S1088-467X(97)00008-5
  • Dehghani, M., Montazeri, Z., Dehghani, A., Malik, O. P., Morales-Menendez, R., Dhiman, G., Nouri, N., Ehsanifar, A., Guerrero, J. M., & Ramirez-Mendoza, R. A. (2021). Binary spring search algorithm for solving various optimization problems. Applied Sciences, 11(3), 1286. doi:https://doi.org/10.3390/app11031286
  • Dhiman, G., & Kumar, V. (2019). Seagull optimization algorithm: Theory and its applications for large-scale industrial engineering problems. Knowledge-based systems, 165, 169-196. doi:https://doi.org/10.1016/j.knosys.2018.11.024
  • Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern Classification, Hoboken. In: NJ: Wiley.
  • Fan, Q., Chen, Z., & Xia, Z. (2020). A novel quasi-reflected Harris hawks optimization algorithm for global optimization problems. Soft Computing, 24(19), 14825-14843. doi:https://doi.org/10.1007/s00500-020-04834-7
  • Friedman, M. (1940). A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics, 11(1), 86-92. doi:https://doi.org/10.1214/aoms/1177731944
  • Grabczewski, K., & Jankowski, N. (2005). Feature selection with decision tree criterion. Fifth International Conference on Hybrid Intelligent Systems (HIS'05) (ss. 6 pp.).
  • Hichem, H., Elkamel, M., Rafik, M., Mesaaoud, M. T., & Ouahiba, C. (2019). A new binary grasshopper optimization algorithm for feature selection problem. Journal of King Saud University-Computer Information Sciences. doi:https://doi.org/10.1016/j.jksuci.2019.11.007
  • Houssein, E. H., Saad, M. R., Hashim, F. A., Shaban, H., & Hassaballah, M. (2020). Lévy flight distribution: A new metaheuristic algorithm for solving engineering optimization problems. Engineering Applications of Artificial Intelligence, 94, 103731. doi:https://doi.org/10.1016/j.engappai.2020.103731
  • Hussien, A. G., Hassanien, A. E., Houssein, E. H., Amin, M., & Azar, A. T. (2020). New binary whale optimization algorithm for discrete optimization problems. Engineering Optimization, 52(6), 945-959. doi:https://doi.org/10.1080/0305215X.2019.1624740
  • Jh, H. (1975). Adaptation in natural and artificial systems. Ann Arbor.
  • Karaboga, D., Gorkemli, B., Ozturk, C., & Karaboga, N. (2014). A comprehensive survey: artificial bee colony (ABC) algorithm and applications. Artificial Intelligence Review, 42(1), 21-57. doi:https://doi.org/10.1007/s10462-012-9328-0
  • Khanesar, M. A., Teshnehlab, M., & Shoorehdeli, M. A. (2007). A novel binary particle swarm optimization. 2007 Mediterranean conference on control & automation (ss. 1-6).
  • Kittler, J. (1978). Feature set search algorithms. Pattern recognition signal processing, 41-60.
  • Li, S., Chen, H., Wang, M., Heidari, A. A., & Mirjalili, S. (2020). Slime mould algorithm: A new method for stochastic optimization. Future Generation Computer Systems, 111, 300-323. doi:https://doi.org/doi.org/10.1016/j.future.2020.03.055
  • Li, Y., Zhu, X., & Liu, J. (2020). An improved moth-flame optimization algorithm for engineering problems. Symmetry, 12(8), 1234. doi:https://doi.org/10.3390/sym12081234
  • Long, W., & Xu, S. (2016). A novel grey wolf optimizer for global optimization problems. 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) (ss. 1266-1270).
  • Luo, K., & Zhao, Q. (2019). A binary grey wolf optimizer for the multidimensional knapsack problem. Applied Soft Computing, 83, 105645. doi:https://doi.org/10.1016/j.asoc.2019.105645
  • Mirjalili, S. (2016). SCA: a sine cosine algorithm for solving optimization problems. Knowledge-based systems, 96, 120-133. doi:https://doi.org/10.1016/j.knosys.2015.12.022
  • Nadimi-Shahraki, M. H., Banaie-Dezfouli, M., Zamani, H., Taghian, S., & Mirjalili, S. (2021). B-MFO: a binary moth-flame optimization for feature selection from medical datasets. Computers in Biology, 10(11), 136.
  • Pal, M., & Foody, G. M. (2010). Feature selection for classification of hyperspectral data by SVM. EEE Transactions on Geoscience Remote Sensing
  • 48(5), 2297-2307. doi:https://doi.org/10.1109/TGRS.2009.2039484
  • Poli, R., Kennedy, J., & Blackwell, T. (2007). Particle swarm optimization. Swarm intelligence, 1(1), 33-57. doi:https://doi.org/10.1007/s11721-007-0002-0
  • Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1(1), 81-106.
  • Robnik-Šikonja, M., & Kononenko, I. J. M. l. (2003). Theoretical and empirical analysis of ReliefF and RReliefF. 53(1), 23-69.
  • Saeys, Y., Inza, I., & Larranaga, P. J. b. (2007). A review of feature selection techniques in bioinformatics. 23(19), 2507-2517. doi:https://doi.org/10.1093/bioinformatics/btm344
  • Siedlecki, W., & Sklansky, J. (1993). On automatic feature selection. In Handbook of pattern recognition and computer vision (ss. 63-87): World Scientific.
  • Taghian, S., & Nadimi-Shahraki, M. H. (2019). Binary sine cosine algorithms for feature selection from medical data. Advanced Computing: An International Journal (ACIJ), 10. doi:https://doi.org/10.5121/acij.2019.10501
  • Thaher, T., Heidari, A. A., Mafarja, M., Dong, J. S., & Mirjalili, S. (2020). Binary Harris Hawks optimizer for high-dimensional, low sample size feature selection. In Evolutionary machine learning techniques (ss. 251-272): Springer.
  • Too, J., & Rahim Abdullah, A. (2020). Binary atom search optimisation approaches for feature selection. Connection Science, 32(4), 406-430. doi:https://doi.org/10.1080/09540091.2020.1741515

Öznitelik seçimi problemleri için ikili beyaz köpekbalığı optimizasyon algoritması

Year 2023, Volume: 13 Issue: 2, 281 - 298, 15.04.2023
https://doi.org/10.17714/gumusfenbil.1175548

Abstract

Öznitelik seçimi, makine öğrenmesi problemlerinde kabul edilebilir bir sınıflandırma doğruluğunu hedeflerken, aynı zamanda büyük ölçekli bir veri kümesinden gereksiz, alakasız ve gürültülü öznitelikleri elimine etme işlemidir. Aslında öznitelik seçimi de bir optimizasyon problemi olarak nitelendirilebilir. Literatürde metasezgisel optimizasyon algoritmalarının, optimum öznitelik alt kümelerinin bulunmasında başarılı performansa sahip olduğu çalışmalar mevcuttur. Bu çalışmada da beyaz köpek balığı optimizasyon algoritması (BKO), S, V ve U-şekilli transfer fonksiyonları ile ikili forma dönüştürülerek öznitelik seçimi için kullanılmıştır. Önerilen yöntemler UCI veri deposundaki sekiz farklı veri kümesi üzerinde uygulanmış ve sınıflandırma doğrulukları, uygunluk değerleri ve seçilen öznitelik sayıları yönünden incelenmiştir. Sınıflandırıcı olarak k-en yakın komşuluk sınıflandırıcısı kullanılmıştır. Daha sonra farklı metasezgisel algoritmalarla karşılaştırılarak Freidman sıralama testi uygulanmıştır. Deneysel sonuçlar önerilen metotların, öznitelik seçiminde başarılı olduğunu ve sınıflandırma başarısını artırdığını göstermektedir. Özellikle V ve U-şekilli versiyonların daha kararlı ve yüksek doğrulukla sonuçlar ürettiği yorumu yapılabilir.

References

  • Abdel-Basset, M., Abdel-Fatah, L., & Sangaiah, A. K. (2018). Metaheuristic algorithms: A comprehensive review. Computational intelligence for multimedia big data on the cloud with engineering applications, 185-231. doi:https://doi.org/10.1016/B978-0-12-813314-9.00010-4
  • Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics, 376, 113609. doi:https://doi.org/10.1016/j.cma.2020.113609
  • Arora, S., & Anand, P. (2019). Binary butterfly optimization approaches for feature selection. Expert Systems with Applications, 116, 147-160. doi:https://doi.org/10.1016/j.eswa.2018.08.051
  • Awadallah, M. A., Hammouri, A. I., Al-Betar, M. A., Braik, M. S., & Abd Elaziz, M. (2022). Binary Horse herd optimization algorithm with crossover operators for feature selection. Computers in Biology Medicine, 141, 105152. doi:https://doi.org/10.1016/j.compbiomed.2021.105152
  • Bäck, T., & Schwefel, H.-P. (1993). An overview of evolutionary algorithms for parameter optimization. Evolutionary computation, 1(1), 1-23. doi:https://doi.org/10.1162/evco.1993.1.1.1
  • Braik, M., Hammouri, A., Atwan, J., Al-Betar, M. A., & Awadallah, M. A. J. K.-B. S. (2022). White Shark Optimizer: A novel bio-inspired meta-heuristic algorithm for global optimization problems. 243, 108457. doi:https://doi.org/10.1016/j.knosys.2022.108457
  • Dash, M., & Liu, H. (1997). Feature selection for classification. Intelligent data analysis, 1(1-4), 131-156. doi:https://doi.org/10.1016/S1088-467X(97)00008-5
  • Dehghani, M., Montazeri, Z., Dehghani, A., Malik, O. P., Morales-Menendez, R., Dhiman, G., Nouri, N., Ehsanifar, A., Guerrero, J. M., & Ramirez-Mendoza, R. A. (2021). Binary spring search algorithm for solving various optimization problems. Applied Sciences, 11(3), 1286. doi:https://doi.org/10.3390/app11031286
  • Dhiman, G., & Kumar, V. (2019). Seagull optimization algorithm: Theory and its applications for large-scale industrial engineering problems. Knowledge-based systems, 165, 169-196. doi:https://doi.org/10.1016/j.knosys.2018.11.024
  • Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern Classification, Hoboken. In: NJ: Wiley.
  • Fan, Q., Chen, Z., & Xia, Z. (2020). A novel quasi-reflected Harris hawks optimization algorithm for global optimization problems. Soft Computing, 24(19), 14825-14843. doi:https://doi.org/10.1007/s00500-020-04834-7
  • Friedman, M. (1940). A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics, 11(1), 86-92. doi:https://doi.org/10.1214/aoms/1177731944
  • Grabczewski, K., & Jankowski, N. (2005). Feature selection with decision tree criterion. Fifth International Conference on Hybrid Intelligent Systems (HIS'05) (ss. 6 pp.).
  • Hichem, H., Elkamel, M., Rafik, M., Mesaaoud, M. T., & Ouahiba, C. (2019). A new binary grasshopper optimization algorithm for feature selection problem. Journal of King Saud University-Computer Information Sciences. doi:https://doi.org/10.1016/j.jksuci.2019.11.007
  • Houssein, E. H., Saad, M. R., Hashim, F. A., Shaban, H., & Hassaballah, M. (2020). Lévy flight distribution: A new metaheuristic algorithm for solving engineering optimization problems. Engineering Applications of Artificial Intelligence, 94, 103731. doi:https://doi.org/10.1016/j.engappai.2020.103731
  • Hussien, A. G., Hassanien, A. E., Houssein, E. H., Amin, M., & Azar, A. T. (2020). New binary whale optimization algorithm for discrete optimization problems. Engineering Optimization, 52(6), 945-959. doi:https://doi.org/10.1080/0305215X.2019.1624740
  • Jh, H. (1975). Adaptation in natural and artificial systems. Ann Arbor.
  • Karaboga, D., Gorkemli, B., Ozturk, C., & Karaboga, N. (2014). A comprehensive survey: artificial bee colony (ABC) algorithm and applications. Artificial Intelligence Review, 42(1), 21-57. doi:https://doi.org/10.1007/s10462-012-9328-0
  • Khanesar, M. A., Teshnehlab, M., & Shoorehdeli, M. A. (2007). A novel binary particle swarm optimization. 2007 Mediterranean conference on control & automation (ss. 1-6).
  • Kittler, J. (1978). Feature set search algorithms. Pattern recognition signal processing, 41-60.
  • Li, S., Chen, H., Wang, M., Heidari, A. A., & Mirjalili, S. (2020). Slime mould algorithm: A new method for stochastic optimization. Future Generation Computer Systems, 111, 300-323. doi:https://doi.org/doi.org/10.1016/j.future.2020.03.055
  • Li, Y., Zhu, X., & Liu, J. (2020). An improved moth-flame optimization algorithm for engineering problems. Symmetry, 12(8), 1234. doi:https://doi.org/10.3390/sym12081234
  • Long, W., & Xu, S. (2016). A novel grey wolf optimizer for global optimization problems. 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) (ss. 1266-1270).
  • Luo, K., & Zhao, Q. (2019). A binary grey wolf optimizer for the multidimensional knapsack problem. Applied Soft Computing, 83, 105645. doi:https://doi.org/10.1016/j.asoc.2019.105645
  • Mirjalili, S. (2016). SCA: a sine cosine algorithm for solving optimization problems. Knowledge-based systems, 96, 120-133. doi:https://doi.org/10.1016/j.knosys.2015.12.022
  • Nadimi-Shahraki, M. H., Banaie-Dezfouli, M., Zamani, H., Taghian, S., & Mirjalili, S. (2021). B-MFO: a binary moth-flame optimization for feature selection from medical datasets. Computers in Biology, 10(11), 136.
  • Pal, M., & Foody, G. M. (2010). Feature selection for classification of hyperspectral data by SVM. EEE Transactions on Geoscience Remote Sensing
  • 48(5), 2297-2307. doi:https://doi.org/10.1109/TGRS.2009.2039484
  • Poli, R., Kennedy, J., & Blackwell, T. (2007). Particle swarm optimization. Swarm intelligence, 1(1), 33-57. doi:https://doi.org/10.1007/s11721-007-0002-0
  • Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1(1), 81-106.
  • Robnik-Šikonja, M., & Kononenko, I. J. M. l. (2003). Theoretical and empirical analysis of ReliefF and RReliefF. 53(1), 23-69.
  • Saeys, Y., Inza, I., & Larranaga, P. J. b. (2007). A review of feature selection techniques in bioinformatics. 23(19), 2507-2517. doi:https://doi.org/10.1093/bioinformatics/btm344
  • Siedlecki, W., & Sklansky, J. (1993). On automatic feature selection. In Handbook of pattern recognition and computer vision (ss. 63-87): World Scientific.
  • Taghian, S., & Nadimi-Shahraki, M. H. (2019). Binary sine cosine algorithms for feature selection from medical data. Advanced Computing: An International Journal (ACIJ), 10. doi:https://doi.org/10.5121/acij.2019.10501
  • Thaher, T., Heidari, A. A., Mafarja, M., Dong, J. S., & Mirjalili, S. (2020). Binary Harris Hawks optimizer for high-dimensional, low sample size feature selection. In Evolutionary machine learning techniques (ss. 251-272): Springer.
  • Too, J., & Rahim Abdullah, A. (2020). Binary atom search optimisation approaches for feature selection. Connection Science, 32(4), 406-430. doi:https://doi.org/10.1080/09540091.2020.1741515
There are 36 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Funda Kutlu Onay 0000-0002-8531-4054

Publication Date April 15, 2023
Submission Date September 15, 2022
Acceptance Date January 21, 2023
Published in Issue Year 2023 Volume: 13 Issue: 2

Cite

APA Kutlu Onay, F. (2023). Öznitelik seçimi problemleri için ikili beyaz köpekbalığı optimizasyon algoritması. Gümüşhane Üniversitesi Fen Bilimleri Dergisi, 13(2), 281-298. https://doi.org/10.17714/gumusfenbil.1175548