Research Article
BibTex RIS Cite

BREAST CANCER DIAGNOSIS WITH FEATURE SELECTION USING NATURE-INSPIRED OPTIMIZATION AND MACHINE LEARNING

Year 2022, , 442 - 452, 30.06.2022
https://doi.org/10.21923/jesd.1023451

Abstract

Breast cancer is the most common type of cancer in women and is the leading cause of death. If diagnosed and treated early, the rate of recovery and survival increases. Machine learning, with different applications in the medical field, plays a successful decision support role for experts in the early diagnosis of cancer types. Using all of the many features collected to diagnose a disease can negatively affect the analysis process and success. Selecting the most effective features from the collected data and making predictions using them can increase the success of the diagnosis. In this study, classifications were carried out on the UCI WDBC dataset, which is widely used in the literature on breast cancer, with KNN, RF, and SVM, without feature selection and by applying feature selection. Feature selection operations were performed on the original WDBC dataset containing 30 features, using nature-inspired algorithms Cuckoo Search (CS), Particle Swarm Optimization (PSO), Whale Optimization (WOA), and Red Deer Algorithm (RDA) consisting of 25, 50, and 75 particles. The highest accuracy was obtained with the RO classifier as 99.12%, using 16 features selected with 75-particle CS. It was observed that the accuracy of the classifications made by feature selection was higher than the results obtained without feature selection. The findings were compared with the current studies in the literature and it was observed that it provided higher success.

References

  • Ahuja, A., Al-Zogbi, L., & Krieger, A. (2021). Application of Noise-Reduction Techniques to Machine Learning Algorithms for Breast Cancer Tumor Identification. Computers in Biology and Medicine, 104576.
  • Al-Azzam, N., & Shatnawi, I. (2021). Comparing supervised and semi-supervised Machine Learning Models on Diagnosing Breast Cancer. Annals of Medicine and Surgery, 62, 53-64. https://doi.org/10.1016/j.amsu.2020.12.043
  • Al-Yaseen, W., Jehad, A., Abed, C. I., & Idrees, A. K. (2021). The use of modified K-Means algorithm to enhance the performance of support vector machine in classifying breast cancer. Int. J. Intell. Eng. Syst, 14(2).
  • Ara, S., Das, A., & Dey, A. (2021). Malignant and Benign Breast Cancer Classification using Machine Learning Algorithms. 2021 International Conference on Artificial Intelligence (ICAI), 97-101.
  • Arora, S., & Anand, P. (2019). Binary butterfly optimization approaches for feature selection. Expert Systems with Applications, 116, 147-160. https://doi.org/10.1016/j.eswa.2018.08.051
  • Assegie, T. A. (2021). An optimized K-Nearest Neighbor based breast cancer detection. Journal of Robotics and Control (JRC), 2(3), 115-118.
  • Bayrak, E. A., Kırcı, P., & Ensari, T. (2019). Comparison of machine learning methods for breast cancer diagnosis. 2019 Scientific meeting on electrical-electronics & biomedical engineering and computer science (EBBT), 1-3.
  • Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
  • Devi, M. V., Sravani, M., Ramya, K., Bindulakshmisai, N., Parameshwari, V., & others. (2021). Breast Cancer Diagnosis Using Adaptive Voting Ensemble Machine Learning Algorithm. UGC Care Group I Listed Journal, 11(1), 495-501.
  • Dhal, K. G., Ray, S., Das, A., & Das, S. (2019). A Survey on Nature-Inspired Optimization Algorithms and Their Application in Image Enhancement Domain. Archives of Computational Methods in Engineering, 26(5), 1607-1638. https://doi.org/10.1007/s11831-018-9289-9
  • Fathollahi-Fard, A. M., Hajiaghaei-Keshteli, M., & Tavakkoli-Moghaddam, R. (2020). Red deer algorithm (RDA): A new nature-inspired meta-heuristic. Soft Computing, 24(19), 14637-14665.
  • Feroz, N., Ahad, M. A., & Doja, F. (2021). Machine Learning Techniques for Improved Breast Cancer Detection and Prognosis—A Comparative Analysis. İçinde Applications of Artificial Intelligence and Machine Learning (ss. 441-455). Springer.
  • Fister Jr, I., Yang, X.-S., Fister, I., Brest, J., & Fister, D. (2013). A brief review of nature-inspired algorithms for optimization. arXiv preprint arXiv:1307.4186.
  • Fix, E., & Hodges Jr, J. L. (1952). Discriminatory analysis-nonparametric discrimination: Small sample performance. California Univ Berkeley.
  • Ghosh, P., Azam, S., Hasib, K. M., Karim, A., Jonkman, M., & Anwar, A. (2021). A Performance Based Study on Deep Learning Algorithms in the Effective Prediction of Breast Cancer. 2021 International Joint Conference on Neural Networks (IJCNN), 1-8.
  • Gupta, N., & Kaushik, B. N. (2021). Prognosis and Prediction of Breast Cancer Using Machine Learning and Ensemble-Based Training Model. The Computer Journal.
  • Ho, T. K. (1995). Random decision forests. Proceedings of 3rd international conference on document analysis and recognition, 1, 278-282.
  • Jabbar, M. A. (2021). Breast Cancer Data Classification Using Ensemble Machine Learning. Engineering and Applied Science Research, 48(1), 65-72.
  • Jerez-Aragonés, J. M., Gómez-Ruiz, J. A., Ramos-Jiménez, G., Muñoz-Pérez, J., & Alba-Conejo, E. (2003). A combined neural network and decision trees model for prognosis of breast cancer relapse. Artificial intelligence in medicine, 27(1), 45-63.
  • Jijitha, S., & Amudha, T. (2021). Breast cancer prognosis using machine learning techniques and genetic algorithm: Experiment on six different datasets. İçinde Evolutionary Computing and Mobile Sustainable Networks (ss. 703-711). Springer.
  • Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN’95-international conference on neural networks, 4, 1942-1948.
  • Khorshid, S. F., Abdulazeez, A. M., & Sallow, A. B. (2021). A comparative analysis and predicting for breast cancer detection based on data mining models. Asian Journal of Research in Computer Science, 45-59.
  • Kumar, S., & Bhargava, C. (2019). A review on Breast Cancer Analysis using Machine Learning. International Journal of Emerging Technologies and Innovative Research, 6, 326-329.
  • Lahoura, V., Singh, H., Aggarwal, A., Sharma, B., Mohammed, M. A., Damaševičius, R., Kadry, S., & Cengiz, K. (2021). Cloud computing-based framework for breast cancer diagnosis using extreme learning machine. Diagnostics, 11(2), 241.
  • Magboo, V. P. C., & Magboo, M. S. A. (2021). Machine Learning Classifiers on Breast Cancer Recurrences. Procedia Computer Science, 192, 2742-2752. https://doi.org/10.1016/j.procs.2021.09.044
  • Mirjalili, S., & Lewis, A. (2016). The whale optimization algorithm. Advances in engineering software, 95, 51-67. Muhammad Amin, B., & Inna, E. (2021). Breast Cancer Prediction Model Using Machine Learning. Journal of Data Science, 2021(02).
  • Naji, M. A., Filali, S. E., Aarika, K., Benlahmar, E. H., Abdelouhahid, R. A., & Debauche, O. (2021). Machine Learning Algorithms For Breast Cancer Prediction And Diagnosis. Procedia Computer Science, 191, 487-492. https://doi.org/10.1016/j.procs.2021.07.062
  • Papageorgiou, E. I., Subramanian, J., Karmegam, A., & Papandrianos, N. (2015). A risk management model for familial breast cancer: A new application using Fuzzy Cognitive Map method. Computer methods and programs in biomedicine, 122(2), 123-135.
  • Parekh, D. H., Dahiya, V., & others. (2021). Predicting breast cancer using machine learning classifiers and enhancing the output by combining the predictions to generate optimal F1-score. Biomedical and Biotechnology Research Journal (BBRJ), 5(3), 331.
  • Pawar, S., Bagal, P., Shukla, P., & Dawkhar, A. (2021). Detection of Breast Cancer using Machine Learning Classifier. 2021 Asian Conference on Innovation in Technology (ASIANCON), 1-5. https://doi.org/10.1109/ASIANCON51346.2021.9544767
  • Pravesjit, S., Longpradit, P., Kantawong, K., Pengchata, R., & Oul, N. (2021). A Hybrid PSO with Rao Algorithm for Classification of Wisconsin Breast Cancer Dataset. 2021 2nd International Conference on Big Data Analytics and Practices (IBDAP), 68-71. https://doi.org/10.1109/IBDAP52511.2021.9552152
  • Shrivastavat, S. S., Sant, A., & Aharwal, R. P. (2013). An overview on data mining approach on breast cancer data. International Journal of Advanced Computer Research, 3(4), 256.
  • Sinha, N. K., Khulal, M., Gurung, M., & Lal, A. (2020). Developing a web based system for breast cancer prediction using xgboost classifier. International Journal of Engineering Research Technology (IJERT), 9.
  • Sung, H., Ferlay, J., Siegel, R. L., Laversanne, M., Soerjomataram, I., Jemal, A., & Bray, F. (2021). Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians, 71(3), 209-249.
  • UCI. (2021). Breast Cancer Wisconsin (Diagnostic) Data Set. Erişim tarihi 04 Ekim 2021, https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
  • Vapnik, V., Golowich, S. E., Smola, A., & others. (1997). Support vector method for function approximation, regression estimation, and signal processing. Advances in neural information processing systems, 281-287.
  • Vijayakumar, K., Kadam, V. J., & Sharma, S. K. (2021). Breast cancer diagnosis using multiple activation deep neural network. Concurrent Engineering, 29(3), 275-284.
  • Yang, X.-S., & Deb, S. (2009). Cuckoo Search via Lévy flights. 2009 World Congress on Nature Biologically Inspired Computing (NaBIC), 210-214. https://doi.org/10.1109/NABIC.2009.5393690

DOĞA İLHAMLI OPTİMİZASYON KULLANARAK ÖZELLİK SEÇİMİ VE MAKİNE ÖĞRENMESİ İLE MEME KANSERİ TEŞHİSİ

Year 2022, , 442 - 452, 30.06.2022
https://doi.org/10.21923/jesd.1023451

Abstract

Meme kanseri kadınlarda en yaygın görülen kanser türü ve en çok ölüme sebep olan hastalıktır. Erken teşhis ve tedavi edilirse iyileşme ve hayatta kalma oranı yükselmektedir. Makine öğrenmesi, medikal alanda farklı uygulamalarıyla kanser türlerinin erken teşhisinde uzmanlar için başarılı bir karar destek rolü oynamaktadır. Bir hastalığın tanısı için toplanan çok sayıda özelliğin tümünün birlikte kullanımı analiz süreci ve başarısını olumsuz etkileyebilmektedir. Toplanan verilerden en etkin özniteliklerin seçilmesi ve bunları kullanarak tahminlemeler yapılması teşhis başarısını artırabilmektedir. Bu çalışmada meme kanseri konusunda literatürde yaygın kullanılan UCI WDBC veri seti üzerinde KNN, RO ve DVM algoritmaları ile öznitelik seçimi olmadan ve öznitelik seçimi uygulanarak sınıflamalar gerçekleştirilmiştir. Öznitelik seçimi konusunda başarılı sonuçlar sağlayan doğa ilhamlı algoritmalardan Guguk Kuşu Arama (GKA), Parçacık Sürü Optimizasyonu (PSO), Balina Optimizasyon (BO) algoritması ve Kızıl Geyik (KG) algoritması kullanılarak ve bu algoritmaların 25,50 ve 75 parçacıklı sürüleri ile 30 öznitelikli orijinal WDBC veri seti üzerinde öznitelik seçimleri gerçekleştirilmiştir. En yüksek doğruluk 75 parçacıklı GKA ile seçilen 16 adet öznitelik ile RO sınıflandırıcı kullanılarak %99.12 olarak elde edilmiştir. Öznitelik seçimi yapılarak gerçekleştirilen sınıflamalardaki doğruluklarının, seçim yapılmadan elde edilen sonuçlardan yüksek olduğu görülmüştür. Eldeki bulgular literatürdeki çalışmalar ile karşılaştırılmış ve daha yüksek başarı sağladığı gözlenmiştir.

References

  • Ahuja, A., Al-Zogbi, L., & Krieger, A. (2021). Application of Noise-Reduction Techniques to Machine Learning Algorithms for Breast Cancer Tumor Identification. Computers in Biology and Medicine, 104576.
  • Al-Azzam, N., & Shatnawi, I. (2021). Comparing supervised and semi-supervised Machine Learning Models on Diagnosing Breast Cancer. Annals of Medicine and Surgery, 62, 53-64. https://doi.org/10.1016/j.amsu.2020.12.043
  • Al-Yaseen, W., Jehad, A., Abed, C. I., & Idrees, A. K. (2021). The use of modified K-Means algorithm to enhance the performance of support vector machine in classifying breast cancer. Int. J. Intell. Eng. Syst, 14(2).
  • Ara, S., Das, A., & Dey, A. (2021). Malignant and Benign Breast Cancer Classification using Machine Learning Algorithms. 2021 International Conference on Artificial Intelligence (ICAI), 97-101.
  • Arora, S., & Anand, P. (2019). Binary butterfly optimization approaches for feature selection. Expert Systems with Applications, 116, 147-160. https://doi.org/10.1016/j.eswa.2018.08.051
  • Assegie, T. A. (2021). An optimized K-Nearest Neighbor based breast cancer detection. Journal of Robotics and Control (JRC), 2(3), 115-118.
  • Bayrak, E. A., Kırcı, P., & Ensari, T. (2019). Comparison of machine learning methods for breast cancer diagnosis. 2019 Scientific meeting on electrical-electronics & biomedical engineering and computer science (EBBT), 1-3.
  • Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
  • Devi, M. V., Sravani, M., Ramya, K., Bindulakshmisai, N., Parameshwari, V., & others. (2021). Breast Cancer Diagnosis Using Adaptive Voting Ensemble Machine Learning Algorithm. UGC Care Group I Listed Journal, 11(1), 495-501.
  • Dhal, K. G., Ray, S., Das, A., & Das, S. (2019). A Survey on Nature-Inspired Optimization Algorithms and Their Application in Image Enhancement Domain. Archives of Computational Methods in Engineering, 26(5), 1607-1638. https://doi.org/10.1007/s11831-018-9289-9
  • Fathollahi-Fard, A. M., Hajiaghaei-Keshteli, M., & Tavakkoli-Moghaddam, R. (2020). Red deer algorithm (RDA): A new nature-inspired meta-heuristic. Soft Computing, 24(19), 14637-14665.
  • Feroz, N., Ahad, M. A., & Doja, F. (2021). Machine Learning Techniques for Improved Breast Cancer Detection and Prognosis—A Comparative Analysis. İçinde Applications of Artificial Intelligence and Machine Learning (ss. 441-455). Springer.
  • Fister Jr, I., Yang, X.-S., Fister, I., Brest, J., & Fister, D. (2013). A brief review of nature-inspired algorithms for optimization. arXiv preprint arXiv:1307.4186.
  • Fix, E., & Hodges Jr, J. L. (1952). Discriminatory analysis-nonparametric discrimination: Small sample performance. California Univ Berkeley.
  • Ghosh, P., Azam, S., Hasib, K. M., Karim, A., Jonkman, M., & Anwar, A. (2021). A Performance Based Study on Deep Learning Algorithms in the Effective Prediction of Breast Cancer. 2021 International Joint Conference on Neural Networks (IJCNN), 1-8.
  • Gupta, N., & Kaushik, B. N. (2021). Prognosis and Prediction of Breast Cancer Using Machine Learning and Ensemble-Based Training Model. The Computer Journal.
  • Ho, T. K. (1995). Random decision forests. Proceedings of 3rd international conference on document analysis and recognition, 1, 278-282.
  • Jabbar, M. A. (2021). Breast Cancer Data Classification Using Ensemble Machine Learning. Engineering and Applied Science Research, 48(1), 65-72.
  • Jerez-Aragonés, J. M., Gómez-Ruiz, J. A., Ramos-Jiménez, G., Muñoz-Pérez, J., & Alba-Conejo, E. (2003). A combined neural network and decision trees model for prognosis of breast cancer relapse. Artificial intelligence in medicine, 27(1), 45-63.
  • Jijitha, S., & Amudha, T. (2021). Breast cancer prognosis using machine learning techniques and genetic algorithm: Experiment on six different datasets. İçinde Evolutionary Computing and Mobile Sustainable Networks (ss. 703-711). Springer.
  • Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN’95-international conference on neural networks, 4, 1942-1948.
  • Khorshid, S. F., Abdulazeez, A. M., & Sallow, A. B. (2021). A comparative analysis and predicting for breast cancer detection based on data mining models. Asian Journal of Research in Computer Science, 45-59.
  • Kumar, S., & Bhargava, C. (2019). A review on Breast Cancer Analysis using Machine Learning. International Journal of Emerging Technologies and Innovative Research, 6, 326-329.
  • Lahoura, V., Singh, H., Aggarwal, A., Sharma, B., Mohammed, M. A., Damaševičius, R., Kadry, S., & Cengiz, K. (2021). Cloud computing-based framework for breast cancer diagnosis using extreme learning machine. Diagnostics, 11(2), 241.
  • Magboo, V. P. C., & Magboo, M. S. A. (2021). Machine Learning Classifiers on Breast Cancer Recurrences. Procedia Computer Science, 192, 2742-2752. https://doi.org/10.1016/j.procs.2021.09.044
  • Mirjalili, S., & Lewis, A. (2016). The whale optimization algorithm. Advances in engineering software, 95, 51-67. Muhammad Amin, B., & Inna, E. (2021). Breast Cancer Prediction Model Using Machine Learning. Journal of Data Science, 2021(02).
  • Naji, M. A., Filali, S. E., Aarika, K., Benlahmar, E. H., Abdelouhahid, R. A., & Debauche, O. (2021). Machine Learning Algorithms For Breast Cancer Prediction And Diagnosis. Procedia Computer Science, 191, 487-492. https://doi.org/10.1016/j.procs.2021.07.062
  • Papageorgiou, E. I., Subramanian, J., Karmegam, A., & Papandrianos, N. (2015). A risk management model for familial breast cancer: A new application using Fuzzy Cognitive Map method. Computer methods and programs in biomedicine, 122(2), 123-135.
  • Parekh, D. H., Dahiya, V., & others. (2021). Predicting breast cancer using machine learning classifiers and enhancing the output by combining the predictions to generate optimal F1-score. Biomedical and Biotechnology Research Journal (BBRJ), 5(3), 331.
  • Pawar, S., Bagal, P., Shukla, P., & Dawkhar, A. (2021). Detection of Breast Cancer using Machine Learning Classifier. 2021 Asian Conference on Innovation in Technology (ASIANCON), 1-5. https://doi.org/10.1109/ASIANCON51346.2021.9544767
  • Pravesjit, S., Longpradit, P., Kantawong, K., Pengchata, R., & Oul, N. (2021). A Hybrid PSO with Rao Algorithm for Classification of Wisconsin Breast Cancer Dataset. 2021 2nd International Conference on Big Data Analytics and Practices (IBDAP), 68-71. https://doi.org/10.1109/IBDAP52511.2021.9552152
  • Shrivastavat, S. S., Sant, A., & Aharwal, R. P. (2013). An overview on data mining approach on breast cancer data. International Journal of Advanced Computer Research, 3(4), 256.
  • Sinha, N. K., Khulal, M., Gurung, M., & Lal, A. (2020). Developing a web based system for breast cancer prediction using xgboost classifier. International Journal of Engineering Research Technology (IJERT), 9.
  • Sung, H., Ferlay, J., Siegel, R. L., Laversanne, M., Soerjomataram, I., Jemal, A., & Bray, F. (2021). Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians, 71(3), 209-249.
  • UCI. (2021). Breast Cancer Wisconsin (Diagnostic) Data Set. Erişim tarihi 04 Ekim 2021, https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
  • Vapnik, V., Golowich, S. E., Smola, A., & others. (1997). Support vector method for function approximation, regression estimation, and signal processing. Advances in neural information processing systems, 281-287.
  • Vijayakumar, K., Kadam, V. J., & Sharma, S. K. (2021). Breast cancer diagnosis using multiple activation deep neural network. Concurrent Engineering, 29(3), 275-284.
  • Yang, X.-S., & Deb, S. (2009). Cuckoo Search via Lévy flights. 2009 World Congress on Nature Biologically Inspired Computing (NaBIC), 210-214. https://doi.org/10.1109/NABIC.2009.5393690
There are 38 citations in total.

Details

Primary Language Turkish
Subjects Computer Software
Journal Section Research Articles
Authors

Onur Sevli 0000-0002-8933-8395

Publication Date June 30, 2022
Submission Date November 14, 2021
Acceptance Date January 24, 2022
Published in Issue Year 2022

Cite

APA Sevli, O. (2022). DOĞA İLHAMLI OPTİMİZASYON KULLANARAK ÖZELLİK SEÇİMİ VE MAKİNE ÖĞRENMESİ İLE MEME KANSERİ TEŞHİSİ. Mühendislik Bilimleri Ve Tasarım Dergisi, 10(2), 442-452. https://doi.org/10.21923/jesd.1023451