Araştırma Makalesi

Comparison of Feature Selection Methods in Breast Cancer Microarray Data

Cilt: 5 Sayı: 2 15 Mayıs 2023
PDF İndir
EN

Comparison of Feature Selection Methods in Breast Cancer Microarray Data

Abstract

Aim: We aim to predict metastasis in breast cancer patients with tree-based conventional machine learning algorithms and to observe which feature selection methods is more effective in machine learning methods related to microarray breast cancer data reducing the number of features. Material and Methods: Feature selection methods, least squares absolute shrinkage (LASSO), Boruta and maximum relevance-minimum redundancy (MRMR) and statistical preprocessing steps were first applied before the tree-based learning conventional machine learning methods like Decision-tree, Extremely randomized trees and Gradient Boosting Tree applied on the microarray breast cancer data. Results: Microarray data with 54675 features (202 (101/101 breast cancer patients with/without metastases)) was first reduced to 235 features, then the feature selection algorithms were applied and the most important features were found with tree-based machine learning algorithms. It was observed that the highest recall and F-measure values were obtained from the XGBoost method and the highest precision value was received from the Extra-tree method. The 10 arrays out of 54675 with the highest variable importance were listed. Conclusion: The most accurate results were obtained from the statistical preprocessed data for the XGBoost and Extra-trees machine learning algorithms. Statistical and microarray preprocessing steps would be enough in machine learning analysis of microarray data in breast cancer metastases predictions.

Keywords

Kaynakça

  1. Abd-Elnaby M, Alfonse M, Roushdy M. Classification of breast cancer using microarray gene expression data: a survey. J Biomed Inform. 2021;117.
  2. Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68:394-424.
  3. Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;70:313.
  4. Bahçeli PZ, Kucuk BY, Fear of cancer recurrence in women with breast cancer: A cross-sectional study after Mastectomy, Med Records. 2022;4:315-20.
  5. Chaffer CL, Weinberg RA. A perspective on cancer cell metastasis. Science. 2011;25:331.
  6. Scully OJ, Bay B, Yip G, Yu Y. Breast cancer metastasis.Cancer Genomics Proteomics. 2012:9;311-20.
  7. Curtis RK, Oresic M, Vidal-Puig A. Breast cancer metastasis Pathways to the analysis of microarray data. Trends Biotechnol. 2005;23:429–35.
  8. Dhanasekaran SM, Barrette TR, Ghosh D, et al. Delineation of prognostic biomarkers in prostate cancer. Nature. 2001;412:822–6.

Ayrıntılar

Birincil Dil

İngilizce

Konular

İç Hastalıkları

Bölüm

Araştırma Makalesi

Erken Görünüm Tarihi

15 Mayıs 2023

Yayımlanma Tarihi

15 Mayıs 2023

Gönderilme Tarihi

11 Kasım 2022

Kabul Tarihi

4 Ocak 2023

Yayımlandığı Sayı

Yıl 2023 Cilt: 5 Sayı: 2

Kaynak Göster

AMA
1.Agraz M. Comparison of Feature Selection Methods in Breast Cancer Microarray Data. Med Records. 2023;5(2):284-9. doi:10.37990/medr.1202671

Cited By