Araştırma Makalesi

High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach

Cilt: 9 Sayı: 4 29 Aralık 2021
  • Yılmaz Atay *
  • Muhterem Oğuzhan Yıldırım
  • Cuma Umur Doğan
PDF İndir
EN TR

High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach

Öz

Currently the approach of biological meaningfulness detection from gene microarray datasets obtained with microarray technology is used effectively in many areas such as disease diagnosis and differentiation of cancer types. However, since datasets obtained with this technology measure gene expression profiles collectively, the number of features in the dataset can be quite high. The small number of samples in gene microarray datasets, the high number of features and where the data is noisy significantly complicates the preparation process of these datasets. In order for machine learning models to successfully classify, the number of features that represent the size of the dataset should be reduced. In the proposed method, gene microarray data is taken as input and Information Gain, Fisher Correlation Scoring, ReliefF and, Chi-Square methods are applied separately for feature selection. After this stage, a sub-dataset containing the new genes is obtained and a pool of genes for Genetic Algorithm is created according to this dataset. Bayes classifier is trained using the sub-dataset created with the genes of the most successful chromosome. Thus, the classification process of cancer data is successfully completed. The model proposed in this study was applied to datasets that are frequently used in the literature and high success rates were obtained in classification. As a result; acceptable feature selection methods and the hybrid method based on Genetic Algorithm generally provided the most appropriate results on the all test data.

Anahtar Kelimeler

Kaynakça

  1. [1]  Zhang, P.-W., Chen, L., Huang, T., Zhang, N., Kong, X.-Y., Cai, Y.- D. (2015). Classifying Ten Types of Major Cancers Based on Reverse Phase Protein Array Profiles. PLoS One, 10(3). doi: 10.1371/jour- nal.pone.0123147. 

  2. [2]  Al-shamasneh, A. R. M., Obaidellah, U. H. B. (2017). Artificial Intelligence Techniques for Cancer Detection and Classification: Review Study. European Scientific Journal, 13(3). https://doi.org/10.19044/esj.2016.v13n3p342. 

  3. [3]  Russo, G., Zegar, C., Giordano, A. (2003). Advantages and limitations of microarray technology in human cancer - Oncogene. Oncogene, 22, 6497–6507. doi: 10.1038/sj.onc.1206865. 

  4. [4]  Bolo ́n-Canedo, V., Sa ́nchez-Maron ̃o, N., Alonso-Betanzos, A., Ben ́ıtez, J. M., Herrera, F. (2014). A review of microarray datasets and applied feature selection methods. Inform. Sci., 282, 111–135. doi: 10.1016/j.ins.2014.05.042. 

  5. [5]  Candan, H., Durmus ̧, A., Harman, G. (2019). Genetik Algoritma ve Sınıflandırıcı Yo ̈ntemler ile Kanser Tahmini. Veri Bilimi, 2(1), 30–34. 

  6. [6]  Kahraman M., Kaya, M. (2010). Çok amaçlı genetik algoritma kullanarak DNA mikrodizi verilerinin ku ̈melenmesi. (20 Ağustos 2021). Retrieved from https://tez.yok.gov.tr (tez no: 269977). 

  7. [7]  Turgut S., Dağtekin M., Ensari T. (2017). Makine öğrenmesi yöntemleri kullanarak kanser teşhisi. (22 Ag ̆ustos 2021). Retrieved from https://tez.yok.gov.tr (tez no: 487852). 

  8. [8]  Su, Q., Wang, Y., Jiang, X., Chen, F., Lu, W.-c. (2017). A Cancer Gene Selection Algorithm Based on the K-S Test and CFS. Biomed Res. Int., 2017, 1645619. doi: 10.1155/2017/1645619. 


Ayrıntılar

Birincil Dil

İngilizce

Konular

Mühendislik

Bölüm

Araştırma Makalesi

Yazarlar

Muhterem Oğuzhan Yıldırım Bu kişi benim
0000-0003-1288-0861
Türkiye

Cuma Umur Doğan Bu kişi benim
0000-0003-1792-7294
Türkiye

Yayımlanma Tarihi

29 Aralık 2021

Gönderilme Tarihi

26 Eylül 2021

Kabul Tarihi

15 Aralık 2021

Yayımlandığı Sayı

Yıl 2021 Cilt: 9 Sayı: 4

Kaynak Göster

APA
Atay, Y., Yıldırım, M. O., & Doğan, C. U. (2021). High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji, 9(4), 811-827. https://doi.org/10.29109/gujsc.1000926
AMA
1.Atay Y, Yıldırım MO, Doğan CU. High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach. GUJS Part C. 2021;9(4):811-827. doi:10.29109/gujsc.1000926
Chicago
Atay, Yılmaz, Muhterem Oğuzhan Yıldırım, ve Cuma Umur Doğan. 2021. “High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 9 (4): 811-27. https://doi.org/10.29109/gujsc.1000926.
EndNote
Atay Y, Yıldırım MO, Doğan CU (01 Aralık 2021) High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 9 4 811–827.
IEEE
[1]Y. Atay, M. O. Yıldırım, ve C. U. Doğan, “High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach”, GUJS Part C, c. 9, sy 4, ss. 811–827, Ara. 2021, doi: 10.29109/gujsc.1000926.
ISNAD
Atay, Yılmaz - Yıldırım, Muhterem Oğuzhan - Doğan, Cuma Umur. “High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 9/4 (01 Aralık 2021): 811-827. https://doi.org/10.29109/gujsc.1000926.
JAMA
1.Atay Y, Yıldırım MO, Doğan CU. High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach. GUJS Part C. 2021;9:811–827.
MLA
Atay, Yılmaz, vd. “High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji, c. 9, sy 4, Aralık 2021, ss. 811-27, doi:10.29109/gujsc.1000926.
Vancouver
1.Yılmaz Atay, Muhterem Oğuzhan Yıldırım, Cuma Umur Doğan. High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach. GUJS Part C. 01 Aralık 2021;9(4):811-27. doi:10.29109/gujsc.1000926

Cited By

                                     16168      16167     16166     21432        logo.png   


    e-ISSN:2147-9526