Research Article
BibTex RIS Cite

Feature Selection Based Data Mining Approach for Coronary Artery Disease Diagnosis

Year 2021, Volume: 9 Issue: 3, 451 - 459, 30.09.2021
https://doi.org/10.21541/apjes.899055

Abstract

Cardiovascular diseases responsible for many deaths are very common and important health problems. According to World Health Organization, each year 17.7 million people die because of them. Coronary artery disease is the most important type of cardiovascular diseases that cause serious heart problems in patients, affecting the heart’s function negatively. Being aware of the important attributes for this disease will help field-specialist in the analysis of routine laboratory test results of a patient coming internal medicine or another medicine unit except for the cardiology unit. In this study, it is aimed to determine the significance of attributes for coronary artery disease by utilizing Stability Selection method. In experiments, the attributes; ‘Age’, ‘Atypical’, ‘Blood pressure’, ‘Current smoker’, ‘Diastolic murmur’, ‘Dyslipidemia’, ‘Diabetes mellitus’, ‘Ejection fraction’, ‘Erythrocyte sedimentation rate’, ‘Family history’, ‘Hypertension’, ‘Potassium’, ‘Nonanginal’, ‘Pulse rate’, ‘Q wave’, ‘Regional wall motion abnormality’, ‘Sex’, ‘St Depression’, ‘Triglyceride’, ‘Tinversion’, ‘Typical chest pain’ and ‘Valvular heart disease’ were found important for each sub-dataset. Besides, the performances of four traditional machine learning algorithms were evaluated to detection of this disease. Logistic Regression algorithm outperformed others with %90.88 value of accuracy, 95.18% value of sensitivity, and 81.34% value of specificity.

Thanks

Author would like to thank Arabasadi et al. [8] for providing the Z-Alizadeh Sani dataset.

References

  • Alizadehsani R, Habibi J, Hosseini MJ, et al (2013) A data mining approach for diagnosis of coronary artery disease. Comput Methods Programs Biomed 111:52–61. https://doi.org/10.1016/j.cmpb.2013.03.004
  • Chagas P, Mazocco L, Piccoli J da CE, et al (2017) Association of alcohol consumption with coronary artery disease severity. Clin Nutr 36:1036–1039. https://doi.org/10.1016/j.clnu.2016.06.017
  • Roberts R (2015) A genetic basis for coronary artery disease. Trends Cardiovasc. Med. 25:171–178
  • Yadav C, Lade S, Professor A, Suman MK (2014) Predictive Analysis for the Diagnosis of Coronary Artery Disease using Association Rule Mining
  • Ghadiri Hedeshi N, Saniee Abadeh M (2014) Coronary artery disease detection using a fuzzy-boosting PSO approach. Comput Intell Neurosci 2014:. https://doi.org/10.1155/2014/783734
  • Alizadehsani R, Hosseini MJ, Boghrati R, et al (2013) Exerting Cost-Sensitive and Feature Creation Algorithms for Coronary Artery Disease Diagnosis. Int J Knowl Discov Bioinforma 3:59–79. https://doi.org/10.4018/jkdb.2012010104
  • Nithya S, Suresh C, Dhas G (2015) FUZZY LOGIC BASED IMPROVED SUPPORT VECTOR MACHINE (F-ISVM) CLASSIFIERFOR HEART DISEASE CLASSIFICATION. 10:
  • Arabasadi Z, Alizadehsani R, Roshanzamir M, et al (2017) Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm. Comput Methods Programs Biomed 141:19–26. https://doi.org/10.1016/j.cmpb.2017.01.004
  • Alizadehsani R, Hosseini MJ, Sani ZA, et al (2012) Diagnosis of coronary artery disease using cost-sensitive algorithms. In: Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012. pp 9–16
  • Qin CJ, Guan Q, Wang XP (2017) Application of ensemble algorithm integrating multiple criteria feature selection in coronary heart disease detection. Biomed Eng - Appl Basis Commun 29:. https://doi.org/10.4015/S1016237217500430
  • Babic F, Olejar J, Vantova Z, Paralic J (2017) Predictive and descriptive analysis for heart disease diagnosis. In: Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, FedCSIS 2017. Institute of Electrical and Electronics Engineers Inc., pp 155–163
  • Pathak LA, Shirodkar S, Ruparelia R, Rajebahadur J (2017) Coronary artery disease in women. Indian Heart J. 69:532–538
  • Malakar AK, Choudhury D, Halder B, et al (2019) A review on coronary artery disease, its risk factors, and therapeutics. J Cell Physiol 234:16812–16823. https://doi.org/10.1002/jcp.28350
  • Effrosynidis D, Arampatzis A (2021) An evaluation of feature selection methods for environmental data. Ecol Inform 101224. https://doi.org/10.1016/j.ecoinf.2021.101224
  • K.P. MN, P. T (2021) Feature selection using efficient fusion of Fisher Score and greedy searching for Alzheimer’s classification. J King Saud Univ - Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2020.12.009
  • Kavitha KK, Kangaiammal A (2020) Correlation-based high distinction feature selection in digital mammogram. Mater Today Proc. https://doi.org/10.1016/j.matpr.2020.10.858
  • Sheth PD, Patil ST, Dhore ML (2020) Evolutionary computing for clinical dataset classification using a novel feature selection algorithm. J King Saud Univ - Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2020.12.012
  • Amini F, Hu G (2021) A two-layer feature selection method using Genetic Algorithm and Elastic Net. Expert Syst Appl 166:114072. https://doi.org/10.1016/j.eswa.2020.114072
  • Chen SB, Zhang YM, Ding CHQ, et al (2019) Extended adaptive Lasso for multi-class and multi-label feature selection. Knowledge-Based Syst 173:28–36. https://doi.org/10.1016/j.knosys.2019.02.021
  • Haq AU, Zeb A, Lei Z, Zhang D (2021) Forecasting daily stock trend using multi-filter feature selection and deep learning. Expert Syst Appl 168:114444. https://doi.org/10.1016/j.eswa.2020.114444
  • Toğaçar M, Ergen B, Cömert Z (2020) Classification of white blood cells using deep features obtained from Convolutional Neural Network models based on the combination of feature selection methods. Appl Soft Comput J 97:106810. https://doi.org/10.1016/j.asoc.2020.106810
  • Niu T, Wang J, Lu H, et al (2020) Developing a deep learning framework with two-stage feature selection for multivariate financial time series forecasting. Expert Syst Appl 148:113237. https://doi.org/10.1016/j.eswa.2020.113237
  • Tian W, Liu Z, Li L, et al (2020) Identification of abnormal conditions in high-dimensional chemical process based on feature selection and deep learning. Chinese J Chem Eng 28:1875–1883. https://doi.org/10.1016/j.cjche.2020.05.003
  • Kim W, Han Y, Kim KJ, Song KW (2020) Electricity load forecasting using advanced feature selection and optimal deep learning model for the variable refrigerant flow systems. Energy Reports 6:2604–2618. https://doi.org/10.1016/j.egyr.2020.09.019
  • Shi H, Li H, Zhang D, et al (2018) An efficient feature generation approach based on deep learning and feature selection techniques for traffic classification. Comput Networks 132:81–98. https://doi.org/10.1016/j.comnet.2018.01.007
  • Mordelet F, Horton J, Hartemink AJ, et al (2013) Stability selection for regression-based models of transcription factor-DNA binding specificity. In: Bioinformatics
  • Meinshausen N, Bühlmann P (2010) Stability selection. J R Stat Soc Ser B (Statistical Methodol 72:417–473. https://doi.org/10.1111/j.1467-9868.2010.00740.x
  • Zucco C (2019) Data Mining in Bioinformatics. In: Encyclopedia of Bioinformatics and Computational Biology. Elsevier, pp 328–335
  • Kantardzic M Data mining : concepts, models, methods, and algorithms
  • Hastie T, Tibshirani R, Friedman J (2009) Elements of Statistical Learning 2nd ed. Elements 27:745. https://doi.org/10.1007/978-0-387-84858-7
  • Kohavi R (1995) A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection
  • Shaikh SA (2011) Measures Derived from a 2 x 2 Table for an Accuracy of a Diagnostic Test. J Biom Biostat 02:1–4. https://doi.org/10.4172/2155-6180.1000128
Year 2021, Volume: 9 Issue: 3, 451 - 459, 30.09.2021
https://doi.org/10.21541/apjes.899055

Abstract

References

  • Alizadehsani R, Habibi J, Hosseini MJ, et al (2013) A data mining approach for diagnosis of coronary artery disease. Comput Methods Programs Biomed 111:52–61. https://doi.org/10.1016/j.cmpb.2013.03.004
  • Chagas P, Mazocco L, Piccoli J da CE, et al (2017) Association of alcohol consumption with coronary artery disease severity. Clin Nutr 36:1036–1039. https://doi.org/10.1016/j.clnu.2016.06.017
  • Roberts R (2015) A genetic basis for coronary artery disease. Trends Cardiovasc. Med. 25:171–178
  • Yadav C, Lade S, Professor A, Suman MK (2014) Predictive Analysis for the Diagnosis of Coronary Artery Disease using Association Rule Mining
  • Ghadiri Hedeshi N, Saniee Abadeh M (2014) Coronary artery disease detection using a fuzzy-boosting PSO approach. Comput Intell Neurosci 2014:. https://doi.org/10.1155/2014/783734
  • Alizadehsani R, Hosseini MJ, Boghrati R, et al (2013) Exerting Cost-Sensitive and Feature Creation Algorithms for Coronary Artery Disease Diagnosis. Int J Knowl Discov Bioinforma 3:59–79. https://doi.org/10.4018/jkdb.2012010104
  • Nithya S, Suresh C, Dhas G (2015) FUZZY LOGIC BASED IMPROVED SUPPORT VECTOR MACHINE (F-ISVM) CLASSIFIERFOR HEART DISEASE CLASSIFICATION. 10:
  • Arabasadi Z, Alizadehsani R, Roshanzamir M, et al (2017) Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm. Comput Methods Programs Biomed 141:19–26. https://doi.org/10.1016/j.cmpb.2017.01.004
  • Alizadehsani R, Hosseini MJ, Sani ZA, et al (2012) Diagnosis of coronary artery disease using cost-sensitive algorithms. In: Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012. pp 9–16
  • Qin CJ, Guan Q, Wang XP (2017) Application of ensemble algorithm integrating multiple criteria feature selection in coronary heart disease detection. Biomed Eng - Appl Basis Commun 29:. https://doi.org/10.4015/S1016237217500430
  • Babic F, Olejar J, Vantova Z, Paralic J (2017) Predictive and descriptive analysis for heart disease diagnosis. In: Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, FedCSIS 2017. Institute of Electrical and Electronics Engineers Inc., pp 155–163
  • Pathak LA, Shirodkar S, Ruparelia R, Rajebahadur J (2017) Coronary artery disease in women. Indian Heart J. 69:532–538
  • Malakar AK, Choudhury D, Halder B, et al (2019) A review on coronary artery disease, its risk factors, and therapeutics. J Cell Physiol 234:16812–16823. https://doi.org/10.1002/jcp.28350
  • Effrosynidis D, Arampatzis A (2021) An evaluation of feature selection methods for environmental data. Ecol Inform 101224. https://doi.org/10.1016/j.ecoinf.2021.101224
  • K.P. MN, P. T (2021) Feature selection using efficient fusion of Fisher Score and greedy searching for Alzheimer’s classification. J King Saud Univ - Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2020.12.009
  • Kavitha KK, Kangaiammal A (2020) Correlation-based high distinction feature selection in digital mammogram. Mater Today Proc. https://doi.org/10.1016/j.matpr.2020.10.858
  • Sheth PD, Patil ST, Dhore ML (2020) Evolutionary computing for clinical dataset classification using a novel feature selection algorithm. J King Saud Univ - Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2020.12.012
  • Amini F, Hu G (2021) A two-layer feature selection method using Genetic Algorithm and Elastic Net. Expert Syst Appl 166:114072. https://doi.org/10.1016/j.eswa.2020.114072
  • Chen SB, Zhang YM, Ding CHQ, et al (2019) Extended adaptive Lasso for multi-class and multi-label feature selection. Knowledge-Based Syst 173:28–36. https://doi.org/10.1016/j.knosys.2019.02.021
  • Haq AU, Zeb A, Lei Z, Zhang D (2021) Forecasting daily stock trend using multi-filter feature selection and deep learning. Expert Syst Appl 168:114444. https://doi.org/10.1016/j.eswa.2020.114444
  • Toğaçar M, Ergen B, Cömert Z (2020) Classification of white blood cells using deep features obtained from Convolutional Neural Network models based on the combination of feature selection methods. Appl Soft Comput J 97:106810. https://doi.org/10.1016/j.asoc.2020.106810
  • Niu T, Wang J, Lu H, et al (2020) Developing a deep learning framework with two-stage feature selection for multivariate financial time series forecasting. Expert Syst Appl 148:113237. https://doi.org/10.1016/j.eswa.2020.113237
  • Tian W, Liu Z, Li L, et al (2020) Identification of abnormal conditions in high-dimensional chemical process based on feature selection and deep learning. Chinese J Chem Eng 28:1875–1883. https://doi.org/10.1016/j.cjche.2020.05.003
  • Kim W, Han Y, Kim KJ, Song KW (2020) Electricity load forecasting using advanced feature selection and optimal deep learning model for the variable refrigerant flow systems. Energy Reports 6:2604–2618. https://doi.org/10.1016/j.egyr.2020.09.019
  • Shi H, Li H, Zhang D, et al (2018) An efficient feature generation approach based on deep learning and feature selection techniques for traffic classification. Comput Networks 132:81–98. https://doi.org/10.1016/j.comnet.2018.01.007
  • Mordelet F, Horton J, Hartemink AJ, et al (2013) Stability selection for regression-based models of transcription factor-DNA binding specificity. In: Bioinformatics
  • Meinshausen N, Bühlmann P (2010) Stability selection. J R Stat Soc Ser B (Statistical Methodol 72:417–473. https://doi.org/10.1111/j.1467-9868.2010.00740.x
  • Zucco C (2019) Data Mining in Bioinformatics. In: Encyclopedia of Bioinformatics and Computational Biology. Elsevier, pp 328–335
  • Kantardzic M Data mining : concepts, models, methods, and algorithms
  • Hastie T, Tibshirani R, Friedman J (2009) Elements of Statistical Learning 2nd ed. Elements 27:745. https://doi.org/10.1007/978-0-387-84858-7
  • Kohavi R (1995) A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection
  • Shaikh SA (2011) Measures Derived from a 2 x 2 Table for an Accuracy of a Diagnostic Test. J Biom Biostat 02:1–4. https://doi.org/10.4172/2155-6180.1000128
There are 32 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Kemal Akyol 0000-0002-2272-5243

Publication Date September 30, 2021
Submission Date March 18, 2021
Published in Issue Year 2021 Volume: 9 Issue: 3

Cite

IEEE K. Akyol, “Feature Selection Based Data Mining Approach for Coronary Artery Disease Diagnosis”, APJES, vol. 9, no. 3, pp. 451–459, 2021, doi: 10.21541/apjes.899055.