Research Article
BibTex RIS Cite

Artificial Immune System with Special Selection for Stroke Prediction in İmbalanced Data

Year 2024, Volume: 12 Issue: 3, 1723 - 1738, 31.07.2024
https://doi.org/10.29130/dubited.1268348

Abstract

Stroke is a neurological disease caused by either bleeding or blockage in the brain, and it is becoming increasingly common worldwide. It can lead to direct deaths as well as disabilities. Due to the lack of a generally accepted and predictable diagnosis method, early diagnosis is a challenging topic. However, detecting recurrent stroke incidents is also crucial. Early stroke prediction has been studied numerous times in the literature by using artificial intelligence techniques, however, it remains an area open to development. In this study, a model is proposed to address the imbalance issue on a stroke dataset with limited patient data. An artificial immune system algorithm with parameters updated by the firefly algorithm is used for data balancing. The algorithm’s outputs were adjusted according to the One-Sided Selection model to improve the performance of the minority class. The model's efficiency is presented with performance metrics evaluated based on six different classification algorithms, namely Categorical Boosting Algorithm (CatBoost), Light Gradient Boosting Machine (LightGBMBoost), Gradient Boosting (GB), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Logistic Regression (LR). The proposed approach achieved effective results compared to previous studies, with accuracy, specificity, and sensitivity rates of 86%, 38%, and 87%, respectively.

References

  • [1] M. O. Owolabi et al., “The state of stroke services across the globe: Report of World Stroke Organization–World Health Organization surveys,” International Journal of Stroke, vol. 16, no. 8, pp. 889–901, May 2021, doi: https://doi.org/10.1177/17474930211019568.
  • [2] Y. Chen, K. T. Abel, J. T. Janecek, Y. Chen, K. Zheng, and S. C. Cramer, “Home-based technologies for stroke rehabilitation: A systematic review,” International Journal of Medical Informatics, vol. 123, pp. 11–22, Mar. 2019, doi: https://doi.org/10.1016/j.ijmedinf.2018.12.001.
  • [3] M. J. O’Donnell et al., “Global and regional effects of potentially modifiable risk factors associated with acute stroke in 32 countries (INTERSTROKE): a case-control study,” Lancet (London, England), vol. 388, no. 10046, pp. 761–75, 2016, doi: https://doi.org/10.1016/S0140-6736(16)30506-2.
  • [4] A. K. Arslan, C. Colak, and M. E. Sarihan, “Different medical data mining approaches based prediction of ischemic stroke,” Computer Methods and Programs in Biomedicine, vol. 130, pp. 87–92, Jul. 2016, doi: https://doi.org/10.1016/j.cmpb.2016.03.022.
  • [5] D. I. Puspitasari, A. F. Riza Kholdani, A. Dharmawati, M. E. Rosadi, and W. Mega Pradnya Dhuhita, “Stroke Disease Analysis and Classification Using Decision Tree and Random Forest Methods,” IEEE Xplore, Nov. 01, 2021. https://ieeexplore.ieee.org/document/9632906 (accessed Dec. 10, 2022).
  • [6] G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing, “Learning from class-imbalanced data: Review of methods and applications,” Expert Systems with Applications, vol. 73, pp. 220–239, May 2017, doi: https://doi.org/10.1016/j.eswa.2016.12.035.
  • [7] J. Li et al., “Adaptive Swarm Balancing Algorithms for rare-event prediction in imbalanced healthcare data,” PLOS ONE, vol. 12, no. 7, p. e0180830, Jul. 2017, doi: https://doi.org/10.1371/journal.pone.0180830. [8] F. Yagin, I. Cicek, and Z. Kucukakcali, “Classification of stroke with gradient boosting tree using smote-based oversampling method,” Medicine Science | International Medical Journal, vol. 10, no. 4, p. 1510, 2021, doi: https://doi.org/10.5455/medscience.2021.09.322. [9] G. Sailasya and G. L. A. Kumari, “Analyzing the Performance of Stroke Prediction using ML Classification Algorithms,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 6, 2021, doi: https://doi.org/10.14569/ijacsa.2021.0120662.
  • [10] C. Rana, N. Chitre, B. Poyekar, and P. Bide, “Stroke Prediction Using Smote-Tomek and Neural Network,” 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Jul. 2021, doi: https://doi.org/10.1109/icccnt51525.2021.9579763. [11] A. Dev and S. K. Malik, “Artificial Bee Colony Optimized Deep Neural Network Model for Handling Imbalanced Stroke Data,” International Journal of E-Health and Medical Communications, vol. 12, no. 5, pp. 67–83, Sep. 2021, doi: https://doi.org/10.4018/ijehmc.20210901.oa5.
  • [12] T. Liu, W. Fan, and C. Wu, “A hybrid machine learning approach to cerebral stroke prediction based on imbalanced medical dataset,” Artificial Intelligence in Medicine, vol. 101, p. 101723, Nov. 2019, doi: https://doi.org/10.1016/j.artmed.2019.101723.
  • [13] L. I. Santos et al., “Decision tree and artificial immune systems for stroke prediction in imbalanced data,” Expert Systems with Applications, vol. 191, p. 116221, Apr. 2022, doi: https://doi.org/10.1016/j.eswa.2021.116221.
  • [14] S. M. Hassan, S. A. Ali, B. Hassan, I. Hussain, M. Rafiq, and S. A. Awan, “Hybrid Features Binary Classification of Imbalance Stroke Patients Using Different Machine Learning Algorithms,” International Journal of Biology and Biomedical Engineering, vol. 16, pp. 154–160, Jan. 2022, doi: https://doi.org/10.46300/91011.2022.16.20.
  • [15] T. Ahammad, “Risk factors identification for stroke prognosis using machine learning algorithms,” Jordanian Journal of Computers and Information Technology, no. 0, p. 1, 2022, doi: https://doi.org/10.5455/jjcit.71-1652725746.
  • [16] E. L. Cooper, “Evolution of immune systems from self/not self to danger to artificial immune systems (AIS),” Physics of Life Reviews, vol. 7, no. 1, pp. 55–78, Mar. 2010, doi: https://doi.org/10.1016/j.plrev.2009.12.001.
  • [17] J. Timmis, A. Hone, T. Stibor, and E. Clark, “Theoretical advances in artificial immune systems,” Theoretical Computer Science, vol. 403, no. 1, pp. 11–32, Aug. 2008, doi: https://doi.org/10.1016/j.tcs.2008.02.011.
  • [18] E. L. Cooper, “Evolution of immune systems from self/not self to danger to artificial immune systems (AIS),” Physics of Life Reviews, vol. 7, no. 1, pp. 55–78, Mar. 2010, doi: https://doi.org/10.1016/j.plrev.2009.12.001.
  • [19] I. Fister Jr, X.-S. Yang, I. Fister, and J. Brest, “Memetic firefly algorithm for combinatorial optimization,” arXiv:1204.5165 [math], May 2012, Accessed: Feb. 19, 2023. [Online]. Available: https://arxiv.org/abs/1204.5165.
  • [20] N. V. Chawla, “Data Mining for Imbalanced Datasets: An Overview,” in Data Mining and Knowledge Discovery Handbook, 2009, pp. 875–886. doi: https://doi.org/10.1007/978-0-387-09823-4_45. [21] Kahraman, C., Engin, O. and Yilmaz, M.K. (2009) ‘A new artificial immune system algorithm for Multiobjective Fuzzy Flow Shop’, International Journal of Computational Intelligence Systems, 2(3), pp. 236–247. doi:10.1080/18756891.2009.9727656. [22] M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera, “A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 4, pp. 463–484, Jul. 2012, doi: https://doi.org/10.1109/tsmcc.2011.2161285.
  • [23] E.-H. A. Rady and A. S. Anwar, “Prediction of kidney disease stages using data mining algorithms,” Informatics in Medicine Unlocked, vol. 15, p. 100178, 2019, doi: https://doi.org/10.1016/j.imu.2019.100178.
  • [24] M. F. S. V. D’Angelo, R. M. Palhares, M. C. O. Camargos Filho, R. D. Maia, J. B. Mendes, and P. Ya. Ekel, “A new fault classification approach applied to Tennessee Eastman benchmark process,” Applied Soft Computing, vol. 49, pp. 676–686, Dec. 2016, doi: https://doi.org/10.1016/j.asoc.2016.08.040.
  • [25] T. Liu, “Data for: A hybrid machine learning approach to cerebral stroke prediction based on imbalanced medical-datasets,” Mendeley, http://dx.doi.org/10. 17632/X8YGRW87JW.1, 2019, URL: https://data.mendeley.com/datasets/x8ygrw87jw/1.

Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı ve Uygulaması

Year 2024, Volume: 12 Issue: 3, 1723 - 1738, 31.07.2024
https://doi.org/10.29130/dubited.1268348

Abstract

İnme, beyinde kanama ya da tıkanma sonucu oluşan nörolojik bir hastalıktır ve dünya genelinde giderek yaygınlaşmaktadır. Doğrudan ölümlere sebep olabildiği gibi sakatlıklara da yol açabilmektedir. Genel geçer öngörülebilir bir teşhis yöntemi bulunmadığından erken teşhisi oldukça zordur. Bununla birlikte, tekrarlanabilecek inme durumlarını tespit etmek de hayati bir önem taşımaktadır. Yapay zekâ teknikleri kullanılarak erken inme tahmini konusu literatürde birçok kez ele alınarak üzerinde çalışmalar yapılmış; ancak hala geliştirilmeye açık alanlardan birisidir. Bu çalışmada, hasta verilerinin azınlıkta olduğu bir inme veri kümesi üzerinde dengeleme sorununu gidermek amacıyla bir model önerilmektedir. Önerilen bu modelde, veri dengeleme işlemi için parametreleri ateş böceği algoritmasına göre güncellenen bir yapay bağışıklık sistemi algoritması kullanılmıştır. Kullanılan algoritma çıktıları, azınlık sınıfın performansını arttırmak amacıyla Tek Taraflı Seçilim modeline göre düzenlenmiştir. Modelin verimliliği, Kategorik Artırma Algoritması (CatBoost), Hafif Gradyan Artırma Makinesi (LightGBMBoost), Gradyan Artırma (Gradient Boosting - GB), Ekstrem Gradyan Arttırma (Extreme Gradient Boosting - XGBoost), Destek Vektör Makinası (Support Vector Machine - SVM) ve Lojistik Regresyon (Logistic Regression - LR) algoritması olmak üzere altı farklı sınıflandırma algoritmasına göre değerlendirilerek performans metrikleriyle sunulmuştur. Önerilen yaklaşımda doğruluk %86, özgüllük %38, hassasiyet %87 oranlarında elde edilerek literatürdeki çalışmalara kıyasla etkili sonuçlar üretildiği gösterilmiştir.

References

  • [1] M. O. Owolabi et al., “The state of stroke services across the globe: Report of World Stroke Organization–World Health Organization surveys,” International Journal of Stroke, vol. 16, no. 8, pp. 889–901, May 2021, doi: https://doi.org/10.1177/17474930211019568.
  • [2] Y. Chen, K. T. Abel, J. T. Janecek, Y. Chen, K. Zheng, and S. C. Cramer, “Home-based technologies for stroke rehabilitation: A systematic review,” International Journal of Medical Informatics, vol. 123, pp. 11–22, Mar. 2019, doi: https://doi.org/10.1016/j.ijmedinf.2018.12.001.
  • [3] M. J. O’Donnell et al., “Global and regional effects of potentially modifiable risk factors associated with acute stroke in 32 countries (INTERSTROKE): a case-control study,” Lancet (London, England), vol. 388, no. 10046, pp. 761–75, 2016, doi: https://doi.org/10.1016/S0140-6736(16)30506-2.
  • [4] A. K. Arslan, C. Colak, and M. E. Sarihan, “Different medical data mining approaches based prediction of ischemic stroke,” Computer Methods and Programs in Biomedicine, vol. 130, pp. 87–92, Jul. 2016, doi: https://doi.org/10.1016/j.cmpb.2016.03.022.
  • [5] D. I. Puspitasari, A. F. Riza Kholdani, A. Dharmawati, M. E. Rosadi, and W. Mega Pradnya Dhuhita, “Stroke Disease Analysis and Classification Using Decision Tree and Random Forest Methods,” IEEE Xplore, Nov. 01, 2021. https://ieeexplore.ieee.org/document/9632906 (accessed Dec. 10, 2022).
  • [6] G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing, “Learning from class-imbalanced data: Review of methods and applications,” Expert Systems with Applications, vol. 73, pp. 220–239, May 2017, doi: https://doi.org/10.1016/j.eswa.2016.12.035.
  • [7] J. Li et al., “Adaptive Swarm Balancing Algorithms for rare-event prediction in imbalanced healthcare data,” PLOS ONE, vol. 12, no. 7, p. e0180830, Jul. 2017, doi: https://doi.org/10.1371/journal.pone.0180830. [8] F. Yagin, I. Cicek, and Z. Kucukakcali, “Classification of stroke with gradient boosting tree using smote-based oversampling method,” Medicine Science | International Medical Journal, vol. 10, no. 4, p. 1510, 2021, doi: https://doi.org/10.5455/medscience.2021.09.322. [9] G. Sailasya and G. L. A. Kumari, “Analyzing the Performance of Stroke Prediction using ML Classification Algorithms,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 6, 2021, doi: https://doi.org/10.14569/ijacsa.2021.0120662.
  • [10] C. Rana, N. Chitre, B. Poyekar, and P. Bide, “Stroke Prediction Using Smote-Tomek and Neural Network,” 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Jul. 2021, doi: https://doi.org/10.1109/icccnt51525.2021.9579763. [11] A. Dev and S. K. Malik, “Artificial Bee Colony Optimized Deep Neural Network Model for Handling Imbalanced Stroke Data,” International Journal of E-Health and Medical Communications, vol. 12, no. 5, pp. 67–83, Sep. 2021, doi: https://doi.org/10.4018/ijehmc.20210901.oa5.
  • [12] T. Liu, W. Fan, and C. Wu, “A hybrid machine learning approach to cerebral stroke prediction based on imbalanced medical dataset,” Artificial Intelligence in Medicine, vol. 101, p. 101723, Nov. 2019, doi: https://doi.org/10.1016/j.artmed.2019.101723.
  • [13] L. I. Santos et al., “Decision tree and artificial immune systems for stroke prediction in imbalanced data,” Expert Systems with Applications, vol. 191, p. 116221, Apr. 2022, doi: https://doi.org/10.1016/j.eswa.2021.116221.
  • [14] S. M. Hassan, S. A. Ali, B. Hassan, I. Hussain, M. Rafiq, and S. A. Awan, “Hybrid Features Binary Classification of Imbalance Stroke Patients Using Different Machine Learning Algorithms,” International Journal of Biology and Biomedical Engineering, vol. 16, pp. 154–160, Jan. 2022, doi: https://doi.org/10.46300/91011.2022.16.20.
  • [15] T. Ahammad, “Risk factors identification for stroke prognosis using machine learning algorithms,” Jordanian Journal of Computers and Information Technology, no. 0, p. 1, 2022, doi: https://doi.org/10.5455/jjcit.71-1652725746.
  • [16] E. L. Cooper, “Evolution of immune systems from self/not self to danger to artificial immune systems (AIS),” Physics of Life Reviews, vol. 7, no. 1, pp. 55–78, Mar. 2010, doi: https://doi.org/10.1016/j.plrev.2009.12.001.
  • [17] J. Timmis, A. Hone, T. Stibor, and E. Clark, “Theoretical advances in artificial immune systems,” Theoretical Computer Science, vol. 403, no. 1, pp. 11–32, Aug. 2008, doi: https://doi.org/10.1016/j.tcs.2008.02.011.
  • [18] E. L. Cooper, “Evolution of immune systems from self/not self to danger to artificial immune systems (AIS),” Physics of Life Reviews, vol. 7, no. 1, pp. 55–78, Mar. 2010, doi: https://doi.org/10.1016/j.plrev.2009.12.001.
  • [19] I. Fister Jr, X.-S. Yang, I. Fister, and J. Brest, “Memetic firefly algorithm for combinatorial optimization,” arXiv:1204.5165 [math], May 2012, Accessed: Feb. 19, 2023. [Online]. Available: https://arxiv.org/abs/1204.5165.
  • [20] N. V. Chawla, “Data Mining for Imbalanced Datasets: An Overview,” in Data Mining and Knowledge Discovery Handbook, 2009, pp. 875–886. doi: https://doi.org/10.1007/978-0-387-09823-4_45. [21] Kahraman, C., Engin, O. and Yilmaz, M.K. (2009) ‘A new artificial immune system algorithm for Multiobjective Fuzzy Flow Shop’, International Journal of Computational Intelligence Systems, 2(3), pp. 236–247. doi:10.1080/18756891.2009.9727656. [22] M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera, “A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 4, pp. 463–484, Jul. 2012, doi: https://doi.org/10.1109/tsmcc.2011.2161285.
  • [23] E.-H. A. Rady and A. S. Anwar, “Prediction of kidney disease stages using data mining algorithms,” Informatics in Medicine Unlocked, vol. 15, p. 100178, 2019, doi: https://doi.org/10.1016/j.imu.2019.100178.
  • [24] M. F. S. V. D’Angelo, R. M. Palhares, M. C. O. Camargos Filho, R. D. Maia, J. B. Mendes, and P. Ya. Ekel, “A new fault classification approach applied to Tennessee Eastman benchmark process,” Applied Soft Computing, vol. 49, pp. 676–686, Dec. 2016, doi: https://doi.org/10.1016/j.asoc.2016.08.040.
  • [25] T. Liu, “Data for: A hybrid machine learning approach to cerebral stroke prediction based on imbalanced medical-datasets,” Mendeley, http://dx.doi.org/10. 17632/X8YGRW87JW.1, 2019, URL: https://data.mendeley.com/datasets/x8ygrw87jw/1.
There are 20 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Şerife Çelikbaş 0000-0001-6118-9335

Zeynep Orman 0000-0002-0205-4198

Türker Aksoy 0000-0001-5258-9038

Derya Yılmaz Baysoy 0000-0002-8101-9779

Publication Date July 31, 2024
Published in Issue Year 2024 Volume: 12 Issue: 3

Cite

APA Çelikbaş, Ş., Orman, Z., Aksoy, T., Yılmaz Baysoy, D. (2024). Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı ve Uygulaması. Duzce University Journal of Science and Technology, 12(3), 1723-1738. https://doi.org/10.29130/dubited.1268348
AMA Çelikbaş Ş, Orman Z, Aksoy T, Yılmaz Baysoy D. Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı ve Uygulaması. DUBİTED. July 2024;12(3):1723-1738. doi:10.29130/dubited.1268348
Chicago Çelikbaş, Şerife, Zeynep Orman, Türker Aksoy, and Derya Yılmaz Baysoy. “Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı Ve Uygulaması”. Duzce University Journal of Science and Technology 12, no. 3 (July 2024): 1723-38. https://doi.org/10.29130/dubited.1268348.
EndNote Çelikbaş Ş, Orman Z, Aksoy T, Yılmaz Baysoy D (July 1, 2024) Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı ve Uygulaması. Duzce University Journal of Science and Technology 12 3 1723–1738.
IEEE Ş. Çelikbaş, Z. Orman, T. Aksoy, and D. Yılmaz Baysoy, “Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı ve Uygulaması”, DUBİTED, vol. 12, no. 3, pp. 1723–1738, 2024, doi: 10.29130/dubited.1268348.
ISNAD Çelikbaş, Şerife et al. “Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı Ve Uygulaması”. Duzce University Journal of Science and Technology 12/3 (July 2024), 1723-1738. https://doi.org/10.29130/dubited.1268348.
JAMA Çelikbaş Ş, Orman Z, Aksoy T, Yılmaz Baysoy D. Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı ve Uygulaması. DUBİTED. 2024;12:1723–1738.
MLA Çelikbaş, Şerife et al. “Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı Ve Uygulaması”. Duzce University Journal of Science and Technology, vol. 12, no. 3, 2024, pp. 1723-38, doi:10.29130/dubited.1268348.
Vancouver Çelikbaş Ş, Orman Z, Aksoy T, Yılmaz Baysoy D. Dengesiz Veri Kümelerinde İnme Tahmini İçin Özel Seçilimli Hibrit Dengeleme Yöntemi Tasarımı ve Uygulaması. DUBİTED. 2024;12(3):1723-38.