Comparison of Machine Learning and Deep Learning Techniques for Stroke Prediction

Süphan Yakut; Necaattin Barışçı

doi:10.29137/umagd.1432162

Research Article

Felç Tahmini için Makine Öğrenimi ve Derin Öğrenme Tekniklerinin Karşılaştırılması

Year 2025, Volume: 17 Issue: 1, 11 - 27, 15.03.2025

Süphan Yakut , Necaattin Barışçı

https://doi.org/10.29137/umagd.1432162

Abstract

Sağlık alanındaki hastalıkların erken teşhisi ve yönetimi günümüzde kritik bir öneme sahiptir. Bu karşılaştırmalı çalışma, inme riskinin belirlenmesi için çeşitli makine öğrenimi ve derin öğrenme teknikleri (Lojistik Regresyon, Karar Ağaçları, Rastgele Orman, Destek Vektör Makineleri, Evrişimsel Sinir Ağları, Uzun Kısa Süreli Bellek, İki Yönlü Uzun Kısa Süreli Bellek) kullanarak geliştirilen modelleri değerlendirmektedir. Elde edilen en yüksek doğruluk değerleri şu şekildedir: LR (0,96), KA (0,95), RO (0,95), DVM (0,96), ESA (0,9442), UKSB (0,9442), İY-UKSB (0,9442). Çalışmada, çeşitli klinik parametreleri içeren bir veri setini (healthcare-dataset-stroke-data/Fedesoriano) kullanılarak analiz yapılmıştır. Parametreler arasında yaş, cinsiyet, hipertansiyon, kalp hastalığı, evlilik durumu, çalışma tipi, ikamet türü, ortalama glikoz seviyesi, VKİ ve sigara kullanımı bulunmaktadır. Bu karşılaştırmalı çalışma, farklı makine öğrenimi ve derin öğrenme modellerinin inme riskini belirlemede etkinliklerini değerlendirerek, sağlık alanında önemli bir katkı sağlamayı hedeflemektedir.

Keywords

“Sağlık, İnme, Makine öğrenimi, Derin öğrenme, Veri analizi”

References

Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32.
Breiman, L., Friedman, J., Olshen, R. A., & Stone, C. J. (1986). Classification and Regression Trees. Wadsworth & Brooks/Cole.
Chandra, B., Kausalya, K., & Ciddarth, R. M. (2023, February). Prognosis of stroke using machine learning algorithms. In 2023 7th International Conference on Computing Methodologies and Communication (ICCMC) (pp. 1-6). IEEE.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297.
Dietterich, T. G. (2000, June). Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems (pp. 1-15). Springer Berlin Heidelberg.
Emon, M. U., Keya, M. S., Meghla, T. I., Rahman, M. M., Al Mamun, M. S., & Kaiser, M. S. (2020, November). Performance analysis of machine learning approaches in stroke prediction. In 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 1464-1469). IEEE.
Federisino (2021). Stroke prediction dataset [Data set]. Kaggle. https://www.kaggle.com/datasets/fedesoriano/stroke-prediction dataset
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression (3rd ed.). Wiley.
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6), 417-441.
Iglewicz, B., & Hoaglin, D. C. (1993). How to detect and handle outliers. Sage Publications.
Jain, A. K., Duin, R. P. W., & Mao, J. (2000). Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 4-37.
Jolliffe, I. T. (2002). Principal Component Analysis. Springer Series in Statistics.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (Vol. 2, pp. 1137-1143).
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097-1105.
Kuksal, R., Vaqur, M., Bhatt, A., Chander, H., & Joshi, K. (2023, January). Stroke disease detection and prediction using extreme gradient boosting. In 2023 International Conference on Artificial Intelligence and Smart Communication (AISC) (pp. 187-191). IEEE.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
Lee, H., Lee, E. J., Ham, S., Lee, H. B., Lee, J. S., Kwon, S. U., ... & Kang, D. W. (2020). Machine learning approach to identify stroke within 4.5 hours. Stroke, 51(3), 860-866.
Oğuz, Ö., Bayır, S., & Badem, H. (2021). Makine öğrenmesi yöntemlerinin felç riskinin belirlenmesinde performansı: Karşılaştırmalı bir çalışma. Computer Science, (Special), 274-287.
Pallavi, K., & Saravananthirunavakarasu. (2022). Classification of stroke disease using machine learning algorithms. https://ijcrt.org/papers/IJCRT2208395.pdf
Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 26732681.
Sevli, O. (2021). İnme (Felç) riskinin makine öğrenmesi kullanılarak tespiti (Determining the risk of stroke using machine learning). https://www.researchgate.net/publication/351776808
Singh, M. S., & Choudhary, P. (2017, August). Stroke prediction using artificial intelligence. In 2017 8th Annual Industrial Automation and Electromechanical Engineering Conference (IEMECON) (pp. 158-161). IEEE.
Srinivas, A., & Mosiganti, J. P. (2023). A brain stroke detection model using soft voting based ensemble machine learning classifier. Measurement: Sensors, 29, 100871.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288.
World Health Organization. (2022, October 29). World Stroke Day 2022. Retrieved from https://www.who.int/srilanka/news/detail/29-10-2022-world-stroke-day-2022

Comparison of Machine Learning and Deep Learning Techniques for Stroke Prediction

Year 2025, Volume: 17 Issue: 1, 11 - 27, 15.03.2025

Süphan Yakut , Necaattin Barışçı

https://doi.org/10.29137/umagd.1432162

Abstract

The early diagnosis and management of diseases in medicine have become critically important in today's world. This comparative thesis focuses on evaluating models developed using various machine learning and deep learning techniques (Logistic Regression, Decision Trees, Random Forest, Support Vector Machine, CNN 1-D, LSTM 1-D, BİLSTM 1-D) to determine the risk of stroke. The obtained highest accuracy values are as follows: LR (0.96), DT (0.95), RF (0.95), SVM (0.96), CNN (0.9442), LSTM (0.9442), BİLSTM (0.9442). The study analyzes a dataset containing various clinical parameters (age, gender, hypertension, heart disease, marital status, occupation type, residence type, average glucose level, BMI, and smoking) using the healthcare-dataset-stroke-data/Fedesoriano. This comparative research aims to make a significant contribution to the field of health by evaluating the effectiveness of different machine learning and deep learning models in determining the risk of stroke.

Keywords

“Health, Stroke, Machine learning, Deep learning, Data analysis”

References

Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32.
Breiman, L., Friedman, J., Olshen, R. A., & Stone, C. J. (1986). Classification and Regression Trees. Wadsworth & Brooks/Cole.
Chandra, B., Kausalya, K., & Ciddarth, R. M. (2023, February). Prognosis of stroke using machine learning algorithms. In 2023 7th International Conference on Computing Methodologies and Communication (ICCMC) (pp. 1-6). IEEE.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297.
Dietterich, T. G. (2000, June). Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems (pp. 1-15). Springer Berlin Heidelberg.
Emon, M. U., Keya, M. S., Meghla, T. I., Rahman, M. M., Al Mamun, M. S., & Kaiser, M. S. (2020, November). Performance analysis of machine learning approaches in stroke prediction. In 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 1464-1469). IEEE.
Federisino (2021). Stroke prediction dataset [Data set]. Kaggle. https://www.kaggle.com/datasets/fedesoriano/stroke-prediction dataset
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression (3rd ed.). Wiley.
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6), 417-441.
Iglewicz, B., & Hoaglin, D. C. (1993). How to detect and handle outliers. Sage Publications.
Jain, A. K., Duin, R. P. W., & Mao, J. (2000). Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 4-37.
Jolliffe, I. T. (2002). Principal Component Analysis. Springer Series in Statistics.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (Vol. 2, pp. 1137-1143).
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097-1105.
Kuksal, R., Vaqur, M., Bhatt, A., Chander, H., & Joshi, K. (2023, January). Stroke disease detection and prediction using extreme gradient boosting. In 2023 International Conference on Artificial Intelligence and Smart Communication (AISC) (pp. 187-191). IEEE.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
Lee, H., Lee, E. J., Ham, S., Lee, H. B., Lee, J. S., Kwon, S. U., ... & Kang, D. W. (2020). Machine learning approach to identify stroke within 4.5 hours. Stroke, 51(3), 860-866.
Oğuz, Ö., Bayır, S., & Badem, H. (2021). Makine öğrenmesi yöntemlerinin felç riskinin belirlenmesinde performansı: Karşılaştırmalı bir çalışma. Computer Science, (Special), 274-287.
Pallavi, K., & Saravananthirunavakarasu. (2022). Classification of stroke disease using machine learning algorithms. https://ijcrt.org/papers/IJCRT2208395.pdf
Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 26732681.
Sevli, O. (2021). İnme (Felç) riskinin makine öğrenmesi kullanılarak tespiti (Determining the risk of stroke using machine learning). https://www.researchgate.net/publication/351776808
Singh, M. S., & Choudhary, P. (2017, August). Stroke prediction using artificial intelligence. In 2017 8th Annual Industrial Automation and Electromechanical Engineering Conference (IEMECON) (pp. 158-161). IEEE.
Srinivas, A., & Mosiganti, J. P. (2023). A brain stroke detection model using soft voting based ensemble machine learning classifier. Measurement: Sensors, 29, 100871.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288.
World Health Organization. (2022, October 29). World Stroke Day 2022. Retrieved from https://www.who.int/srilanka/news/detail/29-10-2022-world-stroke-day-2022

There are 25 citations in total.

Details

Primary Language	English
Subjects	Information Systems (Other)
Journal Section	Articles
Authors	Süphan Yakut 0009-0003-1411-8210 Necaattin Barışçı 0000-0002-8762-5091
Early Pub Date	March 3, 2025
Publication Date	March 15, 2025
Submission Date	February 9, 2024
Acceptance Date	September 29, 2024
Published in Issue	Year 2025 Volume: 17 Issue: 1

Cite

APA	Yakut, S., & Barışçı, N. (2025). Comparison of Machine Learning and Deep Learning Techniques for Stroke Prediction. International Journal of Engineering Research and Development, 17(1), 11-27. https://doi.org/10.29137/umagd.1432162

Download Cover Image

Article Files

Full Text