Research Article
BibTex RIS Cite

Investigation Of Diabetes Data with Permutation Feature Importance Based Deep Learning Methods

Year 2022, , 916 - 930, 15.12.2022
https://doi.org/10.31466/kfbd.1174591

Abstract

Diabetes is a metabolic disease that occurs due to high blood sugar levels in the body. If it is not treated, diabetes-related health problems may occur in many vital organs of the body. With the latest techniques in machine learning technologies, some of the applications can be used to diagnose diabetes at an early stage. In this study, the data set from the laboratories of Medical City Hospital Endocrinology and Diabetes Specialization Center Al Kindy Training Hospital was used. The dataset consists of 3 different classes: normal, pre-diabetes and diabetes. The obtained diabetes dataset was classified using Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) deep learning methods. The classification performance of each algorithm was evaluated with accuracy, precision, sensitivity and F score performance parameters. Among the deep learning methods, 96.5% classification accuracy was obtained with the LSTM algorithm, 94% with the CNN algorithm and 93% with the GRU algorithm. In this study, the Permutation Feature Importance (PFI) method was also used to determine the effect of features in the data set on classification performance. With this method, study reveals that the HbA1c feature is an important parameter in the used deep learning methods. Both the results obtained with the LSTM algorithm and the determination of the most important feature affecting the classification success reveal the originality of the study. It shows that the obtained results will provide healthcare professionals with a prognostic tool for effective decision-making that can assist in the early detection of the disease.

References

  • Ahlam, Rashid. 2020. “Diabetes Dataset.”
  • Alhassan, Zakhriya, A. Stephen McGough, Riyad Alshammari, Tahani Daghstani, David Budgen, and Noura Al Moubayed. 2018. “Type-2 Diabetes Mellitus Diagnosis from Time Series Clinical Data Using Deep Learning Models.” Pp. 468–78 in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 11141 LNCS. Springer Verlag.
  • Altmann, André, Laura Toloşi, Oliver Sander, and Thomas Lengauer. 2010. “Permutation Importance: A Corrected Feature Importance Measure.” Bioinformatics 26(10):1340–47. doi: 10.1093/bioinformatics/btq134.
  • Ayata, Deger, Murat Saraclar, and Arzucan Ozgur. 2017. “Uzun-Kisa Süreli Bellek Yinelemeli Aǧlar Ile Politik Yönelimlerin/Duygularin Twitter Üzerinden Tahminlenmesi.” in 2017 25th Signal Processing and Communications Applications Conference, SIU 2017. Institute of Electrical and Electronics Engineers Inc.
  • Battineni, Gopi, Getu Gamo Sagaro, Chintalapudi Nalini, Francesco Amenta, and Seyed Khosrow Tayebati. 2019. “Comparative Machine-Learning Approach: A Follow-up Study on Type 2 Diabetes Predictions by Cross-Validation Methods.” Machines 7(4). doi: 10.3390/machines7040074.
  • Bhardwaj, Sanjeev, Sachin Jain, Naresh Kumar Trivedi, Ajay Kumar, and Raj Gaurang Tiwari. 2022. “Intelligent Heart Disease Prediction System Using Data Mining Modeling Techniques.” Lecture Notes in Networks and Systems 425:881–91. doi: 10.1007/978-981-19-0707-4_79.
  • Bişkin, Osman Tayfun, and Ahmet Çifçi. 2021. “Forecasting of Turkey’s Electrical Energy Consumption Using LSTM and GRU Networks.” Bilecik Şeyh Edebali Üniversitesi Fen Bilimleri Dergisi. doi: 10.35193/bseufbd.935824.
  • Chen, Zuyan, Jared Walters, Gang Xiao, and Shuai Li. 2022. “An Enhanced GRU Model With Application to Manipulator Trajectory Tracking.” EAI Endorsed Transactions on AI and Robotics 1:1–11. doi: 10.4108/airo.v1i.7.
  • Cho, Kyunghyun, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. “Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation.” Pp. 1724–34 in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics.
  • Er, Mehmet Bilal, and İbrahim Işık. 2021. “LSTM Tabanlı Derin Ağlar Kullanılarak Diyabet Hastalığı Tahmini.” Türk Doğa ve Fen Dergisi. doi: 10.46810/tdfd.818528.
  • Fazakis, Nikos, Otilia Kocsis, Elias Dritsas, Sotiris Alexiou, Nikos Fakotakis, and Konstantinos Moustakas. 2021. “Machine Learning Tools for Long-Term Type 2 Diabetes Risk Prediction.” IEEE Access 9:103737–57. doi: 10.1109/ACCESS.2021.3098691.
  • Fischer, Thomas, and Christopher Krauss. 2018. “Deep Learning with Long Short-Term Memory Networks for Financial Market Predictions.” European Journal of Operational Research 270(2):654–69. doi: 10.1016/j.ejor.2017.11.054.
  • Hochreiter, Sepp, and Jürgen Schmindhuber. 1997. “Long Short-Term Memory.” Neural Computation 9(8):1–32.
  • Ibrahim, Bassem, and Roozbeh Jafari. 2019. “Cuffless Blood Pressure Monitoring from an Array of Wrist Bio-Impedance Sensors Using Subject-Specific Regression Models: Proof of Concept.” IEEE Transactions on Biomedical Circuits and Systems. doi: 10.1109/TBCAS.2019.2946661.
  • Kandhasamy, J. Pradeep, and S. Balamurali. 2015. “Performance Analysis of Classifier Models to Predict Diabetes Mellitus.” Pp. 45–51 in Procedia Computer Science. Vol. 47. Elsevier B.V.
  • Karabiber, Cansu, and Nazan Savaş. 2021. “Birinci Basamak Merkez Laboratuvarı HbA1c Verilerine Göre XXXX’da Glisemik Kontrol Durumu ve İlişkili Faktörler.” Türkiye Halk Sağlığı Dergisi. doi: 10.20518/tjph.853697.
  • Kesici, Mert. 2019. “Güç Sistemlerinde Geçici Hal Kararsızlığının ve Gelişiminin Derin Öğrenme ve Karar Ağacı Tabanlı Yöntemler Ile Geniş Alan Ölçümlerine Dayalı Olarak Erken Kestirimi.” Istanbul Technical University.
  • Kumari, Saloni, Deepika Kumar, and Mamta Mittal. 2021. “An Ensemble Approach for Classification and Prediction of Diabetes Mellitus Using Soft Voting Classifier.” International Journal of Cognitive Computing in Engineering 2:40–46. doi: 10.1016/j.ijcce.2021.01.001.
  • Makroum, Mohammed Amine, Mehdi Adda, Abdenour Bouzouane, and Hussein Ibrahim. 2022. “Machine Learning and Smart Devices for Diabetes Management: Systematic Review.” Sensors 22(5).
  • Molnar, Christoph. 2022. Interpretable Machine Learning : A Guide for Making Black Box Models Explainable. Second Edi. christophm.github.io.
  • O’Shea, Keiron, and Ryan Nash. 2015. “An Introduction to Convolutional Neural Networks.”
  • Otchere, Daniel Asante, Mary Aboagye, Mohammed Ayoub, Abdalla Mohammed, and Thomas Boahen Boakye. 2022. Enhancing Drilling Fluid Lost-Circulation Prediction Using Model Agnostic and Supervised Machine Learning.
  • Peng, Min, Chongyang Wang, Tong Chen, and Guangyuan Liu. 2016. “NIRFaceNet: A Convolutional Neural Network for near-Infrared Face Identification.” Information (Switzerland) 7(4). doi: 10.3390/info7040061.
  • Qawqzeh, Yousef K., Abdullah S. Bajahzar, Mahdi Jemmali, Mohammad Mahmood Otoom, and Adel Thaljaoui. 2020. “Classification of Diabetes Using Photoplethysmogram (PPG) Waveform Analysis: Logistic Regression Modeling.” BioMed Research International 2020. doi: 10.1155/2020/3764653.
  • Rajput, Minakshi R., and Sushant S. Khedgikar. 2022. “Diabetes Prediction and Analysis Using Medical Attributes: A Machine Learning Approach.” Journal of Xi’an University of Architecture & Technology 14(1):98–103. doi: 10.37896/JXAT14.01/314405.
  • Rengasamy, Divish, Benjamin C. Rothwell, and Grazziela P. Figueredo. 2021. “Towards a More Reliable Interpretation of Machine Learning Outputs for Safety-Critical Systems Using Feature Importance Fusion.” Applied Sciences (Switzerland) 11(24). doi: 10.3390/app112411854.
  • Sadeghi, Somayeh, Davood Khalili, Azra Ramezankhani, Mohammad Ali Mansournia, and Mahboubeh Parsaeian. 2022. “Diabetes Mellitus Risk Prediction in the Presence of Class Imbalance Using Flexible Machine Learning Methods.” BMC Medical Informatics and Decision Making 22(1). doi: 10.1186/s12911-022-01775-z.
  • Sagheer, Alaa, and Mostafa Kotb. 2019. “Time Series Forecasting of Petroleum Production Using Deep LSTM Recurrent Networks.” Neurocomputing 323:203–13. doi: 10.1016/j.neucom.2018.09.082.
  • Shishvan, Omid Rajabi, Daphney Stavroula Zois, and Tolga Soyata. 2018. “Machine Intelligence in Healthcare and Medical Cyber Physical Systems: A Survey.” IEEE Access 6:46419–94.
  • Sisodia, Deepti, and Dilip Singh Sisodia. 2018. “Prediction of Diabetes Using Classification Algorithms.” Pp. 1578–85 in Procedia Computer Science. Vol. 132. Elsevier B.V.
  • Sun, Yun Lei, and Da Lin Zhang. 2019. “Machine Learning Techniques for Screening and Diagnosis of Diabetes: A Survey.” Tehnicki Vjesnik 26(3):872–80.
  • Swapna, G., K. P. Soman, and R. Vinayakumar. 2018. “Automated Detection of Diabetes Using CNN and CNN-LSTM Network and Heart Rate Signals.” Pp. 1253–62 in Procedia Computer Science. Vol. 132. Elsevier B.V.
  • Tafa, Zhilbert, Nerxhivane Pervetica, and Bertran Karahoda. 2015. “An Intelligent System for Diabetes Prediction.” Pp. 378–82 in Proceedings - 2015 4th Mediterranean Conference on Embedded Computing, MECO 2015 - Including ECyPS 2015, BioEMIS 2015, BioICT 2015, MECO-Student Challenge 2015. Institute of Electrical and Electronics Engineers Inc.
  • Wang, Huaizhi, Haiyan Yi, Jianchun Peng, Guibin Wang, Yitao Liu, Hui Jiang, and Wenxin Liu. 2017. “Deterministic and Probabilistic Forecasting of Photovoltaic Power Based on Deep Convolutional Neural Network.” Energy Conversion and Management 153:409–22. doi: 10.1016/j.enconman.2017.10.008.
  • Xiao, Yuelei, and Yang Yin. 2019. “Hybrid LSTM Neural Network for Short-Term Traffic Flow Prediction.” Information (Switzerland) 10(3). doi: 10.3390/info10030105.

Diyabet Verilerinin Permütasyon Önem Özelliği Temelli Derin Öğrenme Yöntemleriyle İncelenmesi

Year 2022, , 916 - 930, 15.12.2022
https://doi.org/10.31466/kfbd.1174591

Abstract

Diyabet, vücuttaki yüksek kan şekeri seviyesi nedeniyle meydana gelen metabolik bir hastalıktır. Tedavi edilmediği takdirde, vücudun birçok hayati organında diyabete bağlı sağlık sorunları meydana gelebilir. Makine öğrenme teknolojilerindeki son teknikler ile diyabet hastalığını erken bir aşamada teşhis edebilen uygulamalar kullanılabilir. Bu çalışmada Medical City Hastanesi Endokrinoloji ve Diyabet Uzmanlık Merkezi Al Kindy Eğitim Hastanesi laboratuvarlarından elde edilen veri seti kullanılmıştır. Veri seti, Normal, pre diyabet ve diyabet şeklinde 3 farklı sınıftan oluşmaktadır. Elde edilen diyabet veri seti Uzun-Kısa Vadeli Bellek (LSTM), Evrişimsel Sinir Ağları (CNN) ve Geçitli Tekrarlayan Birim (GRU) derin öğrenme yöntemleri kullanılarak sınıflandırılmıştır. Her algoritmanın sınıflandırma başarımı; doğruluk, kesinlik, duyarlılık ve F skor başarım parametreleri ile değerlendirilmiştir. Derin öğrenme yöntemlerinden, LSTM algoritmasıyla %96.5, CNN algoritmasıyla % 94 ve GRU algoritmasıyla %93 sınıflandırma doğruluğu elde edilmiştir. Bu çalışmada ayrıca veri setindeki özelliklerin sınıflandırma başarımına etkisini belirlemek için Permütasyon önem özelliği yöntemi de kullanılmıştır. Bu yöntem ile HbA1c özelliğinin kullanılan derin öğrenme yöntemlerinde önemli bir parametre olduğu ortaya konulmuştur. Gerek LSTM algoritması ile elde edilen sonuçlar, gerekse sınıflandırma başarısına etki eden en önemli özelliğin tespiti çalışmanın özgünlüğünü ortaya koymaktadır. Elde edilen sonuçların sağlık çalışanlarına hastalığın erken tespitine yardımcı olabilecek etkin karar verme için prognostik araç sağlayacağını göstermektedir.

References

  • Ahlam, Rashid. 2020. “Diabetes Dataset.”
  • Alhassan, Zakhriya, A. Stephen McGough, Riyad Alshammari, Tahani Daghstani, David Budgen, and Noura Al Moubayed. 2018. “Type-2 Diabetes Mellitus Diagnosis from Time Series Clinical Data Using Deep Learning Models.” Pp. 468–78 in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 11141 LNCS. Springer Verlag.
  • Altmann, André, Laura Toloşi, Oliver Sander, and Thomas Lengauer. 2010. “Permutation Importance: A Corrected Feature Importance Measure.” Bioinformatics 26(10):1340–47. doi: 10.1093/bioinformatics/btq134.
  • Ayata, Deger, Murat Saraclar, and Arzucan Ozgur. 2017. “Uzun-Kisa Süreli Bellek Yinelemeli Aǧlar Ile Politik Yönelimlerin/Duygularin Twitter Üzerinden Tahminlenmesi.” in 2017 25th Signal Processing and Communications Applications Conference, SIU 2017. Institute of Electrical and Electronics Engineers Inc.
  • Battineni, Gopi, Getu Gamo Sagaro, Chintalapudi Nalini, Francesco Amenta, and Seyed Khosrow Tayebati. 2019. “Comparative Machine-Learning Approach: A Follow-up Study on Type 2 Diabetes Predictions by Cross-Validation Methods.” Machines 7(4). doi: 10.3390/machines7040074.
  • Bhardwaj, Sanjeev, Sachin Jain, Naresh Kumar Trivedi, Ajay Kumar, and Raj Gaurang Tiwari. 2022. “Intelligent Heart Disease Prediction System Using Data Mining Modeling Techniques.” Lecture Notes in Networks and Systems 425:881–91. doi: 10.1007/978-981-19-0707-4_79.
  • Bişkin, Osman Tayfun, and Ahmet Çifçi. 2021. “Forecasting of Turkey’s Electrical Energy Consumption Using LSTM and GRU Networks.” Bilecik Şeyh Edebali Üniversitesi Fen Bilimleri Dergisi. doi: 10.35193/bseufbd.935824.
  • Chen, Zuyan, Jared Walters, Gang Xiao, and Shuai Li. 2022. “An Enhanced GRU Model With Application to Manipulator Trajectory Tracking.” EAI Endorsed Transactions on AI and Robotics 1:1–11. doi: 10.4108/airo.v1i.7.
  • Cho, Kyunghyun, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. “Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation.” Pp. 1724–34 in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics.
  • Er, Mehmet Bilal, and İbrahim Işık. 2021. “LSTM Tabanlı Derin Ağlar Kullanılarak Diyabet Hastalığı Tahmini.” Türk Doğa ve Fen Dergisi. doi: 10.46810/tdfd.818528.
  • Fazakis, Nikos, Otilia Kocsis, Elias Dritsas, Sotiris Alexiou, Nikos Fakotakis, and Konstantinos Moustakas. 2021. “Machine Learning Tools for Long-Term Type 2 Diabetes Risk Prediction.” IEEE Access 9:103737–57. doi: 10.1109/ACCESS.2021.3098691.
  • Fischer, Thomas, and Christopher Krauss. 2018. “Deep Learning with Long Short-Term Memory Networks for Financial Market Predictions.” European Journal of Operational Research 270(2):654–69. doi: 10.1016/j.ejor.2017.11.054.
  • Hochreiter, Sepp, and Jürgen Schmindhuber. 1997. “Long Short-Term Memory.” Neural Computation 9(8):1–32.
  • Ibrahim, Bassem, and Roozbeh Jafari. 2019. “Cuffless Blood Pressure Monitoring from an Array of Wrist Bio-Impedance Sensors Using Subject-Specific Regression Models: Proof of Concept.” IEEE Transactions on Biomedical Circuits and Systems. doi: 10.1109/TBCAS.2019.2946661.
  • Kandhasamy, J. Pradeep, and S. Balamurali. 2015. “Performance Analysis of Classifier Models to Predict Diabetes Mellitus.” Pp. 45–51 in Procedia Computer Science. Vol. 47. Elsevier B.V.
  • Karabiber, Cansu, and Nazan Savaş. 2021. “Birinci Basamak Merkez Laboratuvarı HbA1c Verilerine Göre XXXX’da Glisemik Kontrol Durumu ve İlişkili Faktörler.” Türkiye Halk Sağlığı Dergisi. doi: 10.20518/tjph.853697.
  • Kesici, Mert. 2019. “Güç Sistemlerinde Geçici Hal Kararsızlığının ve Gelişiminin Derin Öğrenme ve Karar Ağacı Tabanlı Yöntemler Ile Geniş Alan Ölçümlerine Dayalı Olarak Erken Kestirimi.” Istanbul Technical University.
  • Kumari, Saloni, Deepika Kumar, and Mamta Mittal. 2021. “An Ensemble Approach for Classification and Prediction of Diabetes Mellitus Using Soft Voting Classifier.” International Journal of Cognitive Computing in Engineering 2:40–46. doi: 10.1016/j.ijcce.2021.01.001.
  • Makroum, Mohammed Amine, Mehdi Adda, Abdenour Bouzouane, and Hussein Ibrahim. 2022. “Machine Learning and Smart Devices for Diabetes Management: Systematic Review.” Sensors 22(5).
  • Molnar, Christoph. 2022. Interpretable Machine Learning : A Guide for Making Black Box Models Explainable. Second Edi. christophm.github.io.
  • O’Shea, Keiron, and Ryan Nash. 2015. “An Introduction to Convolutional Neural Networks.”
  • Otchere, Daniel Asante, Mary Aboagye, Mohammed Ayoub, Abdalla Mohammed, and Thomas Boahen Boakye. 2022. Enhancing Drilling Fluid Lost-Circulation Prediction Using Model Agnostic and Supervised Machine Learning.
  • Peng, Min, Chongyang Wang, Tong Chen, and Guangyuan Liu. 2016. “NIRFaceNet: A Convolutional Neural Network for near-Infrared Face Identification.” Information (Switzerland) 7(4). doi: 10.3390/info7040061.
  • Qawqzeh, Yousef K., Abdullah S. Bajahzar, Mahdi Jemmali, Mohammad Mahmood Otoom, and Adel Thaljaoui. 2020. “Classification of Diabetes Using Photoplethysmogram (PPG) Waveform Analysis: Logistic Regression Modeling.” BioMed Research International 2020. doi: 10.1155/2020/3764653.
  • Rajput, Minakshi R., and Sushant S. Khedgikar. 2022. “Diabetes Prediction and Analysis Using Medical Attributes: A Machine Learning Approach.” Journal of Xi’an University of Architecture & Technology 14(1):98–103. doi: 10.37896/JXAT14.01/314405.
  • Rengasamy, Divish, Benjamin C. Rothwell, and Grazziela P. Figueredo. 2021. “Towards a More Reliable Interpretation of Machine Learning Outputs for Safety-Critical Systems Using Feature Importance Fusion.” Applied Sciences (Switzerland) 11(24). doi: 10.3390/app112411854.
  • Sadeghi, Somayeh, Davood Khalili, Azra Ramezankhani, Mohammad Ali Mansournia, and Mahboubeh Parsaeian. 2022. “Diabetes Mellitus Risk Prediction in the Presence of Class Imbalance Using Flexible Machine Learning Methods.” BMC Medical Informatics and Decision Making 22(1). doi: 10.1186/s12911-022-01775-z.
  • Sagheer, Alaa, and Mostafa Kotb. 2019. “Time Series Forecasting of Petroleum Production Using Deep LSTM Recurrent Networks.” Neurocomputing 323:203–13. doi: 10.1016/j.neucom.2018.09.082.
  • Shishvan, Omid Rajabi, Daphney Stavroula Zois, and Tolga Soyata. 2018. “Machine Intelligence in Healthcare and Medical Cyber Physical Systems: A Survey.” IEEE Access 6:46419–94.
  • Sisodia, Deepti, and Dilip Singh Sisodia. 2018. “Prediction of Diabetes Using Classification Algorithms.” Pp. 1578–85 in Procedia Computer Science. Vol. 132. Elsevier B.V.
  • Sun, Yun Lei, and Da Lin Zhang. 2019. “Machine Learning Techniques for Screening and Diagnosis of Diabetes: A Survey.” Tehnicki Vjesnik 26(3):872–80.
  • Swapna, G., K. P. Soman, and R. Vinayakumar. 2018. “Automated Detection of Diabetes Using CNN and CNN-LSTM Network and Heart Rate Signals.” Pp. 1253–62 in Procedia Computer Science. Vol. 132. Elsevier B.V.
  • Tafa, Zhilbert, Nerxhivane Pervetica, and Bertran Karahoda. 2015. “An Intelligent System for Diabetes Prediction.” Pp. 378–82 in Proceedings - 2015 4th Mediterranean Conference on Embedded Computing, MECO 2015 - Including ECyPS 2015, BioEMIS 2015, BioICT 2015, MECO-Student Challenge 2015. Institute of Electrical and Electronics Engineers Inc.
  • Wang, Huaizhi, Haiyan Yi, Jianchun Peng, Guibin Wang, Yitao Liu, Hui Jiang, and Wenxin Liu. 2017. “Deterministic and Probabilistic Forecasting of Photovoltaic Power Based on Deep Convolutional Neural Network.” Energy Conversion and Management 153:409–22. doi: 10.1016/j.enconman.2017.10.008.
  • Xiao, Yuelei, and Yang Yin. 2019. “Hybrid LSTM Neural Network for Short-Term Traffic Flow Prediction.” Information (Switzerland) 10(3). doi: 10.3390/info10030105.
There are 35 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Mehmet İsmail Gürsoy 0000-0002-2285-5160

Ahmet Alkan 0000-0003-0857-0764

Publication Date December 15, 2022
Published in Issue Year 2022

Cite

APA Gürsoy, M. İ., & Alkan, A. (2022). Investigation Of Diabetes Data with Permutation Feature Importance Based Deep Learning Methods. Karadeniz Fen Bilimleri Dergisi, 12(2), 916-930. https://doi.org/10.31466/kfbd.1174591