Talasemi Hastalığı Tahmini İçin Farklı Makine Öğrenmesi Yöntemlerinin Kullanılması ve Karşılaştırılması

Ece Gülşah Abbasoğulları; Faruk Baturalp Gunay

doi:10.31466/kfbd.1512278

Araştırma Makalesi

Talasemi Hastalığı Tahmini İçin Farklı Makine Öğrenmesi Yöntemlerinin Kullanılması ve Karşılaştırılması

Yıl 2024, Cilt: 14 Sayı: 4, 1990 - 2007, 15.12.2024

Ece Gülşah Abbasoğulları , Faruk Baturalp Gunay

https://doi.org/10.31466/kfbd.1512278

Öz

Talasemi, insan vücudunda az miktarda hemoglobin ve kırmızı kan hücresine neden olan kalıtsal bir hastalıktır. Bu hastalık tedavi edilemediği gibi bazı hastalarda ömür boyu kan nakli gerektirmektedir. Hastalığın erken teşhis edilmesi büyük önem taşımaktadır. Çalışmanın amacı makine öğrenmesi sınıflandırma yöntemleri kullanarak talasemi hastalığı tahmini yapmaktır. Çalışmada kullanılan veriler Erzurum Atatürk Üniversitesi Araştırma Hastanesine gelen hastalardan oluşmaktadır. Çalışma, python dili ile Jupyter Notebook ortamında sınıflandırma yöntemleri kullanılarak gerçekleştirilmiştir. Çalışmada, Naive Bayes (NB), K-En Yakın Komşu (KNN), Destek Vektör Makineleri (SVM), Lojistik Regresyon (LR), Rastgele Orman (RF) ve Karar Ağaçları (DT) gibi farklı sınıflandırma yöntemlerin karşılaştırılması yapılmıştır. Bu sınıflandırma yöntemleri kullanılarak en iyi tahmin sonucuna ulaşmaya çalışılmıştır. Veri seti %70 eğitim ve %30 test aşamasında kullanmak için ayrılmıştır. Bu aşamalarda oluşan sapmaların önüne geçmek için k kat çapraz doğrulama (k fold cross validation) yöntemi uygulanmıştır. Sınıflandırma yöntemlerinin performans değerlendirmesinde kesinlik (precision), duyarlılık (recall), f1-skoru (f1 score), doğruluk (accuracy), işlem karakteristik eğrisi (ROC-AUC), log loss (logaritmik kayıp) gibi performans metriklerine bakılmıştır. Çalışma sonucunda, yöntem uygulanmadan kurulan modeller içerisinde KNN yöntemi ile en başarılı doğruluk değeri %94,14 olarak, k katlı çapraz doğrulama yöntemi kullanıldıktan sonra kurulan modeller içerisinde ise RF yöntemi ile en başarılı doğruluk değeri %93,92 olarak elde edilmiştir.

Anahtar Kelimeler

Makine Öğrenmesi, Sınıflandırma, Talasemi, K Katlı Çapraz Doğrulama

Kaynakça

Akgül, İ., Kaya, V., Karavaş, E., Aydın, S., & Baran, A. (2024). A Novel Artificial Intelligence-Based Hybrid System to Improve Breast Cancer DetectionUsing DCE-MRI. BULLETIN OF THE POLISH ACADEMY OF SCIENCES. TECHNICAL SCIENCES, 72(3).
Alzubi, J., Nayyar, A., & Kumar, A. (2018). Machine learning from theory to algorithms: an overview. Journal of physics: conference series,
Colab. (2024). Google Colaboratory. Retrieved 2024 from https://colab.research.google.com/
Çil, B., Ayyıldız, H., & Tuncer, T. (2020). Discrimination of β-thalassemia and iron deficiency anemia through extreme learning machine and regularized extreme learning machine based decision support system. Medical hypotheses, 138, 109611.
Devanath, A., Akter, S., Karmaker, P., & Sattar, A. (2022). Thalassemia Prediction using Machine Learning Approaches. 2022 6th International Conference on Computing Methodologies and Communication (ICCMC),
Eröz, B. (2010). Veri yapısına bağlı olarak Roc eğrisi altında kalan alana ilişkin istatistiksel yöntemlerin karşılaştırılması Sağlık Bilimleri Enstitüsü].
Farzaliyev, E., Saihood, Q., & Sonuç, E. (2023). Çocuklarda Anemi Hastalığının Teşhisinde Topluluk Öğrenme Yöntemlerinin Kullanılması. 1 st International Conference on Recent Academic Studies,
Ferih, K., Elsayed, B., Elshoeibi, A. M., Elsabagh, A. A., Elhadary, M., Soliman, A., Abdalgayoom, M., & Yassin, M. (2023). Applications of artificial intelligence in thalassemia: A comprehensive review. Diagnostics, 13(9), 1551.
Fu, Y., Liu, H., Lee, L., Chen, Y., Chien, S., Lin, J., Chen, W., Cheng, M., Lin, P., & Lai, J. (2021). The TVGH-NYCU Thal-Classifier: Development of a Machine-Learning Classifier for Differentiating Thalassemia and Non-Thalassemia Patients. Diagnostics (Basel) 2021; 11 (9): 1725. DOI: https://doi. org/10.3390/diagnostics11091725. PMID: https://www. ncbi. nlm. nih. gov/pubmed/34574066.
Gao, J., & Liu, W. (2022). Advances in screening of thalassaemia. Clinica Chimica Acta, 534, 176-184.
Gonaygunta, H. (2023). Machine learning algorithms for detection of cyber threats using logistic regression. Department of Information Technology, University of the Cumberlands.
Ibrahim, I., & Abdulazeez, A. (2021). The role of machine learning algorithms for diagnosing diseases. Journal of Applied Science and Technology Trends, 2(01), 10-19.
Karaaziz, M., & Okyayuz, Ü. H. (2020). Bir talasemi hastasının hastalık ile uyumluluğunun incelenmesi: olgu sunumu. Cukurova Medical Journal, 45(1), 362-369.
Karadağ, K. (2021). Kan vermeye elverişli donörlerin makine öğrenme yöntemleri ile tespiti. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, 8(15), 508-514.
Laengsri, V., Shoombuatong, W., Adirojananon, W., Nantasenamat, C., Prachayasittikul, V., & Nuchnoi, P. (2019). ThalPred: a web-based prediction tool for discriminating thalassemia trait and iron deficiency anemia. BMC medical informatics and decision making, 19, 1-14.
Liu, S. (2024). An Application of Machine Learning to Thalassemia Diagnosis. Journal of Computer and Communications, 12(2), 211-230.
Mahesh, B. (2020). Machine learning algorithms-a review. International Journal of Science and Research (IJSR).[Internet], 9(1), 381-386.
Masala, G. L., Golosio, B., Cutzu, R., & Pola, R. (2013). A two-layered classifier based on the radial basis function for the screening of thalassaemia. Computers in biology and medicine, 43(11), 1724-1731.
Mohammed, M. Q., & Al-Tuwaijari, J. M. (2021). A Survey on various Machine Learning Approaches for thalassemia detection and classification. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(13), 7866-7871.
Nematzadeh, Z., Ibrahim, R., & Selamat, A. (2015). Comparative studies on breast cancer classifications with k-fold cross validations using machine learning techniques. 2015 10th Asian control conference (ASCC),
Nti, I. K., Nyarko-Boateng, O., & Aning, J. (2021). Performance of machine learning algorithms with different K values in K-fold cross-validation. International Journal of Information Technology and Computer Science, 13(6), 61-71.
Ozcift, A., & Gulten, A. (2011). Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms. Computer methods and programs in biomedicine, 104(3), 443-451.
Phirom, K., Charoenkwan, P., Shoombuatong, W., Charoenkwan, P., Sirichotiyakul, S., & Tongsong, T. (2022). DeepThal: A Deep Learning-Based Framework for the Large-Scale Prediction of the α+-Thalassemia Trait Using Red Blood Cell Parameters. Journal of Clinical Medicine, 11(21), 6305.
Sevli, O. (2022). Farklı sınıflandırıcılar ve yeniden örnekleme teknikleri kullanılarak kalp hastalığı teşhisine yönelik karşılaştırmalı bir çalışma. Journal of Intelligent Systems: Theory and Applications, 5(2), 92-105.
Vatansever, B., Aydın, H., & Çetinkaya, A. (2021). Heart Disease Prediction with Machine Learning Algorithm Using Feature Selection by Genetic Algorithm. Bilim, Teknoloji ve Mühendislik Araştırmaları Dergisi, 2(2), 67-80.
Yağmur, N., Temurtaş, H., & İdris, D. (2023). Anemi Hastalığının Yapay Sinir Ağları Yöntemleri Kullanılarak Sınıflandırılması. Journal of Scientific Reports-B(008), 20-34.

Using and Comparing Different Machine Learning Methods for Thalassemia Disease Prediction

Yıl 2024, Cilt: 14 Sayı: 4, 1990 - 2007, 15.12.2024

Ece Gülşah Abbasoğulları , Faruk Baturalp Gunay

https://doi.org/10.31466/kfbd.1512278

Öz

Thalassemia is an inherited disease that causes a low amount of hemoglobin and red blood cells in the human body. This disease cannot be treated and some patients require lifelong blood transfusions. Early diagnosis of the disease is of great importance. The aim of this study is to predict thalassemia disease using machine learning classification methods. The data used in this study consists of patients coming to Erzurum Atatürk University Research Hospital. This study was carried out using classification methods in the Jupyter Notebook environment with the Python language. In this study, different classification methods such as Naive Bayes (NB), K-Nearest Neighbor (KNN), Support Vector Machines (SVM), Logistic Regression (LR), Random Forest (RF) and Decision Trees (DT) were compared. Using these classification methods, the best estimation result was tried to be achieved. The dataset was divided into 70% for training and 30% for testing. To prevent deviations in these stages, k fold cross validation (k fold cross validation) method was applied. In the performance evaluation of classification methods, performance metrics such as precision (precision), recall (recall), f1 score (f1 score), accuracy (accuracy), operating characteristic curve (ROC-AUC), log loss (logarithmic loss) were examined. As a result of this study, the most successful accuracy value was obtained as 94.14% with the KNN method among the models established without applying any method, and the most successful accuracy value was obtained as 93.92% with the RF method among the models established after using the k-fold cross-validation method.

Anahtar Kelimeler

Machine Learning, Artificial Intelligence, Classification, Thalassemia, K Fold Cross Validation

Kaynakça

Akgül, İ., Kaya, V., Karavaş, E., Aydın, S., & Baran, A. (2024). A Novel Artificial Intelligence-Based Hybrid System to Improve Breast Cancer DetectionUsing DCE-MRI. BULLETIN OF THE POLISH ACADEMY OF SCIENCES. TECHNICAL SCIENCES, 72(3).
Alzubi, J., Nayyar, A., & Kumar, A. (2018). Machine learning from theory to algorithms: an overview. Journal of physics: conference series,
Colab. (2024). Google Colaboratory. Retrieved 2024 from https://colab.research.google.com/
Çil, B., Ayyıldız, H., & Tuncer, T. (2020). Discrimination of β-thalassemia and iron deficiency anemia through extreme learning machine and regularized extreme learning machine based decision support system. Medical hypotheses, 138, 109611.
Devanath, A., Akter, S., Karmaker, P., & Sattar, A. (2022). Thalassemia Prediction using Machine Learning Approaches. 2022 6th International Conference on Computing Methodologies and Communication (ICCMC),
Eröz, B. (2010). Veri yapısına bağlı olarak Roc eğrisi altında kalan alana ilişkin istatistiksel yöntemlerin karşılaştırılması Sağlık Bilimleri Enstitüsü].
Farzaliyev, E., Saihood, Q., & Sonuç, E. (2023). Çocuklarda Anemi Hastalığının Teşhisinde Topluluk Öğrenme Yöntemlerinin Kullanılması. 1 st International Conference on Recent Academic Studies,
Ferih, K., Elsayed, B., Elshoeibi, A. M., Elsabagh, A. A., Elhadary, M., Soliman, A., Abdalgayoom, M., & Yassin, M. (2023). Applications of artificial intelligence in thalassemia: A comprehensive review. Diagnostics, 13(9), 1551.
Fu, Y., Liu, H., Lee, L., Chen, Y., Chien, S., Lin, J., Chen, W., Cheng, M., Lin, P., & Lai, J. (2021). The TVGH-NYCU Thal-Classifier: Development of a Machine-Learning Classifier for Differentiating Thalassemia and Non-Thalassemia Patients. Diagnostics (Basel) 2021; 11 (9): 1725. DOI: https://doi. org/10.3390/diagnostics11091725. PMID: https://www. ncbi. nlm. nih. gov/pubmed/34574066.
Gao, J., & Liu, W. (2022). Advances in screening of thalassaemia. Clinica Chimica Acta, 534, 176-184.
Gonaygunta, H. (2023). Machine learning algorithms for detection of cyber threats using logistic regression. Department of Information Technology, University of the Cumberlands.
Ibrahim, I., & Abdulazeez, A. (2021). The role of machine learning algorithms for diagnosing diseases. Journal of Applied Science and Technology Trends, 2(01), 10-19.
Karaaziz, M., & Okyayuz, Ü. H. (2020). Bir talasemi hastasının hastalık ile uyumluluğunun incelenmesi: olgu sunumu. Cukurova Medical Journal, 45(1), 362-369.
Karadağ, K. (2021). Kan vermeye elverişli donörlerin makine öğrenme yöntemleri ile tespiti. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, 8(15), 508-514.
Laengsri, V., Shoombuatong, W., Adirojananon, W., Nantasenamat, C., Prachayasittikul, V., & Nuchnoi, P. (2019). ThalPred: a web-based prediction tool for discriminating thalassemia trait and iron deficiency anemia. BMC medical informatics and decision making, 19, 1-14.
Liu, S. (2024). An Application of Machine Learning to Thalassemia Diagnosis. Journal of Computer and Communications, 12(2), 211-230.
Mahesh, B. (2020). Machine learning algorithms-a review. International Journal of Science and Research (IJSR).[Internet], 9(1), 381-386.
Masala, G. L., Golosio, B., Cutzu, R., & Pola, R. (2013). A two-layered classifier based on the radial basis function for the screening of thalassaemia. Computers in biology and medicine, 43(11), 1724-1731.
Mohammed, M. Q., & Al-Tuwaijari, J. M. (2021). A Survey on various Machine Learning Approaches for thalassemia detection and classification. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(13), 7866-7871.
Nematzadeh, Z., Ibrahim, R., & Selamat, A. (2015). Comparative studies on breast cancer classifications with k-fold cross validations using machine learning techniques. 2015 10th Asian control conference (ASCC),
Nti, I. K., Nyarko-Boateng, O., & Aning, J. (2021). Performance of machine learning algorithms with different K values in K-fold cross-validation. International Journal of Information Technology and Computer Science, 13(6), 61-71.
Ozcift, A., & Gulten, A. (2011). Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms. Computer methods and programs in biomedicine, 104(3), 443-451.
Phirom, K., Charoenkwan, P., Shoombuatong, W., Charoenkwan, P., Sirichotiyakul, S., & Tongsong, T. (2022). DeepThal: A Deep Learning-Based Framework for the Large-Scale Prediction of the α+-Thalassemia Trait Using Red Blood Cell Parameters. Journal of Clinical Medicine, 11(21), 6305.
Sevli, O. (2022). Farklı sınıflandırıcılar ve yeniden örnekleme teknikleri kullanılarak kalp hastalığı teşhisine yönelik karşılaştırmalı bir çalışma. Journal of Intelligent Systems: Theory and Applications, 5(2), 92-105.
Vatansever, B., Aydın, H., & Çetinkaya, A. (2021). Heart Disease Prediction with Machine Learning Algorithm Using Feature Selection by Genetic Algorithm. Bilim, Teknoloji ve Mühendislik Araştırmaları Dergisi, 2(2), 67-80.
Yağmur, N., Temurtaş, H., & İdris, D. (2023). Anemi Hastalığının Yapay Sinir Ağları Yöntemleri Kullanılarak Sınıflandırılması. Journal of Scientific Reports-B(008), 20-34.

Toplam 26 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Bilgisayar Yazılımı, Yazılım Mühendisliği (Diğer)
Bölüm	Makaleler
Yazarlar	Ece Gülşah Abbasoğulları 0000-0002-0685-6999 Faruk Baturalp Gunay 0000-0001-5472-3608
Yayımlanma Tarihi	15 Aralık 2024
Gönderilme Tarihi	8 Temmuz 2024
Kabul Tarihi	9 Eylül 2024
Yayımlandığı Sayı	Yıl 2024 Cilt: 14 Sayı: 4

Kaynak Göster

APA	Abbasoğulları, E. G., & Gunay, F. B. (2024). Talasemi Hastalığı Tahmini İçin Farklı Makine Öğrenmesi Yöntemlerinin Kullanılması ve Karşılaştırılması. Karadeniz Fen Bilimleri Dergisi, 14(4), 1990-2007. https://doi.org/10.31466/kfbd.1512278

Kapak Resmi İndir

Makale Dosyaları

Tam Metin

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.