Research Article
BibTex RIS Cite

CPV Tahmininde Makine Öğrenme Yöntemlerinin Performanslarının Değerlendirilmesi: Farklı Bölgeler ve Parametreler ile Büyük Veri Uygulaması

Year 2022, Volume: 2 Issue: 3, 8 - 18, 31.12.2022

Abstract

Kapsamı: Günümüzde hızla artan düzenli/düzensiz farklı
alanlardaki büyük verilerin yorumlanması, sınıflandırılması,
depolanması ve ayıklanarak tekrar işe yarar hale getirilmesi,
üzerinde yoğun çalışılan konuların arasındadır. Sağlık
alanındaki büyük verilerin doğru yorumlanması ise hızlı ve
doğru teşhis konulmasını sağladığından hayati öneme sahiptir.
Projede sağlık verilerinin yorumlanabileceği makine
öğrenme yöntemleri, Canine parvovirüsü enfeksiyonun özelinde
uygulanmıştır. CPV klinik bulgulara dayanılarak teşhis
konulabilirken; diğer enfeksiyonlardan ayırt edilebilmesi
için ise laboratuvar bulguları ile desteklenmesi gerekir. CPV,
yavru köpeklerde ölümle sonuçlanabilen, kanlı ishalle seyreden
diğer enfeksiyonlardan ayırt edilebilmesi için doğru
teşhis hayati önem taşır. Bu sebeple, virüsün etkilenebileceği
diğer verilerle beraber incelenmesi yapılarak, en doğru kararın
alma yöntemleri karşılaştırılarak değerlendirilmiştir.
Amaç: Çalışmada, halk arasında delibaş hastalığı olarak
bilinen, köpeklerin en önemli enfeksiyöz etkenlerinden birisi
olarak kabul edilen CPV farklı parametreleri açısından En
Yakın komşu Algoritması (KNN), Rastgele Orman (RF), Lojistik
Regresyon ve NaiveBayes sınıflandırma algoritmaları
kullanarak yorumlamayı hedeflemiştir.
Sonuç/Bulgular: Toplam doğruluk değerleri incelendiğinde
anlamsız değişken modelde çıkarıldığında lojistik regresyon
ve RF yöntemlerinde doğruluk oranları düşmüştür. RF
yöntemi Platelets,Trombosit (PLT) değişkeni modelde iken
en iyi tahminleri yapmıştır. Bu değişkeni modelden çıkarmak
istemediğimiz durumlarda bize çok verimli sonuçlar verebilmektedir.
KNN yöntemi değişken sayısı azaldığında daha iyi
sonuçlar vermektedir. Özellikle veri boyutu arttığında makine
öğrenmesi yöntemi daha iyi performans ile daha verimli
sonuçlar verdiği gözlemlenmiştir.

References

  • 1. Karim, M. R., Beyan, O., Zappa, A., Costa, I. G., Rebholz-Schuhmann, D., Cochez, M., &Decker, S. Deeplearning-basedclusteringapproachesforbioinformatics. Briefings in Bioinformatics, (2021), 22(1), 393-415.
  • 2. Li, H.,Xue, Y., &Zeng, X. Investigation of dataminingtechniqueandartificialintelligencealgorithm in microflorabioinformatics. In E3S Web of Conferences (2021). (Vol. 267, p. 01040). EDP Sciences.
  • 3. De Mauro, A.,Greco, M., &Grimaldi, M. What is bigdata? A consensualdefinitionand a review of keyresearchtopics. In AIP conferenceproceedings (2015), (Vol. 1644, No. 1, pp. 97-104). AmericanInstitute of Physics.
  • 4. Pathak, R. K., Singh, D. B., & Singh, R. Introductiontobasics of bioinformatics. (2022).In Bioinformatics (pp. 1-15). AcademicPress.
  • 5. Ebrahimi, F.,Asemi, A., Shabani, A., &Nezarat, A. Developing a Prediction Model for Author Collaboration in BioinformaticsResearch Using GraphMiningTechniquesandBig Data Applications. International Journal of Information Scienceand Management (2021). (IJISM), 19(2), 1-18.
  • 6. Sakulwira, K.,Vanapongtipagorn, P., Theamboonlers, A., Oraveerakul, K., &Poovorawan, Y. Prevalence of canine coronavirusandparvovirusinfections in dogswithgastroenteritis in Thailand. VeterinarniMedicina, (2003).48(6), 163..
  • 7. Filipov, C.,Decaro, N., Desario, C., Amorisco, F., Sciarretta, R., &Buonavoglia, C. . Canine parvovirusepidemiology in Bulgaria. Journal of VeterinaryDiagnosticInvestigation, (2011), 23(1), 152-154.
  • 8. ÖZKUL, A., KELEŞ, İ., KARAOĞLU, T., ÇABALAR, M., & BURGU, İ. Detectionandrflpanalysis of canine parvovirus (cpv) dnabypolymerasechainreaction (pcr) in a dog. TurkishJournal of VeterinaryandAnimalSciences, (2002). 26(5), 1201-1203.
  • 9. Sellers, R. F., & Pedgley, D. E. (1985). Possible windborne spread to western Turkey of bluetongue virus in 1977 and of Akabane virus in 1979. Epidemiology & Infection, 95(1), 149- 158.
  • 10. Sterzing, F., Kratochwil, C., Fiedler, H., Katayama, S., Habl, G., Kopka, K., ... & Giesel, F. L. (2016). 68Ga-PSMA-11 PET/ CT: a new technique with high potential for the radiotherapeutic management of prostate cancer patients. European journal of nuclear medicine and molecular imaging, 43(1), 34-41.
  • 11. Khan, N. T. (2018). Data Mining–Basics of Bioinformatics. Transcriptomics, 6(142), 2.
  • 12. Mahapatro, P. S. AssociationRuleMining in HealthCare: A Study. Studies in IndianPlaceNames, (2020). 40(53), 87–91.
  • 13. Dinov, I. D. Volume andvalue of bighealthcaredata. Journal of medicalstatisticsandinformatics, (2016). 4.
  • 14. Polat, M., & KARAHAN, A. (2009). Multidisipliner yeni bir bilim dalı: biyoinformatik ve tıpta uygulamaları. SDÜ Tıp Fakültesi Dergisi, 16(3), 41-50.
  • 15. Snyder, L. V., Atan, Z., Peng, P., Rong, Y., Schmitt, A. J., & Sinsoysal, B. (2016). OR/MS models for supply chain disruptions: A review. Iie Transactions, 48(2), 89-109.
  • 16. Zeynep, ÖZEL ve DEMİRSÖZ, M. . Makine Öğrenmesi Yöntemleri İle Covid-19 Verilerinin İncelenmesi: Türkiye Örneği: An Analysis of Covid-19 Data With Machine Learning Methods: The Case of Turkey. Sağlık Bilimlerinde Yapay Zeka Dergisi (Journal of ArtificialIntelligence in HealthSciences) ISSN: 2757-9646, 1(2), 1-7.(2021)
  • 17. Kochetkova, O. V.,&Shiryaeva, E. V. Perspectivearchitecture of dairyfarmingenterprises, using modern digitaltechnologiesforsustainabledevelopment. In IOP Conference Series: Earth andEnvironmentalScience (2022), (Vol. 965, No. 1, p. 012062). IOP Publishing.
  • 18. Ghaffarian, S.,van der Voort, M., Valente, J., Tekinerdogan, B., & de Mey, Y. . Machine learning-basedfarm risk management: A systematicmappingreview. ComputersandElectronics in Agriculture, (2022), 192, 106631.
  • 19. Bollig N, DeBoer D, Döpfer D Learning Tutorial for Veterinarians: Examples Using Canine Atopic Dermatitis, (2020)
  • 20. Yazdanbakhsh, O.,Zhou, Y., &Dick, S. An intelligentsystemforlivestockdiseasesurveillance. Information Sciences, (2017).378, 26-47.
  • 21. Kılınçalp, S., Ekiz, F., Başar, Ö., Ayte, M. R., Çoban, Ş., Yılmaz, B., ... & Yüksel, O. (2014). Mean platelet volume could be possible biomarker in early diagnosis and monitoring of gastric cancer. Platelets, 25(8), 592-594.
  • 22. Saberioon M, Císař P, Labbé L, Souček P, Pelissier, P ve Kerneis T. Comparativeperformanceanalysis of supportvectormachine, randomforest, logisticregressionand k-nearestneighbours in rainbowtrout (oncorhynchusmykiss) classificationusingimage-basedfeatures. Sensors,(2018). 18(4), 10

Evaluating the Performance of Machine Learning Methods in CPV Prediction: Big Data Application Wıih Different Regions and Parameters

Year 2022, Volume: 2 Issue: 3, 8 - 18, 31.12.2022

Abstract

Scopa: Nowadays, the interpretation, classification, storage and extraction
of big data in different fields, which are rapidly increasing in regular
and irregular areas, and making them useful again are among the subjects
that are intensively studied. The correct interpretation of big data in
the field of health is of vital importance as it enables fast and accurate
diagnosis. In the project, machine learning methods that can interpret health
data have been applied specifically to Canine parvovirus infection.
While CPV can be diagnosed based on clinical findings, it needs to be
supported by laboratory findings to distinguish it from other infections.
Correct diagnosis is vital to distinguish CPV from other infections with
bloody diarrhoea, which can result in death in puppies. For this reason,
by analysing the virus together with other data that may be affected by
the virus, the methods of making the most accurate decision were compared
and evaluated.
Purpose: In this study, it was aimed to interpret CPV, which is considered
to be one of the most important infectious agents of dogs, popularly
known as mad-head disease, using K-NearestNeighbour (KNN),
RandomForest (RF), Logistic Regression and NaiveBayes classification
algorithms in terms of different parameters. When the total accuracy values
were examined, the accuracy rates decreased in logistic regression
and RF methods when the insignificant variable was removed in the model.
Result: RF method made the best predictions when Platelets, Platelet
(PLT) variable was in the model.In cases where we do not want
to remove this variable from the model, it can give us very efficient results.
KNN method gives better results when the number of variables
decreases. Especially when the data size increases, it has been observed
that the machine learning method gives more efficient results with better
performance.

References

  • 1. Karim, M. R., Beyan, O., Zappa, A., Costa, I. G., Rebholz-Schuhmann, D., Cochez, M., &Decker, S. Deeplearning-basedclusteringapproachesforbioinformatics. Briefings in Bioinformatics, (2021), 22(1), 393-415.
  • 2. Li, H.,Xue, Y., &Zeng, X. Investigation of dataminingtechniqueandartificialintelligencealgorithm in microflorabioinformatics. In E3S Web of Conferences (2021). (Vol. 267, p. 01040). EDP Sciences.
  • 3. De Mauro, A.,Greco, M., &Grimaldi, M. What is bigdata? A consensualdefinitionand a review of keyresearchtopics. In AIP conferenceproceedings (2015), (Vol. 1644, No. 1, pp. 97-104). AmericanInstitute of Physics.
  • 4. Pathak, R. K., Singh, D. B., & Singh, R. Introductiontobasics of bioinformatics. (2022).In Bioinformatics (pp. 1-15). AcademicPress.
  • 5. Ebrahimi, F.,Asemi, A., Shabani, A., &Nezarat, A. Developing a Prediction Model for Author Collaboration in BioinformaticsResearch Using GraphMiningTechniquesandBig Data Applications. International Journal of Information Scienceand Management (2021). (IJISM), 19(2), 1-18.
  • 6. Sakulwira, K.,Vanapongtipagorn, P., Theamboonlers, A., Oraveerakul, K., &Poovorawan, Y. Prevalence of canine coronavirusandparvovirusinfections in dogswithgastroenteritis in Thailand. VeterinarniMedicina, (2003).48(6), 163..
  • 7. Filipov, C.,Decaro, N., Desario, C., Amorisco, F., Sciarretta, R., &Buonavoglia, C. . Canine parvovirusepidemiology in Bulgaria. Journal of VeterinaryDiagnosticInvestigation, (2011), 23(1), 152-154.
  • 8. ÖZKUL, A., KELEŞ, İ., KARAOĞLU, T., ÇABALAR, M., & BURGU, İ. Detectionandrflpanalysis of canine parvovirus (cpv) dnabypolymerasechainreaction (pcr) in a dog. TurkishJournal of VeterinaryandAnimalSciences, (2002). 26(5), 1201-1203.
  • 9. Sellers, R. F., & Pedgley, D. E. (1985). Possible windborne spread to western Turkey of bluetongue virus in 1977 and of Akabane virus in 1979. Epidemiology & Infection, 95(1), 149- 158.
  • 10. Sterzing, F., Kratochwil, C., Fiedler, H., Katayama, S., Habl, G., Kopka, K., ... & Giesel, F. L. (2016). 68Ga-PSMA-11 PET/ CT: a new technique with high potential for the radiotherapeutic management of prostate cancer patients. European journal of nuclear medicine and molecular imaging, 43(1), 34-41.
  • 11. Khan, N. T. (2018). Data Mining–Basics of Bioinformatics. Transcriptomics, 6(142), 2.
  • 12. Mahapatro, P. S. AssociationRuleMining in HealthCare: A Study. Studies in IndianPlaceNames, (2020). 40(53), 87–91.
  • 13. Dinov, I. D. Volume andvalue of bighealthcaredata. Journal of medicalstatisticsandinformatics, (2016). 4.
  • 14. Polat, M., & KARAHAN, A. (2009). Multidisipliner yeni bir bilim dalı: biyoinformatik ve tıpta uygulamaları. SDÜ Tıp Fakültesi Dergisi, 16(3), 41-50.
  • 15. Snyder, L. V., Atan, Z., Peng, P., Rong, Y., Schmitt, A. J., & Sinsoysal, B. (2016). OR/MS models for supply chain disruptions: A review. Iie Transactions, 48(2), 89-109.
  • 16. Zeynep, ÖZEL ve DEMİRSÖZ, M. . Makine Öğrenmesi Yöntemleri İle Covid-19 Verilerinin İncelenmesi: Türkiye Örneği: An Analysis of Covid-19 Data With Machine Learning Methods: The Case of Turkey. Sağlık Bilimlerinde Yapay Zeka Dergisi (Journal of ArtificialIntelligence in HealthSciences) ISSN: 2757-9646, 1(2), 1-7.(2021)
  • 17. Kochetkova, O. V.,&Shiryaeva, E. V. Perspectivearchitecture of dairyfarmingenterprises, using modern digitaltechnologiesforsustainabledevelopment. In IOP Conference Series: Earth andEnvironmentalScience (2022), (Vol. 965, No. 1, p. 012062). IOP Publishing.
  • 18. Ghaffarian, S.,van der Voort, M., Valente, J., Tekinerdogan, B., & de Mey, Y. . Machine learning-basedfarm risk management: A systematicmappingreview. ComputersandElectronics in Agriculture, (2022), 192, 106631.
  • 19. Bollig N, DeBoer D, Döpfer D Learning Tutorial for Veterinarians: Examples Using Canine Atopic Dermatitis, (2020)
  • 20. Yazdanbakhsh, O.,Zhou, Y., &Dick, S. An intelligentsystemforlivestockdiseasesurveillance. Information Sciences, (2017).378, 26-47.
  • 21. Kılınçalp, S., Ekiz, F., Başar, Ö., Ayte, M. R., Çoban, Ş., Yılmaz, B., ... & Yüksel, O. (2014). Mean platelet volume could be possible biomarker in early diagnosis and monitoring of gastric cancer. Platelets, 25(8), 592-594.
  • 22. Saberioon M, Císař P, Labbé L, Souček P, Pelissier, P ve Kerneis T. Comparativeperformanceanalysis of supportvectormachine, randomforest, logisticregressionand k-nearestneighbours in rainbowtrout (oncorhynchusmykiss) classificationusingimage-basedfeatures. Sensors,(2018). 18(4), 10
There are 22 citations in total.

Details

Primary Language Turkish
Subjects Knowledge Representation and Reasoning
Journal Section Research Article
Authors

Gözde Zabzun 0000-0002-9502-8756

Meltem Sevinç This is me 0000-0001-6351-3370

Çınar Dalkılıç This is me 0000-0002-1881-5684

Kıvanç Ege Çam 0000-0002-5951-7245

Publication Date December 31, 2022
Published in Issue Year 2022 Volume: 2 Issue: 3

Cite

Vancouver Zabzun G, Sevinç M, Dalkılıç Ç, Çam KE. CPV Tahmininde Makine Öğrenme Yöntemlerinin Performanslarının Değerlendirilmesi: Farklı Bölgeler ve Parametreler ile Büyük Veri Uygulaması. JAIHS. 2022;2(3):8-18.