Research Article
BibTex RIS Cite

KANSERLİ HASTALARDA MAKİNE ÖĞRENİMİ İLE SAĞKALIM ORANININ TAHMİN EDİLMESİ

Year 2024, Volume: 14 Issue: 28, 842 - 855, 30.11.2024
https://doi.org/10.53092/duiibfd.1494646

Abstract

Kanser önemli bir halk sağlığı sorunu olmakla birlikte ABD’de hastalık yükü açısından ikinci sırada yer almakta dünyada ise küresel hastalık yükü sıralamasında ilk sıralarda yer alabilmektedir. Önemli oranda mortalite ve morbiditeye neden olan kanser hastalığı birçok faktörden etkilenmektedir. Gerek hastalığa neden olan faktörlerin incelenmesi gerek hastalığın yönetilmesi konusunda araştırmacılar giderek artan oranda bu alanla ilgilenmekte yeni tedavi yöntemleri, yeni teknikler ve teknolojiler ile bu hastalık üzerinde araştırmalar yapmaktadırlar. Bu çalışmada ABD toplumun %8,3’ünü temsil eden ve açık erişimli olarak ulaşılabilen kanser verileri analiz edilerek sağkalım oranlarını tespit etmek amaçlanmıştır. Araştırmada veri madenciliği yöntemlerinden biri olan Konstanz Information Miner (KNIME) programı kullanılmıştır. Elde edilen veriler ile kanser hastalarının sağkalımları sınıflandırılmaya çalışılmıştır. Araştırma kapsamında veri madenciliği araçları olan karar ağaçları, random forrest, Destek Vektör Makineleri (Support Vector Machine-SVM) algoritmaları ile çeşitli güven düzeyleri elde edilmiştir. En yüksek güven düzeyi %75,3 ile random forrest algoritması ile elde edilmiştir. Sonuç olarak modelin anlamlı ve kullanılabilir olduğu ve elde edilen veriler ile sağkalım sınıflandırılmasının yapılabildiği görülmüştür. Sağkalım sınıflandırması kaynak tahsisinde ve etkili bakım konusunda sağlık hizmet sunucuları için önemli bir unsur olabilir.

References

  • Asri, H., Mousannif, H., Al Moatassime, H., & Noel, T. (2016). Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Computer Science, 83, 1064-1069.
  • Bellazzi, R., & Zupan, B. (2008). Predictive data mining in clinical medicine: Current issues and guidelines. International Journal of Medical Informatics, 77(2), 81-97.
  • Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., & Jemal, A. (2018). Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians, 68(6), 394-424.
  • Charbuty, B., & Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(1), 20-28.
  • Chhikara, B. S., & Parang, K. (2023). Global cancer statistics 2022: The trends projection analysis. Chemical Biology Letters, 10(1), 451-451.
  • Delen, D. (2009). Analysis of cancer data: A data mining approach. Expert Systems, 26(1), 100-112.
  • Delen, D., Walker, G., & Kadam, A. (2005). Predicting breast cancer survivability: A comparison of three data mining methods. Artificial Intelligence in Medicine, 34(2), 113-127.
  • Herland, M., Khoshgoftaar, T. M., & Wald, R. (2014). A review of data mining using big data in health informatics. Journal of Big Data, 1, 1-35.
  • Kharya, S. (2012). Using data mining techniques for diagnosis and prognosis of cancer disease. arXiv preprint arXiv:1205.1923.
  • KNIME Software. (2024). Retrieved February 29, 2024, from https://www.knime.com/
  • Li, L., Tang, H., Wu, Z., Gong, J., Gruidl, M., Zou, J., ... & Clark, R. A. (2004). Data mining techniques for cancer detection using serum proteomic profiling. Artificial Intelligence in Medicine, 32(2), 71-83.
  • Liou, D. M., & Chang, W. P. (2015). Applying data mining for the analysis of breast cancer data. In C. Perner (Ed.), Data Mining in Clinical Medicine (pp. 175-189). Springer.
  • Meng, T., Jing, X., Yan, Z., & Pedrycz, W. (2020). A survey on machine learning for data fusion. Information Fusion, 57, 115-129.
  • Osman, A. H. (2017). An enhanced breast cancer diagnosis scheme based on two-step-SVM technique. International Journal of Advanced Computer Science and Applications, 8(4).
  • Patel, B. R., & Rana, K. K. (2014). A survey on decision tree algorithm for classification. International Journal of Engineering Development and Research, 2(1), 1-5.
  • Sahu, H., Shrma, S., & Gondhalakar, S. (2011). A brief overview on data mining survey. International Journal of Computer Technology and Electronics Engineering, 1(3), 114-121.
  • SEER Database Guideline. (2024). SEER*Stat Databases: SEER November 2022 Submission. Retrieved February 27, 2024, from https://seer.cancer.gov/data-software/documentation/seerstat/nov2022/
  • SEER Official Website. (2024). Retrieved February 29, 2024, from https://seer.cancer.gov/
  • Siegel, R. L., Miller, K. D., Wagle, N. S., & Jemal, A. (2023). Cancer statistics, 2023. CA: A Cancer Journal for Clinicians, 73(1), 17-48.
  • Sullivan, R., Peppercorn, J., Sikora, K., Zalcberg, J., Meropol, N. J., Amir, E., ... & Aapro, M. (2011). Delivering affordable cancer care in high-income countries. The Lancet Oncology, 12(10), 933-980.
  • Tan, C., Chen, H., & Xia, C. (2009). Early prediction of lung cancer based on the combination of trace element analysis in urine and an Adaboost algorithm. Journal of Pharmaceutical and Biomedical Analysis, 49(3), 746-752.
  • Vens, C., Struyf, J., Schietgat, L., Džeroski, S., & Blockeel, H. (2008). Decision trees for hierarchical multi-label classification. Machine Learning, 73, 185-214.
  • Wirth, R., & Hipp, J. (2000, April). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining (Vol. 1, pp. 29-39).
  • World Health Organization. (2024). Cancer. Retrieved February 14, 2024, from https://www.who.int/health-topics/cancer
  • Yue, W., Wang, Z., Chen, H., Payne, A., & Liu, X. (2018). Machine learning with applications in breast cancer diagnosis and prognosis. Designs, 2(2), 13.
  • Zhang, F., & O'Donnell, L. J. (2020). Support vector regression. In M. J. Er (Ed.), Machine Learning (pp. 123-140). Academic Press.

PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER

Year 2024, Volume: 14 Issue: 28, 842 - 855, 30.11.2024
https://doi.org/10.53092/duiibfd.1494646

Abstract

Cancer is an important public health problem, ranking second in terms of burden of disease in the United States and ranking first in the global burden of disease in the world. Cancer, which causes significant mortality and morbidity, is affected by many factors. Researchers are increasingly interested in this field, both in examining the factors that cause the disease and in managing the disease and are conducting research on this disease with new treatment methods, new techniques and technologies. In this study, its aimed to determine survival rates by analysing open access cancer data representing 8.3% of the US population. With the data obtained, it was tried to classify the survival of cancer patients. Within the scope of the research, various confidence levels were obtained with decision trees, random forrest, SVM algorithms, which are data mining tools. The highest confidence level was obtained with the random forrest algorithm with 75.3%. As a result, it was seen that the model was meaningful and usable, and that survival classification could be made with the data obtained. Survival classification can be an important element for health service providers in resource allocation and effective care.

References

  • Asri, H., Mousannif, H., Al Moatassime, H., & Noel, T. (2016). Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Computer Science, 83, 1064-1069.
  • Bellazzi, R., & Zupan, B. (2008). Predictive data mining in clinical medicine: Current issues and guidelines. International Journal of Medical Informatics, 77(2), 81-97.
  • Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., & Jemal, A. (2018). Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians, 68(6), 394-424.
  • Charbuty, B., & Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(1), 20-28.
  • Chhikara, B. S., & Parang, K. (2023). Global cancer statistics 2022: The trends projection analysis. Chemical Biology Letters, 10(1), 451-451.
  • Delen, D. (2009). Analysis of cancer data: A data mining approach. Expert Systems, 26(1), 100-112.
  • Delen, D., Walker, G., & Kadam, A. (2005). Predicting breast cancer survivability: A comparison of three data mining methods. Artificial Intelligence in Medicine, 34(2), 113-127.
  • Herland, M., Khoshgoftaar, T. M., & Wald, R. (2014). A review of data mining using big data in health informatics. Journal of Big Data, 1, 1-35.
  • Kharya, S. (2012). Using data mining techniques for diagnosis and prognosis of cancer disease. arXiv preprint arXiv:1205.1923.
  • KNIME Software. (2024). Retrieved February 29, 2024, from https://www.knime.com/
  • Li, L., Tang, H., Wu, Z., Gong, J., Gruidl, M., Zou, J., ... & Clark, R. A. (2004). Data mining techniques for cancer detection using serum proteomic profiling. Artificial Intelligence in Medicine, 32(2), 71-83.
  • Liou, D. M., & Chang, W. P. (2015). Applying data mining for the analysis of breast cancer data. In C. Perner (Ed.), Data Mining in Clinical Medicine (pp. 175-189). Springer.
  • Meng, T., Jing, X., Yan, Z., & Pedrycz, W. (2020). A survey on machine learning for data fusion. Information Fusion, 57, 115-129.
  • Osman, A. H. (2017). An enhanced breast cancer diagnosis scheme based on two-step-SVM technique. International Journal of Advanced Computer Science and Applications, 8(4).
  • Patel, B. R., & Rana, K. K. (2014). A survey on decision tree algorithm for classification. International Journal of Engineering Development and Research, 2(1), 1-5.
  • Sahu, H., Shrma, S., & Gondhalakar, S. (2011). A brief overview on data mining survey. International Journal of Computer Technology and Electronics Engineering, 1(3), 114-121.
  • SEER Database Guideline. (2024). SEER*Stat Databases: SEER November 2022 Submission. Retrieved February 27, 2024, from https://seer.cancer.gov/data-software/documentation/seerstat/nov2022/
  • SEER Official Website. (2024). Retrieved February 29, 2024, from https://seer.cancer.gov/
  • Siegel, R. L., Miller, K. D., Wagle, N. S., & Jemal, A. (2023). Cancer statistics, 2023. CA: A Cancer Journal for Clinicians, 73(1), 17-48.
  • Sullivan, R., Peppercorn, J., Sikora, K., Zalcberg, J., Meropol, N. J., Amir, E., ... & Aapro, M. (2011). Delivering affordable cancer care in high-income countries. The Lancet Oncology, 12(10), 933-980.
  • Tan, C., Chen, H., & Xia, C. (2009). Early prediction of lung cancer based on the combination of trace element analysis in urine and an Adaboost algorithm. Journal of Pharmaceutical and Biomedical Analysis, 49(3), 746-752.
  • Vens, C., Struyf, J., Schietgat, L., Džeroski, S., & Blockeel, H. (2008). Decision trees for hierarchical multi-label classification. Machine Learning, 73, 185-214.
  • Wirth, R., & Hipp, J. (2000, April). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining (Vol. 1, pp. 29-39).
  • World Health Organization. (2024). Cancer. Retrieved February 14, 2024, from https://www.who.int/health-topics/cancer
  • Yue, W., Wang, Z., Chen, H., Payne, A., & Liu, X. (2018). Machine learning with applications in breast cancer diagnosis and prognosis. Designs, 2(2), 13.
  • Zhang, F., & O'Donnell, L. J. (2020). Support vector regression. In M. J. Er (Ed.), Machine Learning (pp. 123-140). Academic Press.
There are 26 citations in total.

Details

Primary Language English
Subjects Health Economy
Journal Section Research Article
Authors

Cuma Çakmak 0000-0002-4409-9669

Fadime Çınar 0000-0002-9017-4105

Mehmet Aziz Çakmak 0000-0002-5040-5642

Early Pub Date July 22, 2024
Publication Date November 30, 2024
Submission Date June 2, 2024
Acceptance Date July 22, 2024
Published in Issue Year 2024 Volume: 14 Issue: 28

Cite

APA Çakmak, C., Çınar, F., & Çakmak, M. A. (2024). PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER. Dicle Üniversitesi İktisadi Ve İdari Bilimler Fakültesi Dergisi, 14(28), 842-855. https://doi.org/10.53092/duiibfd.1494646
AMA Çakmak C, Çınar F, Çakmak MA. PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER. Dicle Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi. November 2024;14(28):842-855. doi:10.53092/duiibfd.1494646
Chicago Çakmak, Cuma, Fadime Çınar, and Mehmet Aziz Çakmak. “PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER”. Dicle Üniversitesi İktisadi Ve İdari Bilimler Fakültesi Dergisi 14, no. 28 (November 2024): 842-55. https://doi.org/10.53092/duiibfd.1494646.
EndNote Çakmak C, Çınar F, Çakmak MA (November 1, 2024) PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER. Dicle Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi 14 28 842–855.
IEEE C. Çakmak, F. Çınar, and M. A. Çakmak, “PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER”, Dicle Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi, vol. 14, no. 28, pp. 842–855, 2024, doi: 10.53092/duiibfd.1494646.
ISNAD Çakmak, Cuma et al. “PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER”. Dicle Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi 14/28 (November 2024), 842-855. https://doi.org/10.53092/duiibfd.1494646.
JAMA Çakmak C, Çınar F, Çakmak MA. PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER. Dicle Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi. 2024;14:842–855.
MLA Çakmak, Cuma et al. “PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER”. Dicle Üniversitesi İktisadi Ve İdari Bilimler Fakültesi Dergisi, vol. 14, no. 28, 2024, pp. 842-55, doi:10.53092/duiibfd.1494646.
Vancouver Çakmak C, Çınar F, Çakmak MA. PREDICTING SURVIVAL LIMITATION BY MACHINE LEARNING IN PATIENT WITH CANCER. Dicle Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi. 2024;14(28):842-55.

                                                                                                                                                           32482   32483

All works published in this journal are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License.