Research Article
BibTex RIS Cite

Performance Comparison of Different Machine Learning Techniques in Diagnosis of Breast Cancer

Year 2019, Issue: 16, 176 - 185, 31.08.2019
https://doi.org/10.31590/ejosat.553549

Abstract

Computers are able to process faster than people, but their ability to make decisions is limited. Various machine learning techniques are being developed for today's computers to make better analyzes and predictions. These techniques increase the decision-making power of computers and enable the development of support systems for experts in different fields. Machine learning techniques are being used rapidly to assist medical specialists in diagnosing diseases with their successful classification and diagnostic capabilities. Successful work can be done with machine learning, which is rapidly increasing in the use of cancer diagnosis. Breast cancer is the second most common type of cancer in the world and is the most common cancer related cause of death among women. As with all other types of cancer, early diagnosis of breast cancer is critical in reducing the mortality rate. Diagnosis of breast cancer, diagnosis and interpretation of test results require specialized human knowledge, but successful studies are being carried out in the diagnosis of breast cancer by developing machine learning techniques. Machine learning is an artificial intelligence branch that allows computers to quickly identify patterns within complex and large data sets by learning from existing data. Due to this ability, machine learning is widely used in diagnosis of cancer, especially in breast cancer. In this study, the University of Wisconsin breast cancer data set, which consists of 569 samples, each with 30 features, was classified by five different machine learning techniques.Data was randomly splitted as training and test set. After the training process of Support Vector Machine, Naïve Bayes, Random Forest, K-Nearest Neighbour and Logistic Regression methods, confusion matrices and roc curves were created and the success of each method has been compared. As a result of this comparison, it has been shown that Logistic Regression is the most successful model with 98.24% accuracy.

References

  • Kolay, N., Erdoğmuş, P., The classification of breast cancer with Machine Learning Techniques. In Electric Electronics, Computer Science, Biomedical Engineerings' Meeting (EBBT), 1-4, 2016.
  • Jemal, A., Siegel, R., Xu, J., Ward, E., Cancer statistics 2010, CA: a cancer journal for clinicians, 60(5), 277-300, 2010.
  • Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., & Jemal, A., Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA: a cancer journal for clinicians, 68(6), 394-424, 2018.
  • Papageorgiou, E. I., Jayashree Subramanian, Karmegam, A., & Papandrianos, N., A risk management model for familial breast cancer: A new application using Fuzzy Cognitive Map method, Computer Methods and Programs in Biomedicine, 122(2), 123–135, 2015.
  • Tapak, L., Shirmohammadi-Khorram, N., Amini, P., Alafchi, B., Hamidi, O., & Poorolajal, J., Prediction of survival and metastasis in breast cancer patients using machine learning classifiers, Clinical Epidemiology and Global Health, 2018.
  • Cruz, J. A., Wishart, D. S., Applications of machine learning in cancer prediction and prognosis, Cancer informatics, 2, 2006.
  • Jemal, A., Bray, F., Center, M. M., Ferlay, J., Ward, E., Forman, D., Global cancer statistics, CA: a cancer journal for clinicians, 61(2), 69-90, 2011.
  • Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., Fotiadis, D. I., Machine learning applications in cancer prognosis and prediction, Computational and structural biotechnology journal, 13, 8-17, 2015.
  • Thyagarajan, R., Murugavalli, S., Segmentation of Digital Breast Tomograms using clustering techniques, In India Conference (INDICON), 2012 Annual IEEE, 1090-1094, 2012.
  • Heriana, O., Soesanti, I., Tumor size classification of breast thermal image using fuzzy C-Means algorithm, In Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET), 2015 International Conference on, 98-103, 2015.
  • Belciug, S., Salem, A. B., Gorunescu, F., Gorunescu, M., Clustering-based approach for detecting breast cancer recurrence, In Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on, 533-538, 2010.
  • Abbosh, Y. M., Yahya, A. F., Abbosh, A., Neural networks for the detection and localization of breast cancer, In Communications and Information Technology (ICCIT), 2011 International Conference on, 156-159, 2011.
  • Isa, N. A. M., Hamid, N. H. A., Sakim, H. A. M., Mashor, M. Y., Zamli, K. Z., Intelligent classification system for cancer data based on artificial neural network, In Cybernetics and Intelligent Systems, 2004 IEEE Conference on, 196-201, 2004.
  • Pawar, P. S., Patil, D. R., Breast cancer detection using neural network models, In Communication Systems and Network Technologies (CSNT), 2013 International Conference on, 568-572, 2013.
  • Shen, L., Chen, H., Yu, Z., Kang, W., Zhang, B., Li, H., Liu, D., Evolving support vector machines using fruit fly optimization for medical data classification, Knowledge-Based Systems, 96, 61-75, 2016.
  • Banu, G. S., Fareeth, A., Hundewale, N., Prediction of breast cancer in mammagram image using support vector machine and fuzzy C-means, In Biomedical and Health Informatics (BHI), 2012 IEEE-EMBS International Conference on, 573-576, 2012.
  • Majid, A., Ali, S., Iqbal, M., Kausar, N., Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines, Computer methods and programs in biomedicine, 113(3), 792-808, 2014.
  • Ribeiro, A. C., Silva, D. P., Araujo, E., Fuzzy breast cancer risk assessment, In Fuzzy Systems (FUZZ-IEEE), 2014 IEEE International Conference on, 1083-1087, 2014.
  • Keleş, A., Keleş, A., Yavuz, U., Expert system based on neuro-fuzzy rules for diagnosis breast cancer, Expert systems with applications, 38(5), 5719-5726, 2011.
  • Ravdin, P. M., Clark, G. M., A practical application of neural network analysis for predicting outcome of individual breast cancer patients, Breast cancer research and treatment, 22(3), 285-293, 1992.
  • Mangasarian, O. L., Street, W. N., Wolberg, W. H., Breast cancer diagnosis and prognosis via linear programming, Operations Research, 43(4), 570-577, 1995.
  • Ravi, V., Zimmermann, H. J., Fuzzy rule based classification with FeatureSelector and modified threshold accepting, European Journal of Operational Research, 123(1), 16-28, 2000.
  • Delen, D., Walker, G., Kadam, A., Predicting breast cancer survivability: a comparison of three data mining methods. Artificial intelligence in medicine, 34(2), 113-127, 2005.
  • Polat, K., Güneş, S., Breast cancer diagnosis using least square support vector machine, Digital signal processing, 17(4), 694-701, 2007.
  • Khan, M. U., Choi, J. P., Shin, H., Kim, M., Predicting breast cancer survivability using fuzzy decision trees for personalized healthcare, In Engineering in Medicine and Biology Society 30th Annual International Conference of the IEEE , 5148-5151, 2008.
  • Chauhan, N., Ravi, V., Chandra, D. K., Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks, Expert Systems with Applications, 36(4), 7659-7665, 2009.
  • Karabatak, M., Ince, M. C., An expert system for detection of breast cancer based on association rules and neural network, Expert systems with Applications, 36(2), 3465-3469, 2009.
  • Costantino, J. P., Gail, M. H., Pee, D., Anderson, S., Redmond, C. K., Benichou, J., Wieand, H. S., Validation studies for models projecting the risk of invasive and total breast cancer incidence, Journal of the National Cancer Institute, 91(18), 1541-1548, 1999.
  • Tyrer, J., Duffy, S. W., Cuzick, J., A breast cancer prediction model incorporating familial and personal risk factors, Statistics in medicine, 23(7), 1111-1130, 2004.
  • Parmigiani, G., Berry, D. A., Aguilar, O., Determining carrier probabilities for breast cancer–susceptibility genes BRCA1 and BRCA2, The American Journal of Human Genetics, 62(1), 145-158, 1998.
  • Powell, M., Jamshidian, F., Cheyne, K., Nititham, J., Prebil, L. A., Ereman, R., Assessing breast cancer risk models in Marin County, a population with high rates of delayed childbirth, Clinical breast cancer, 14(3), 212-220, 2014.
  • Alharbi, A., Tchier, F., Using a genetic-fuzzy algorithm as a computer aided diagnosis tool on Saudi Arabian breast cancer database, Mathematical biosciences, 286, 39-48, 2017.
  • Akyol, K., Meme Kanseri Tanısı İçin Özniteliklerin Öneminin Değerlendirilmesi Üzerine Bir Çalışma, Academic Platform Journal of Engineering and Science, 6(2), 109–115, 2018.
  • Cortes, C., Vapnik, V., Support-vector networks, Machine learning, 20(3), 273-297, 1995.
  • Wood, A., Shpilrain, V., Najarian, K., Kahrobaei, D., Private naive bayes classification of personal biomedical data: Application in cancer data analysis. Computers in biology and medicine, 105, 144-150, 2019.
  • Breiman, L., Random forests. Machine learning, 45(1), 5-32, 2001.
  • Yang, Y., Loog, M., A benchmark and comparison of active learning for logistic regression, Pattern Recognition, 83, 401-415, 2018.

Göğüs Kanseri Teşhisinde Farklı Makine Öğrenmesi Tekniklerinin Performans Karşılaştırması

Year 2019, Issue: 16, 176 - 185, 31.08.2019
https://doi.org/10.31590/ejosat.553549

Abstract

Bilgisayarlar insanlara
nazaran daha hızlı işlem yapabilmektedir ancak karar verme yetenekleri
kısıtlıdır. Günümüz bilgisayarlarının daha iyi analizler yapıp tahminlerde
bulunabilmeleri için çeşitli makine öğrenmesi teknikleri geliştirilmektedir. Bu
teknikler bilgisayarların karar verme güçlerini arttırmakta ve farklı sahalarda
uzmanlara destek sistemlerin geliştirilmesine olanak sağlamaktadır. Makine
öğrenmesi tekniklerinin, başarılı sınıflama ve tanılama yetenekleri ile
hastalık teşhisinde medikal uzmanlara yardımcı olarak kullanımları hızla
artmaktadır. Kanser teşhisinde de kullanımı hızla artan makine öğrenmesi ile
başarılı çalışmalar yapılabilmektedir. Göğüs kanseri dünya genelinde en yaygın
görülen ikinci kanser türü olup
kadınlar
arasında kanser kaynaklı en yüksek oranda ölüme sebep olan hastalıktır.  Diğer tüm kanser türlerinde olduğu gibi göğüs kanserinin de erken
teşhisi ölüm oranını azaltmada kritik bir öneme sahiptir. Göğüs kanseri tanısı,
test sonuçların yorumlanarak teşhis edilmesi uzman insan bilgisine ihtiyaç
duymaktadır ancak gelişen makine öğrenmesi teknikleri ile göğüs kanseri
teşhisinde başarılı çalışmalar yürütülmektedir. Makine öğrenmesi
bilgisayarların mevcut verilerden öğrenerek karmaşık ve büyük veri setleri
içerisindeki desenleri hızlı bir şekilde tespit etmesini sağlayan bir yapay
zekâ dalıdır. Bu yeteneğinden dolayı makine öğrenmesi kanser tanı ve teşhisinde
özellikle göğüs kanseri konusunda da yaygın kullanım alanı bulmaktadır. Bu
çalışmada her biri 30 adet özellik içeren ve 569 örnekten oluşan Wisconsin
Üniversitesi göğüs kanseri veri seti, beş farklı makine öğrenmesi tekniği ile
sınıflandırılmıştır. Veriler rastgele olarak eğitim ve test setlerine ayrılmıştır.
Destek vektör makinesi, Naïve Bayes, rastgele orman, K en yakın komşu ve lojistik
regresyon metotları ile gerçekleştirilen eğitim sürecinin ardından confusion
matrisleri ve roc eğrileri oluşturulmuştur. Her bir tekniğin başarısı
karşılaştırılmıştır. Bu karşılaştırmanın sonucunda lojistik regresyonun %98.24
doğruluk ile en başarılı yöntem olduğu ortaya konmuştur.

References

  • Kolay, N., Erdoğmuş, P., The classification of breast cancer with Machine Learning Techniques. In Electric Electronics, Computer Science, Biomedical Engineerings' Meeting (EBBT), 1-4, 2016.
  • Jemal, A., Siegel, R., Xu, J., Ward, E., Cancer statistics 2010, CA: a cancer journal for clinicians, 60(5), 277-300, 2010.
  • Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., & Jemal, A., Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA: a cancer journal for clinicians, 68(6), 394-424, 2018.
  • Papageorgiou, E. I., Jayashree Subramanian, Karmegam, A., & Papandrianos, N., A risk management model for familial breast cancer: A new application using Fuzzy Cognitive Map method, Computer Methods and Programs in Biomedicine, 122(2), 123–135, 2015.
  • Tapak, L., Shirmohammadi-Khorram, N., Amini, P., Alafchi, B., Hamidi, O., & Poorolajal, J., Prediction of survival and metastasis in breast cancer patients using machine learning classifiers, Clinical Epidemiology and Global Health, 2018.
  • Cruz, J. A., Wishart, D. S., Applications of machine learning in cancer prediction and prognosis, Cancer informatics, 2, 2006.
  • Jemal, A., Bray, F., Center, M. M., Ferlay, J., Ward, E., Forman, D., Global cancer statistics, CA: a cancer journal for clinicians, 61(2), 69-90, 2011.
  • Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., Fotiadis, D. I., Machine learning applications in cancer prognosis and prediction, Computational and structural biotechnology journal, 13, 8-17, 2015.
  • Thyagarajan, R., Murugavalli, S., Segmentation of Digital Breast Tomograms using clustering techniques, In India Conference (INDICON), 2012 Annual IEEE, 1090-1094, 2012.
  • Heriana, O., Soesanti, I., Tumor size classification of breast thermal image using fuzzy C-Means algorithm, In Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET), 2015 International Conference on, 98-103, 2015.
  • Belciug, S., Salem, A. B., Gorunescu, F., Gorunescu, M., Clustering-based approach for detecting breast cancer recurrence, In Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on, 533-538, 2010.
  • Abbosh, Y. M., Yahya, A. F., Abbosh, A., Neural networks for the detection and localization of breast cancer, In Communications and Information Technology (ICCIT), 2011 International Conference on, 156-159, 2011.
  • Isa, N. A. M., Hamid, N. H. A., Sakim, H. A. M., Mashor, M. Y., Zamli, K. Z., Intelligent classification system for cancer data based on artificial neural network, In Cybernetics and Intelligent Systems, 2004 IEEE Conference on, 196-201, 2004.
  • Pawar, P. S., Patil, D. R., Breast cancer detection using neural network models, In Communication Systems and Network Technologies (CSNT), 2013 International Conference on, 568-572, 2013.
  • Shen, L., Chen, H., Yu, Z., Kang, W., Zhang, B., Li, H., Liu, D., Evolving support vector machines using fruit fly optimization for medical data classification, Knowledge-Based Systems, 96, 61-75, 2016.
  • Banu, G. S., Fareeth, A., Hundewale, N., Prediction of breast cancer in mammagram image using support vector machine and fuzzy C-means, In Biomedical and Health Informatics (BHI), 2012 IEEE-EMBS International Conference on, 573-576, 2012.
  • Majid, A., Ali, S., Iqbal, M., Kausar, N., Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines, Computer methods and programs in biomedicine, 113(3), 792-808, 2014.
  • Ribeiro, A. C., Silva, D. P., Araujo, E., Fuzzy breast cancer risk assessment, In Fuzzy Systems (FUZZ-IEEE), 2014 IEEE International Conference on, 1083-1087, 2014.
  • Keleş, A., Keleş, A., Yavuz, U., Expert system based on neuro-fuzzy rules for diagnosis breast cancer, Expert systems with applications, 38(5), 5719-5726, 2011.
  • Ravdin, P. M., Clark, G. M., A practical application of neural network analysis for predicting outcome of individual breast cancer patients, Breast cancer research and treatment, 22(3), 285-293, 1992.
  • Mangasarian, O. L., Street, W. N., Wolberg, W. H., Breast cancer diagnosis and prognosis via linear programming, Operations Research, 43(4), 570-577, 1995.
  • Ravi, V., Zimmermann, H. J., Fuzzy rule based classification with FeatureSelector and modified threshold accepting, European Journal of Operational Research, 123(1), 16-28, 2000.
  • Delen, D., Walker, G., Kadam, A., Predicting breast cancer survivability: a comparison of three data mining methods. Artificial intelligence in medicine, 34(2), 113-127, 2005.
  • Polat, K., Güneş, S., Breast cancer diagnosis using least square support vector machine, Digital signal processing, 17(4), 694-701, 2007.
  • Khan, M. U., Choi, J. P., Shin, H., Kim, M., Predicting breast cancer survivability using fuzzy decision trees for personalized healthcare, In Engineering in Medicine and Biology Society 30th Annual International Conference of the IEEE , 5148-5151, 2008.
  • Chauhan, N., Ravi, V., Chandra, D. K., Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks, Expert Systems with Applications, 36(4), 7659-7665, 2009.
  • Karabatak, M., Ince, M. C., An expert system for detection of breast cancer based on association rules and neural network, Expert systems with Applications, 36(2), 3465-3469, 2009.
  • Costantino, J. P., Gail, M. H., Pee, D., Anderson, S., Redmond, C. K., Benichou, J., Wieand, H. S., Validation studies for models projecting the risk of invasive and total breast cancer incidence, Journal of the National Cancer Institute, 91(18), 1541-1548, 1999.
  • Tyrer, J., Duffy, S. W., Cuzick, J., A breast cancer prediction model incorporating familial and personal risk factors, Statistics in medicine, 23(7), 1111-1130, 2004.
  • Parmigiani, G., Berry, D. A., Aguilar, O., Determining carrier probabilities for breast cancer–susceptibility genes BRCA1 and BRCA2, The American Journal of Human Genetics, 62(1), 145-158, 1998.
  • Powell, M., Jamshidian, F., Cheyne, K., Nititham, J., Prebil, L. A., Ereman, R., Assessing breast cancer risk models in Marin County, a population with high rates of delayed childbirth, Clinical breast cancer, 14(3), 212-220, 2014.
  • Alharbi, A., Tchier, F., Using a genetic-fuzzy algorithm as a computer aided diagnosis tool on Saudi Arabian breast cancer database, Mathematical biosciences, 286, 39-48, 2017.
  • Akyol, K., Meme Kanseri Tanısı İçin Özniteliklerin Öneminin Değerlendirilmesi Üzerine Bir Çalışma, Academic Platform Journal of Engineering and Science, 6(2), 109–115, 2018.
  • Cortes, C., Vapnik, V., Support-vector networks, Machine learning, 20(3), 273-297, 1995.
  • Wood, A., Shpilrain, V., Najarian, K., Kahrobaei, D., Private naive bayes classification of personal biomedical data: Application in cancer data analysis. Computers in biology and medicine, 105, 144-150, 2019.
  • Breiman, L., Random forests. Machine learning, 45(1), 5-32, 2001.
  • Yang, Y., Loog, M., A benchmark and comparison of active learning for logistic regression, Pattern Recognition, 83, 401-415, 2018.
There are 37 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Onur Sevli 0000-0002-8933-8395

Publication Date August 31, 2019
Published in Issue Year 2019 Issue: 16

Cite

APA Sevli, O. (2019). Göğüs Kanseri Teşhisinde Farklı Makine Öğrenmesi Tekniklerinin Performans Karşılaştırması. Avrupa Bilim Ve Teknoloji Dergisi(16), 176-185. https://doi.org/10.31590/ejosat.553549

Cited By