Research Article
BibTex RIS Cite

Enhancement Of Breast Cancer Diagnosis Accuracy With Deep Learning

Year 2019, , 452 - 462, 31.10.2019
https://doi.org/10.31590/ejosat.638428

Abstract

Breast cancer is a highly fatal disease that is very prevalent among the female population. In this study, a new type of approach is proposed with the aim of improving the accuracy of breast cancer diagnosis, an important problem of our present time, by means of deep learning, one of the techniques in machine learning. In the designed method, the original data set of Breast Cancer Wisconsin being available in the Irvine Machine Learning Repository of University of California was used. Within this data set, there were 699 data consisting of 10 independent variables and 1 dependent variable. The complete utilization of the entire data set was ensured by correction of 16 incorrect data. A normalization process was applied in the data set for the purpose of reducing the time required for learning process. The used data set was allocated as 80% for training, 10% for validation, and 10% for testing. An artificial neural network was designed for the deep learning model. The neural network was set up of a total of 5 layers which were an input layer with 10 neurons, 3 hidden layers with 1000 neurons for each layer, and an output layer with 3 neurons. The software, developed for implementation was written by using Spyder which is an interactive development environment for Python programming language. In addition, Keras neural network API was used. The performance of the model was evaluated with Confusion Matrix and ROC (Receiver Operating Characteristic) analysis. According to the test data obtained at the end of the training, it was observed that the implemented model provided successful results. It is considered that the proposed method will contribute to the improvement of breast cancer diagnosis accuracy.

References

  • https://www.wcrf.org/dietandcancer/cancer-trends/breast-cancer-statistics (accessed 20.05.2019)
  • Montúfar, G.F. (2014). Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units, Neural Computation, Vol. 26, issue 7, pp 1386-1407.
  • Schmidhuber, J. (2015). Deep Learning in Neural Networks: An Overview. Neural Networks, Vol. 61, pp.85-117.
  • Goodfellow I, Bengio Y, Courville A. (2016). Deep Learning, MIT Press.
  • Baneriee, C., Paul, S., Ghoshal, M. (2017). A Comparative Study of Different Ensemble Learning Techniques using Wisconsin Breast Cancer dataset, International Conference on Computer, Electrical & Communication Engineering (ICCECE)
  • Ghosh, S., Hossain, J., Fattah S.A., Shahnaz, C. and Khan, A. I. (2017). Efficient approaches for accuracy improvement of breast cancer classification using Wisconsin database, 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dhaka, pp. 792-797.
  • Mumtaz, K., Sheriff, S., Duraiswamy, K.(2009).Evaluation of Three Neural Net-work Models using Wisconsin Breast Cancer Database, International Conference on Control, Automation, Communication and Energy Conservation.
  • Ashraf, M., Le, K., Huang, X. (2011). Iterative weighted k-nn for constructing missing feature values in Wisconsin breast cancer dataset. In: 2011 3rd International Conference on Data Mining and Intelligent Information Technology Applications (ICMIA), pp. 23–27
  • Zhang, D., Zou, L., Zhou, X. and He, F.(2018).Integrating feature selection and feature extraction methods with deep learning to predict clinical outcome of breast cancer, IEEE Access, vol. 6, pp. 28936–28944.
  • Wisconsin Breast Cancer original dataset https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+ (original) (accessed 10.05.2019)
  • Mean Imputation technique. https://chrisalbon.com/machine_learning/preprocessing_structured_data/impute_missing_values_with_means/ (accessed 20.05.2019)
  • https://scikit-learn.org/stable/modules/preprocessing.html (accessed 10.05.2019)
  • https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_validate.html ((accessed 10.05.2019)
  • Zhang, T., Zheng, H., Zhang L. (2018). Verification CAPTCHA Based on Deep Learning Proceedings of the 37th Chinese Control Conference, Wuhan, China
  • Agarap, A.F. (2019).Deep learning using rectified linear units (relu), [Online]. ArXiv: 1803.08375v2 [cs.NE] 7 Feb 2019
  • Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting, Journal of Machine Learning Research, 15(1), pp: 1929–1958.
  • Kingma, D. P. and Ba, J. (2015).Adam: A Method for Stochastic Optimization. ArXiv:1412.6980v8 [cs.LG] 23 Jul 2015
  • https://www.jqr.com/article/000505 (accessed 5.05.2019)
  • https://lasagne.readthedocs.io/en/latest/modules/objectives.html#lasagne.objectives.categorical_crossentropy (accessed 15.05.2019)
  • Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation, Journal of Machine Learning Technologies, 2 (1), pp.37–63.
  • Ting, K. M. (2011). Encyclopedia of machine learning. Springer.
  • Zweig, M. H. and Campbell, G. (1993) “Receiver-operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine,” Clin. Chem., vol. 39, no. 4, pp. 561–577.

Derin Öğrenme İle Meme Kanseri Tanısının Doğruluğunun Geliştirilmesi

Year 2019, , 452 - 462, 31.10.2019
https://doi.org/10.31590/ejosat.638428

Abstract

Meme kanseri, kadınlarda çok sık görülen ve ölümlere neden olan hastalıklardan biridir. Bu çalışmada makina öğrenmesi tekniklerinden biri olan derin öğrenme metodu ile günümüzün büyük bir problemi olan meme kanseri tanısının doğruluğunun geliştirilmesi amacıyla yeni bir yaklaşım ortaya konulmuştur. Tasarlanan yöntemde, literatürde yer alan University of California, Irvine Makina öğrenmesi veri havuzunda bulunan Breast Cancer Wisconsin orjinal veri seti kullanılmıştır. Bu veri setinde 10 adet bağımsız değişken, 1 adet bağımlı değişkenden oluşan 699 veri mevcuttur. Bu verilerden bozuk olan 16 adet veri düzeltilerek tüm veri setinin kullanılması sağlanmıştır. Veri setinin öğrenme süresinin azaltılması amacıyla normalizasyon işlemi yapılmıştır. Kullanılan veri seti, % 80 eğitim için, %10 değerlendirme ve %10 test için ayrılmıştır. Derin öğrenme modeli için bir yapay sinir ağı tasarlanmıştır. Sinir ağı 10 nöronlu giriş katmanı, 1000’er nöronlu 3 adet gizli katman ve 3 nöronlu çıkış katmanı olmak üzere toplam 5 katmandan oluşturulmuştur. Uygulamada geliştirilen yazılım Python programlama dili için, etkileşimli geliştirme ortamı olan Spyder ile kodlanmıştır. Keras sinir ağı API’ si kullanılmıştır. Oluşturulan modelin performansı Confusion Matrix ve ROC (Receiver Operating Characteristic) analizi ile değerlendirilmiştir. Eğitim sonunda elde edilen test verilerine göre gerçekleştirilen modelin başarılı sonuçlar verdiği görülmüştür. Önerilen yöntemin meme kanseri tanısının doğruluğunun geliştirilmesine katkıda bulunacağı değerlendirilmektedir.

References

  • https://www.wcrf.org/dietandcancer/cancer-trends/breast-cancer-statistics (accessed 20.05.2019)
  • Montúfar, G.F. (2014). Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units, Neural Computation, Vol. 26, issue 7, pp 1386-1407.
  • Schmidhuber, J. (2015). Deep Learning in Neural Networks: An Overview. Neural Networks, Vol. 61, pp.85-117.
  • Goodfellow I, Bengio Y, Courville A. (2016). Deep Learning, MIT Press.
  • Baneriee, C., Paul, S., Ghoshal, M. (2017). A Comparative Study of Different Ensemble Learning Techniques using Wisconsin Breast Cancer dataset, International Conference on Computer, Electrical & Communication Engineering (ICCECE)
  • Ghosh, S., Hossain, J., Fattah S.A., Shahnaz, C. and Khan, A. I. (2017). Efficient approaches for accuracy improvement of breast cancer classification using Wisconsin database, 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dhaka, pp. 792-797.
  • Mumtaz, K., Sheriff, S., Duraiswamy, K.(2009).Evaluation of Three Neural Net-work Models using Wisconsin Breast Cancer Database, International Conference on Control, Automation, Communication and Energy Conservation.
  • Ashraf, M., Le, K., Huang, X. (2011). Iterative weighted k-nn for constructing missing feature values in Wisconsin breast cancer dataset. In: 2011 3rd International Conference on Data Mining and Intelligent Information Technology Applications (ICMIA), pp. 23–27
  • Zhang, D., Zou, L., Zhou, X. and He, F.(2018).Integrating feature selection and feature extraction methods with deep learning to predict clinical outcome of breast cancer, IEEE Access, vol. 6, pp. 28936–28944.
  • Wisconsin Breast Cancer original dataset https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+ (original) (accessed 10.05.2019)
  • Mean Imputation technique. https://chrisalbon.com/machine_learning/preprocessing_structured_data/impute_missing_values_with_means/ (accessed 20.05.2019)
  • https://scikit-learn.org/stable/modules/preprocessing.html (accessed 10.05.2019)
  • https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_validate.html ((accessed 10.05.2019)
  • Zhang, T., Zheng, H., Zhang L. (2018). Verification CAPTCHA Based on Deep Learning Proceedings of the 37th Chinese Control Conference, Wuhan, China
  • Agarap, A.F. (2019).Deep learning using rectified linear units (relu), [Online]. ArXiv: 1803.08375v2 [cs.NE] 7 Feb 2019
  • Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting, Journal of Machine Learning Research, 15(1), pp: 1929–1958.
  • Kingma, D. P. and Ba, J. (2015).Adam: A Method for Stochastic Optimization. ArXiv:1412.6980v8 [cs.LG] 23 Jul 2015
  • https://www.jqr.com/article/000505 (accessed 5.05.2019)
  • https://lasagne.readthedocs.io/en/latest/modules/objectives.html#lasagne.objectives.categorical_crossentropy (accessed 15.05.2019)
  • Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation, Journal of Machine Learning Technologies, 2 (1), pp.37–63.
  • Ting, K. M. (2011). Encyclopedia of machine learning. Springer.
  • Zweig, M. H. and Campbell, G. (1993) “Receiver-operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine,” Clin. Chem., vol. 39, no. 4, pp. 561–577.
There are 22 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

İlker Yıldız 0000-0002-1575-2673

Alper Talha Karadeniz 0000-0003-4165-3932

Publication Date October 31, 2019
Published in Issue Year 2019

Cite

APA Yıldız, İ., & Karadeniz, A. T. (2019). Enhancement Of Breast Cancer Diagnosis Accuracy With Deep Learning. Avrupa Bilim Ve Teknoloji Dergisi452-462. https://doi.org/10.31590/ejosat.638428