Classification of Skin Lesions in Dermatoscopic Images with Deep Convolution Network

In this study, a deep learning based solution with convolutional neural network is presented to solve the problem of classification of dermoscopic images containing skin lesions. Defining models with classical machine learning techniques takes a lot of time and with this model, it cannot make data meaningful without pretreatment. Thanks to deep learning, we have come a long way in problems that we think are difficult to solve for many years. Deep learning achieves results by processing the data at hand without any intervention by us. Classification of dermoscopic images of skin lesions is a difficult task to distinguish between benign or malignant melanocytic tumors. Malignant melanoma is the deadliest form of skin cancer and is one of the fastest developing cancers in the world. If diagnosed early, it can be easily treated and ultimately, early diagnosis of melanoma is vital. Dermoscopy has become one of the most important tools in the diagnosis of melanoma and other pigmented skin lesions. Due to the inaccuracy, subjectivity, and poor reproducibility of human judgment, there has been a need to process the dermoscopy image with an automatic recognition algorithm. In 2017, the support vector machine (SVM) classifier was used to differentiate 172 dermatoscopic images into two classes as “benign”and “malignant”. Experiments on the dataset have 91% accuracy. However, the fact that we have thousands of images in our data set and that we will break them down into seven lesion classes required us to search for more effective methods. Classroom inconsistency of melanomas is considered a challenging process due to the low contrast of skin lesions and artificial objects such as noise, presence of hair, air bubbles and similarity between non-melanoma cases in dermoscopy images. To solve these problems, we propose the VGGNET-16 architecture, which includes a powerful convolutional neural network model to classify seven different types of disease on dermoscopic images. “HAM10000 ”(Human Againist Machine) data set was used with VGGNET-16 architecture and the results were observed. The data set, which is published as an educational set for academic machine learning and made public through the ISIC archive, consists of 10015 dermatoscopic images. K-Fold Cross Validation technique was used to differentiate the data set consisting of seven lesion classes as training and test area. In the test phase of the educated model, the validation of the classes was obtained as 85.62%.


Introduction
In this study, a model was developed using deep learning architectures for early detection of skin cancer using dermoscopic images. Classification of dermoscopic images of skin lesions is a difficult task to distinguish between benign or malignant melanocytic tumors. Malignant melanoma is the deadliest form of skin cancer and is one of the fastest developing cancers in the world. If diagnosed early, it can be easily treated and ultimately, early diagnosis of melanoma is vital. Dermoscopy is an important tool in the early diagnosis of skin cancer. Due to the inaccuracy, subjectivity, and poor reproducibility of human judgment, there has been a need to process the dermoscopy image with an automatic recognition algorithm.
With the introduction of "Deep Learning" methods in the field of machine learning, with the help of GPU and other hardware developments, artificial intelligence based methods have become the preferred method. At the beginning of the 2000s, to produce solutions by increasing the number of layers and nodes in Artificial Neural Networks (ANNs) was not an efficient method as a result of insufficient hardware units. [1].
Through deep learning, it has provided an environment for development in many scientific fields such as voice recognition, visual object detection, drug prediction. [2]. The choice of the learning algorithm is an important factor in large data sets. [3]. Standard artificial neural network is a highly advanced method of classification. We can also talk about the existence of a spatial structure rather than the independence of inputs by the use of convoluted neural networks. [4]. By means of convolutional neural networks, the image input dataset is made more meaningful for the artificial neural network.
Although the incidence of skin cancer diseases has increased in recent years, there has been an increase in the studies conducted for early diagnosis. When we look at previous studies, we see that data sets are limited or class distributions are not balanced. In this sense, we have developed a more reliable model that provides comprehensive discrimination by stabilizing the data set and using k-fold validation.

Material and Method
The images in Ham10000 dataset are classified using convolution, centering and full connection layers in the model created with VGGNET architecture. Images in the size of 600x450 were reduced to 400x300 unit sizes and processed. In this application coded with Python programming language, Tensorflow, Keras, Sklearn machine learning libraries were used. PYCharm was preferred as the IDE environment.

Deep Learning and CNN
The images in Ham10000 dataset are classified using convolution, centering and full connection layers in the model created with VGGNET architecture. Images in the size of 600x450 were reduced to 400x300 unit sizes and processed. In this application coded with Python programming language, Tensorflow, Keras, Sklearn machine learning libraries were used. PYCharm was preferred as the IDE environment.

Definition
Many non-linear layers are utilized for feature extraction from the data set with deep learning. The output produced from one layer is transferred as input to another. [5]. Algorithms can be supervised (such as classification) or uncontrolled (such as pattern analysis). In this application, it is known which class of image of each lesion belongs to in the training stage of artificial neural network model. Therefore, supervised learning was conducted.
Supervised learning knows the input and results of the information in the data set. Performs a learning using the previously defined class labels. [6]. The deep learning approach is based on learning by extracting features from the whole data itself. With vectors and edge information, features are extracted on the image. The most ideal algorithms are used to extract these features through deep learning. [7]. In the application, "Relu" algorithm was used as activation function, "Binary Cross-Entropy" function was used ass loss function and "Adam" function was used as optimizer function in convolution and fully connected artificial neural network stages.

History
Supervised deep-feed multi-layer first learning algorithm was published in 1965 by Ivakhnenko and Lapa [8]. The first deep learning architecture "Neocognitron" was proposed by Fukushima in 1979. Fukushima's nets contain multiple bending and pond layers [10]. Yann LeCun and friends have developed mailbox writings [11]. The network model gives good results, but it is seen as a disadvantage that the time required to train the network is high. [11] Although the network worked successfully, it was found to be unsuitable in practice since the training lasted approximately 3 days.
Yann LeCunn co-opted with curved webs to classify handwriting numbers (MNIST) through the Net LeNet network [12]. In 1995, Brendan Frey, Peter Dayan, and Geoffrey Hinton developed the wake-sleep algorithm, which showed that a fully connected network of hundreds of hidden layers could be trained, even if the training lasted two days. [13]. In 1997, there were some important developments such as long short term memory for recurrent neural networks proposed by Hochreiter and Schmidhuber [14].
For the first time in the context of ANN, the term learning deep learning deep was introduced in 2000 by Igor Aizenberg et al. [15]. In an article published by Geoffrey Hinton in 2006, he demonstrated how a multilayer feedforward neural network can effectively train a layer at each iteration (he trained each layer with an uncontrolled Boltzmann machine without supervision), and then fine-tuned it with a controlled back propagation method. [16]. [20] In 1994, Binder et al successfully used dermatoscopic images to successfully train an artificial neural network to differentiate melanomas from the melanocytic nevi, the most lethal skin cancer. [21] In 2013, Mendonça et al obtained 200 dermatoscopic images as a PH2 dataset including 160 types and 40 melanomas. [22] In 2018, the study by Philipp Tschandl, Cliff Rosendahl and Harald Kittler presented the HAM10000 data set as a large collection of multidimensional dermatoscopic images of diffuse pigmented skin lesions. [23] Also in 2018, Nithin D Reddy trained an evolutionary neural network based on the ResNet50 architecture to accurately classify dermoscopy images of skin lesions into one of seven disease categories. The validation data set achieved a 91% balanced accuracy. [24] In

Convolutional Neural Networks (CNN)
A neural network consists of neurons that communicate with each other. Each neuron has weight values and the model will produce accurate results by updating these weights and training the network. As shown in figure 1, each layer processes the information from the previous layer to produce properties. These pattern recognition layers can be between 5 and 25 in a typical CNN. [18].
The first CNN network given in Figure 3 is LeNet architecture, which was introduced by Yann LeCun in 1988 and continued to be improved until 1998. [19]  CNN algorithms are used in many different fields such as image, sound processing, natural language processing (NLP) and biomedical.
A typical convolutional neural network is shown in Figure 4. In this study, the images were first introduced into the convolution and partnering processes consisting of five layers using VGGNET. Three artificial neural network layers were used in the full connection structure. In order not to memorize the model during the training phase, "dropout" method was applied and some nodes in the neural network were forgotten.

Data Set and Model
The HAM 10000 dataset contains seven classes of skin lesions. Figure 5. Classes in the HAM 10000 dataset As can be seen, the excessive number of samples belonging to the "Melanocytic nevi" class in the distribution of classes on the data set disrupts the homogeneous structure of the data set. For this reason, the data of this class has been rearranged (some of it has been discarded) and the distribution of the classes for education has been made acceptable as in Figure 6.  In the visual data set is divided into 10 parts. In each round, the blue painted area is allocated to the test set, while the other parts are reserved for training. When all rounds are over, E's arithmetic mean shows the performance of our model.
The purpose of separating the data set as a training and test set is to avoid possible overfitting and to understand how the model performs on a data set that it has not seen before. However, there may be some errors due to distribution during the training and testing phase of our model. K -Fold Cross Validation technique is used to minimize these errors.

Confusion Matrix
As a result, the graphs of the "loss" and "success" functions obtained during the training and verification stages of the model are as follows. The confusion matrix obtained as a result of the application is as follows.

Conclusions and Recommendations
It is possible to increase the success of the application by changing convolution processes, activation functions, loss functions and using more hidden layers in the artificial neural network. Although the incidence of skin cancer diseases has increased in recent years, there has been an increase in the studies conducted for early diagnosis.When we look at previous studies, we see that data sets are limited or class distributions are not balanced. In this sense, we have developed a more reliable model that provides comprehensive discrimination by stabilizing the data set and using k-fold validation. In the test phase of the educated model, the validation of the classes was obtained as 85.62%.