Artificial intelligence-assisted detection model for melanoma diagnosis using deep learning techniques

The progressive depletion of the ozone layer poses a significant threat to both human health and the environment. Prolonged exposure to ultraviolet radiation increases the risk of developing skin cancer, particularly melanoma. Early diagnosis and vigilant monitoring play a crucial role in the successful treatment of melanoma. Effective diagnostic strategies need to be implemented to curb the rising incidence of this disease worldwide. In this work, we propose an artificial intelligence-based detection model that employs deep learning techniques to accurately monitor nevi with characteristics that may indicate the presence of melanoma. A comprehensive dataset comprising 8598 images was utilized for the model development. The dataset underwent training, validation, and testing processes, employing the algorithms such as AlexNet, MobileNet, ResNet, VGG16, and VGG19, as documented in current literature. Among these algorithms, the MobileNet model demonstrated superior performance, achieving an accuracy of 84.94% after completing the training and testing phases. Future plans involve integrating this model with a desktop program compatible with various operating systems, thereby establishing a practical detection system. The proposed model has the potential to aid qualified healthcare professionals in the diagnosis of melanoma. Furthermore, we envision the development of a mobile application to facilitate melanoma detection in home environments, providing added convenience and accessibility.


Introduction
Skin cancer is a disease that can have serious consequences if left untreated.Especially an aggressive type of skin cancer, such as melanoma, can spread quickly and metastasize to other organs.Skin cancer can be fatal if not diagnosed in time and treated appropriately.Because melanoma is more aggressive than other types of skin cancer and tends to metastasize, treatment becomes more difficult when diagnosed in advanced stages.Even in cases of melanoma treated with early diagnosis, in some cases, the disease can progress and become life-threatening.Because skin cancer can have serious consequences, it is important to take preventive measures by reducing risk factors such as regular skin examinations and sun exposure.Early detection can make treatment more effective and prevent serious consequences.For the early diagnosis of this disease, image processing [1,2] and deep learning methods can be applied.Many studies have reported the application of artificial intelligence (AI) algorithms in the detection of various types of cancer [3][4][5], including skin cancer [6,7].Yildiz [8], proposes an automatic detection system for melanoma using deep learning methods, specifically model C4Net.Sultana and Puhan [9] review the use of deep learning techniques to detect melanoma from other skin lesions in clinical and dermoscopy images.They emphasized that deep learning techniques outperform traditional methods in skin cancer detection, but data labeling for deep learning techniques is challenging.A similar study to Sultana and Puhan was conducted by Poorna et al.Poorna et al. [10] discuss the development of a computer vision-aided system for the early diagnosis of melanoma, a type of skin cancer.The study compares the accuracy and precision of conventional supervised learning techniques with deep learning-based methods for melanoma detection.Their learning techniques used for the classification were Total Dermoscopic Score, K Nearest Neighbor (KNN) [11][12][13] and, Support Vector Machine (SVM) [14].Kwiatkowska et al. [15] detected melanoma from dermoscopic images using ResNet and its different versions.However, higher accuracy rates can be achieved with different models.Shchetinin et al. [16] developed a computer-aided detection system for Melanoma detection with Deep Neural Networks on HAM10000 dataset.By combining multiple dataset instead of a single dataset, a system more suitable for real scenarios can be presented.In this study, we developed a system for the early detection of melanoma using various deeplearning techniques.The system is tested on a vast dataset.This dataset is a combination of 2 different datasets to increase the reality of the system and its feasibility for clinical applications.Training and testing were performed on a relatively rich dataset in terms of the variety and number of datasets.The presented system will play an active role in the early detection of skin cancer, which is likely to threaten our lives even more in the future, and will contribute to the development of future research on the diagnosis of this disease.

Preparing the dataset
In this study, two different datasets were blended.In this way, a rich dataset was obtained in terms of data diversity.The nevus category data from the first dataset called nevus classifier [17] was added to the new dataset for healthy nevus images.In the second dataset, called skin lesions [18], the data in the melanoma category were added to the same new dataset as melanoma images for cancerous nevus.Thus, a dataset containing healthy, labeled as nevus, and cancerous, labeled as melanoma, nevus images were obtained.The dataset consists of two classes: nevus and melanoma.The melanoma class represents cancerous mole images and the nevus class represents healthy mole images.There are a total of 8598 data in the dataset.The distribution of the dataset is shown in Table 1.Each class in the dataset is divided into 60% train, 20% test, and 20% validation.Figure 1 illustrates some of the images in the dataset.In this work, since the dataset is rich and diverse enough, there was no need to augment the data.

Proposed artificial intelligence models
The proposed AI model for the presented problem is based on a Convolutional Neural Network (CNN).CNNs [19][20][21] are a deep learning discipline that has proven its success in computer vision and has models designed for various problems.They can be used in vision systems of robots and autonomous vehicles for face [22,23], object [24] and, traffic sign [25,26] recognition.A CNN generally consists of Convolution, Pooling, and Fully Connected Layer structures.In a CNN, the image is directly input into the network, followed by several convolution and pooling processes.The outputs of the convolution and pooling processes feed one or more fully connected layers.Eventually, the class label is extracted as an output.in the convolutional layers are organized into feature maps.The aim of the pooling layer is to reduce the size (width x height) of the input image.This simplifies the computation for the next layer and also prevents overfitting.Filters similar to those used in the previous layer, convolution, are also used in the pooling layer.In these filters, the image matrix is shifted and the highest value of the pixels (maximum pooling) or the average (average pooling) is calculated.Maximum pooling is often used because it produces higher performance due to the simplicity and rapidity of the process.
The fully connected layer is the last layer of the CNN structure.In the CNN structure, the fully connected layer follows the convolution and pooling layers in succession.A fully connected layer is a standard layer used in classification problems [27].Convolution and pooling layers can be used many times before reaching the fully connected layer.However, the matrices at the output of the convolution and pooling layers need to be flattened in order to be used in the fully connected layer.Therefore, a flattening layer must be used before the fully connected layer.In the flattening layer, the matrices from the convolution and pooling layers are converted into a vector.This process is called flattening.
For the classification problem in this study; AlexNet, MobileNet, ResNet, VGG16 [28,29] and VGG19 models were used.These models were adapted to the presented problem by making the revisions indicated in Figure 3.The nevus images obtained from Kaggle were sized 224 x 224 and were input to the models.For each model, sigmoid was chosen as the activation function because it is more suitable for two-class problems.Figure 3 symbolizes the revisions made to the models proposed.In future studies, the softmax activation function may be preferred for classification problems with a larger number of outputs, as in the 1000-class model in Figure 3.

Training and performance evaluation of models
The training of the CNN-based models using the sigmoid activation function was performed for various values of the epoch, batch size, and learning rate parameters.For each model, the weights that provided the highest accuracy value on the validation data at the end of the training were determined.The testing process was carried out with these weights.Thus, test performance was improved and over-fitting was prevented.The algorithm memorizes the data in the training set, which is over-fitting.The algorithm that memorizes the training data cannot perform well enough on the test data.Therefore, the determination of the training parameters is crucial.Table 2 lists the models proposed for this work and the parameters used in the training of these models.
The learning rate parameters are kept low to prevent over-fitting.Epoch values are set between 10-30.Also, an early stop was included in the algorithm during the training of some of the models.When the training is terminated early (early stop), the training accuracy must be more than 95%.The parameters that produce the maximum accuracy on the validation data are assigned after the termination of the training.This is one strategy used to avoid over-fitting.
The performance of the proposed models was evaluated through the metrics in Eqs (1) -( 5).The symbols in the metrics can be expressed as 1 and 0 as follows.True Positive (TP) means that the true value is 1 and the predicted value is 1, True Negative (TN) means that the true value is 0 and the predicted value is 0, False Positive (FP) means that the true value is 0 but the predicted value is 1 and False Negative (FN) means that the true value is 1 but the predicted value is 0. The performances of the models are reported in the same list next to the hyper-parameters in Table 2.In neural networks [30,31], the training parameters are used for high performance, which can be set similarly to the hyper-parameters in Table 2.The absolute minimum value of the error needs to be found.In this process, various optimizers [32][33][34][35] are employed to reduce the computational load and minimize losses in the training process.In the present work, Adaptive Moment Estimation (Adam) Optimizer [33] is preferred.Adam algorithm is not only fast but also has low memory usage [36].
MobileNet achieved the highest performance with close to 85% accuracy as a result of the tests of the models on the GPU.Afterwards, the batch size value for the MobileNet model was increased to save time.Figure 4 shows the ROC curve of the model.The area under the curve (AUC) can be considered as an indicator of the model's performance.The higher the AUC, the better the performance of the trained model.
Besides the metrics expressed by Eqs. ( 1) -( 5), the confusion matrix in Table 3 is utilized to evaluate the test performance of the MobileNet model.The confusion matrix is composed of TP, TN, FP and FN values calculated for the testing process.

(TP)
The test performance of the proposed model is evaluated on 728 cancerous and 992 healthy nevus images that are not in the training dataset.While the model detected 259 out of 728 melanoma images as healthy, was able to detect all 992 healthy mole images successfully (without failure).4 has been prepared to compare the performance of the model with benchmark works in the literature.Successful classifications have been performed with the model file obtained at the end of the training.Also, some of the classified images have been illustrated in Figure 5. Early diagnosis of skin cancer, aided by these and similar studies, will prevent deaths from this disease.This paper proves that deep learning algorithms can be successfully used in the detection of skin cancer.As the number and diversity of the dataset increase, the success and accuracy of the deep learning model increase.However, labeling the data becomes more difficult.Also, the proposed deep learning models should be considered as an assistant.The use of these models should not be independent of experts, especially in the field of health.

➤
Received: 09.06.2023 ➤ Revised: 27.06.2023➤ Accepted: 28.06.2023➤ Published: 30.06.2023 Figure 2 depicts a general CNN structure for the detection problem in this paper.The convolutional layers shown in the CNN diagram in Figure 2 serve as feature extractors.Thus, the features of the input images are extracted.Neurons

Figure 2 .
Figure 2. Customized CNN schematic for the classification problem

Figure 4 .
Figure 4. ROC curve of the MobileNet model

Figure 5 .
Figure 5. Classification outputs of the model

Table 1 .
The data distribution in the dataset Figure 1.Some of the dataset images

Table 2 .
Hyper-parameters used to train the models and performance metrics presented to evaluate the test performance of the models

Table 3 .
Test performance of the proposed model based on confusion matrix