Performance analysis of combination of cnn-based models with adaboost algorithm to diagnose covid-19 disease

In this study, Chest X-ray images labeled as Covid-19, Normal and Viral Pneumonia were classified using two different proposed models, the effect of some parameters on these models was investigated.


INTRODUCTION
The new disease, which was first seen in Wuhan, China, in December 2019, started to threaten the whole world over time [1].This disease which is caused by the SARS-CoV-2 virus has been named coronavirus disease .For confirmed COVID -19 cases, common complaints reported include fever, cough, muscle pain, or fatigue.However, these symptoms are not unique features of COVID -19 because they are like the flu symptoms of other viral diseases is similar [2].
According to the statistics of the European Centre for Disease Prevention and Control, coronavirus disease affected 71,554,018 people and caused 1,613,671 deaths since 31 December 2019 and as of week 2020-50.While in some cases, this disease can progress with mild symptoms.It can also cause severe pneumoniainflammation of the air sacs in the lungs.Pneumonia is usually caused by bacteria or viruses [3].Patients with Covid-19 symptoms are tested for the diagnosis of the disease.However, it may take a few days to get results from these tests (RT-PCR).This can lead to fatal consequences, especially for patients with severe symptoms.
The other way of diagnosis this disease is to use some imaging techniques such as Chest radiography (X-ray), magnetic resonance imaging (MRI), and computed tomography (CT).Since Chest X-Ray, a cheap and fast technique, makes the patients receive less radiation dose than the other imaging techniques, it is a more preferred method used for diagnosis the viral diseases [5].Chest X-Rays are also used in the diagnosis of pneumonia.
Finding the source of this disease in patients with symptoms of pneumonia can help diagnose Covid-19 disease.However, the diagnosis of this disease from the chest is a difficult task even for expert radiologists.It can take time.Moreover, expert knowledge and experiences are required to understand the cause of pneumonia and to diagnose Covid-19 disease from Chest X-ray [4].Generally, it is not quite easy to reach experts to understand whether pneumonia is caused by corona or bacteria because of the increasing number of cases per day and the insufficient number of experts.Therefore, it is immediately needed an approved and safe treatment that can affect SARS-CoV-2, which is the cause of the Covid-19 disease, to decrease the number of cases and death rate also to end the pandemic [3].For this reason, an automatic and fast system is needed for the early diagnosis of this disease.
In unusual situations such as the Covid-19 pandemic, the need for hospitals and doctors is increasing.Hospital capacity, number of doctors, etc., may be insufficient because of the increasing number of cases.

BAGGING and BOOSTING ALGORITHMS
Ensemble learning is an effective and adopted technique that combines different learning algorithms to achieve improved performance of the overall model [16].There are some different Ensemble Learning methods such as Bagging, Boosting, etc.

Bagging Algorithm
Bagging, which has been proposed by Breiman [17], stands for bootstrap aggregating.Bagging aims to increase accuracy and regression estimation by combining predictive results of models trained on randomly distributed training sets.In the Bagging method, samples of each subset of the original dataset are selected with replacement.That means after selecting samples of the first subset randomly, these samples are placed into the original set, then samples of the second subset are randomly chosen again from this original set.
In short, each sample has an equal chance to be included in chosen training subset [18].Then, each subset is trained with the designated classifiers.Predictions of each model, which work independently, are combined at the end to obtain better results.The main working principle of the Bagging method is shown in Figure 1.

Boosting Algorithm (Adaboost)
Boosting, which is another Ensemble Learning technique, combines weak classifiers, predictors to obtain a stronger classifier [19].Adaboost, which is the short form of "adaptive boosting", was presented first by Freund ve Schapire [15] in 1997.Adaboost is the most used boosting method.The main concept of the Adaboost algorithm is to add more weights to misclassified samples.So, the incorrectly classified sample is intended to be classified correctly in the next step [20].The working principle of Adaboost is listed below:  In the beginning, equal weights are given to each sample of original data.Number of training instances: N Weights of each sample :  Training samples are trained on classifiers, and the total error of the model is calculated as:  New weights for misclassified samples: Sample weight×  (4)  New weights for correctly classified samples: Sample weight× − (5) Updating the weights using the above formulas allows reducing the weight of correctly classified samples while increasing the weight of incorrectly classified samples.
 Then using these updated weights, the new dataset is created dependent on the value of weights.When selecting new samples, weights with higher values have more chance to be included in the new dataset.Also, the same sample could be included in the new data more than once. After all, the same steps are repeated for the new dataset [21].In the end, it is obtained a model which is sensitive to the misclassified samples of the previous model.It is possible to use some different base estimators/weak classifiers in Adaboost.Also, the steps above are repeated until the algorithm reaches the determined number of estimators.So it is important to determine the appropriate number of estimators to achieve higher results.After the algorithm reaches the determined number of estimator, the voting method is generally used to obtain final predictions of samples.Main concept of Adaboost algorithm can be seen in Figure 2.

MATERIALS and METHODS
The main purpose of this study is to make a multi-class classification of Chest X-Ray images using the Adaboost algorithm.Since it is needed to feed the Adaboost algorithm with features of these images, automatic feature extraction methods with CNN and ResNet-152 have been used.The effects of these CNN-based automatic feature extractors on the Adaboost algorithm have been investigated using several hyperparameters such as learning rate and the number of iterations.

The Dataset
In this study, the dataset [22], which consists of Chest X-Ray images, has been used.This dataset is a collection of the different datasets from different studies [23,24].In the dataset 1   The distributions of the number of samples for each class before and after SMOTE method can be seen in Table 1 and 2, respectively.As seen in Table 2, the number of images in all classes in the training set became equal after the SMOTE method.
In this study, since images in the dataset are in different sizes, all RGB images in train and test sets were resized to 112× 112 pixel size.In Figure 3

Classification with Adaboost(CNN)
In this part of the study, the first proposed method is considered to classify Chest X-Ray images.Adaboost method, which is one of the Ensemble Learning techniques, has been used as a classifier.This Adaboost algorithm needs some features to be able to make a classification.Feature extraction is a difficult and timeconsuming process in machine learning applications.For successful feature extraction, the properties of the images and labels must be well known.However, when we look at Figure 3, it can be said that the features that distinguish between the classes of X-Ray images even visually are absent.Visual disease diagnosis in medical images is a difficult task even for those who are experts in the field.Therefore, for the first part, convolutional neural network (CNN) has been used as a feature extractor.This first proposed method has been called Adaboost(CNN) in this study.Before the Adaboost algorithm as a classifier, in Figure 4, the steps of CNN as a feature extractor are presented.As seen in Figure 4 below, the inputs of the model are Chest X-Ray images that were resized to 112x112x3 pixel size.In this model, there are five convolution layers.The convolution layer is also known as the feature extraction layer because the property of the given image in the network is extracted in this layer.In this model, each convolution layers consist of a convolution process, an activation function of ReLU and a max-pooling layer.The different number of filters has been used for each convolution layer.For example, for the first convolution layer, the convolution process is applied with 16 filters of 3x3 size to the input.The number of steps for sliding filters on input is set to 1.This number of steps is called stride.Also, zero padding has been used to control the size of output at the end of the convolution process.By zero padding, the border of the inputs is filled by zeros.After convolution, ReLU has been used as an activation function and max-pooling to create new features by selecting maximum values in filters.For each convolution process, the stride is 1, the filter size is 3x3, the numbers of filters are 16, 32, 64, 128, 256, respectively.Also, zero-padding has been used for these convolution processes.After using ReLU activation function for each convolution layer, maxpooling layer has been used with 2x2 filters, and 2 in the stride to decrease the size of feature maps without affecting the depth size.These parameters for each layer affect the size of output at the end of the processes.The size of each feature map can be seen in Figure 4.
After the last convolution layer, flattening has been applied to the output.Flattening converts the multidimensional feature map to one-dimensional feature vector.Finally, Fully Connected Layer (FC) has been used to fully connect each neuron of the one-dimensional feature vector with 1024 neurons.In the end, 1024 features have been obtained from the given input to be able to feed Adaboost algorithm.After obtaining features using CNN without classification part, Adaboost algorithm has been used for classification.As said earlier, Adaboost algorithm needs a weak/base classifier.
In this study, Support Vector Machine (SVM) is selected as a base classifier.SVM is a supervised learning method proposed by Cortes and Vapnik (1995) [26].The main purpose of SVM is to define the hyperplane that can make the most appropriate distinction between classes.SVM tries to learn the class labels of a data set whose properties are specified, and then the classification performance of this model is tested with the unseen test data [27].
In Adaboost(CNN) model, which is the first proposed method of this study, the features of X-Ray images have been extracted using CNN, and the effect of the number of estimators and value of learning rate have been investigated.As mentioned earlier, Adaboost algorithm continues to make predictions iteratively until the number of iterations is reached.And learning rate, used in Adaboost algorithm, controls the contribution of each model to the ensemble prediction.So both parameters, which are the number of iterations and learning rate, are important in Adaboost to have better classification results.
In order to see the effect of these parameters on the classification, the number of estimators of the algorithm was changed between 10 and 30, while the learning rate value took values between 0.1 and 0.0001.For each iteration number and learning rate, the performance of the model whose features come from CNN is noted The main concept of Adaboost(CNN) is summarized in Figure 5.

Experimental Results
The classification performance of each proposed model in this study is evaluated by accuracy metric.Then precision, recall, and F1-score are calculated for the model which gives the best average accuracy result.These metrics are calculated for each class in this study.In the formulas ( 6), ( 7), (8), and (9),  _ stands for classes which are Covid-19, Normal and Viral Pneumonia.Accuracy is calculated by the ratio of the values we correctly estimate in the model to the total number of data.Precision shows how many of the values we predicted as Positive are actually Positive.Recall is a metric that shows how many of the values which we should predict as Positive predicted as Positive.And F1score is the harmonic mean of precision and recall values.After calculating each metric for each class, each obtained result is divided by the number of classes, which is 3 in this study, to have average results.The detailed results can be seen in Table 3, Figure 8, Table 4, and Figure 9.

Comparison
In this study, some additional experiments were conducted to see the effectiveness of the Adaboost classifier with CNN features.In this case, CNN and ResNet-152 models, used as feature extractors, were used as classifiers.To be able to make these models a classifier, a classification layer (softmax) has been added to the end of each model.For both new models, RMSProp optimizer (learning rate: 0.01) has been used.Additionally, for the training process of each model, the batch size and epoch were set to 32 and 20, respectively.In Table 5 below, obtained accuracy results can be seen

Figure 1 .
Figure 1.Bagging Algorithm , there are 1341 normal/healthy X-Rays, 1345 virus pneumonia X-Rays, and 219 Covid-19 X-Rays.Since collaborators still continue to update the dataset, it is possible to have the different number of images for each label at different times.There are 2905 images in total, each of which is different in size.Normally the dataset has an unbalance distribution over classes.The number of images in each class was balanced with the Synthetic Minority Oversampling Technique (SMOTE) to prevent the inclination towards the class that has majority during the training.SMOTE method is an over-sampling approach in which the minority class is oversampled by creating "synthetic" examples to make the image distribution in classes balanced.The algorithm developed by Chawla et al. in 2002 [25] has been applied to a lot of unbalanced dataset problems.It differs from the other sampling techniques since it creates synthetic samples instead of copying minority class data.It is listed the steps of how the algorithm creates synthetic samples [25]:  Step 1: k-nearest neighbors (kNN) of each sample in the minority class are calculated. Step 2: It is taken the difference between the sample and its kNN. Step 3: This difference is multiplied by a random number which is chosen between [0,1]. Step 4: Obtained result from Step 3 is added to the original sample, and the new sample is obtained. Step 5: Step 1, 2, 3, 4 are repeated until reach the balanced dataset.Before applying SMOTE method, the original dataset is splitted into training and testing.80% of the original dataset has been used in training and 20% of them in testing.After splitting the data as training and testing, SMOTE method is just applied to samples in the training set.The result of usage SMOTE method after splitting instead of before splitting is to have a reliable model.Since some similar samples have been obtained with SMOTE method, it was aimed to prevent that similar samples are in both training and the testing set.Because if one of the similar samples exists in training and the other one is in testing, the model could easily classify this sample since it saw that sample in training

Table 1 .
The number of images in classes before SMOTE method.

Table 2 .
The number of images in classes after SMOTE method.

Table 3 .
Average accuracy results of Adaboost(CNN) for each learning rate and the number of estimator.Confusion matrix of Adaboost(CNN); precision, recall, and F1_score values for the model has the best average accuracy

Table 4 .
Average accuracy results of Adaboost(ResNet-152) for each learning rate and the number of estimators.Confusion matrix of Adaboost(ResNet-152); precision, recall, and F1_score values for the model has the best average accuracy In Table3, the average accuracy results of Adaboost(CNN) model for each number of estimators and each learning rate value have been shown.Since Adaboost(CNN) gives the best average accuracy when the number of estimators is 25, and the learning rate is 0.1, confusion matrix and average precision, recall, and F1_score values for these parameters have been presented in Figure8.As seen in Table3 and Figure 8,

Table 5 .
Results of CNN models as feature extractors and classifiers.As seen in Table5above, it can be easily said that using CNN models as feature extractors with the Adaboost classification algorithm gives better classification results for the Covid-19 dataset.Finally, to see the effectiveness of our proposed model, some comparisons were made with similar studies.The comparison table can be seen in Table6below.

Table 6 .
The comparison of the results of the proposed method with similar studies.

Table 6 ,
the approach of using CNN as a feature extractor with the Adaboost classification algorithm can outperform the studies in the literature for Covid-19 diagnosis.5.CONCLUSIONDiagnosis of diseases such as Covid-19 and pneumonia is crucial for human life.Diagnosis of these diseases can take a long time, and sometimes the wrong diagnosis can be made.For this, a machine learning algorithm is proposed in this study to classify these diseases in a short time with high accuracy.In this proposed algorithm, instead of manually extracting features, features are extracted automatically from Chest X-Ray images using two different CNN-based methods, which are the proposed CNN and pre-trained ResNet-152.Since the used dataset has an unbalanced distribution over classes, which are Normal, Covid-19, and Viral Pneumonia, SMOTE method has been applied on the just training set to have a balanced set.Adaboost algorithm, one of the Ensemble Learning algorithms, was trained with automatically extracted features.This algorithm, whose results are examined with different parameters, gives outstanding results with an average accuracy rate of 0.945 for classifying Covid-19 and pneumonia, which are very difficult to diagnose even by experts from Chest X-Ray images.Results show that the Adaboost algorithm can give very high accuracy rates in diagnosing the disease from the Chest X-Ray images with the features that are automatically extracted from the proposed CNN.The combination of different types of CNN-based models with the Adaboost algorithm can be used to classify medical images.