• Data mining • Majority voting • SMOTE algorithm • Cervical cancer • Classification • Majority voting SMOTE algorithm Cervical cancer Classification
Cervical cancer is one of the most successful types of treatment when diagnosed early. In this study, it is aimed to find and classify the disease with data mining methods on the digitized data set obtained as a result of the pap-smear test. Two-stage architecture has been proposed for the diagnosis of cervical cancer. In the first stage of the study, missing data were extracted from the used dataset, and in the second stage, a new dataset was obtained by using the Synthetic Minority Oversampling Technique (SMOTE) algorithm to balance the target classes in the dataset. By applying the majority voting (MV) method to the dataset used in the study, the structure with 4 target variables was reduced to a single target variable. On two data sets, Artificial Neural Network (ANN), Support Vector Machines (SVM), Decision Trees (DT), Random Forest (RF), and K-Nearest Neighbors (KNN) algorithms from data mining methods were used for the diagnosis of cervical cancer. The results obtained from the original dataset and the dataset produced with Smote were compared. ANN is the best method evaluated according to classification success and F-score, and the major voted target variable in the balanced data group produced with the Smote algorithm gave the most successful result. The experimental results showed that the use of MV and SMOTE algorithms together increased the classification success from 93% to 99%.
Primary Language | English |
---|---|
Subjects | Computer Software |
Journal Section | Bilgisayar Mühendisliği / Computer Engineering |
Authors | |
Early Pub Date | May 27, 2023 |
Publication Date | June 1, 2023 |
Submission Date | December 22, 2022 |
Acceptance Date | January 16, 2023 |
Published in Issue | Year 2023 Volume: 13 Issue: 2 |