Research Article
BibTex RIS Cite

Breast Cancer Classification in Ultrasound Imaging Using Cost-Sensitive Learning and K-Means SMOTE on the Imbalanced BUSI Dataset with Deep Feature Extraction

Year 2025, Volume: 14 Issue: 2, 755 - 776, 30.06.2025
https://doi.org/10.17798/bitlisfen.1587411

Abstract

Breast cancer is one of the five most common types of cancer that occurs when breast tissue turns into a tumor and mainly affects women. Early diagnosis of the disease is crucial for the patient's lifespan. However, misclassification of malignancy may result in treatment delays and initiate an irreversible process for the patient. This study proposes an approach for classifying ultrasound breast images into malignant, benign, and healthy categories, with a particular emphasis on minimizing false-negative outcomes. The BUSI dataset, characterized by imbalanced class distributions, was used for the breast cancer detection. The dataset was augmented to enhance feature representations using contrast-limited adaptive histogram equalization (CLAHE) to address the class imbalance issue, creating the BUSICL dataset. Features extracted from both datasets with the VGG16 and ResNet50 models were then classified using a support vector machine (SVM). Following the results analysis, the SVM algorithm's cost matrix values were adjusted according to the inverse proportions of class distributions applying a cost-sensitive approach. In addition, the robustness of the proposed methodology is compared with the K-Means SMOTE algorithm. The proposed method achieved an overall accuracy of 99.36%, surpassing the performance of previous comprehensive classification studies using the BUSI dataset.

Ethical Statement

The study is complied with research and publication ethics.

References

  • M. Arnold et al., “Current and future burden of breast cancer: Global statistics for 2020 and 2040,” Breast, vol. 66, no. June, pp. 15–23, 2022, doi: 10.1016/j.breast.2022.08.010.
  • J. A. Malik et al., “Drugs repurposed: An advanced step towards the treatment of breast cancer and associated challenges,” Biomed. Pharmacother., vol. 145, p. 112375, 2022, doi: 10.1016/j.biopha.2021.112375.
  • M. Eghtedari, A. Chong, R. Rakow-Penner, and H. Ojeda-Fournier, “Current status and future of BI-RADS in multimodality imaging, from the AJR special series on radiology reporting and data systems,” Am. J. Roentgenol., vol. 216, no. 4, pp. 860–873, 2021, doi: 10.2214/AJR.20.24894.
  • D. R. Chen et al., “Classification of breast ultrasound images using fractal feature,” Clin. Imaging, vol. 29, no. 4, pp. 235–245, 2005, doi: 10.1016/j.clinimag.2004.11.024.
  • Y. L. Huang and D. R. Chen, “Support vector machines in sonography: Application to decision making in the diagnosis of breast cancer,” Clin. Imaging, vol. 29, no. 3, pp. 179–184, 2005, doi: 10.1016/j.clinimag.2004.08.002.
  • Y. L. Huang, D. R. Chen, Y. R. Jiang, S. J. Kuo, H. K. Wu, and W. K. Moon, “Computer-aided diagnosis using morphological features for classifying breast lesions on ultrasound,” Ultrasound Obstet. Gynecol., vol. 32, no. 4, pp. 565–572, 2008, doi: 10.1002/uog.5205.
  • M. Ragab, A. Albukhari, J. Alyami, and R. F. Mansour, “Ensemble Deep-Learning-Enabled Clinical Decision Support Ultrasound Images,” Biology (Basel)., vol. 11, p. 439, 2022.
  • B. Abhisheka, S. K. Biswas, and B. Purkayastha, “HBMD-Net: Feature Fusion Based Breast Cancer Classification with Class Imbalance Resolution,” J. Imaging Informatics Med., no. 0123456789, 2024, doi: 10.1007/s10278-024-01046-5.
  • A. Raza, N. Ullah, J. A. Khan, M. Assam, A. Guzzo, and H. Aljuaid, “DeepBreastCancerNet: A Novel Deep Learning Model for Breast Cancer Detection Using Ultrasound Images,” Appl. Sci., vol. 13, no. 4, 2023, doi: 10.3390/app13042082.
  • G. Douzas, F. Bacao, and F. Last, “Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE,” Inf. Sci. (Ny)., vol. 465, pp. 1–20, 2018, doi: 10.1016/j.ins.2018.06.056.
  • Z. Xu, D. Shen, T. Nie, Y. Kou, N. Yin, and X. Han, “A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data,” Inf. Sci. (Ny)., vol. 572, pp. 574–589, 2021, doi: 10.1016/j.ins.2021.02.056.
  • A. Arafa, N. El-Fishawy, M. Badawy, and M. Radad, “RN-SMOTE: Reduced Noise SMOTE based on DBSCAN for enhancing imbalanced data classification,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 8, pp. 5059–5074, 2022, doi: 10.1016/j.jksuci.2022.06.005.
  • C. C. Chang, Y. Z. Li, H. C. Wu, and M. H. Tseng, “Melanoma Detection Using XGB Classifier Combined with Feature Extraction and K-Means SMOTE Techniques,” Diagnostics, vol. 12, no. 7, 2022, doi: 10.3390/diagnostics12071747.
  • I. D. Mienye and Y. Sun, “Performance analysis of cost-sensitive learning methods with application to imbalanced medical data,” Informatics Med. Unlocked, vol. 25, 2021, doi: 10.1016/j.imu.2021.100690.
  • N. Liu, J. Shen, M. Xu, D. Gan, E. S. Qi, and B. Gao, “Improved Cost-Sensitive Support Vector Machine Classifier for Breast Cancer Diagnosis,” Math. Probl. Eng., vol. 2018, 2018, doi: 10.1155/2018/3875082.
  • V. Ravi, “Attention Cost-Sensitive Deep Learning-Based Approach for Skin Cancer Detection and Classification,” Cancers (Basel)., vol. 14, no. 2, p. 5872, 2022, doi: doi.org/10.3390/cancers14235872.
  • W. Al-Dhabyani, M. Gomaa, H. Khaled, and A. Fahmy, “Dataset of breast ultrasound images,” Data Br., vol. 28, p. 104863, 2020, doi: 10.1016/j.dib.2019.104863.
  • Y. R. Haddadi, B. Mansouri, and F. Z. I. Khodja, “A novel medical image enhancement algorithm based on CLAHE and pelican optimization,” Multimed. Tools Appl., no. 0123456789, 2024, doi: 10.1007/s11042-024-19070-6.
  • D. Garg, N. K. Garg, and M. Kumar, “Underwater image enhancement using blending of CLAHE and percentile methodologies,” Multimed. Tools Appl., vol. 77, no. 20, pp. 26545–26561, 2018, doi: 10.1007/s11042-018-5878-8.
  • D. Theckedath and R. R. Sedamkar, “Detecting Affect States Using VGG16, ResNet50 and SE-ResNet50 Networks,” SN Comput. Sci., vol. 1, no. 2, pp. 1–7, 2020, doi: 10.1007/s42979-020-0114-9.
  • N. Muzoğlu, M. K. Karaslan, A. M. Halefoğlu, And S. Yarman, “Prediction of the Prognosis of Covid-19 Disease Using Deep Learning Methods and Boruta Feature Selection Algorithm,” Afyon Kocatepe Univ. J. Sci. Eng., vol. 22, no. 3, pp. 577–587, 2022, doi: 10.35414/akufemubid.1114346.
  • W. Gómez-Flores, M. J. Gregorio-Calas, and W. Coelho de Albuquerque Pereira, “BUS-BRA: A breast ultrasound dataset for assessing computer-aided diagnosis systems,” Med. Phys., vol. 51, no. 4, pp. 3110–3123, 2024, doi: 10.1002/mp.16812.
  • Y. Peng, L. Zhang, S. Liu, X. Wu, Y. Zhang, and X. Wang, “Dilated Residual Networks with Symmetric Skip Connection for image denoising,” Neurocomputing, vol. 345, pp. 67–76, 2019, doi: 10.1016/j.neucom.2018.12.075.
  • S. Ruder, “An overview of gradient descent optimization algorithms,” Sep. 2016, [Online]. Available: http://arxiv.org/abs/1609.04747.
  • A. Iranmehr, H. Masnadi-Shirazi, and N. Vasconcelos, “Cost-sensitive support vector machines,” Neurocomputing, vol. 343, pp. 50–64, 2019, doi: 10.1016/j.neucom.2018.11.099.
  • Zhou, Zhi‐Hua, and Xu‐Ying Liu. "On multi‐class cost‐sensitive learning." Computational Intelligence, 26.3, pp. 232-257, 2010, doi: 10.1111/j.1467-8640.2010.00358.
  • Mienye, Ibomoiye Domor, and Yanxia Sun. "Performance analysis of cost-sensitive learning methods with application to imbalanced medical data." Informatics in Medicine Unlocked,vol. 25, 2021, doi: /doi.org/10.1016/j.imu.2021.100690.
  • M. N. Hossin, M., Sulaiman, “A Review of Evaluation Metrics in Machine Learning Algorithms,” Med. Image Anal., vol. 80, no. 2, p. 102478, 2022, [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S1361841522001256.
  • R. K. Pathan et al., “Breast Cancer Classification by Using Multi-Headed Convolutional Neural Network Modeling,” Healthc., vol. 10, no. 12, pp. 1–17, 2022, doi: 10.3390/healthcare10122367.
  • A. K. Mishra, P. Roy, S. Bandyopadhyay, and S. K. Das, “Achieving highly efficient breast ultrasound tumor classification with deep convolutional neural networks,” Int. J. Inf. Technol., vol. 14, no. 7, pp. 3311–3320, 2022, doi: 10.1007/s41870-022-00901-4.
  • H. Chen, M. Ma, G. Liu, Y. Wang, Z. Jin, and C. Liu, “Breast Tumor Classification in Ultrasound Images by Fusion of Deep Convolutional Neural Network and Shallow LBP Feature,” J. Digit. Imaging, vol. 36, no. 3, pp. 932–946, 2023, doi: 10.1007/s10278-022-00711-x.
  • H. Özcan, “BUS-CAD: A computer-aided diagnosis system for breast tumor classification in ultrasound images using grid-search-optimized machine learning algorithms with extended and Boruta-selected features,” Int. J. Imaging Syst. Technol., vol. 33, no. 5, pp. 1480–1493, 2023, doi: 10.1002/ima.22873.
  • F. Z. Reguieg and N. Benblidia, Ultrasound breast tumoral classification by a new adaptive pre-trained convolutive neural networks for computer-aided diagnosis, vol. 83, no. 15. Springer US, 2024.
  • K. Jabeen et al., “Breast Cancer Classification from Ultrasound Images Using Probability‐Based Optimal Deep Learning Feature Fusion,” Sensors, vol. 22, no. 3, 2022, doi: 10.3390/s22030807.
Year 2025, Volume: 14 Issue: 2, 755 - 776, 30.06.2025
https://doi.org/10.17798/bitlisfen.1587411

Abstract

References

  • M. Arnold et al., “Current and future burden of breast cancer: Global statistics for 2020 and 2040,” Breast, vol. 66, no. June, pp. 15–23, 2022, doi: 10.1016/j.breast.2022.08.010.
  • J. A. Malik et al., “Drugs repurposed: An advanced step towards the treatment of breast cancer and associated challenges,” Biomed. Pharmacother., vol. 145, p. 112375, 2022, doi: 10.1016/j.biopha.2021.112375.
  • M. Eghtedari, A. Chong, R. Rakow-Penner, and H. Ojeda-Fournier, “Current status and future of BI-RADS in multimodality imaging, from the AJR special series on radiology reporting and data systems,” Am. J. Roentgenol., vol. 216, no. 4, pp. 860–873, 2021, doi: 10.2214/AJR.20.24894.
  • D. R. Chen et al., “Classification of breast ultrasound images using fractal feature,” Clin. Imaging, vol. 29, no. 4, pp. 235–245, 2005, doi: 10.1016/j.clinimag.2004.11.024.
  • Y. L. Huang and D. R. Chen, “Support vector machines in sonography: Application to decision making in the diagnosis of breast cancer,” Clin. Imaging, vol. 29, no. 3, pp. 179–184, 2005, doi: 10.1016/j.clinimag.2004.08.002.
  • Y. L. Huang, D. R. Chen, Y. R. Jiang, S. J. Kuo, H. K. Wu, and W. K. Moon, “Computer-aided diagnosis using morphological features for classifying breast lesions on ultrasound,” Ultrasound Obstet. Gynecol., vol. 32, no. 4, pp. 565–572, 2008, doi: 10.1002/uog.5205.
  • M. Ragab, A. Albukhari, J. Alyami, and R. F. Mansour, “Ensemble Deep-Learning-Enabled Clinical Decision Support Ultrasound Images,” Biology (Basel)., vol. 11, p. 439, 2022.
  • B. Abhisheka, S. K. Biswas, and B. Purkayastha, “HBMD-Net: Feature Fusion Based Breast Cancer Classification with Class Imbalance Resolution,” J. Imaging Informatics Med., no. 0123456789, 2024, doi: 10.1007/s10278-024-01046-5.
  • A. Raza, N. Ullah, J. A. Khan, M. Assam, A. Guzzo, and H. Aljuaid, “DeepBreastCancerNet: A Novel Deep Learning Model for Breast Cancer Detection Using Ultrasound Images,” Appl. Sci., vol. 13, no. 4, 2023, doi: 10.3390/app13042082.
  • G. Douzas, F. Bacao, and F. Last, “Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE,” Inf. Sci. (Ny)., vol. 465, pp. 1–20, 2018, doi: 10.1016/j.ins.2018.06.056.
  • Z. Xu, D. Shen, T. Nie, Y. Kou, N. Yin, and X. Han, “A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data,” Inf. Sci. (Ny)., vol. 572, pp. 574–589, 2021, doi: 10.1016/j.ins.2021.02.056.
  • A. Arafa, N. El-Fishawy, M. Badawy, and M. Radad, “RN-SMOTE: Reduced Noise SMOTE based on DBSCAN for enhancing imbalanced data classification,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 8, pp. 5059–5074, 2022, doi: 10.1016/j.jksuci.2022.06.005.
  • C. C. Chang, Y. Z. Li, H. C. Wu, and M. H. Tseng, “Melanoma Detection Using XGB Classifier Combined with Feature Extraction and K-Means SMOTE Techniques,” Diagnostics, vol. 12, no. 7, 2022, doi: 10.3390/diagnostics12071747.
  • I. D. Mienye and Y. Sun, “Performance analysis of cost-sensitive learning methods with application to imbalanced medical data,” Informatics Med. Unlocked, vol. 25, 2021, doi: 10.1016/j.imu.2021.100690.
  • N. Liu, J. Shen, M. Xu, D. Gan, E. S. Qi, and B. Gao, “Improved Cost-Sensitive Support Vector Machine Classifier for Breast Cancer Diagnosis,” Math. Probl. Eng., vol. 2018, 2018, doi: 10.1155/2018/3875082.
  • V. Ravi, “Attention Cost-Sensitive Deep Learning-Based Approach for Skin Cancer Detection and Classification,” Cancers (Basel)., vol. 14, no. 2, p. 5872, 2022, doi: doi.org/10.3390/cancers14235872.
  • W. Al-Dhabyani, M. Gomaa, H. Khaled, and A. Fahmy, “Dataset of breast ultrasound images,” Data Br., vol. 28, p. 104863, 2020, doi: 10.1016/j.dib.2019.104863.
  • Y. R. Haddadi, B. Mansouri, and F. Z. I. Khodja, “A novel medical image enhancement algorithm based on CLAHE and pelican optimization,” Multimed. Tools Appl., no. 0123456789, 2024, doi: 10.1007/s11042-024-19070-6.
  • D. Garg, N. K. Garg, and M. Kumar, “Underwater image enhancement using blending of CLAHE and percentile methodologies,” Multimed. Tools Appl., vol. 77, no. 20, pp. 26545–26561, 2018, doi: 10.1007/s11042-018-5878-8.
  • D. Theckedath and R. R. Sedamkar, “Detecting Affect States Using VGG16, ResNet50 and SE-ResNet50 Networks,” SN Comput. Sci., vol. 1, no. 2, pp. 1–7, 2020, doi: 10.1007/s42979-020-0114-9.
  • N. Muzoğlu, M. K. Karaslan, A. M. Halefoğlu, And S. Yarman, “Prediction of the Prognosis of Covid-19 Disease Using Deep Learning Methods and Boruta Feature Selection Algorithm,” Afyon Kocatepe Univ. J. Sci. Eng., vol. 22, no. 3, pp. 577–587, 2022, doi: 10.35414/akufemubid.1114346.
  • W. Gómez-Flores, M. J. Gregorio-Calas, and W. Coelho de Albuquerque Pereira, “BUS-BRA: A breast ultrasound dataset for assessing computer-aided diagnosis systems,” Med. Phys., vol. 51, no. 4, pp. 3110–3123, 2024, doi: 10.1002/mp.16812.
  • Y. Peng, L. Zhang, S. Liu, X. Wu, Y. Zhang, and X. Wang, “Dilated Residual Networks with Symmetric Skip Connection for image denoising,” Neurocomputing, vol. 345, pp. 67–76, 2019, doi: 10.1016/j.neucom.2018.12.075.
  • S. Ruder, “An overview of gradient descent optimization algorithms,” Sep. 2016, [Online]. Available: http://arxiv.org/abs/1609.04747.
  • A. Iranmehr, H. Masnadi-Shirazi, and N. Vasconcelos, “Cost-sensitive support vector machines,” Neurocomputing, vol. 343, pp. 50–64, 2019, doi: 10.1016/j.neucom.2018.11.099.
  • Zhou, Zhi‐Hua, and Xu‐Ying Liu. "On multi‐class cost‐sensitive learning." Computational Intelligence, 26.3, pp. 232-257, 2010, doi: 10.1111/j.1467-8640.2010.00358.
  • Mienye, Ibomoiye Domor, and Yanxia Sun. "Performance analysis of cost-sensitive learning methods with application to imbalanced medical data." Informatics in Medicine Unlocked,vol. 25, 2021, doi: /doi.org/10.1016/j.imu.2021.100690.
  • M. N. Hossin, M., Sulaiman, “A Review of Evaluation Metrics in Machine Learning Algorithms,” Med. Image Anal., vol. 80, no. 2, p. 102478, 2022, [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S1361841522001256.
  • R. K. Pathan et al., “Breast Cancer Classification by Using Multi-Headed Convolutional Neural Network Modeling,” Healthc., vol. 10, no. 12, pp. 1–17, 2022, doi: 10.3390/healthcare10122367.
  • A. K. Mishra, P. Roy, S. Bandyopadhyay, and S. K. Das, “Achieving highly efficient breast ultrasound tumor classification with deep convolutional neural networks,” Int. J. Inf. Technol., vol. 14, no. 7, pp. 3311–3320, 2022, doi: 10.1007/s41870-022-00901-4.
  • H. Chen, M. Ma, G. Liu, Y. Wang, Z. Jin, and C. Liu, “Breast Tumor Classification in Ultrasound Images by Fusion of Deep Convolutional Neural Network and Shallow LBP Feature,” J. Digit. Imaging, vol. 36, no. 3, pp. 932–946, 2023, doi: 10.1007/s10278-022-00711-x.
  • H. Özcan, “BUS-CAD: A computer-aided diagnosis system for breast tumor classification in ultrasound images using grid-search-optimized machine learning algorithms with extended and Boruta-selected features,” Int. J. Imaging Syst. Technol., vol. 33, no. 5, pp. 1480–1493, 2023, doi: 10.1002/ima.22873.
  • F. Z. Reguieg and N. Benblidia, Ultrasound breast tumoral classification by a new adaptive pre-trained convolutive neural networks for computer-aided diagnosis, vol. 83, no. 15. Springer US, 2024.
  • K. Jabeen et al., “Breast Cancer Classification from Ultrasound Images Using Probability‐Based Optimal Deep Learning Feature Fusion,” Sensors, vol. 22, no. 3, 2022, doi: 10.3390/s22030807.
There are 34 citations in total.

Details

Primary Language English
Subjects Artificial Life and Complex Adaptive Systems
Journal Section Research Article
Authors

Nedim Muzoglu 0000-0003-1591-2806

Early Pub Date June 27, 2025
Publication Date June 30, 2025
Submission Date November 18, 2024
Acceptance Date April 11, 2025
Published in Issue Year 2025 Volume: 14 Issue: 2

Cite

IEEE N. Muzoglu, “Breast Cancer Classification in Ultrasound Imaging Using Cost-Sensitive Learning and K-Means SMOTE on the Imbalanced BUSI Dataset with Deep Feature Extraction”, Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, vol. 14, no. 2, pp. 755–776, 2025, doi: 10.17798/bitlisfen.1587411.

Bitlis Eren University
Journal of Science Editor
Bitlis Eren University Graduate Institute
Bes Minare Mah. Ahmet Eren Bulvari, Merkez Kampus, 13000 BITLIS