Research Article
BibTex RIS Cite

Machine Learning Approach for Thyroid Cancer Diagnosis Using Clinical Data

Year 2023, Volume: 9 Issue: 3, 440 - 452, 31.08.2023
https://doi.org/10.19127/mbsjohs.1282265

Abstract

Objective: With an early diagnosis of thyroid cancer, one of the world's most significant health issues, it is feasible to treat the nodules before the spread of malignant thyroid gland cells. It has become crucial to develop models for predicting thyroid cancer. In light of this, the purpose of this study is to develop a clinical decision support model using the Bagged CART model, a machine learning (ML) model for the prediction of thyroid cancer.
Methods: Between 2010 and 2012, 724 patients who applied to China Median University Shengjing Hospital comprised the study's data set. The dataset comprises information on nodule malignancies, demographic characteristics, ultrasound characteristics, and blood test results for all patients who underwent thyroidectomy. Using this open-access data set, the Bagged CART modeling technique was applied. Negative predictive value (NPV), specificity (Spe), balanced accuracy (BACC), positive predictive value (PPV), accuracy (ACC), sensitivity (Sen), and F1-score performance metrics were used to evaluate the model's predictive performance. In addition, a 10-fold cross-validation method was used to determine the validity of the model. In addition, variable importance was established, which reveals how much the input variables impact the output variable.
Results: ACC, BACC, Sen, Spe, PPV, NPV, and F1-score obtained from the model performance metrics were calculated to 99.1%, 98.7%, 99.7%, 97.7%, 99.1%, 99.2%, and 99.4%, respectively, as a result of modeling. According to the variable importance values that were acquired for the input variables in the dataset that was investigated in this study, the seven variable that hold the greatest significance are as follows: size, TSH, blood flow: size, TSH, blood flow: enriched, multilateral: yes, FT4, site: isthmus, and age, in that order.
Conclusion: As a result, the Bagged CART model was found to be effective at predicting thyroid cancer based on the findings of this study. In addition, in this study, risk factors for thyroid cancer were evaluated and their importance values were given. With these results, the decision-making process about the disease will be able to accelerate and thus, it will be able to effective in preventive medicine practices.

References

  • Rossi ED, Pantanowitz L, Hornick JL. A worldwide journey of thyroid cancer incidence centred on tumour histology. The Lancet Diabetes and Endocrinology. 2021;9(4):193-4.
  • Anari S, Tataei Sarshar N, Mahjoori N, Dorosti S, Rezaie A. Review of Deep Learning Approaches for Thyroid Cancer Diagnosis. Mathematical Problems in Engineering. 2022;2022.
  • Araque DVP, Bleyer A, Brito JP. Thyroid cancer in adolescents and young adults. Future Oncology. 2017;13(14):1253-61.
  • Tuttle RM, Ball DW, Byrd D, Dilawari RA, Doherty GM, Duh Q-Y, et al. Thyroid carcinoma. Journal of the National Comprehensive Cancer Network. 2010;8(11):1228-74.
  • Carcangiu ML, Steeper T, Zampi G, Rosai J. Anaplastic thyroid carcinoma: a study of 70 cases. American journal of clinical pathology. 1985;83(2):135-58.
  • Olson E, Wintheiser G, Wolfe KM, Droessler J, Silberstein PT. Epidemiology of thyroid cancer: a review of the National Cancer Database, 2000-2013. Cureus. 2019;11(2).
  • Lamartina L, Grani G, Durante C, Filetti S, Cooper DS. Screening for differentiated thyroid cancer in selected populations. The Lancet Diabetes & Endocrinology. 2020;8(1):81-8.
  • Lin JS, Bowles EJA, Williams SB, Morrison CC. Screening for thyroid cancer: updated evidence report and systematic review for the US Preventive Services Task Force. Jama. 2017;317(18):1888-903.
  • Keramidas EG, Iakovidis DK, Maroulis D, Karkanis S, editors. Efficient and effective ultrasound image analysis scheme for thyroid nodule detection. International Conference Image Analysis and Recognition; 2007: Springer.
  • Durante C, Grani G, Lamartina L, Filetti S, Mandel SJ, Cooper DS. The diagnosis and management of thyroid nodules: a review. Jama. 2018;319(9):914-24.
  • Li T, Sheng J, Li W, Zhang X, Yu H, Chen X, et al. A new computational model for human thyroid cancer enhances the preoperative diagnostic efficacy. Oncotarget. 2015;6(29):28463.
  • Jin Z, Zhu Y, Zhang S, Xie F, Zhang M, Zhang Y, et al. Ultrasound computer-aided diagnosis (CAD) based on the thyroid imaging reporting and data system (TI-RADS) to distinguish benign from malignant thyroid nodules and the diagnostic performance of radiologists with different diagnostic experience. Medical Science Monitor: International Medical Journal of Experimental and Clinical Research. 2020;26:e918452-1.
  • Zhao Y, Healy BC, Rotstein D, Guttmann CR, Bakshi R, Weiner HL, et al. Exploration of machine learning techniques in predicting multiple sclerosis disease course. PloS one. 2017;12(4):e0174866.
  • Chen JIZ, Hengjinda P. Early prediction of coronary artery disease (CAD) by machine learning method-a comparative study. Journal of Artificial Intelligence. 2021;3(01):17-33.
  • Hamze-Ziabari S, Bakhshpoori T. Improving the prediction of ground motion parameters based on an efficient bagging ensemble model of M5′ and CART algorithms. Applied Soft Computing. 2018;68:147-61.
  • Deng H, Diao Y, Wu W, Zhang J, Ma M, Zhong X. A high-speed D-CART online fault diagnosis algorithm for rotor systems. Applied Intelligence. 2020;50(1):29-41.
  • Choubin B, Abdolshahnejad M, Moradi E, Querol X, Mosavi A, Shamshirband S, et al. Spatial hazard assessment of the PM10 using machine learning models in Barcelona, Spain. Science of The Total Environment. 2020;701:134474.
  • Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees: Routledge; 2017.
  • Timofeev R. Classification and regression trees (CART) theory and applications. Humboldt University, Berlin. 2004;54.
  • Murphree DH, Arabmakki E, Ngufor C, Storlie CB, McCoy RG. Stacked classifiers for individualized prediction of glycemic control following initiation of metformin therapy in type 2 diabetes. Computers in biology and medicine. 2018;103:109-15.
  • Duan H, Deng Z, Deng F, Wang D. Assessment of groundwater potential based on multicriteria decision making model and decision tree algorithms. Mathematical Problems in Engineering. 2016;2016.
  • Ismay C, Kennedy PC. Getting Used to r, RStudio, and r Markdown. 2016.
  • Asif MA-A-R, Nishat MM, Faisal F, Shikder MF, Udoy MH, Dip RR, et al., editors. Computer aided diagnosis of thyroid disease using machine learning algorithms. 2020 11th International Conference on Electrical and Computer Engineering (ICECE); 2020: IEEE.
  • Yoon JH, Lee HS, Kim E-K, Moon HJ, Kwak JY. Malignancy risk stratification of thyroid nodules: comparison between the thyroid imaging reporting and data system and the 2014 American Thyroid Association management guidelines. Radiology. 2016;278(3):917-24.
  • Kaur K, Sonkhya N, Bapna A, Mital P. A comparative study of fine needle aspiration cytology, ultrasonography and radionuclide scan in the management of solitary thyroid nodule: A prospective analysis of fifty cases. Indian Journal of Otolaryngology and Head and Neck Surgery. 2002;54(2):96-101.
  • Shin JH, Baek JH, Chung J, Ha EJ, Kim J-h, Lee YH, et al. Ultrasonography diagnosis and imaging-based management of thyroid nodules: revised Korean Society of Thyroid Radiology consensus statement and recommendations. Korean journal of radiology. 2016;17(3):370-95.
  • Yoo I, Alafaireet P, Marinov M, Pena-Hernandez K, Gopidi R, Chang J-F, et al. Data mining in healthcare and biomedicine: a survey of the literature. Journal of medical systems. 2012;36(4):2431-48.
  • Mansour R, Eghbal K, Amirhossein H. Comparison of artificial neural network, logistic regression and discriminant analysis efficiency in determining risk factors of type 2 diabetes. 2013.
  • Wang C, Li L, Wang L, Ping Z, Flory MT, Wang G, et al. Evaluating the risk of type 2 diabetes mellitus using artificial neural network: An effective classification approach. Diabetes research and clinical practice. 2013;100(1):111-8.
  • Parikh KS, Shah TP, Kota R, Vora R. Diagnosing common skin diseases using soft computing techniques. International Journal of Bio-Science and Bio-Technology. 2015;7(6):275-86.
  • Ioniţă I, Ioniţă L. Prediction of thyroid disease using data mining techniques. BRAIN Broad Research in Artificial Intelligence and Neuroscience. 2016;7(3):115-24.
  • Chaurasia V, Pal S, Tiwari B. Prediction of benign and malignant breast cancer using data mining techniques. Journal of Algorithms & Computational Technology. 2018;12(2):119-26.
  • Talasila V, Madhubabu K, Mahadasyam MC, Atchala NJ, Kande LS. The Prediction of Diseases Using Rough Set Theory with Recurrent Neural Network in Big Data Analytics. International Journal of Intelligent Engineering & Systems. 2020;13(5).
  • Kumar HH, editor A novel approach of SVM based classification on thyroid disease stage detection. 2020 third international conference on smart systems and inventive technology (ICSSIT); 2020: IEEE.
  • Aversano L, Bernardi ML, Cimitile M, Iammarino M, Macchia PE, Nettore IC, et al. Thyroid disease treatment prediction with machine learning approaches. Procedia Computer Science. 2021;192:1031-40.
Year 2023, Volume: 9 Issue: 3, 440 - 452, 31.08.2023
https://doi.org/10.19127/mbsjohs.1282265

Abstract

References

  • Rossi ED, Pantanowitz L, Hornick JL. A worldwide journey of thyroid cancer incidence centred on tumour histology. The Lancet Diabetes and Endocrinology. 2021;9(4):193-4.
  • Anari S, Tataei Sarshar N, Mahjoori N, Dorosti S, Rezaie A. Review of Deep Learning Approaches for Thyroid Cancer Diagnosis. Mathematical Problems in Engineering. 2022;2022.
  • Araque DVP, Bleyer A, Brito JP. Thyroid cancer in adolescents and young adults. Future Oncology. 2017;13(14):1253-61.
  • Tuttle RM, Ball DW, Byrd D, Dilawari RA, Doherty GM, Duh Q-Y, et al. Thyroid carcinoma. Journal of the National Comprehensive Cancer Network. 2010;8(11):1228-74.
  • Carcangiu ML, Steeper T, Zampi G, Rosai J. Anaplastic thyroid carcinoma: a study of 70 cases. American journal of clinical pathology. 1985;83(2):135-58.
  • Olson E, Wintheiser G, Wolfe KM, Droessler J, Silberstein PT. Epidemiology of thyroid cancer: a review of the National Cancer Database, 2000-2013. Cureus. 2019;11(2).
  • Lamartina L, Grani G, Durante C, Filetti S, Cooper DS. Screening for differentiated thyroid cancer in selected populations. The Lancet Diabetes & Endocrinology. 2020;8(1):81-8.
  • Lin JS, Bowles EJA, Williams SB, Morrison CC. Screening for thyroid cancer: updated evidence report and systematic review for the US Preventive Services Task Force. Jama. 2017;317(18):1888-903.
  • Keramidas EG, Iakovidis DK, Maroulis D, Karkanis S, editors. Efficient and effective ultrasound image analysis scheme for thyroid nodule detection. International Conference Image Analysis and Recognition; 2007: Springer.
  • Durante C, Grani G, Lamartina L, Filetti S, Mandel SJ, Cooper DS. The diagnosis and management of thyroid nodules: a review. Jama. 2018;319(9):914-24.
  • Li T, Sheng J, Li W, Zhang X, Yu H, Chen X, et al. A new computational model for human thyroid cancer enhances the preoperative diagnostic efficacy. Oncotarget. 2015;6(29):28463.
  • Jin Z, Zhu Y, Zhang S, Xie F, Zhang M, Zhang Y, et al. Ultrasound computer-aided diagnosis (CAD) based on the thyroid imaging reporting and data system (TI-RADS) to distinguish benign from malignant thyroid nodules and the diagnostic performance of radiologists with different diagnostic experience. Medical Science Monitor: International Medical Journal of Experimental and Clinical Research. 2020;26:e918452-1.
  • Zhao Y, Healy BC, Rotstein D, Guttmann CR, Bakshi R, Weiner HL, et al. Exploration of machine learning techniques in predicting multiple sclerosis disease course. PloS one. 2017;12(4):e0174866.
  • Chen JIZ, Hengjinda P. Early prediction of coronary artery disease (CAD) by machine learning method-a comparative study. Journal of Artificial Intelligence. 2021;3(01):17-33.
  • Hamze-Ziabari S, Bakhshpoori T. Improving the prediction of ground motion parameters based on an efficient bagging ensemble model of M5′ and CART algorithms. Applied Soft Computing. 2018;68:147-61.
  • Deng H, Diao Y, Wu W, Zhang J, Ma M, Zhong X. A high-speed D-CART online fault diagnosis algorithm for rotor systems. Applied Intelligence. 2020;50(1):29-41.
  • Choubin B, Abdolshahnejad M, Moradi E, Querol X, Mosavi A, Shamshirband S, et al. Spatial hazard assessment of the PM10 using machine learning models in Barcelona, Spain. Science of The Total Environment. 2020;701:134474.
  • Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees: Routledge; 2017.
  • Timofeev R. Classification and regression trees (CART) theory and applications. Humboldt University, Berlin. 2004;54.
  • Murphree DH, Arabmakki E, Ngufor C, Storlie CB, McCoy RG. Stacked classifiers for individualized prediction of glycemic control following initiation of metformin therapy in type 2 diabetes. Computers in biology and medicine. 2018;103:109-15.
  • Duan H, Deng Z, Deng F, Wang D. Assessment of groundwater potential based on multicriteria decision making model and decision tree algorithms. Mathematical Problems in Engineering. 2016;2016.
  • Ismay C, Kennedy PC. Getting Used to r, RStudio, and r Markdown. 2016.
  • Asif MA-A-R, Nishat MM, Faisal F, Shikder MF, Udoy MH, Dip RR, et al., editors. Computer aided diagnosis of thyroid disease using machine learning algorithms. 2020 11th International Conference on Electrical and Computer Engineering (ICECE); 2020: IEEE.
  • Yoon JH, Lee HS, Kim E-K, Moon HJ, Kwak JY. Malignancy risk stratification of thyroid nodules: comparison between the thyroid imaging reporting and data system and the 2014 American Thyroid Association management guidelines. Radiology. 2016;278(3):917-24.
  • Kaur K, Sonkhya N, Bapna A, Mital P. A comparative study of fine needle aspiration cytology, ultrasonography and radionuclide scan in the management of solitary thyroid nodule: A prospective analysis of fifty cases. Indian Journal of Otolaryngology and Head and Neck Surgery. 2002;54(2):96-101.
  • Shin JH, Baek JH, Chung J, Ha EJ, Kim J-h, Lee YH, et al. Ultrasonography diagnosis and imaging-based management of thyroid nodules: revised Korean Society of Thyroid Radiology consensus statement and recommendations. Korean journal of radiology. 2016;17(3):370-95.
  • Yoo I, Alafaireet P, Marinov M, Pena-Hernandez K, Gopidi R, Chang J-F, et al. Data mining in healthcare and biomedicine: a survey of the literature. Journal of medical systems. 2012;36(4):2431-48.
  • Mansour R, Eghbal K, Amirhossein H. Comparison of artificial neural network, logistic regression and discriminant analysis efficiency in determining risk factors of type 2 diabetes. 2013.
  • Wang C, Li L, Wang L, Ping Z, Flory MT, Wang G, et al. Evaluating the risk of type 2 diabetes mellitus using artificial neural network: An effective classification approach. Diabetes research and clinical practice. 2013;100(1):111-8.
  • Parikh KS, Shah TP, Kota R, Vora R. Diagnosing common skin diseases using soft computing techniques. International Journal of Bio-Science and Bio-Technology. 2015;7(6):275-86.
  • Ioniţă I, Ioniţă L. Prediction of thyroid disease using data mining techniques. BRAIN Broad Research in Artificial Intelligence and Neuroscience. 2016;7(3):115-24.
  • Chaurasia V, Pal S, Tiwari B. Prediction of benign and malignant breast cancer using data mining techniques. Journal of Algorithms & Computational Technology. 2018;12(2):119-26.
  • Talasila V, Madhubabu K, Mahadasyam MC, Atchala NJ, Kande LS. The Prediction of Diseases Using Rough Set Theory with Recurrent Neural Network in Big Data Analytics. International Journal of Intelligent Engineering & Systems. 2020;13(5).
  • Kumar HH, editor A novel approach of SVM based classification on thyroid disease stage detection. 2020 third international conference on smart systems and inventive technology (ICSSIT); 2020: IEEE.
  • Aversano L, Bernardi ML, Cimitile M, Iammarino M, Macchia PE, Nettore IC, et al. Thyroid disease treatment prediction with machine learning approaches. Procedia Computer Science. 2021;192:1031-40.
There are 35 citations in total.

Details

Primary Language English
Subjects Health Care Administration
Journal Section Research articles
Authors

İpek Balıkçı Çiçek 0000-0002-3805-9214

Zeynep Küçükakçalı 0000-0001-7956-9272

Publication Date August 31, 2023
Published in Issue Year 2023 Volume: 9 Issue: 3

Cite

Vancouver Balıkçı Çiçek İ, Küçükakçalı Z. Machine Learning Approach for Thyroid Cancer Diagnosis Using Clinical Data. Mid Blac Sea J Health Sci. 2023;9(3):440-52.

2310022108  22107  22106  22105  22103  22109 22137 22102  22110    e-ISSN 2149-7796