Research Article
BibTex RIS Cite
Year 2025, Issue: 060, 89 - 98, 25.03.2025
https://doi.org/10.59313/jsr-a.1604576

Abstract

References

  • [1] S. Ceylan, “Temel veri madenciliği algoritmalarının başarımlarının endokrin veri seti üzerinde karşılaştırılması,” Yüksek Lisans Tezi, Pamukkale Üniversitesi, Fen Bilimleri Enstitüsü, Elektrik-Elektronik Mühendisliği Anabilim Dalı, Turkey, 2023.
  • [2] G. Silahtaroğlu, Veri madenciliği. Turkey: Papatya Yayınları, 2008.
  • [3] S. Dogan and I. Turkoglu, "Extraction association rules from the biochemistry parameters for diagnosing hyperthyroidi," in IEEE 16th SIU., (Aydin, Turkey), 2008, pp. 1-4, doi: 10.1109/SIU.2008.4632562.
  • [4] D. A. Koutras, “Subclinical Hyperthyroidism,” Thyroid, Vol. 9, No:3, pp: 311-315, 1999, doi: https://doi.org/10.1089/thy.1999.9.311
  • [5] “Hypertiroidism (overactive thyroid),” mayo.clinic.org. https://www.mayoclinic.org/diseases-conditions/hyperthyroidism/diagnosis-treatment/drc-20373665. (accessed Feb. 2024).
  • [6] A. Ozel, A. Akdemir and S. Orsel, “Hipertiroidinin Neden Olduğu Psikotik Bozukluk: Bir Olgu Sunumu,” Nöropsikiyatri Arşivi, 39(2-3-4), pp: 64-66, 2002. Available: https://www.researchgate.net/profile/Asena-Akdemir/publication/242681514_Hipertiroidinin_Neden_Olduu_Psikotik_Bozukluk_Bir_Olgu_Sunumu/links/00b7d53989fe82f6e3000000/Hipertiroidinin-Neden-Olduu-Psikotik-Bozukluk-Bir-Olgu-Sunumu.pdf
  • [7] J. Barrera and R. M. Cesar-Jr, “An environment for knowledge discovery in biology,” Computers in Biology and Medicine, 34, pp: 427–447, 2003, doi: https://doi.org/10.1016/S0010-4825(03)00073-8
  • [8] R. J. Shebuski, “Utility of point-of-care diagnostic testing in patients with chest pain and suspected acute myocardial infarction,” Current Opinion in Pharmacology, 2, pp: 160–164, 2002, doi: https://doi.org/10.1016/S1471-4892(02)00140-6
  • [9] Ş. Kitiş and H. Göker, “Detection of obesity stages using machine learning algorithms,” Anbar Journal of Engineering Sciences, 14(1), pp: 80-88, 2023. Available: https://www.iraqoaj.net/iasj/article/271320
  • [10] M. M. Yin and J. T. L.Wang, “GeneScout: a data mining system for predicting vertebrate genes in genomic DNA sequences,” Information Sciences, 163, pp: 201–218, 2003, doi: https://doi.org/10.1016/j.ins.2003.03.016
  • [11] H. Göker, “Automatic detection of Parkinson's disease from power spectral density of electroencephalography (EEG) signals using deep learning model,” Physical and Engineering Sciences In Medicine, vol.46, no.3, pp.1163-1174, 2023, doi: https://doi.org/10.1007/s13246-023-01284-x
  • [12] J. M. Ayub, et all. “Protein–Protein interaction map of the trypanosoma cruzi ribosomal p, protein complex,” Gene, 357, pp: 129 – 136, 2005, doi: https://doi.org/10.1016/j.gene.2005.06.006
  • [13] Collaborators: jaina (Owner), “Thyroid Disease Data,” kaggle.com. https://www.kaggle.com/datasets/jainaru/thyroid-disease-data/data (Accessed Feb. 2024).
  • [14] C. Boukhatem, H.Y. Youssef and A. B. Nassıf, “Heart disease prediction using machine learning,” in Advances in Science and Engineering Technology International Conferences (United Kingdom), 2022, pp:1-6, doi: 10.1109/ASET53988.2022.9734880.
  • [15] I. Rish, “An emprical study of the naive bayes,” IBM Research Report, USA, 2 November 2001.
  • [16] H. Zhang, “The optimality of naive bayes,” in FLAIRS Conference (Miami Beach, Florida, USA), 2004, pp:1-6, Available: https://cdn.aaai.org/FLAIRS/2004/Flairs04-097.pdf
  • [17] E. Aydoğan, “Veri madenciliğinde sınıflandırma problemleri için evrimsel algoritma tabanlı yeni bir yaklaşım: rough-mep algoritması,” Doktora Tezi, Gazi Üniversitesi, Turkey, 2008.
  • [18] S. Singaravelan, D. Murugan and R. Mayakrishnan, “Analysis of classification algorithms J48 and Smo on different datasets,” World Engineering & Applied Sciences Journal, 6(2), 119-123, 2015, doi:10.5829/idosi.weasj.2015.6.2.22162
  • [19] S. Aljawarneh, M. B. Yassein and M. Aljundi, “An enhanced J48 classification algorithm for the anomaly intrusion detection systems,” Cluster Computing, pp: 1-17, 2017, doi: https://doi.org/10.1007/s10586-017-1109-8
  • [20] M. A. Alan and C. Yeşilyurt, “Farklı veri setleri üzerinde SMO ve J48 algoritmalarının sınıflandırma sonuçlarının karşılaştırılması,” İşletme Bilimi Dergisi (JOBS), 6(3), pp. 199-213, 2018, doi:10.22139/jobs.487388
  • [21] N. L. Leech, K.C. Barrett, and G.A. Morgan, SPSS for intermediate statistics: Use and interpretation. Manwah New Jersey, USA: Lawrance Erlbaum Associates Publishers, 2004.
  • [22] H. Bircan, “Lojistik regresyon analizi: Tıp verileri üzerine bir uygulama,” Kocaeli Üniversitesi Sosyal Bilimler Dergisi, (8), pp: 185-208, 2004. Available: https://dergipark.org.tr/en/pub/kosbed/issue/25712/271314
  • [23] H. Tatlıdil, Uygulamalı çok değişkenli istatistiksel analiz. Ankara, Turkey: Akademi Matbaası, 1996.
  • [24] D. Delen, G. Walker and A. Kadam, “Predicting breast cancer survivability: a comparison of three data mining methods,” Artificial Intelligence in Medicine, Vol 34, pp. 113-127, June 2005, doi: https://doi.org/10.1016/j.artmed.2004.07.002
  • [25] W. Chen and S. Zhang, “GIS-based comparative study of Bayes network, hoeffding tree and logistic model tree for landslide susceptibility modeling,” Catena, 105344, 203, 2021, doi: https://doi.org/10.1016/j.catena.2021.105344
  • [26] N. Nahar and F. Ara, “Liver disease prediction by using different decision tree techniques,” International Journal of Data Mining & Knowledge Management Process (IJDKP), Vol.8, No.2, March 2018, doi: 10.5121/ijdkp.2018.8201 1
  • [27] S. Ray, “A quick review of machine learning algorithms,” in 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon), (India), 2019, pp. 35-39, doi: 10.1109/COMITCon.2019.8862451
  • [28] M. Zemouli, “Un système intelligent pour améliorer la prédiction des maladies cardiovasculaires,” Guelma: Université du, 8 mai 1945 Guelma, 2023. Available: http://dspace.univ-guelma.dz/jspui/handle/123456789/15053
  • [29] B. Wijonarko, “Perbandingan algoritma data mining naive bayes dan bayes network untuk mengidentifikasi penyakit tiroid,” Jurnal Pilar Nusa Mandiri, Vol. 14, No. 1, Maret 2018, doi: https://doi.org/10.33480/pilar.v14i1.83
  • [30] R. S. Tantika, “Penggunaan metode support vector machine klasifikasi multiclass pada data pasien penyakit tiroid,” Bandung Conference Series: Statistics, Vol. 2, No. 2, pp. 159-166, 2022, doi: https://doi.org/10.29313/bcss.v2i2.3590
  • [31] L. Yusuf and T. Hidayatulloh, “Implementasi algoritma artificial neural network dengan aktivasi ReLU: klasifikasi tiroid,” Jurnal Swabumi, Vol.12 No.1, pp. 113-119, Maret 2024. Available: https://repository.nusamandiri.ac.id/repo/files/248978/download/23020-59820-1-PB.pdf
  • [32] A. Angel and D. E. Herwindiatib, “Perbandingan algoritma K-NN, SVM, dan decision tree dalam klasifikasi kelenjar tiroid,” Jurnal Teknologi Dan Sistem Informasi Bisnis, Vol. 6 No. 4, hal. 866-871, Oktober 2024, doi: https://doi.org/10.47233/jteksis.v6i4.1651
  • [33] M. Faruqziddan, E. H. S. Aulia, S. D. Azzahra, A. Ristyawan and E. Daniati, “Klasifikasi risiko kambuhnya kanker tiroid menggunakan algoritma random forest,” INOTEK, Vol. 8, No. 1, hal. 63-74, Agustus 2024. Available: https://proceeding.unpkediri.ac.id/index.php/inotek/article/view/4912
  • [34] M. Luthfi, P. Kinerja “Algoritma klasifikasi untuk prediksi penyakit tiroid,” Universitas Pembangunan Nasional “Veteran” Jakarta Fakultas Ilmu Komputer Program Studi Informatika, Vol. 22, No. 2, 2023, doi: https://doi.org/10.31294/p.v21i2
  • [35] L. Mutawali, W. Murniati and K. Kunci, “Penerapan knnimputer dalam mengolah data missing value untuk membantu meningkatkan akurasi support vector machine klasifikasi penyakit tiroid,” 4.4 (2022): 386-390. 2022. Available: https://archive.ics.uci.edu/ml/datasets/thyroid+diseas
  • [36] C. Untuk and K. Harga, “Perbandingan kinerja algoritma decision tree dan naive bayes dalam prediksi kebangkrutan,” Core, vol. 7, no. 1, pp. 20–24, 2020. Available: https://core.ac.uk/download/pdf/143964255.pdf
  • [37] P. Tamba, “Prediksi penyakit gagal jantung dengan menggunakan random forest,” Jurnal Sistem Informasi dan Ilmu Komputer Prima, vol. 5, no. 2, 2022, doi: https://doi.org/10.34012/jurnalsisteminformasidanilmukomputer.v5i2.2445

Analysis of patient demographics and test results using data mining methods and thyroid cancer examination

Year 2025, Issue: 060, 89 - 98, 25.03.2025
https://doi.org/10.59313/jsr-a.1604576

Abstract

The aim of this study is to detect thyroid cancer and cancer recurrence using 383 datasets containing 16 parameters. These variables are: age, gender, smoking, smoking history, whether got radiotherapy or not, thyroid function status, physical examination, adenopathy, pathology, focus, risk type, T, N, M stages depending on risk type, cancer level and recurrence status. In this study, Decision Stump, Hoeffding Tree, J48, LMT, Random Forest1, Random Forest2, REP Tree trees datasets and Naive Bayes, Logistic Function, Multilayer Perception Function, Simple Logistic Function1, Simple Logistic Function2, IBK K 3 functions were run with WEKA program. According to the results, it is concluded that Random Forest trees are better than other classifiers and studies in the literature.

References

  • [1] S. Ceylan, “Temel veri madenciliği algoritmalarının başarımlarının endokrin veri seti üzerinde karşılaştırılması,” Yüksek Lisans Tezi, Pamukkale Üniversitesi, Fen Bilimleri Enstitüsü, Elektrik-Elektronik Mühendisliği Anabilim Dalı, Turkey, 2023.
  • [2] G. Silahtaroğlu, Veri madenciliği. Turkey: Papatya Yayınları, 2008.
  • [3] S. Dogan and I. Turkoglu, "Extraction association rules from the biochemistry parameters for diagnosing hyperthyroidi," in IEEE 16th SIU., (Aydin, Turkey), 2008, pp. 1-4, doi: 10.1109/SIU.2008.4632562.
  • [4] D. A. Koutras, “Subclinical Hyperthyroidism,” Thyroid, Vol. 9, No:3, pp: 311-315, 1999, doi: https://doi.org/10.1089/thy.1999.9.311
  • [5] “Hypertiroidism (overactive thyroid),” mayo.clinic.org. https://www.mayoclinic.org/diseases-conditions/hyperthyroidism/diagnosis-treatment/drc-20373665. (accessed Feb. 2024).
  • [6] A. Ozel, A. Akdemir and S. Orsel, “Hipertiroidinin Neden Olduğu Psikotik Bozukluk: Bir Olgu Sunumu,” Nöropsikiyatri Arşivi, 39(2-3-4), pp: 64-66, 2002. Available: https://www.researchgate.net/profile/Asena-Akdemir/publication/242681514_Hipertiroidinin_Neden_Olduu_Psikotik_Bozukluk_Bir_Olgu_Sunumu/links/00b7d53989fe82f6e3000000/Hipertiroidinin-Neden-Olduu-Psikotik-Bozukluk-Bir-Olgu-Sunumu.pdf
  • [7] J. Barrera and R. M. Cesar-Jr, “An environment for knowledge discovery in biology,” Computers in Biology and Medicine, 34, pp: 427–447, 2003, doi: https://doi.org/10.1016/S0010-4825(03)00073-8
  • [8] R. J. Shebuski, “Utility of point-of-care diagnostic testing in patients with chest pain and suspected acute myocardial infarction,” Current Opinion in Pharmacology, 2, pp: 160–164, 2002, doi: https://doi.org/10.1016/S1471-4892(02)00140-6
  • [9] Ş. Kitiş and H. Göker, “Detection of obesity stages using machine learning algorithms,” Anbar Journal of Engineering Sciences, 14(1), pp: 80-88, 2023. Available: https://www.iraqoaj.net/iasj/article/271320
  • [10] M. M. Yin and J. T. L.Wang, “GeneScout: a data mining system for predicting vertebrate genes in genomic DNA sequences,” Information Sciences, 163, pp: 201–218, 2003, doi: https://doi.org/10.1016/j.ins.2003.03.016
  • [11] H. Göker, “Automatic detection of Parkinson's disease from power spectral density of electroencephalography (EEG) signals using deep learning model,” Physical and Engineering Sciences In Medicine, vol.46, no.3, pp.1163-1174, 2023, doi: https://doi.org/10.1007/s13246-023-01284-x
  • [12] J. M. Ayub, et all. “Protein–Protein interaction map of the trypanosoma cruzi ribosomal p, protein complex,” Gene, 357, pp: 129 – 136, 2005, doi: https://doi.org/10.1016/j.gene.2005.06.006
  • [13] Collaborators: jaina (Owner), “Thyroid Disease Data,” kaggle.com. https://www.kaggle.com/datasets/jainaru/thyroid-disease-data/data (Accessed Feb. 2024).
  • [14] C. Boukhatem, H.Y. Youssef and A. B. Nassıf, “Heart disease prediction using machine learning,” in Advances in Science and Engineering Technology International Conferences (United Kingdom), 2022, pp:1-6, doi: 10.1109/ASET53988.2022.9734880.
  • [15] I. Rish, “An emprical study of the naive bayes,” IBM Research Report, USA, 2 November 2001.
  • [16] H. Zhang, “The optimality of naive bayes,” in FLAIRS Conference (Miami Beach, Florida, USA), 2004, pp:1-6, Available: https://cdn.aaai.org/FLAIRS/2004/Flairs04-097.pdf
  • [17] E. Aydoğan, “Veri madenciliğinde sınıflandırma problemleri için evrimsel algoritma tabanlı yeni bir yaklaşım: rough-mep algoritması,” Doktora Tezi, Gazi Üniversitesi, Turkey, 2008.
  • [18] S. Singaravelan, D. Murugan and R. Mayakrishnan, “Analysis of classification algorithms J48 and Smo on different datasets,” World Engineering & Applied Sciences Journal, 6(2), 119-123, 2015, doi:10.5829/idosi.weasj.2015.6.2.22162
  • [19] S. Aljawarneh, M. B. Yassein and M. Aljundi, “An enhanced J48 classification algorithm for the anomaly intrusion detection systems,” Cluster Computing, pp: 1-17, 2017, doi: https://doi.org/10.1007/s10586-017-1109-8
  • [20] M. A. Alan and C. Yeşilyurt, “Farklı veri setleri üzerinde SMO ve J48 algoritmalarının sınıflandırma sonuçlarının karşılaştırılması,” İşletme Bilimi Dergisi (JOBS), 6(3), pp. 199-213, 2018, doi:10.22139/jobs.487388
  • [21] N. L. Leech, K.C. Barrett, and G.A. Morgan, SPSS for intermediate statistics: Use and interpretation. Manwah New Jersey, USA: Lawrance Erlbaum Associates Publishers, 2004.
  • [22] H. Bircan, “Lojistik regresyon analizi: Tıp verileri üzerine bir uygulama,” Kocaeli Üniversitesi Sosyal Bilimler Dergisi, (8), pp: 185-208, 2004. Available: https://dergipark.org.tr/en/pub/kosbed/issue/25712/271314
  • [23] H. Tatlıdil, Uygulamalı çok değişkenli istatistiksel analiz. Ankara, Turkey: Akademi Matbaası, 1996.
  • [24] D. Delen, G. Walker and A. Kadam, “Predicting breast cancer survivability: a comparison of three data mining methods,” Artificial Intelligence in Medicine, Vol 34, pp. 113-127, June 2005, doi: https://doi.org/10.1016/j.artmed.2004.07.002
  • [25] W. Chen and S. Zhang, “GIS-based comparative study of Bayes network, hoeffding tree and logistic model tree for landslide susceptibility modeling,” Catena, 105344, 203, 2021, doi: https://doi.org/10.1016/j.catena.2021.105344
  • [26] N. Nahar and F. Ara, “Liver disease prediction by using different decision tree techniques,” International Journal of Data Mining & Knowledge Management Process (IJDKP), Vol.8, No.2, March 2018, doi: 10.5121/ijdkp.2018.8201 1
  • [27] S. Ray, “A quick review of machine learning algorithms,” in 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon), (India), 2019, pp. 35-39, doi: 10.1109/COMITCon.2019.8862451
  • [28] M. Zemouli, “Un système intelligent pour améliorer la prédiction des maladies cardiovasculaires,” Guelma: Université du, 8 mai 1945 Guelma, 2023. Available: http://dspace.univ-guelma.dz/jspui/handle/123456789/15053
  • [29] B. Wijonarko, “Perbandingan algoritma data mining naive bayes dan bayes network untuk mengidentifikasi penyakit tiroid,” Jurnal Pilar Nusa Mandiri, Vol. 14, No. 1, Maret 2018, doi: https://doi.org/10.33480/pilar.v14i1.83
  • [30] R. S. Tantika, “Penggunaan metode support vector machine klasifikasi multiclass pada data pasien penyakit tiroid,” Bandung Conference Series: Statistics, Vol. 2, No. 2, pp. 159-166, 2022, doi: https://doi.org/10.29313/bcss.v2i2.3590
  • [31] L. Yusuf and T. Hidayatulloh, “Implementasi algoritma artificial neural network dengan aktivasi ReLU: klasifikasi tiroid,” Jurnal Swabumi, Vol.12 No.1, pp. 113-119, Maret 2024. Available: https://repository.nusamandiri.ac.id/repo/files/248978/download/23020-59820-1-PB.pdf
  • [32] A. Angel and D. E. Herwindiatib, “Perbandingan algoritma K-NN, SVM, dan decision tree dalam klasifikasi kelenjar tiroid,” Jurnal Teknologi Dan Sistem Informasi Bisnis, Vol. 6 No. 4, hal. 866-871, Oktober 2024, doi: https://doi.org/10.47233/jteksis.v6i4.1651
  • [33] M. Faruqziddan, E. H. S. Aulia, S. D. Azzahra, A. Ristyawan and E. Daniati, “Klasifikasi risiko kambuhnya kanker tiroid menggunakan algoritma random forest,” INOTEK, Vol. 8, No. 1, hal. 63-74, Agustus 2024. Available: https://proceeding.unpkediri.ac.id/index.php/inotek/article/view/4912
  • [34] M. Luthfi, P. Kinerja “Algoritma klasifikasi untuk prediksi penyakit tiroid,” Universitas Pembangunan Nasional “Veteran” Jakarta Fakultas Ilmu Komputer Program Studi Informatika, Vol. 22, No. 2, 2023, doi: https://doi.org/10.31294/p.v21i2
  • [35] L. Mutawali, W. Murniati and K. Kunci, “Penerapan knnimputer dalam mengolah data missing value untuk membantu meningkatkan akurasi support vector machine klasifikasi penyakit tiroid,” 4.4 (2022): 386-390. 2022. Available: https://archive.ics.uci.edu/ml/datasets/thyroid+diseas
  • [36] C. Untuk and K. Harga, “Perbandingan kinerja algoritma decision tree dan naive bayes dalam prediksi kebangkrutan,” Core, vol. 7, no. 1, pp. 20–24, 2020. Available: https://core.ac.uk/download/pdf/143964255.pdf
  • [37] P. Tamba, “Prediksi penyakit gagal jantung dengan menggunakan random forest,” Jurnal Sistem Informasi dan Ilmu Komputer Prima, vol. 5, no. 2, 2022, doi: https://doi.org/10.34012/jurnalsisteminformasidanilmukomputer.v5i2.2445
There are 37 citations in total.

Details

Primary Language English
Subjects Biomedical Engineering (Other)
Journal Section Research Articles
Authors

Şükrü Kitiş 0000-0003-3302-3359

Publication Date March 25, 2025
Submission Date December 20, 2024
Acceptance Date March 6, 2025
Published in Issue Year 2025 Issue: 060

Cite

IEEE Ş. Kitiş, “Analysis of patient demographics and test results using data mining methods and thyroid cancer examination”, JSR-A, no. 060, pp. 89–98, March 2025, doi: 10.59313/jsr-a.1604576.