Research Article
BibTex RIS Cite

Boyut Azaltmanın Bulanık C-Ortalama Kümeleme Teknikleri Üzerindeki Etkisi

Year 2021, Volume: 4 Issue: 1, 1 - 7, 15.01.2021

Abstract

Bulanık c-ortalama kümeleme, literatürde farklı alanlarda kullanılan yaygın kümeleme algoritmalarından biridir. Boyut küçültme, büyük veri kümelerini, en az bilgi kaybıyla eşdeğeri olan daha küçük boyutlu veri kümelerine dönüştüren bir tekniktir. Bu makalede, boyut azalmasının farklı bulanık kümeleme teknikleri üzerindeki etkisi incelenmektedir. Bu amaçla farklı dört bulanık kümeleme algoritması kullanıldı: Bulanık C-Ortalamalar (BCO), Tip-2 Bulanık C-Ortalamalar (BCO2), Olasılıksal Bulanık C-Ortalamalar (OBCO) ve Denetimsiz Olasılıksal Bulanık C-Ortalamalar (DOBC). Boyut küçültme için verilerdeki varyansı minimum %80 açıklayan bir dizi bileşen seçildi. Boyutsallığın azaltılması için Kesik Tekil Değer Ayrıştırma (KTDA) tekniği kullanıldı. Çalışmada, ilk olarak, orijinal gerçek dünya veri kümeleri, bahsedilen dört yöntemle kümelendi. Daha sonra, bu veri kümelerinin boyutu küçültülmüş hali de yine bu dört yöntemle kümelendi. Kümeleme performansı için dört dahili kümeleme değerlendirme metriği kullanıldı. Bunlar Silhouette İndeksi (SI), Bölme Katsayısı (BK), Bölme Entropisi (BE) ve Kök Ortalama Kare Hatası (KOKH). Yöntemlerin, orijinal ve boyutu azaltılmış veri kümeleri için kümeleme performansı, karşılaştırmalı olarak sunulmaktadır. Sonuçlara göre, indirgenmiş veriler üzerinde, yöntemlerin performansı orijinal verilerden daha başarılıdır. Boyut azaltımının kümeleme başarısına katkısı en çok BCO için, en az BCO2 için elde edilmektedir.

References

  • [1] Dunn JC. "A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters". Journal of Cybernetics, 3(3), 32–57,1973.
  • [2] Bezdek JC. Pattern recognition with fuzzy objective function algorithms, Plenum, NY,1981.
  • [3] Rhee FCH, Hwang C. "A type-2 fuzzy C-means clustering algorithm". Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569). Vol. 4. IEEE, 2001.
  • [4] Gosain A, Dahiya S. "Performance analysis of various fuzzy clustering algorithms: a review". Procedia Computer Science 79, 100-111, 2016.
  • [5] Pal NR, Pal K, Keller JM, Bezdek JC. “A possibilistic fuzzy c-means clustering algorithm”. IEEE transactions on fuzzy systems, 13(4), 517-530, 2005. [6] Krishnapuram R, Keller JM. “A possibilistic approach to clustering”. IEEE transactions on fuzzy systems, 1(2), 98-110, 1993.
  • [7] Wu X, Wu B, Sun J, Fu H. “Unsupervised possibilistic fuzzy clustering”. Journal of Information & Computational Sci., 7 (5), 1075-1080, 2010.
  • [8] Yang MS, Wu KL. “Unsupervised possibilistic clustering”. Pattern Recognition, 39(1), 5-21,2006.
  • [9] Gosain A, Dahiya S. “Performance analysis of various fuzzy clustering algorithms: a review”. Procedia Computer Science, 79, 100-111, 2016.
  • [10] Eschrich S, Ke J, Hall LO, Goldgof DB. “Fast accurate fuzzy clustering through data reduction”. IEEE transactions on fuzzy systems, 11(2), 262-270, 2003.
  • [11] Lee, KY. “Local fuzzy PCA based GMM with dimension reduction on speaker identification”. Pattern recognition letters, 25(16), 1811-1817,2004.
  • [12] Karami A. "Application of fuzzy clustering for text data dimensionality reduction". arXiv preprint arXiv:1909.10881, 2019.
  • [13] Yildiz K, Çamurcu AY, Dogan B. “Comparison of dimension reduction techniques on high dimensional datasets”. Int. Arab J. Inf. Technol., 15(2), 256-262, 2018.
  • [14] Winkler R, Klawonn F, Kruse R. ” Problems of fuzzy c-means clustering and similar algorithms with high dimensional data sets”. Challenges at the Interface of data analysis, computer science, and optimization (pp. 79-87), Springer, Berlin, Heidelberg, 2012.
  • [15] Halko N, Martinsson PG, Tropp JA. Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions,arXiv:909,https://arxiv.org/pdf/0909.4061.pdf, 2009.
  • [16] Golub GH, Reinsch C. "Singular value decomposition and least squares solutions". Numerische Mathematik. 14 (5),403–42, 1970.
  • [17] Wei JJ et al. "ECG data compression using truncated singular value decomposition". IEEE Transactions on Information Technology in Biomedicine 5(4), 290-299,2001.
  • [18] Kumar R, Kumar A, Singh GK. "Hybrid method based on singular value decomposition and embedded zero tree wavelet technique for ECG signal compression". Computer methods and programs in biomedicine, 129, 135-148,2016.
  • [19] Asuncion A, Newman D. UCI Machine Learning Repository,https://archive.ics.uci.edu/ml/index.php(10.05.2020)
  • [20] Rousseeuw PJ. "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis". Journal of computational and applied mathematics 20, 53-65, 1987.
  • [21] Bezdek JC. “Numerical taxonomy with fuzzy sets”. Journal of Mathematical Biology, 1(1), 57-71, 1974.
  • [22] Bezdek JC. “Cluster validity with fuzzy sets”, J. Cybern, 3, 58–78, 1974.
  • [23] Pedregosa et al., Scikit-learn: Machine Learning in Python, JMLR 12, 2825-2830, 2011.
  • [24] R Foundation for Statistical. https://www.R-project.org (15.05.2020).
Year 2021, Volume: 4 Issue: 1, 1 - 7, 15.01.2021

Abstract

References

  • [1] Dunn JC. "A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters". Journal of Cybernetics, 3(3), 32–57,1973.
  • [2] Bezdek JC. Pattern recognition with fuzzy objective function algorithms, Plenum, NY,1981.
  • [3] Rhee FCH, Hwang C. "A type-2 fuzzy C-means clustering algorithm". Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569). Vol. 4. IEEE, 2001.
  • [4] Gosain A, Dahiya S. "Performance analysis of various fuzzy clustering algorithms: a review". Procedia Computer Science 79, 100-111, 2016.
  • [5] Pal NR, Pal K, Keller JM, Bezdek JC. “A possibilistic fuzzy c-means clustering algorithm”. IEEE transactions on fuzzy systems, 13(4), 517-530, 2005. [6] Krishnapuram R, Keller JM. “A possibilistic approach to clustering”. IEEE transactions on fuzzy systems, 1(2), 98-110, 1993.
  • [7] Wu X, Wu B, Sun J, Fu H. “Unsupervised possibilistic fuzzy clustering”. Journal of Information & Computational Sci., 7 (5), 1075-1080, 2010.
  • [8] Yang MS, Wu KL. “Unsupervised possibilistic clustering”. Pattern Recognition, 39(1), 5-21,2006.
  • [9] Gosain A, Dahiya S. “Performance analysis of various fuzzy clustering algorithms: a review”. Procedia Computer Science, 79, 100-111, 2016.
  • [10] Eschrich S, Ke J, Hall LO, Goldgof DB. “Fast accurate fuzzy clustering through data reduction”. IEEE transactions on fuzzy systems, 11(2), 262-270, 2003.
  • [11] Lee, KY. “Local fuzzy PCA based GMM with dimension reduction on speaker identification”. Pattern recognition letters, 25(16), 1811-1817,2004.
  • [12] Karami A. "Application of fuzzy clustering for text data dimensionality reduction". arXiv preprint arXiv:1909.10881, 2019.
  • [13] Yildiz K, Çamurcu AY, Dogan B. “Comparison of dimension reduction techniques on high dimensional datasets”. Int. Arab J. Inf. Technol., 15(2), 256-262, 2018.
  • [14] Winkler R, Klawonn F, Kruse R. ” Problems of fuzzy c-means clustering and similar algorithms with high dimensional data sets”. Challenges at the Interface of data analysis, computer science, and optimization (pp. 79-87), Springer, Berlin, Heidelberg, 2012.
  • [15] Halko N, Martinsson PG, Tropp JA. Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions,arXiv:909,https://arxiv.org/pdf/0909.4061.pdf, 2009.
  • [16] Golub GH, Reinsch C. "Singular value decomposition and least squares solutions". Numerische Mathematik. 14 (5),403–42, 1970.
  • [17] Wei JJ et al. "ECG data compression using truncated singular value decomposition". IEEE Transactions on Information Technology in Biomedicine 5(4), 290-299,2001.
  • [18] Kumar R, Kumar A, Singh GK. "Hybrid method based on singular value decomposition and embedded zero tree wavelet technique for ECG signal compression". Computer methods and programs in biomedicine, 129, 135-148,2016.
  • [19] Asuncion A, Newman D. UCI Machine Learning Repository,https://archive.ics.uci.edu/ml/index.php(10.05.2020)
  • [20] Rousseeuw PJ. "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis". Journal of computational and applied mathematics 20, 53-65, 1987.
  • [21] Bezdek JC. “Numerical taxonomy with fuzzy sets”. Journal of Mathematical Biology, 1(1), 57-71, 1974.
  • [22] Bezdek JC. “Cluster validity with fuzzy sets”, J. Cybern, 3, 58–78, 1974.
  • [23] Pedregosa et al., Scikit-learn: Machine Learning in Python, JMLR 12, 2825-2830, 2011.
  • [24] R Foundation for Statistical. https://www.R-project.org (15.05.2020).
There are 23 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Nuran Peker

Cemalettin Kubat

Publication Date January 15, 2021
Published in Issue Year 2021 Volume: 4 Issue: 1

Cite

APA Peker, N., & Kubat, C. (2021). Boyut Azaltmanın Bulanık C-Ortalama Kümeleme Teknikleri Üzerindeki Etkisi. Veri Bilimi, 4(1), 1-7.



Dergimizin Tarandığı Dizinler (İndeksler)


Academic Resource Index

logo.png

journalseeker.researchbib.com

Google Scholar

scholar_logo_64dp.png

ASOS Index

asos-index.png

Rooting Index

logo.png

www.rootindexing.com

The JournalTOCs Index

journal-tocs-logo.jpg?w=584

www.journaltocs.ac.uk

General Impact Factor (GIF) Index

images?q=tbn%3AANd9GcQ0CrEQm4bHBnwh4XJv9I3ZCdHgQarj_qLyPTkGpeoRRmNh10eC

generalif.com

Directory of Research Journals Indexing

DRJI_Logo.jpg

olddrji.lbp.world/indexedJournals.aspx

I2OR Index

8c492a0a466f9b2cd59ec89595639a5c?AccessKeyId=245B99561176BAE11FEB&disposition=0&alloworigin=1

http://www.i2or.com/8.html



logo.png