Araştırma Makalesi
BibTex RIS Kaynak Göster

Comparative Performance Analysis of Ellipsoidal Support Vector Clustering on Biomedical Data Sets

Yıl 2019, Cilt: 7 Sayı: 1, 140 - 148, 15.01.2019
https://doi.org/10.21541/apjes.424247

Öz

The performance of clustering algorithms is very
important in biomedical research because they help in the pre-diagnosis of
diseases, recognize diseases and take necessary precautions in diseased people.
However, most clustering algorithms use the Euclidean distance as a similarity
metric. Euclidean distance assumes the variances of the data samples are equal.
The performance of traditional clustering methods that use Euclidean distance
is quite low if the data contains noise or outlier samples. This study proposes
the Ellipsoidal Support Vector Clustering algorithm, which is one of the
kernel-based clustering methods, in order to eliminate the above mentioned
problems. In the ESVC algorithm, there is no need to specify the cluster number
in advance. Moreover, the ESVC algorithm is capable of generating clustering
shapes that are appropriate to the distribution of data using the mahalanobis
similarity metric. The proposed ESVC algorithm was applied to both real
biomedical data and synthetic data and then compared to conventional clustering
methods. It has been observed that ESVC algorithm performs well in terms of
accuracy, specificity and sensitivity.

Kaynakça

  • Fahad, A., Alshatri, N., Tari, Z., Alamri, A., Khalil, I., Zomaya, A., Foufou, S. and Bouras, A. (2014). A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis. IEEE Transactions on Emerging Topics in Computing, 2(3), pp.267-279.
  • Starczewski, A. (2017). A new validity index for crisp clusters. Pattern Analysis and Applications, 20(3), pp.687-700.
  • Xu, R. and WunschII, D. (2005). Survey of Clustering Algorithms. IEEE Transactions on Neural Networks, 16(3), pp.645-678.
  • Khanmohammadi, S., Adibeig, N. and Shanehbandy, S. (2017). An improved overlapping k-means clustering method for medical applications. Expert Systems with Applications, 67, pp.12-18.
  • Rosati, S., Agostini, V., Knaflitz, M. and Balestra, G. (2017). Muscle activation patterns during gait: A hierarchical clustering analysis. Biomedical Signal Processing and Control, 31, pp.463-469.
  • Kumar, D., Verma, H., Mehra, A. and Agrawal, R. (2018). A modified intuitionistic fuzzy c-means clustering approach to segment human brain MRI image. Multimedia Tools and Applications, 1-25.
  • T., V. (2014). Performance based analysis between k-Means and Fuzzy C-Means clustering algorithms for connection oriented telecommunication data. Applied Soft Computing, 19, pp.134-146.
  • Erken, M. (2016). Comparing clustering algorithms on wisconsin data set. In Signal Processing and Communication Application Conference (SIU), pp. 1541-1544.
  • Sharma, A., Boroevich, K., Shigemizu, D., Kamatani, Y., Kubo, M. and Tsunoda, T. (2017). Hierarchical Maximum Likelihood Clustering Approach. IEEE Transactions on Biomedical Engineering, 64(1), pp.112-122..
  • Ben-Hur, A., Horn, D., Siegelmann, H. T., & Vapnik, V. (2001). Support vector clustering. Journal of machine learning research, 2(Dec), pp. 125-137.
  • DonGiovanni, D. and Vaina, L. (2016). Select and Cluster: A Method for Finding Functional Networks of Clustered Voxels in fMRI. Computational Intelligence and Neuroscience, 2016, pp.1-19.
  • Yin, Z. and Zhang, J. (2014). Identification of temporal variations in mental workload using locally-linear-embedding-based EEG feature reduction and support-vector-machine-based clustering and classification techniques. Computer Methods and Programs in Biomedicine, 115(3), pp. 119-134.
  • Villazana, S., Seijas, C., & Caralli, A. (2015). Lempel-Ziv complexity and Shannon entropy-based support vector clustering of ECG signals. Revista Ingenıería Uc, 22(1).
  • Yin, Z. and Zhang, J. (2013). Identifying Changes in Human Operator Mental Workload by Locally Linear Embedding and Support Vector Clustering Approaches. IFAC Proceedings Volumes, 46(13), pp.353-358..
  • Lım, B., Lı, X., Zhang, J. B., & Shı, D. (2007). Feature Dıscretızatıon By Support Vector Clusterıng. In Information Sciences, pp. 797-803.
  • Wang, D., Shi, L., Yeung, D. S., Tsang, E. C., & Heng, P. A. (2007). Ellipsoidal support vector clustering for functional MRI analysis. Pattern Recognition, 40(10), pp. 2685-2695.
  • Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
  • J. Sturm, (1999) Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones, Optim. Methods Software 11 pp. 625–653.
  • M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming (web page and software). http://stanford.edu/∼boyd/cvx, June 2009.
  • J. Lee and D. Lee, (2005), “An Improved Cluster Labeling Method for Support Vector Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, 27(3), pp. 461-464.
  • K.D. Won, K. Lee, K.D. Lee, K.H. Lee, (2005), A k-populations algorithm for clustering categorical data, Pattern Recognition 38(7), pp. 1131–1134.

Elipsoit Destek Vektör Öbekleme Algoritmasının Biyomedikal Veri Setleri Üzerinde Karşılaştırmalı Performans Analizi

Yıl 2019, Cilt: 7 Sayı: 1, 140 - 148, 15.01.2019
https://doi.org/10.21541/apjes.424247

Öz

Hastalıklı kişilerde hastalığın
teşhisinin önceden yapılması, tanısının konulması ve gerekli önlemlerin
alınmasına yardımcı olmalarından dolayı öbekleme algoritmalarının performansı
biyomedikal araştırmalarda çok önemlidir. Ancak, çoğu öbekleme algoritması
benzerlik metriği olarak Öklid uzaklığını kullanır. Öklid uzaklığı verilerin
varyanslarını eşit kabul eder. Gürültülü veya aykırı değerlerin veriye bulaşması
durumunda, geleneksel Öklid uzaklığı kullanan öbekleme yöntemlerinin
performansı oldukça düşmektedir. Bu çalışma, yukarıda bahsedilen olumsuzlukları
gidermek için kernel tabanlı öbekleme yöntemlerinden biri olan Elipsoit Destek
Vektör Öbekleme (EDVÖ) algoritmasını önerir. EDVÖ algoritmasında, önceden öbek
sayısının belirtilmesine gerek yoktur. Ayrıca, EDVÖ algoritması, mahalanobis
benzerlik ölçüsünü kullanarak verilerin dağılımına uygun kümelenme şekilleri
üretebilir. Önerilen EDVÖ algoritması hem gerçek biyomedikal verilere hem de
sentetik verilere uygulanmış ve daha sonra geleneksel kümeleme yöntemleri ile
karşılaştırılmıştır. EDVÖ algoritmasının doğruluk, özgüllük ve duyarlılık
açısından iyi bir performans gösterdiği gözlemlenmiştir.

Kaynakça

  • Fahad, A., Alshatri, N., Tari, Z., Alamri, A., Khalil, I., Zomaya, A., Foufou, S. and Bouras, A. (2014). A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis. IEEE Transactions on Emerging Topics in Computing, 2(3), pp.267-279.
  • Starczewski, A. (2017). A new validity index for crisp clusters. Pattern Analysis and Applications, 20(3), pp.687-700.
  • Xu, R. and WunschII, D. (2005). Survey of Clustering Algorithms. IEEE Transactions on Neural Networks, 16(3), pp.645-678.
  • Khanmohammadi, S., Adibeig, N. and Shanehbandy, S. (2017). An improved overlapping k-means clustering method for medical applications. Expert Systems with Applications, 67, pp.12-18.
  • Rosati, S., Agostini, V., Knaflitz, M. and Balestra, G. (2017). Muscle activation patterns during gait: A hierarchical clustering analysis. Biomedical Signal Processing and Control, 31, pp.463-469.
  • Kumar, D., Verma, H., Mehra, A. and Agrawal, R. (2018). A modified intuitionistic fuzzy c-means clustering approach to segment human brain MRI image. Multimedia Tools and Applications, 1-25.
  • T., V. (2014). Performance based analysis between k-Means and Fuzzy C-Means clustering algorithms for connection oriented telecommunication data. Applied Soft Computing, 19, pp.134-146.
  • Erken, M. (2016). Comparing clustering algorithms on wisconsin data set. In Signal Processing and Communication Application Conference (SIU), pp. 1541-1544.
  • Sharma, A., Boroevich, K., Shigemizu, D., Kamatani, Y., Kubo, M. and Tsunoda, T. (2017). Hierarchical Maximum Likelihood Clustering Approach. IEEE Transactions on Biomedical Engineering, 64(1), pp.112-122..
  • Ben-Hur, A., Horn, D., Siegelmann, H. T., & Vapnik, V. (2001). Support vector clustering. Journal of machine learning research, 2(Dec), pp. 125-137.
  • DonGiovanni, D. and Vaina, L. (2016). Select and Cluster: A Method for Finding Functional Networks of Clustered Voxels in fMRI. Computational Intelligence and Neuroscience, 2016, pp.1-19.
  • Yin, Z. and Zhang, J. (2014). Identification of temporal variations in mental workload using locally-linear-embedding-based EEG feature reduction and support-vector-machine-based clustering and classification techniques. Computer Methods and Programs in Biomedicine, 115(3), pp. 119-134.
  • Villazana, S., Seijas, C., & Caralli, A. (2015). Lempel-Ziv complexity and Shannon entropy-based support vector clustering of ECG signals. Revista Ingenıería Uc, 22(1).
  • Yin, Z. and Zhang, J. (2013). Identifying Changes in Human Operator Mental Workload by Locally Linear Embedding and Support Vector Clustering Approaches. IFAC Proceedings Volumes, 46(13), pp.353-358..
  • Lım, B., Lı, X., Zhang, J. B., & Shı, D. (2007). Feature Dıscretızatıon By Support Vector Clusterıng. In Information Sciences, pp. 797-803.
  • Wang, D., Shi, L., Yeung, D. S., Tsang, E. C., & Heng, P. A. (2007). Ellipsoidal support vector clustering for functional MRI analysis. Pattern Recognition, 40(10), pp. 2685-2695.
  • Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
  • J. Sturm, (1999) Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones, Optim. Methods Software 11 pp. 625–653.
  • M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming (web page and software). http://stanford.edu/∼boyd/cvx, June 2009.
  • J. Lee and D. Lee, (2005), “An Improved Cluster Labeling Method for Support Vector Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, 27(3), pp. 461-464.
  • K.D. Won, K. Lee, K.D. Lee, K.H. Lee, (2005), A k-populations algorithm for clustering categorical data, Pattern Recognition 38(7), pp. 1131–1134.
Toplam 21 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Mühendislik
Bölüm Makaleler
Yazarlar

Ömer Karal 0000-0001-8742-8189

Furkan Burak Bağcı

Yayımlanma Tarihi 15 Ocak 2019
Gönderilme Tarihi 16 Mayıs 2018
Yayımlandığı Sayı Yıl 2019 Cilt: 7 Sayı: 1

Kaynak Göster

IEEE Ö. Karal ve F. B. Bağcı, “Elipsoit Destek Vektör Öbekleme Algoritmasının Biyomedikal Veri Setleri Üzerinde Karşılaştırmalı Performans Analizi”, APJES, c. 7, sy. 1, ss. 140–148, 2019, doi: 10.21541/apjes.424247.

Cited By