TY - JOUR T1 - Elipsoit Destek Vektör Öbekleme Algoritmasının Biyomedikal Veri Setleri Üzerinde Karşılaştırmalı Performans Analizi TT - Comparative Performance Analysis of Ellipsoidal Support Vector Clustering on Biomedical Data Sets AU - Karal, Ömer AU - Bağcı, Furkan Burak PY - 2019 DA - January DO - 10.21541/apjes.424247 JF - Academic Platform - Journal of Engineering and Science JO - APJES PB - Akademik Perspektif Derneği WT - DergiPark SN - 2147-4575 SP - 140 EP - 148 VL - 7 IS - 1 LA - tr AB - Hastalıklı kişilerde hastalığınteşhisinin önceden yapılması, tanısının konulması ve gerekli önlemlerinalınmasına yardımcı olmalarından dolayı öbekleme algoritmalarının performansıbiyomedikal araştırmalarda çok önemlidir. Ancak, çoğu öbekleme algoritmasıbenzerlik metriği olarak Öklid uzaklığını kullanır. Öklid uzaklığı verilerinvaryanslarını eşit kabul eder. Gürültülü veya aykırı değerlerin veriye bulaşmasıdurumunda, geleneksel Öklid uzaklığı kullanan öbekleme yöntemlerininperformansı oldukça düşmektedir. Bu çalışma, yukarıda bahsedilen olumsuzluklarıgidermek için kernel tabanlı öbekleme yöntemlerinden biri olan Elipsoit DestekVektör Öbekleme (EDVÖ) algoritmasını önerir. EDVÖ algoritmasında, önceden öbeksayısının belirtilmesine gerek yoktur. Ayrıca, EDVÖ algoritması, mahalanobisbenzerlik ölçüsünü kullanarak verilerin dağılımına uygun kümelenme şekilleriüretebilir. Önerilen EDVÖ algoritması hem gerçek biyomedikal verilere hem desentetik verilere uygulanmış ve daha sonra geleneksel kümeleme yöntemleri ilekarşılaştırılmıştır. EDVÖ algoritmasının doğruluk, özgüllük ve duyarlılıkaçısından iyi bir performans gösterdiği gözlemlenmiştir. KW - Biyomedikal veriler KW - öbekleme KW - elipsoit destek vektör öbekleme N2 - The performance of clustering algorithms is veryimportant in biomedical research because they help in the pre-diagnosis ofdiseases, recognize diseases and take necessary precautions in diseased people.However, most clustering algorithms use the Euclidean distance as a similaritymetric. Euclidean distance assumes the variances of the data samples are equal.The performance of traditional clustering methods that use Euclidean distanceis quite low if the data contains noise or outlier samples. This study proposesthe Ellipsoidal Support Vector Clustering algorithm, which is one of thekernel-based clustering methods, in order to eliminate the above mentionedproblems. In the ESVC algorithm, there is no need to specify the cluster numberin advance. Moreover, the ESVC algorithm is capable of generating clusteringshapes that are appropriate to the distribution of data using the mahalanobissimilarity metric. The proposed ESVC algorithm was applied to both realbiomedical data and synthetic data and then compared to conventional clusteringmethods. It has been observed that ESVC algorithm performs well in terms ofaccuracy, specificity and sensitivity. CR - Fahad, A., Alshatri, N., Tari, Z., Alamri, A., Khalil, I., Zomaya, A., Foufou, S. and Bouras, A. (2014). A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis. IEEE Transactions on Emerging Topics in Computing, 2(3), pp.267-279. CR - Starczewski, A. (2017). A new validity index for crisp clusters. Pattern Analysis and Applications, 20(3), pp.687-700. CR - Xu, R. and WunschII, D. (2005). Survey of Clustering Algorithms. IEEE Transactions on Neural Networks, 16(3), pp.645-678. CR - Khanmohammadi, S., Adibeig, N. and Shanehbandy, S. (2017). An improved overlapping k-means clustering method for medical applications. Expert Systems with Applications, 67, pp.12-18. CR - Rosati, S., Agostini, V., Knaflitz, M. and Balestra, G. (2017). Muscle activation patterns during gait: A hierarchical clustering analysis. Biomedical Signal Processing and Control, 31, pp.463-469. CR - Kumar, D., Verma, H., Mehra, A. and Agrawal, R. (2018). A modified intuitionistic fuzzy c-means clustering approach to segment human brain MRI image. Multimedia Tools and Applications, 1-25. CR - T., V. (2014). Performance based analysis between k-Means and Fuzzy C-Means clustering algorithms for connection oriented telecommunication data. Applied Soft Computing, 19, pp.134-146. CR - Erken, M. (2016). Comparing clustering algorithms on wisconsin data set. In Signal Processing and Communication Application Conference (SIU), pp. 1541-1544. CR - Sharma, A., Boroevich, K., Shigemizu, D., Kamatani, Y., Kubo, M. and Tsunoda, T. (2017). Hierarchical Maximum Likelihood Clustering Approach. IEEE Transactions on Biomedical Engineering, 64(1), pp.112-122.. CR - Ben-Hur, A., Horn, D., Siegelmann, H. T., & Vapnik, V. (2001). Support vector clustering. Journal of machine learning research, 2(Dec), pp. 125-137. CR - DonGiovanni, D. and Vaina, L. (2016). Select and Cluster: A Method for Finding Functional Networks of Clustered Voxels in fMRI. Computational Intelligence and Neuroscience, 2016, pp.1-19. CR - Yin, Z. and Zhang, J. (2014). Identification of temporal variations in mental workload using locally-linear-embedding-based EEG feature reduction and support-vector-machine-based clustering and classification techniques. Computer Methods and Programs in Biomedicine, 115(3), pp. 119-134. CR - Villazana, S., Seijas, C., & Caralli, A. (2015). Lempel-Ziv complexity and Shannon entropy-based support vector clustering of ECG signals. Revista Ingenıería Uc, 22(1). CR - Yin, Z. and Zhang, J. (2013). Identifying Changes in Human Operator Mental Workload by Locally Linear Embedding and Support Vector Clustering Approaches. IFAC Proceedings Volumes, 46(13), pp.353-358.. CR - Lım, B., Lı, X., Zhang, J. B., & Shı, D. (2007). Feature Dıscretızatıon By Support Vector Clusterıng. In Information Sciences, pp. 797-803. CR - Wang, D., Shi, L., Yeung, D. S., Tsang, E. C., & Heng, P. A. (2007). Ellipsoidal support vector clustering for functional MRI analysis. Pattern Recognition, 40(10), pp. 2685-2695. CR - Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. CR - J. Sturm, (1999) Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones, Optim. Methods Software 11 pp. 625–653. CR - M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming (web page and software). http://stanford.edu/∼boyd/cvx, June 2009. CR - J. Lee and D. Lee, (2005), “An Improved Cluster Labeling Method for Support Vector Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, 27(3), pp. 461-464. CR - K.D. Won, K. Lee, K.D. Lee, K.H. Lee, (2005), A k-populations algorithm for clustering categorical data, Pattern Recognition 38(7), pp. 1131–1134. UR - https://doi.org/10.21541/apjes.424247 L1 - https://dergipark.org.tr/en/download/article-file/592647 ER -