The Comparative Effects of Clustering Algorithms on CPU and GPU

Pınar Ersoy; Mustafa Erşahin; Buket Erşahin

Research Article

The Comparative Effects of Clustering Algorithms on CPU and GPU

Year 2022, Volume: 2 Issue: 2, 19 - 27, 01.10.2022

Pınar Ersoy , Mustafa Erşahin , Buket Erşahin

Abstract

The algorithm clustering can be defined as the operation of separating the populace or pieces of information into various groups. This article aims to construct a performance comparison for Partitional Clustering by using random, k-means++ algorithms implemented with Scikit-Learn and k-means++, Tunnel k-means algorithms implemented with TensorFlow-GPU by means of their execution times. As a final output, a related comparison table will be printed by supplying their framework specifications. Since the article does not focus on the context of data, the necessary data sets will be produced in a random manner.

Keywords

clustering , random k-means , k-means++ , tunnel k-means , algorithm performance , TensorFlow-GPU , Scikit-Learn

References

Julia, S., Oliver, S. (2016). Multi-objective three stage design optimization for island microgrids. Appl Energy 2016;165:789–800.
Zhang Z. et al. (2021). Clustering analysis of typical scenarios of island power supply system by using cohesive hierarchical clustering based K-Means clustering method. Energy Reports 7, 250–256
Garza-Ulloa, J. (2018). Chapter 6 - Application of mathematical models in biomechatronics: artificial intelligence and time-frequency analysis, Applied Biomechatronics using Mathematical Models.
Murtagh, F. Contreras, P., (2019). Algorithms for hierarchical clustering: an overview, II. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 7 (6) (2017), p. e1219.
Ishizaka, A. (2021). A Stochastic Multi-criteria divisive hierarchical clustering algorithm. B. Lokman and M. Tasiou, Omega, vol. 103.
De Smet, Y. (2014). An extension of PROMETHEE to divisive hierarchical multicriteria clustering”, 2014 IEEE International Conference on Industrial Engineering and Engineering Management, IEEE, pp. 555-558.
D. Müllner, (2021). Modern hierarchical, agglomerative clustering algorithms. Comput. Sci., pp. 1-29.
Zhou, S., Xu, Z. Liu, F., (2017). Method for determining the optimal number of clusters based on agglomerative hierarchical clustering. IEEE Trans. Neural Networks Learn. Syst., 28 (12), pp. 3007-3017.
Zhu, E., Ma, R. (2018). An effective partitional clustering algorithm based on new clustering validity index. Applied Soft Computing 71, 608–621.
Zhu, S., Xu, L. (2018). Many-objective fuzzy centroids clustering algorithm for categorical data”, Expert Systems With Applications 96, 230–248.
Knuth, D., (1973). The Art of Computer Programming, Sorting and Searching. Vol. 3, Addison-Wesley, Massachusetts.
Jacques, J., Biernacki, C. (2020). Model-based clustering for rank data based on an insertion sorting algorithm. In: 17th Rencontres de la Société Francophone de Classification, La Réunion.
Zhan, K., Niu, C., Chen, C., Nie, F., Zhang, C., Yang, Y. (2018). “Graph structure fusion for multiview clustering”, IEEE Transactions on Knowledge and Data Engineering, 31 (10), pp. 1984-1993
Tang, C., Zhu, X., Liu, X., Li, M., Wang, P., Zhang, C., Wang, L. (2018). Learning a joint affinity graph for multiview subspace clustering”, IEEE Transactions on Multimedia, 21 (7), pp. 1724-1736.
Kumar, A., Iii, H.D. (2011). A co-training approach for multi-view spectral clustering”, International Conference on International Conference on Machine Learning, Omnipress, pp. 393-400.
Wang, K., Porter, M.D. (2018). Optimal Bayesian clustering using non-negative matrix factorization”, Computational Statistics and Data Analysis 128. 395–411
Randles, B. M., Pasquetto, I. V., Golshan, M. S., and Borgman, C. L.(2017). Using the jupyter notebook as a tool for open science: An empirical study,” in ACM/IEEE Joint Conference on Digital Libraries (JCDL).
McKinney, W. (2011). Pandas: A foundational python library for data analysis and statistics. Python for High Performance and Scientific Computing, vol. 14.
Greenfield, P., Miller, J. T., Hsu, J. & White, R. L (2003). Numarray: a new scientific array package for python. In PyCon DC.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, É. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, vol. 12, , pp. 2825-2830.

Year 2022, Volume: 2 Issue: 2, 19 - 27, 01.10.2022

Pınar Ersoy , Mustafa Erşahin , Buket Erşahin

Abstract

References

Julia, S., Oliver, S. (2016). Multi-objective three stage design optimization for island microgrids. Appl Energy 2016;165:789–800.
Zhang Z. et al. (2021). Clustering analysis of typical scenarios of island power supply system by using cohesive hierarchical clustering based K-Means clustering method. Energy Reports 7, 250–256
Garza-Ulloa, J. (2018). Chapter 6 - Application of mathematical models in biomechatronics: artificial intelligence and time-frequency analysis, Applied Biomechatronics using Mathematical Models.
Murtagh, F. Contreras, P., (2019). Algorithms for hierarchical clustering: an overview, II. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 7 (6) (2017), p. e1219.
Ishizaka, A. (2021). A Stochastic Multi-criteria divisive hierarchical clustering algorithm. B. Lokman and M. Tasiou, Omega, vol. 103.
De Smet, Y. (2014). An extension of PROMETHEE to divisive hierarchical multicriteria clustering”, 2014 IEEE International Conference on Industrial Engineering and Engineering Management, IEEE, pp. 555-558.
D. Müllner, (2021). Modern hierarchical, agglomerative clustering algorithms. Comput. Sci., pp. 1-29.
Zhou, S., Xu, Z. Liu, F., (2017). Method for determining the optimal number of clusters based on agglomerative hierarchical clustering. IEEE Trans. Neural Networks Learn. Syst., 28 (12), pp. 3007-3017.
Zhu, E., Ma, R. (2018). An effective partitional clustering algorithm based on new clustering validity index. Applied Soft Computing 71, 608–621.
Zhu, S., Xu, L. (2018). Many-objective fuzzy centroids clustering algorithm for categorical data”, Expert Systems With Applications 96, 230–248.
Knuth, D., (1973). The Art of Computer Programming, Sorting and Searching. Vol. 3, Addison-Wesley, Massachusetts.
Jacques, J., Biernacki, C. (2020). Model-based clustering for rank data based on an insertion sorting algorithm. In: 17th Rencontres de la Société Francophone de Classification, La Réunion.
Zhan, K., Niu, C., Chen, C., Nie, F., Zhang, C., Yang, Y. (2018). “Graph structure fusion for multiview clustering”, IEEE Transactions on Knowledge and Data Engineering, 31 (10), pp. 1984-1993
Tang, C., Zhu, X., Liu, X., Li, M., Wang, P., Zhang, C., Wang, L. (2018). Learning a joint affinity graph for multiview subspace clustering”, IEEE Transactions on Multimedia, 21 (7), pp. 1724-1736.
Kumar, A., Iii, H.D. (2011). A co-training approach for multi-view spectral clustering”, International Conference on International Conference on Machine Learning, Omnipress, pp. 393-400.
Wang, K., Porter, M.D. (2018). Optimal Bayesian clustering using non-negative matrix factorization”, Computational Statistics and Data Analysis 128. 395–411
Randles, B. M., Pasquetto, I. V., Golshan, M. S., and Borgman, C. L.(2017). Using the jupyter notebook as a tool for open science: An empirical study,” in ACM/IEEE Joint Conference on Digital Libraries (JCDL).
McKinney, W. (2011). Pandas: A foundational python library for data analysis and statistics. Python for High Performance and Scientific Computing, vol. 14.
Greenfield, P., Miller, J. T., Hsu, J. & White, R. L (2003). Numarray: a new scientific array package for python. In PyCon DC.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, É. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, vol. 12, , pp. 2825-2830.

There are 20 citations in total.

Details

Primary Language	English
Subjects	Engineering
Journal Section	Research Articles
Authors	Pınar Ersoy 0000-0001-9591-3037 Mustafa Erşahin 0000-0003-4318-8288 Buket Erşahin 0000-0002-1726-8164
Publication Date	October 1, 2022
Published in Issue	Year 2022 Volume: 2 Issue: 2

Cite

APA	Ersoy, P., Erşahin, M., & Erşahin, B. (2022). The Comparative Effects of Clustering Algorithms on CPU and GPU. Artificial Intelligence Theory and Applications, 2(2), 19-27.

Download Cover Image

Article Files

Full Text