Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems
Abstract
Nowadays, it is becoming increasingly important to use the most efficient and most suitable computational resources for algorithmic tools that extract meaningful information from big data and make smart decisions. In this paper, a comparative analysis is provided for performance measurements of various machine learning and bioinformatics software including scikit-learn, Tensorflow, WEKA, libSVM, ThunderSVM, GMTK, PSI-BLAST, and HHblits with big data applications on different high performance computer systems and workstations. The programs are executed in a wide range of conditions such as single-core central processing unit (CPU), multi-core CPU, and graphical processing unit (GPU) depending on the availability of implementation. The optimum number of CPU cores are obtained for selected software. It is found that the running times depend on many factors including the CPU/GPU version, available RAM, the number of CPU cores allocated, and the algorithm used. If parallel implementations are available for a given software, the best running times are typically obtained by GPU, followed by multi-core CPU, and single-core CPU. Though there is no best system that performs better than others in all applications studied, it is anticipated that the results obtained will help researchers and practitioners to select the most appropriate computational resources for their machine learning and bioinformatics projects.
Keywords
References
- [1]. R. Bekkerman, M. Bilenko, and J. Langford, Scaling Up Machine Learning: Parallel and Distributed Approaches, Cambridge University Press, 2012.
- [2]. Supercomputer, https://en.wikipedia.org/wiki/Supercomputer.
- [3]. Y. Kochura, S. Stirenko, O. Alienin, M. Novotarskiy, and Y. Gordienko, “Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes”, In: Shakhovska N., Stepashko V. (eds) Advances in Intelligent Systems and Computing II. CSIT 2017. Advances in Intelligent Systems and Computing, vol 689. Springer, 243-256, 2018.
- [4]. V. Kovalev, A. Kalinovsky, and S. Kovalev, “Deep Learning with Theano, Torch, Caffe, TensorFlow, and Deeplearning4J: Which One Is the Best in Speed and Accuracy?”, International Conference on Pattern Recognition and Information Processing, (2016).
- [5]. A. Shatnawi, G. Al-Bdour, R. Al-Qurran, and M. Al-Ayyoub, “A Comparative Study of Open Source Deep Learning Frameworks”, 9th International Conference on Information and Communication Systems (ICICS), 72-77, (2018).
- [6]. S. Bahrampur, N. Ramakrishnan, L. Schott, and M. Shah, “Comparative Study of Deep Learning Software Frameworks”, arXiv:1511.06435, 2016.
- [7]. D.A. Bader, Y. Li, T. Li, and V. Sachdeva, “BioPerf: A Benchmark Suite to Evaluate High-Performance Computer Architecture on Bioinformatics Applications”, The IEEE International Symposium on Workload Characterization (IISWC 2005), Austin, TX, October 6-8, 2005.
- [8]. M. Kurtz, F. J. Esteban, P. Hernandez, J. A. Caballero, A. Guevara, G. Dorado, and S. Galvez, “Bioinformatics Performance Comparison of Many-core Tile64 vs. Multi-core Intel Xeon”, Clei Electronic Journal, vol. 17, no. 1, 1-9, 2014.
Details
Primary Language
English
Subjects
Engineering
Journal Section
Research Article
Authors
Zafer Aydın
0000-0001-7686-6298
Türkiye
Publication Date
January 28, 2020
Submission Date
March 30, 2019
Acceptance Date
August 28, 2019
Published in Issue
Year 2020 Volume: 8 Number: 1
Cited By
Object Recognizing Robot Application with Deep Learning
European Journal of Science and Technology
https://doi.org/10.31590/ejosat.962558