Research Article

Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems

Volume: 8 Number: 1 January 28, 2020
EN TR

Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems

Abstract

Nowadays, it is becoming increasingly important to use the most efficient and most suitable computational resources for algorithmic tools that extract meaningful information from big data and make smart decisions. In this paper, a comparative analysis is provided for performance measurements of various machine learning and bioinformatics software including scikit-learn, Tensorflow, WEKA, libSVM, ThunderSVM, GMTK, PSI-BLAST, and HHblits with big data applications on different high performance computer systems and workstations. The programs are executed in a wide range of conditions such as single-core central processing unit (CPU), multi-core CPU, and graphical processing unit (GPU) depending on the availability of implementation. The optimum number of CPU cores are obtained for selected software. It is found that the running times depend on many factors including the CPU/GPU version, available RAM, the number of CPU cores allocated, and the algorithm used. If parallel implementations are available for a given software, the best running times are typically obtained by GPU, followed by multi-core CPU, and single-core CPU. Though there is no best system that performs better than others in all applications studied, it is anticipated that the results obtained will help researchers and practitioners to select the most appropriate computational resources for their machine learning and bioinformatics projects.

Keywords

References

  1. [1]. R. Bekkerman, M. Bilenko, and J. Langford, Scaling Up Machine Learning: Parallel and Distributed Approaches, Cambridge University Press, 2012.
  2. [2]. Supercomputer, https://en.wikipedia.org/wiki/Supercomputer.
  3. [3]. Y. Kochura, S. Stirenko, O. Alienin, M. Novotarskiy, and Y. Gordienko, “Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes”, In: Shakhovska N., Stepashko V. (eds) Advances in Intelligent Systems and Computing II. CSIT 2017. Advances in Intelligent Systems and Computing, vol 689. Springer, 243-256, 2018.
  4. [4]. V. Kovalev, A. Kalinovsky, and S. Kovalev, “Deep Learning with Theano, Torch, Caffe, TensorFlow, and Deeplearning4J: Which One Is the Best in Speed and Accuracy?”, International Conference on Pattern Recognition and Information Processing, (2016).
  5. [5]. A. Shatnawi, G. Al-Bdour, R. Al-Qurran, and M. Al-Ayyoub, “A Comparative Study of Open Source Deep Learning Frameworks”, 9th International Conference on Information and Communication Systems (ICICS), 72-77, (2018).
  6. [6]. S. Bahrampur, N. Ramakrishnan, L. Schott, and M. Shah, “Comparative Study of Deep Learning Software Frameworks”, arXiv:1511.06435, 2016.
  7. [7]. D.A. Bader, Y. Li, T. Li, and V. Sachdeva, “BioPerf: A Benchmark Suite to Evaluate High-Performance Computer Architecture on Bioinformatics Applications”, The IEEE International Symposium on Workload Characterization (IISWC 2005), Austin, TX, October 6-8, 2005.
  8. [8]. M. Kurtz, F. J. Esteban, P. Hernandez, J. A. Caballero, A. Guevara, G. Dorado, and S. Galvez, “Bioinformatics Performance Comparison of Many-core Tile64 vs. Multi-core Intel Xeon”, Clei Electronic Journal, vol. 17, no. 1, 1-9, 2014.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Publication Date

January 28, 2020

Submission Date

March 30, 2019

Acceptance Date

August 28, 2019

Published in Issue

Year 2020 Volume: 8 Number: 1

APA
Aydın, Z. (2020). Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems. Academic Platform - Journal of Engineering and Science, 8(1), 1-14. https://doi.org/10.21541/apjes.547016
AMA
1.Aydın Z. Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems. APJES. 2020;8(1):1-14. doi:10.21541/apjes.547016
Chicago
Aydın, Zafer. 2020. “Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems”. Academic Platform - Journal of Engineering and Science 8 (1): 1-14. https://doi.org/10.21541/apjes.547016.
EndNote
Aydın Z (January 1, 2020) Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems. Academic Platform - Journal of Engineering and Science 8 1 1–14.
IEEE
[1]Z. Aydın, “Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems”, APJES, vol. 8, no. 1, pp. 1–14, Jan. 2020, doi: 10.21541/apjes.547016.
ISNAD
Aydın, Zafer. “Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems”. Academic Platform - Journal of Engineering and Science 8/1 (January 1, 2020): 1-14. https://doi.org/10.21541/apjes.547016.
JAMA
1.Aydın Z. Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems. APJES. 2020;8:1–14.
MLA
Aydın, Zafer. “Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems”. Academic Platform - Journal of Engineering and Science, vol. 8, no. 1, Jan. 2020, pp. 1-14, doi:10.21541/apjes.547016.
Vancouver
1.Zafer Aydın. Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems. APJES. 2020 Jan. 1;8(1):1-14. doi:10.21541/apjes.547016

Cited By