An overview of the activation functions used in deep learning algorithms

Serhat Kılıçarslan; Kemal Adem; Mete Çelik

doi:10.54187/jnrs.1011739

EN

An overview of the activation functions used in deep learning algorithms

Abstract

In deep learning models, the inputs to the network are processed using activation functions to generate the output corresponding to these inputs. Deep learning models are of particular importance in analyzing big data with numerous parameters and forecasting and are useful for image processing, natural language processing, object recognition, and financial forecasting. Also, in deep learning algorithms, activation functions have been developed by taking into account features such as performing the learning process in a healthy way, preventing excessive learning, increasing the accuracy performance, and reducing the computational cost. In this study, we present an overview of common and current activation functions used in deep learning algorithms. In the study, fixed and trainable activation functions are introduced. As fixed activation functions, sigmoid, hyperbolic tangent, ReLU, softplus and swish, and as trainable activation functions, LReLU, ELU, SELU and RSigELU are introduced.

Keywords

Activation function, Neural network, Deep learning

References

K. Adem, S. Kılıçarslan, O. Cömert, Classification and diagnosis of cervical cancer with stacked autoencoder and softmax classification, Expert Systems with Applications, 115, (2018) 557– 564.
S. Kılıçarslan, K. Adem, M. Çelik, Diagnosis and classification of cancer using hybrid model based on ReliefF and convolutional neural network, medical hypotheses, 137, (2020) 199577.
S. Kılıçarslan, M. Çelik, Ş. Sahin, Hybrid models based on genetic algorithm and deep learning algorithms for nutritional Anemia disease classification, Biomedical Signal Processing and Control, 63, (2021) 102231.
S. Kılıçarslan, M. Çelik, RSigELU: A nonlinear activation function for deep neural networks, Expert Systems with Applications, 174, (2021) 114805.
A. Apicella, F. Donnarumma, F. Isgrò, R. Prevete, A survey on modern trainable activation functions, Neural Networks, 138 (2021) 14–32.
S. Scardapane, S. Van Vaerenbergh, S. Totaro, A. Uncini, Kafnets: Kernel-based non-parametric activation functions for neural networks, Neural Networks, 110, (2019) 19–32.
V. Nair, G. E. Hinton, Rectified linear units improve restricted Boltzmann machines, In Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 807–814.
A. L. Maas, A. Y. Hannun, A. Y. Nug, Rectifier nonlinearities improve neural network acoustic models, S. Dasgupta, D. McAllester (Eds.), International Conference on Machine Learning Workshop on Deep Learning for Audio, Speech, and Language Processing, Atlanta, USA, 2013, pp. 1–6.
D. A. Clevert, T. Unterthiner, S. Hochreiter, Fast and accurate deep network learning by exponential linear units (elus), arXiv preprint arXiv:1511.07289, (2015).
L. Trottier, P. Gigu, B. Chaib-draa, Parametric exponential linear unit for deep convolutional neural networks, in: X. Chen, B. Luo, F. Luo, V. Palade, M. A. Wani (Eds.), 16th IEEE International Conference on Machine Learning and Applications, Cancun, Mexico, 2017, pp. 207–214.

P. Ramachandran, B. Zoph, Q. V. Le, Searching for activation functions, arXiv preprint arXiv:1710.05941, (2017).
K. I. Funahashi, On the approximate realization of continuous mappings by neural networks, Neural Networks, 2, (1989) 183–192.
G. Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals and Systems, 2, (1989) 303–314.
Y. Bengio, P. Simard, P. Frasconi, Learning long-term dependencies with gradient descent is difficult, IEEE Transactions on Neural Networks, 5, (1994) 157–166.
S. Hochreiter, S. Jurgen, Long Short-Term Memory, Neural Computation, 9, (1997) 1735–1780.
A. Benjemmaa, I. Klabi, M. S. Masmoudi, J. el Ouni, M. Masmoudi, Implementations approaches of neural networks lane following system, O. Faten, B. A. Faouzi (Eds.), in: 16th IEEE Mediterranean Electrotechnical Conference, Yasmine Hammamet, Tunisia, 2012, pp. 515–518.
M. Goyal, R. Goyal, P. Reddy, B. Lall, Activation Functions, In Deep Learning: Algorithms and Applications, Springer, Cham, 2020.
N. Jinsakul, C. F. Tsai, C. E. Tsai, P. Wu, Enhancement of deep learning in image classification performance using exception with the swish activation function for colorectal polyp preliminary screening, Mathematics, 7, (2019), 1170.
B. Ding, H. Qian, J. Zhou, Activation functions and their characteristics in deep neural networks, F. Wang, G. H. Yang (Eds.), in: Chinese Control and Decision Conference, Shenyang, China, 2018, pp. 1836–1841.
D. J. Rumala, E. M. Yuniarno, R. F. Rachmadi, S. M. S. Nugroho, I. K. E. Purnama, Activation functions evaluation to improve performance of convolutional neural network in brain disease classification based on magnetic resonance images, S. M. S. Nugroho (Ed.), in: 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia, Surabaya, Indonesia, 2020, pp. 402–407.
G. Klambauer, T. Unterthiner, A. Mayr, S. Hochreiter, Self-normalizing neural networks, U. V. Luxburg, I. Guyon, S. Bengio, H. Wallach, R. Fergus (Eds.), in: Advances in Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 971–980.
T. Yang, Y. Wei, Z. Tu, H. Zeng, P. Ren, Design space exploration of neural network activation function circuits, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38(10), (2018) 1974–1978.
D. Pedamonti, Comparison of non-linear activation functions for deep neural networks on MNIST classification task, arXiv preprint arXiv:1804.02763, (2018).
J. Bergstra, G. Desjardins, P. Lamblin, Y. Bengio, Quadratic polynomials learn better image features, Technical Report, (2009), 1337.
G. Lin, W. Shen, Research on convolutional neural network based on improved Relu piecewise activation function, Procedia Computer Science, 131, (2018) 977–984.
C. Dugas, Y. Bengio, F. Belisle, C. Nadeau, R. Garcia, Incorporating second-order functional knowledge for better option pricing, Advances in Neural Information Processing Systems, 20, (2001) 472–478.
V. S. Bawa, V. Kumar, Linearized sigmoidal activation: A novel activation function with tractable non-linear characteristics to boost representation capability, Expert Systems with Applications, 120, (2019), 346–356.
K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, R. Bajcsy, G. Hager, Y. Ma (Eds.) in: IEEE International Conference on Computer Vision, Santiago, Chile, 2015, pp. 1026–1034.
Z. Qiumei, T. Dan, W. Fenghua, Improved convolutional neural network based on fast exponentially linear unit activation function, IEEE Access, 7, (2019) 151359–151367.
F. Godin, J. Degrave, J. Dambre, W. De Neve, Dual Rectified Linear Units (DReLUs): a replacement for tangent activation functions in quasi-recurrent neural networks, Pattern Recognition Letters, 116, (2018), 8–14.
L. Trottier, P. Giguere, B. Chaib-Draa, Parametric exponential linear unit for deep convolutional neural networks, In 2017 16th IEEE International Conference on Machine Learning and Applications, 2017, pp. 207–214.
J. T. Barron, Continuously differentiable exponential linear units, arXiv preprint arXiv:1704.07483, 2017.
D. Misra, Mish: A self-regularized non-monotonic activation function. arXiv preprint arXiv:1908.08681, 2019.
S. Kong, M. Takatsuka, Hexpo: A vanishing-proof activation function, Y. Choe (Ed.) in: International Joint Conference on Neural Networks, Anchorage, AK, USA, 201, pp. 2562–2567.
A. L. Hodgkin, A. F. Huxley, A quantitative description of membrane current and its application to conduction and excitation in nerve, The Journal of Physiology, 117(4), (1952) 500–544.
N. Jinsakul, C. F. Tsai, C. E. Tsai, P. Wu, Enhancement of deep learning in image classification performance using xception with the swish activation function for colorectal polyp preliminary screening, Mathematics, 7(12), (2019) 1170.
H. Ma, Y. Liu, Y. Ren, J. Yu, Detection of collapsed buildings in post-earthquake remote sensing images based on the improved YOLOv3, Remote Sensing, 12(1), (2020) 44.
M. A. Bülbül, C. Öztürk, Optimization, modeling and implementation of plant water consumption control using genetic algorithm and artificial neural network in a hybrid structure, Arabian Journal for Science and Engineering, (2021) 1–15.
I. Pacal, D. Karaboğa, A Robust Real-Time Deep Learning Based Automatic Polyp Detection System, Computers in Biology and Medicine, 134, (2021) 104519.
S. Memiş, S. Enginoğlu, U. Erkan, A classification method in machine learning based on soft decision-making via fuzzy parameterized fuzzy soft matrices, Soft Computing (2021). https://doi.org/10.1007/s00500-021-06553-z
S. Memiş, S. Enginoğlu, U. Erkan, Numerical Data Classification via Distance-Based Similarity Measures of Fuzzy Parameterized Fuzzy Soft Matrices, IEEE Access, 9, (2021) 88583–88601.
U. Erkan, A precise and stable machine learning algorithm: Eigenvalue classification (EigenClass), Neural Computing & Applications, 33, (2021), 5381–5392.
H. Zhu, H. Zeng, J. Liu, X. Zhang, Logish: A new nonlinear nonmonotonic activation function for convolutional neural network, Neurocomputing, 458, (2021), 490–499.
Y. Zhou, D. Li, S. Huo, S. Y. Kung, Shape autotuning activation function, Expert Systems with Applications, 171, (2021) 114534.
M. A. Mercioni, S. Holban, Soft-Clipping Swish: A Novel Activation Function for Deep Learning, L. Kovács, R. E. Precup (Eds.), in: IEEE 15th International Symposium on Applied Computational Intelligence and Informatics, Timisoara, Romania, 2021, pp. 225–230.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Review

Authors

Serhat Kılıçarslan ^*
0000-0001-9483-4425
Türkiye

Kemal Adem
0000-0002-3752-7354
Türkiye

Mete Çelik
0000-0002-1488-1502
Türkiye

Publication Date

December 31, 2021

Submission Date

October 18, 2021

Acceptance Date

December 8, 2021

Published in Issue

Year 2021 Volume: 10 Number: 3

DOI

https://doi.org/10.54187/jnrs.1011739

IZ

https://izlik.org/JA66RC88HC

APA

Kılıçarslan, S., Adem, K., & Çelik, M. (2021). An overview of the activation functions used in deep learning algorithms. Journal of New Results in Science, 10(3), 75-88. https://doi.org/10.54187/jnrs.1011739

AMA

1.Kılıçarslan S, Adem K, Çelik M. An overview of the activation functions used in deep learning algorithms. JNRS. 2021;10(3):75-88. doi:10.54187/jnrs.1011739

Chicago

Kılıçarslan, Serhat, Kemal Adem, and Mete Çelik. 2021. “An Overview of the Activation Functions Used in Deep Learning Algorithms”. Journal of New Results in Science 10 (3): 75-88. https://doi.org/10.54187/jnrs.1011739.

EndNote

Kılıçarslan S, Adem K, Çelik M (December 1, 2021) An overview of the activation functions used in deep learning algorithms. Journal of New Results in Science 10 3 75–88.

IEEE

[1]S. Kılıçarslan, K. Adem, and M. Çelik, “An overview of the activation functions used in deep learning algorithms”, JNRS, vol. 10, no. 3, pp. 75–88, Dec. 2021, doi: 10.54187/jnrs.1011739.

ISNAD

Kılıçarslan, Serhat - Adem, Kemal - Çelik, Mete. “An Overview of the Activation Functions Used in Deep Learning Algorithms”. Journal of New Results in Science 10/3 (December 1, 2021): 75-88. https://doi.org/10.54187/jnrs.1011739.

JAMA

1.Kılıçarslan S, Adem K, Çelik M. An overview of the activation functions used in deep learning algorithms. JNRS. 2021;10:75–88.

MLA

Kılıçarslan, Serhat, et al. “An Overview of the Activation Functions Used in Deep Learning Algorithms”. Journal of New Results in Science, vol. 10, no. 3, Dec. 2021, pp. 75-88, doi:10.54187/jnrs.1011739.

Vancouver

1.Serhat Kılıçarslan, Kemal Adem, Mete Çelik. An overview of the activation functions used in deep learning algorithms. JNRS. 2021 Dec. 1;10(3):75-88. doi:10.54187/jnrs.1011739