Comparative Analysis of First and Second Order Methods for Optimization in Neural Networks

Auras Khanal; Mehmet Dik

doi:10.47086/pims.1170457

Research Article

Comparative Analysis of First and Second Order Methods for Optimization in Neural Networks

Year 2022, Volume: 4 Issue: 2, 77 - 87, 31.12.2022

Auras Khanal , Mehmet Dik

https://doi.org/10.47086/pims.1170457

Abstract

Artificial Neural Networks are fine tuned to yield the best performance through an iterative process where the values of their parameters are altered. Optimization is the preferred method to determine the parameters that yield the minima of the loss function, an evaluation metric for ANN’s. However, the process of finding an optimal model which has minimum loss faces several obstacles, the most notable being the efficiency and rate of convergence to the minima of the loss function. Such optimization efficiency is imperative to reduce the use of computational resources and time when training Neural Network models. This paper reviews and compares the intuition and effectiveness of existing optimization algorithms such as Gradient Descent, Gradient Descent with Momentum, RMSProp and Adam that implement first order derivatives, and Newton’s Method that utilizes second order derivatives for convergence. It also explores the possibility to combine and leverage first and second order optimization techniques for improved performance when training Artificial Neural Networks.

Keywords

Optimization, Artificial Neural Networks, Gradient Descent

Supporting Institution

Beloit College

References

F. Bre, J.M. Gimenez, and V.D. Fachinotti. Prediction of wind pressure coefficients on building surfaces using artificial neural networks. Energy and Buildings, 158 (2017).
Hvidberrrg. Activation functions in artificial neural networks.
Deepanshi. Artificial neural network: Beginners guide to ann. Analytics Vidhya, (2021).
M. Z. Mulla. Cost, activation, loss function ∥ neural network ∥ deep learning. what are these? Medium, (2020).
S.Ruder. An overview of gradient descent optimization algorithms. Ruder.io, (2020).
K. Pykes. Gradient descent. Towards Data Science, (2020).
G. Mayanglambam. Deep learning optimizers. Towards Data Science, (2020).
i2tutorials. Explain brief about mini batch gradient descent. i2tutorials, (2019).
B. S. Shankar. Gradient descent with momentum. Medium, (2020).
A. Kathuria. Intro to optimization in deep learning: Momentum, rmsprop and adam. Paperspace Blog, (2018).
J.Brownlee. Code adam optimization algorithm from scratch. Machine Learning Mastery, (2021).
A.Lam. Bfgs in a nutshell: An introduction to quasi newton methods. Towards Data Science, (2020).
V.Cericola. Quasi-Newton methods. Northwestern University Open Text Book on Process Optimization, (2015)

Year 2022, Volume: 4 Issue: 2, 77 - 87, 31.12.2022

Auras Khanal , Mehmet Dik

https://doi.org/10.47086/pims.1170457

Abstract

References

F. Bre, J.M. Gimenez, and V.D. Fachinotti. Prediction of wind pressure coefficients on building surfaces using artificial neural networks. Energy and Buildings, 158 (2017).
Hvidberrrg. Activation functions in artificial neural networks.
Deepanshi. Artificial neural network: Beginners guide to ann. Analytics Vidhya, (2021).
M. Z. Mulla. Cost, activation, loss function ∥ neural network ∥ deep learning. what are these? Medium, (2020).
S.Ruder. An overview of gradient descent optimization algorithms. Ruder.io, (2020).
K. Pykes. Gradient descent. Towards Data Science, (2020).
G. Mayanglambam. Deep learning optimizers. Towards Data Science, (2020).
i2tutorials. Explain brief about mini batch gradient descent. i2tutorials, (2019).
B. S. Shankar. Gradient descent with momentum. Medium, (2020).
A. Kathuria. Intro to optimization in deep learning: Momentum, rmsprop and adam. Paperspace Blog, (2018).
J.Brownlee. Code adam optimization algorithm from scratch. Machine Learning Mastery, (2021).
A.Lam. Bfgs in a nutshell: An introduction to quasi newton methods. Towards Data Science, (2020).
V.Cericola. Quasi-Newton methods. Northwestern University Open Text Book on Process Optimization, (2015)

There are 13 citations in total.

Details

Primary Language	English
Subjects	Software Engineering (Other)
Journal Section	Articles
Authors	Auras Khanal 0000-0001-8621-9879 Mehmet Dik This is me 0000-0003-0643-2771
Publication Date	December 31, 2022
Acceptance Date	October 4, 2022
Published in Issue	Year 2022 Volume: 4 Issue: 2

Cite

Download Cover Image

Article Files

Full Text

The published articles in PIMS are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.