TY - JOUR T1 - A Preconditioned Unconstrained Optimization Method for Training Multilayer Feed-forward Neural Network AU - Abbo, Khalil AU - Zahra Abdlkareem, Zahra PY - 2020 DA - February JF - Journal of Multidisciplinary Modeling and Optimization JO - jmmo PB - Ahmet ŞAHİNER WT - DergiPark SN - 2645-923X SP - 71 EP - 79 VL - 2 IS - 2 LA - en AB - Non-linear unconstrainedoptimization methods constituteexcellent neural network training methods characterized by their simplicity andefficiency. In this paper, we propose a new preconditioned conjugate gradient neuralnetwork training algorithm which guarantees descent property with standardWolfe condition. Encouraging numerical experiments verify that the proposedalgorithm provides fast and stable convergence. KW - Unconstrained optimization KW - Neural network KW - Descent property CR - L. E. Achenie, Computational experience with a quasi Newton method based training of feed-forward neural networks, Proc. World Congress on Neural Networks, San Diego, 1994, III607-III612. CR - A. Andreas and S. Wu, Practical optimization algorithms and engineering applications, Springer US, 2007, 1-26. CR - R. Battiti, First-and second-order methods for learning: between steepest descent and New-ton's method, Neural Comput., 4(2) 1992, 141-166. CR - R. Battiti and F. Masulli, BFGS optimization for faster and automated supervised learning, In International neural network conference, Springer, Dordrecht, 1990 . CR - C. M. Bishop, Neural networks for pattern recognition. Oxford university press, 1995. CR - E. Birgin and J. Martinez, A spectral conjugate gradient method for unconstrained optimization, Appl. Math. Opt., 43(2) 2001, 117-128. CR - C. Charalambous, Conjugate gradient algorithm for efficient training of artificial neural networks, “IEE Proceedings G (Circuits, Devices and Systems)”, 139(3) 1992, 301-310. CR - R. Fletcher and C.M. Reeves, Function minimization by conjugate gradients, Comput. J., 7(2) 1964, 149-154. CR - [9] L. Gong, C. Liu, Y. Li and Y. Fuqing, Training feed-forward neural networks using the gradient descent method with the optimal stepsize. Journal of Computational Information Systems, 8(4) 2012, 1359-1371. CR - S. Haykin, Neural networks: a comprehensive foundation, Prentice Hall PTR, 1994. CR - J. Hertz, A. Krogh and RG. Palmer, Introduction to the theory of neural computation, Addison Wesley, Longman, 1991. CR - M.R. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linear systems, Washington, 1952. CR - R.A. Jacobs, Increased rates of convergence through learning rate adaptation, Neural networks, 1(4) 1988, 295-307. CR - K. Abbo and M. Hind, Improving the learning rate of the Back-propagation Algorithm Aitken process, Iraqi J. of Statistical Sci., 2012. CR - A. E. Kostopoulos, D. G. Sotiropoulos and T. N. Grapsa, A new efficient variable learning rate for Perry’s spectral conjugate gradient training method, 2004. CR - I. E. Livieris and P. Pintelas, An advanced conjugate gradient training algorithm based on a modified secant equation, ISRN Artificial Intelligence, 2011. UR - https://dergipark.org.tr/tr/pub/jmmo/issue//496993 L1 - https://dergipark.org.tr/tr/download/article-file/980861 ER -