First-Order Optimization (Training) Algorithms in Deep Learning

Oleg Rudenko, Oleksandr Bezsonov, Kyrylo Oliinyk
2020 International Conference on Computational Linguistics and Intelligent Systems  
The use of artificial neural networks (ANN) requires solving structural and parametric identification problems corresponding to the choice of the optimal network topology and its training (parameter settings). In contrast to the problem of determining the structure, which is a discrete optimization (combinatorial), the search for optimal parameters is carried out in continuous space using some optimization methods. The most widely used optimization method in deep learning is the first-order
more » ... rithm that based on gradient descent (GD). In the given paper a comparative analysis of convolutional neural networks training algorithms that are used in tasks of image recognition is provided. Comparison of training algorithms was carried out on the Oxford17 category flower dataset with TensorFlow framework usage. Studies show that for this task a simple gradient descent algorithm is quite effective. At the same time, however, the problem of selecting the optimal values of the algorithms parameters that provide top speed of learning still remains open.
dblp:conf/colins/RudenkoBO20 fatcat:urkwrrkkq5fqvjcrxkmto4move