Uniform Learning in a Deep Neural Network via "Oddball" Stochastic Gradient Descent [article]

Andrew J.R. Simpson
2015 arXiv   pre-print
When training deep neural networks, it is typically assumed that the training examples are uniformly difficult to learn. Or, to restate, it is assumed that the training error will be uniformly distributed across the training examples. Based on these assumptions, each training example is used an equal number of times. However, this assumption may not be valid in many cases. "Oddball SGD" (novelty-driven stochastic gradient descent) was recently introduced to drive training probabilistically
more » ... ding to the error distribution - training frequency is proportional to training error magnitude. In this article, using a deep neural network to encode a video, we show that oddball SGD can be used to enforce uniform error across the training set.
arXiv:1510.02442v1 fatcat:6szvh2jpirbldpnsxew4ybcuda