Accelerated Dual Learning by Homotopic Initialization [article]

Hadi Daneshmand, Hamed Hassani, Thomas Hofmann
2017 arXiv   pre-print
Gradient descent and coordinate descent are well understood in terms of their asymptotic behavior, but less so in a transient regime often used for approximations in machine learning. We investigate how proper initialization can have a profound effect on finding near-optimal solutions quickly. We show that a certain property of a data set, namely the boundedness of the correlations between eigenfeatures and the response variable, can lead to faster initial progress than expected by commonplace
more » ... nalysis. Convex optimization problems can tacitly benefit from that, but this automatism does not apply to their dual formulation. We analyze this phenomenon and devise provably good initialization strategies for dual optimization as well as heuristics for the non-convex case, relevant for deep learning. We find our predictions and methods to be experimentally well-supported.
arXiv:1706.03958v1 fatcat:igca3q77xvdutnzxdtuix2bfd4