Stochastic Hyperparameter Optimization through Hypernetworks [article]

Jonathan Lorraine, David Duvenaud
2018 arXiv   pre-print
Machine learning models are often tuned by nesting optimization of model weights inside the optimization of hyperparameters. We give a method to collapse this nested optimization into joint stochastic optimization of weights and hyperparameters. Our process trains a neural network to output approximately optimal weights as a function of hyperparameters. We show that our technique converges to locally optimal weights and hyperparameters for sufficiently large hypernetworks. We compare this
more » ... to standard hyperparameter optimization strategies and demonstrate its effectiveness for tuning thousands of hyperparameters.
arXiv:1802.09419v2 fatcat:sntuz2kwcnfpxhpxwibeeha3ri