Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

Angela Jiang
Over the past decade, deep learning has demonstrated state-of-the-art accuracy on challenges posed by computer vision and natural language processing, revolutionizing these fields in the process. Deep learning models are now a fundamental building block for applications such asautonomous driving, medical imaging, and neural machine translation. However, many challenges remain when deploying these models in production. Researchers and practitioners must address a diversity of questions,
more » ... how to efficiently design, train, and deploy resource intensive deep learning models and how to automate these approaches while ensuring robustness to changing conditions. This dissertation provides and evaluates new ways to improve the efficiency of deep learning training and inference, as well as the underlying systems' robustness to changes in the environment. We address these issues by focusing on the many hyperparameters that are tuned tooptimize the model's accuracy and resource usage. These hyperparameters include the choice of model architecture, the training dataset, the optimization algorithm, the hyperparameters of the optimization algorithm (e.g., the learning rate and momentum) and the training time budget.Currently, in practice, almost all hyperparameters are tuned once before training and held static thereafter. This is suboptimal as the conditions that dictate the best hyperparameter value change over time (e.g., as training progresses or when hardware used for inference is replaced). We apply dynamic tuning to hyperparameters that have traditionally been considered static. Using threecase studies, we show that using runtime information to dynamically adapt hyperparameters that are traditionally static can increase the efficiency of machine learning training and inference. First, we propose and analyze Selective-Backprop, a new importance sampling approach that prioritizes examples with high loss in an online fashion. In Selective-Backprop, the examples considered challenging is a tunable hyperparameter. B [...]
doi:10.1184/r1/14423846 fatcat:p5xwfwyplrenvov24y5ao6t5zq