Filters








527 Hits in 1.8 sec

BOHB: Robust and Efficient Hyperparameter Optimization at Scale [article]

Stefan Falkner, Aaron Klein, Frank Hutter
2018 arXiv   pre-print
Correspondence to: Stefan Falkner <sfalkner@informatik.uni-freiburg.de>.  ...  Correspondence to: Stefan Falkner <sfalkner@informatik.uni-freiburg.de>. Proceedings of the 35 th International Conference on Machine Learning, Stockholm, Sweden, PMLR 80, 2018.  ... 
arXiv:1807.01774v1 fatcat:msuaxqk6grdlditdkkdmyyxa2a

Learning to Design RNA [article]

Frederic Runge, Danny Stoll, Stefan Falkner, Frank Hutter
2019 arXiv   pre-print
Stefan Falkner, Aaron Klein, and Frank Hutter. BOHB: Robust and efficient hyperparameter op- timization at scale.  ...  We chose the recently-proposed optimizer BOHB (Falkner et al., 2018) to find good configurations, because it can handle mixed discrete/continuous spaces, utilize parallel resources, and additionally  ... 
arXiv:1812.11951v2 fatcat:dobe5pmf35fopm2p7s3iaiqrvi

The BepiColombo Mission

Rita Schulz, Peter Falkner, Anthony Peacock, Christian Erd, Nicola Rando, Stefan Kraft
2005 Highlights of Astronomy  
AbstractBepiColombo is an interdisciplinary mission to the planet Mercury which will provide the detailed information necessary to understand Mercury and its magnetospheric environment. The mission is envisaged to consist of three spacecrafts, the Mercury Planetary Orbiter (MPO), the Mercury Magnetospheric Orbiter (MMO) and the Mercury Surface Element (MSE). The mission went through a re-assessment with the aim of optimizing resources and and advancing the scientific return. Various mission
more » ... arios were investigated and new pay load concepts were adopted. The newly defined mission will be presented focusing on the launch scenario and the MPO reference payload.
doi:10.1017/s1539299600015124 fatcat:nni3b3v2cfbermnb4mv72ph4we

Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings [article]

Matilde Gargiani and Aaron Klein and Stefan Falkner and Frank Hutter
2019 arXiv   pre-print
., 2017; Falkner et al., 2018) .  ...  The same learning curve datasets were also used in (Falkner et al., 2018) .  ... 
arXiv:1910.04522v1 fatcat:vlv3ybnpdvdsrlg6nfjoe6geai

Practical Hyperparameter Optimization for Deep Learning

Stefan Falkner, Aaron Klein, Frank Hutter
2018 International Conference on Learning Representations  
Recently, the bandit-based strategy Hyperband (HB) was shown to yield good hyperparameter settings of deep neural networks faster than vanilla Bayesian optimization (BO). However, for larger budgets, HB is limited by its random search component, and BO works better. We propose to combine the benefits of both approaches to obtain a new practical state-of-the-art hyperparameter optimization method, which we show to consistently outperform both HB and BO on a range of problem types, including
more » ... forward neural networks, Bayesian neural networks, and deep reinforcement learning. Our method is robust and versatile, while at the same time being conceptually simple and easy to implement.
dblp:conf/iclr/FalknerKH18 fatcat:ugdi4lgpfbesndglxklcmtdopq

Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search [article]

Arber Zela, Aaron Klein, Stefan Falkner, Frank Hutter
2018 arXiv   pre-print
Due to space constraints, we refer the reader to Falkner et al. (2018) and for full details on these methods and only provide the basics here.  ...  Falkner & F. Hutter. . For residual blocks containing b > 1 residual branches, Shake-Shake scales the feature maps from branch i by a random factor ai, such that b i ai = 1.  ... 
arXiv:1807.06906v1 fatcat:m643rptk55bafa6c2gwuzvrkaa

Asynchronous Stochastic Gradient MCMC with Elastic Coupling [article]

Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, Frank Hutter
2016 arXiv   pre-print
We consider parallel asynchronous Markov Chain Monte Carlo (MCMC) sampling for problems where we can leverage (stochastic) gradients to define continuous dynamics which explore the target distribution. We outline a solution strategy for this setting based on stochastic gradient Hamiltonian Monte Carlo sampling (SGHMC) which we alter to include an elastic coupling term that ties together multiple MCMC instances. The proposed strategy turns inherently sequential HMC algorithms into asynchronous
more » ... rallel versions. First experiments empirically show that the resulting parallel sampler significantly speeds up exploration of the target distribution, when compared to standard SGHMC, and is less prone to the harmful effects of stale gradients than a naive parallelization approach.
arXiv:1612.00767v2 fatcat:s5qk5sq4grb7nj44exvmo72iki

Bayesian Optimization with Robust Bayesian Neural Networks

Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, Frank Hutter
2016 Neural Information Processing Systems  
Bayesian optimization is a prominent method for optimizing expensive-to-evaluate black-box functions that is widely applied to tuning the hyperparameters of machine learning algorithms. Despite its successes, the prototypical Bayesian optimization approach -using Gaussian process models -does not scale well to either many hyperparameters or many function evaluations. Attacking this lack of scalability and flexibility is thus one of the key challenges of the field. We present a general approach
more » ... or using flexible parametric models (neural networks) for Bayesian optimization, staying as close to a truly Bayesian treatment as possible. We obtain scalability through stochastic gradient Hamiltonian Monte Carlo, whose robustness we improve via a scale adaptation. Experiments including multi-task Bayesian optimization with 21 tasks, parallel optimization of deep neural networks and deep reinforcement learning show the power and flexibility of this approach.
dblp:conf/nips/SpringenbergKFH16 fatcat:vmmf3aodjrgc7nqxx7qua6jgfa

Learning Curve Prediction with Bayesian Neural Networks

Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, Frank Hutter
2017 International Conference on Learning Representations  
Different neural network architectures, hyperparameters and training protocols lead to different performances as a function of time. Human experts routinely inspect the resulting learning curves to quickly terminate runs with poor hyperparameter settings and thereby considerably speed up manual hyperparameter optimization. The same information can be exploited in automatic hyperparameter optimization by means of a probabilistic model of learning curves across hyperparameter settings. Here, we
more » ... udy the use of Bayesian neural networks for this purpose and improve their performance by a specialized learning curve layer.
dblp:conf/iclr/KleinFSH17 fatcat:u4evjkfkpjbixmk75lod7fntdy

SpySMAC: Automated Configuration and Performance Analysis of SAT Solvers [chapter]

Stefan Falkner, Marius Lindauer, Frank Hutter
2015 Lecture Notes in Computer Science  
Most modern SAT solvers expose a range of parameters to allow some customization for improving performance on specific types of instances. Performing this customization manually can be challenging and time-consuming, and as a consequence several automated algorithm configuration methods have been developed for this purpose. Although automatic algorithm configuration has already been applied successfully to many different SAT solvers, a comprehensive analysis of the configuration process is
more » ... ly not readily available to users. Here, we present SpySMAC to address this gap by providing a lightweight and easy-to-use toolbox for (i) automatic configuration of SAT solvers in different settings, (ii) a thorough performance analysis comparing the best found configuration to the default one, and (iii) an assessment of each parameter's importance using the fANOVA framework. To showcase our tool, we apply it to Lingeling and probSAT, two state-of-the-art solvers with very different characteristics.
doi:10.1007/978-3-319-24318-4_16 fatcat:yvkt5ntruvegxmzi7a6fx5epde

Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning [article]

Matthias Feurer, Katharina Eggensperger, Stefan Falkner, Marius Lindauer, Frank Hutter
2021 arXiv   pre-print
Falkner, S. Bartels, P. Hennig, and F. Hutter. Fast Bayesian optimization of machine learning hyperparameters on large datasets. In Proc. of AISTATS'17, 2017a. A. Klein, S. Falkner, J.  ... 
arXiv:2007.04074v2 fatcat:qcrxuttjjfgnpnklql4ars4hbu

One-dimensional coinless quantum walks

Renato Portugal, Stefan Boettcher, Stefan Falkner
2015 Physical Review A. Atomic, Molecular, and Optical Physics  
A coinless, discrete-time quantum walk possesses a Hilbert space whose dimension is smaller compared to the widely-studied coined walk. Coined walks require the direct product of the site basis with the coin space, coinless walks operate purely in the site basis, which is clearly minimal. These coinless quantum walks have received considerable attention recently because they have evolution operators that can be obtained by a graphical method based on lattice tessellations and they have been
more » ... n to be as efficient as the best known coined walks when used as a quantum search algorithm. We argue that both formulations in their most general form are equivalent. In particular, we demonstrate how to transform the one-dimensional version of the coinless quantum walk into an equivalent extended coined version for a specific family of evolution operators. We present some of its basic, asymptotic features for the one-dimensional lattice with some examples of tessellations, and analyze the mixing time and limiting probability distributions on cycles.
doi:10.1103/physreva.91.052319 fatcat:4velpeznpneudgoegqe5z3e2au

Fast Bayesian hyperparameter optimization on large datasets

Aaron Klein, Stefan Falkner, Simon Bartels, Philipp Hennig, Frank Hutter
2017 Electronic Journal of Statistics  
Bayesian optimization has become a successful tool for optimizing the hyperparameters of machine learning algorithms, such as support vector machines or deep neural networks. Despite its success, for large datasets, training and validating a single configuration often takes hours, days, or even weeks, which limits the achievable performance. To accelerate hyperparameter optimization, we propose a generative model for the validation error as a function of training set size, which is learned
more » ... g the optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset. We construct a Bayesian optimization procedure, dubbed Fabolas, which models loss and training time as a function of dataset size and automatically trades off high information gain about the global optimum against computational cost. Experiments optimizing support vector machines and deep neural networks show that Fabolas often finds high-quality solutions 10 to 100 times faster than other state-of-the-art Bayesian optimization methods or the recently proposed bandit strategy Hyperband.
doi:10.1214/17-ejs1335si fatcat:2xq3susfnvh3ho5p73lavu2ura

Relation between random walks and quantum walks

Stefan Boettcher, Stefan Falkner, Renato Portugal
2015 Physical Review A. Atomic, Molecular, and Optical Physics  
Based on studies on four specific networks, we conjecture a general relation between the walk dimensions d_w of discrete-time random walks and quantum walks with the (self-inverse) Grover coin. In each case, we find that d_w of the quantum walk takes on exactly half the value found for the classical random walk on the same geometry. Since walks on homogeneous lattices satisfy this relation trivially, our results for heterogeneous networks suggests that such a relation holds irrespective of
more » ... er translational invariance is maintained or not. To develop our results, we extend the renormalization group analysis (RG) of the stochastic master equation to one with a unitary propagator. As in the classical case, the solution ρ(x,t) in space and time of this quantum walk equation exhibits a scaling collapse for a variable x^d_w/t in the weak limit, which defines d_w and illuminates fundamental aspects of the walk dynamics, e.g., its mean-square displacement. We confirm the collapse for ρ(x,t) in each case with extensive numerical simulation. The exact values for d_w in themselves demonstrate that RG is a powerful complementary approach to study the asymptotics of quantum walks that weak-limit theorems have not been able to access, such as for systems lacking translational symmetries beyond simple trees.
doi:10.1103/physreva.91.052330 fatcat:4q7wi4uyl5etlohrhd6s2wtqwi

Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets [article]

Aaron Klein, Stefan Falkner, Simon Bartels, Philipp Hennig, Frank Hutter
2017 arXiv   pre-print
Bayesian optimization has become a successful tool for hyperparameter optimization of machine learning algorithms, such as support vector machines or deep neural networks. Despite its success, for large datasets, training and validating a single configuration often takes hours, days, or even weeks, which limits the achievable performance. To accelerate hyperparameter optimization, we propose a generative model for the validation error as a function of training set size, which is learned during
more » ... he optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset. We construct a Bayesian optimization procedure, dubbed Fabolas, which models loss and training time as a function of dataset size and automatically trades off high information gain about the global optimum against computational cost. Experiments optimizing support vector machines and deep neural networks show that Fabolas often finds high-quality solutions 10 to 100 times faster than other state-of-the-art Bayesian optimization methods or the recently proposed bandit strategy Hyperband.
arXiv:1605.07079v2 fatcat:azg6isihpncvde24xejsntzohm
« Previous Showing results 1 — 15 out of 527 results