7 Hits in 2.3 sec

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems [article]

Jack Parker-Holder, Raghu Rajan, Xingyou Song, André Biedenkapp, Yingjie Miao, Theresa Eimer, Baohe Zhang, Vu Nguyen, Roberto Calandra, Aleksandra Faust, Frank Hutter, Marius Lindauer
2022 arXiv   pre-print
In this survey we seek to unify the field of AutoRL, we provide a common taxonomy, discuss each area in detail and pose open problems which would be of interest to researchers going forward.  ...  However, Automated Reinforcement Learning (AutoRL) involves not only standard applications of AutoML but also includes additional challenges unique to RL, that naturally produce a different set of methods  ...  Acknowledgements We would like to thank Jie Tan for providing feedback on the survey, as well as Sagi Perel and Daniel Golovin for valuable discussions.  ... 
arXiv:2201.03916v1 fatcat:4j2ycfj6czgxvjn7goxnhxsvzm

ARLO: A Framework for Automated Reinforcement Learning [article]

Marco Mussi, Davide Lombarda, Alberto Maria Metelli, Francesco Trovò, Marcello Restelli
2022 arXiv   pre-print
In this work, we propose a general and flexible framework, namely ARLO: Automated Reinforcement Learning Optimizer, to construct automated pipelines for AutoRL.  ...  Automated Reinforcement Learning (AutoRL) is a relatively new area of research that is gaining increasing attention.  ...  Conversely, RL is currently far from being a tool usable by a non-expert user since a complete and reliable Automated Reinforcement Learning (AutoRL) pipeline is currently missing.  ... 
arXiv:2205.10416v1 fatcat:i3ugqiwo3fcf5d7kgyg2xcmybi

Bayesian Generational Population-Based Training [article]

Xingchen Wan, Cong Lu, Jack Parker-Holder, Philip J. Ball, Vu Nguyen, Binxin Ru, Michael A. Osborne
2022 arXiv   pre-print
This motivates AutoRL, a class of methods seeking to automate these design choices.  ...  Second, we show that using a generational approach, we can also learn both architectures and hyperparameters jointly on-the-fly in a single training run.  ...  XW and BR are supported by the Clarendon Scholarship at the University of Oxford. CL is funded by the Engineering and Physical Sciences Research Council (EPSRC).  ... 
arXiv:2207.09405v1 fatcat:wbryc6dl5ndvrfcjtmygdykj6q

Reinforcement Learning in Practice: Opportunities and Challenges [article]

Yuxi Li
2022 arXiv   pre-print
In this article, we first give a brief introduction to reinforcement learning (RL), and its relationship with deep learning, machine learning and AI.  ...  This article is a gentle discussion about the field of reinforcement learning in practice, about opportunities and challenges, touching a broad range of topics, with perspectives and without technical  ...  As in a survey about automated RL (AutoRL) (Parker-Holder et al., 2022) , we may automate task design, algorithms, architectures, and hyperparameters, and AutoRL methods include the following: random/  ... 
arXiv:2202.11296v2 fatcat:xdtsmme22rfpfn6rgfotcspnhy

Deep Reinforcement Learning [article]

Yuxi Li
2018 arXiv   pre-print
We discuss deep reinforcement learning in an overview style. We draw a big picture, filled with details.  ...  We start with background of artificial intelligence, machine learning, deep learning, and reinforcement learning (RL), with resources.  ...  The authors propose policy-space response oracle (PSRO), and its approximation, deep cognitive hierarchies (DCH), to compute best responses to a mixture of policies using deep RL, and to compute new meta-strategy  ... 
arXiv:1810.06339v1 fatcat:kp7atz5pdbeqta352e6b3nmuhy

Towards Automatic Actor-Critic Solutions to Continuous Control [article]

Jake Grigsby, Jin Yong Yoo, Yanjun Qi
2021 arXiv   pre-print
However, these algorithms rely on a number of design tricks and hyperparameters, making their application to new domains difficult and computationally expensive.  ...  Our design is sample efficient and provides practical advantages over baseline approaches, including improved exploration, generalization over multiple control frequencies, and a robust ensemble of high-performance  ...  However, the true promise of AutoRL methods lies in their ability to automate the process of engineering RL solutions to new domains.  ... 
arXiv:2106.08918v2 fatcat:2hy6rrfmoffx3be5xsdgr3krjq

Open Source Vizier: Distributed Infrastructure and API for Reliable and Flexible Blackbox Optimization [article]

Xingyou Song, Sagi Perel, Chansoo Lee, Greg Kochanski, Daniel Golovin
2022 arXiv   pre-print
OSS Vizier provides an API capable of defining and solving a wide variety of optimization problems, including multi-metric, early stopping, transfer learning, and conditional search.  ...  In this paper, we introduce Open Source (OSS) Vizier, a standalone Python-based interface for blackbox optimization and research, based on the Google-internal Vizier infrastructure and framework.  ...  integrations, Tom Hennigan, Pavel Sountsov, Richard Belleville, Bu Su Kim, Hao Li, and Yutian Chen for open source and infrastructure help, and George Dahl, Aleksandra Faust, and Zoubin Ghahramani for  ... 
arXiv:2207.13676v1 fatcat:hahuqsg4wvetjhflmgsvpbbiym