A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Challenges in High-dimensional Reinforcement Learning with Evolution Strategies
[article]
2018
arXiv
pre-print
Evolution Strategies (ESs) have recently become popular for training deep neural networks, in particular on reinforcement learning tasks, a special form of controller design. ...
In addition, many control problems give rise to a stochastic fitness function. ...
Our results indicate that a scalable modern evolution strategy with step size and efficient metric learning equipped with uncertainty handling is the most promising general-purpose technique for high-dimensional ...
arXiv:1806.01224v2
fatcat:oz3qbxjknrhzpfjatd2jmwrxci
CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning
[article]
2021
arXiv
pre-print
Current state-of-the-art model-based reinforcement learning algorithms use trajectory sampling methods, such as the Cross-Entropy Method (CEM), for planning in continuous control settings. ...
These zeroth-order optimizers require sampling a large number of trajectory rollouts to select an optimal action, which scales poorly for large prediction horizons or high dimensional action spaces. ...
Methods
Preliminaries: Cross-Entropy Method for Trajectory planning In model-based reinforcement learning Nagabandi et al. (2018) , a common scheme for action selection is to use model predictive control ...
arXiv:2112.07746v1
fatcat:fg5qovlnhres7bb5bbwtfq64d4
High-Accuracy Model-Based Reinforcement Learning, a Survey
[article]
2021
arXiv
pre-print
In recent years, a diverse landscape of model-based methods has been introduced to improve model accuracy, using methods such as uncertainty modeling, model-predictive control, latent models, and end-to-end ...
To reduce the number of environment samples, model-based reinforcement learning creates an explicit model of the environment dynamics. ...
Acknowledgments We thank the members of the Leiden Reinforcement Learning Group, and especially Thomas Moerland and Mike Huisman, for many discussions and insights. ...
arXiv:2107.08241v1
fatcat:tma6xb2uy5fybjfhmzasfx2cta
Autonomous Exploration Under Uncertainty via Deep Reinforcement Learning on Graphs
[article]
2020
arXiv
pre-print
We propose a novel approach that uses graph neural networks (GNNs) in conjunction with deep reinforcement learning (DRL), enabling decision-making over graphs containing exploration information to predict ...
For this problem, belief space planning methods that forward-simulate robot sensing and estimation may often fail in real-time implementation, scaling poorly with increasing size of the state, belief and ...
In this paper we consider value-based methods and policybased methods for model-free control. ...
arXiv:2007.12640v1
fatcat:4x2lsdzoazeq7ayauocjsjqqq4
Risk-Aware Model-Based Control
2021
Frontiers in Robotics and AI
In this work, a novel MBRL method is proposed, called Risk-Aware Model-Based Control (RAMCO). ...
facing high-dimensional and complex problems. ...
RISK-AWARE MODEL-BASED CONTROL In this work, we propose a model-based method with a probabilistic dynamics model, and our main objective is to learn a safe and scalable policy efficiently. ...
doi:10.3389/frobt.2021.617839
pmid:33778013
pmcid:PMC7990789
fatcat:v4thq6253zgjpde6zhowd6ufca
Real-Time Model Calibration with Deep Reinforcement Learning
[article]
2020
arXiv
pre-print
In this paper, we propose a novel framework for inference of model parameters based on reinforcement learning. ...
However, fast and accurate inference for processes with large and high dimensional datasets cannot easily be achieved with state-of-the-art methods under noisy real-world conditions. ...
Scalability to Large Dataset and High Dimensional Model Calibration Parameters θ. ...
arXiv:2006.04001v2
fatcat:dndjxnsaubhi3jkpxu6tkqkfdm
Scalable Global Optimization via Local Bayesian Optimization
[article]
2020
arXiv
pre-print
Bayesian optimization has recently emerged as a popular method for the sample-efficient optimization of expensive black-box functions. ...
This motivates the design of a local probabilistic approach for global optimization of large-scale high-dimensional problems. ...
Lunar landing reinforcement learning Here the goal is to learn a controller for a lunar lander implemented in the OpenAI gym 3 . ...
arXiv:1910.01739v4
fatcat:z6orkycjnjgmjcvdaymd2xp4gm
An Adversarial Objective for Scalable Exploration
[article]
2020
arXiv
pre-print
Model-based curiosity combines active learning approaches to optimal sampling with the information gain based incentives for exploration presented in the curiosity literature. ...
This discriminator is optimized jointly with a prediction model and enables our active learning approach to sample sequences of observations and actions which result in predictions considered the least ...
ACKNOWLEDGEMENTS The authors are grateful for support through the Curious Minded Machines project funded by the Honda Research Institute. ...
arXiv:2003.06082v4
fatcat:gb7vwk35p5fk3drgho37tbx3dq
Building a Scalable and Interpretable Bayesian Deep Learning Framework for Quality Control of Free Form Surfaces
2021
IEEE Access
ACKNOWLEDGMENT This study was supported by the UK EPSRC project EP/K019368/1: "Self-Resilient Reconfigurable Assembly Systems with In-process Quality Improvement" and the WMG-IIT scholarship. ...
Further, to exponentially enhance the scalability for high dimensional MAS and reduce CAE simulation time, uncertainty guided continual learning [8] and transfer learning [7] features are integrated ...
for a single MAS by leveraging uncertainty estimates of the Bayesian 3D U-net OSER-MAS model. (2) Uncertainty guided transfer/continual learning-based scalability model to transfer meta-knowledge from ...
doi:10.1109/access.2021.3068867
fatcat:wb6yf6j7nbdxrifpmbg6vew6iy
Uncertainty-Aware Policy Optimization: A Robust, Adaptive Trust Region Approach
[article]
2020
arXiv
pre-print
When combined with small sample sizes, these methods can result in unstable learning due to their reliance on high-dimensional sample-based estimates. ...
In order for reinforcement learning techniques to be useful in real-world decision making processes, they must be able to produce robust performance from limited data. ...
Acknowledgments This research was partially supported by the NSF under grants ECCS-1931600, DMS-1664644, CNS-1645681, and IIS-1914792 ...
arXiv:2012.10791v1
fatcat:euqqdcoi4rea3i52ww7p3poyte
A Survey on Learning-Based Model Predictive Control: Toward Path Tracking Control of Mobile Platforms
2022
Applied Sciences
The learning-based model predictive control (LB-MPC) is an effective and critical method to solve the path tracking problem in mobile platforms under uncertain disturbances. ...
It is well known that the machine learning (ML) methods use the historical and real-time measurement data to build data-driven prediction models. ...
Acknowledgments: The authors would like to thank all anonymous reviewers and editors for their helpful suggestions for the improvement of this paper. ...
doi:10.3390/app12041995
fatcat:hdy753pdbfhjvihdopaw6lip2u
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning
[article]
2021
arXiv
pre-print
Reinforcement Learning (RL) agents in the real world must satisfy safety constraints in addition to maximizing a reward objective. ...
We propose Conservative and Adaptive Penalty (CAP), a model-based safe RL framework that accounts for potential modeling errors by capturing model uncertainty and adaptively exploiting it to balance the ...
Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein ...
arXiv:2112.07701v1
fatcat:fpaqm7dd5jfujcwzozgn4yotly
Wide Area Measurement System-based Low Frequency Oscillation Damping Control through Reinforcement Learning
[article]
2020
arXiv
pre-print
Such a technique has a unique characteristic to learn on diverse scenarios and operating conditions by exploring the environment and devising an optimal control action policy by implementing policy gradient ...
Lately, wide area measurement system-based centralized controlling techniques started providing a more flexible and robust control to keep the system stable. ...
Such a systematic validation shows the applicability and scalability of the reinforcement learning-based oscillation damping controller. ...
arXiv:2001.07829v1
fatcat:jzp66s4w4nfo5botusqsigacdy
Scalable Learning Paradigms for Data-Driven Wireless Communication
[article]
2020
arXiv
pre-print
However, the ever exploding data volume and model complexity will limit centralized solutions to learn and respond within a reasonable time. ...
Therefore, scalability becomes a critical issue to be solved. In this article, we aim to provide a systematic discussion on the building blocks of scalable data-driven wireless networks. ...
Besides, Bayesian neural networks and deep GP algorithms can be combined with RL techniques to form Bayesian deep reinforcement learning (BDRL) to achieve a fully Bayesian control framework. ...
arXiv:2003.00474v1
fatcat:kd6plphwgbfvbdyylcn4jk24uq
Deep Reinforcement Learning Versus Evolution Strategies: A Comparative Survey
[article]
2021
arXiv
pre-print
After presenting their fundamental concepts and algorithms, a comparison is provided on key aspects such as scalability, exploration, adaptation to dynamic environments, and multi-agent learning. ...
Deep Reinforcement Learning (DRL) and Evolution Strategies (ESs) have surpassed human-level control in many sequential decision-making problems, yet many open challenges still exist. ...
Their approach consists of two main steps: (i) learning a state representation and initial policy from high-dimensional input data using gradient-based methods (i.e., DQN or DDPG); and (ii) fine-tuning ...
arXiv:2110.01411v1
fatcat:nw47ududyndyljlh4nx2gm73jq
« Previous
Showing results 1 — 15 out of 5,380 results