Filters








6,285 Hits in 5.3 sec

Bayesian Robust Optimization for Imitation Learning [article]

Daniel S. Brown, Scott Niekum, Marek Petrik
2020 arXiv   pre-print
To provide a bridge between these two extremes, we propose Bayesian Robust Optimization for Imitation Learning (BROIL).  ...  BROIL leverages Bayesian reward function inference and a user specific risk tolerance to efficiently optimize a robust policy that balances expected return and conditional value at risk.  ...  Acknowledgments and Disclosure of Funding We would like to thank the reviewers for their detailed feedback that helped to improve the paper.  ... 
arXiv:2007.12315v3 fatcat:a3yro3z2ffhxzd7rsk5nzrfhla

Policy Gradient Bayesian Robust Optimization for Imitation Learning [article]

Zaynah Javed, Daniel S. Brown, Satvik Sharma, Jerry Zhu, Ashwin Balakrishna, Marek Petrik, Anca D. Dragan, Ken Goldberg
2021 arXiv   pre-print
We derive a novel policy gradient-style robust optimization approach, PG-BROIL, that optimizes a soft-robust objective that balances expected performance and risk.  ...  Results suggest that PG-BROIL can produce a family of behaviors ranging from risk-neutral to risk-averse and outperforms state-of-the-art imitation learning algorithms when learning from ambiguous demonstrations  ...  This work has taken place in the AUTOLAB and InterACT Lab at the University of California, Berkeley and the Reinforcement Learning and Robustness Lab (RLsquared) at the University of New Hampshire.  ... 
arXiv:2106.06499v2 fatcat:qpyrd4lrr5aqjl4kqeaf3z3fgi

Deep Bayesian Reward Learning from Preferences [article]

Daniel S. Brown, Scott Niekum
2019 arXiv   pre-print
Bayesian inverse reinforcement learning (IRL) methods are ideal for safe imitation learning, as they allow a learning agent to reason about reward uncertainty and the safety of a learned policy.  ...  We demonstrate that B-REX learns imitation policies that are competitive with a state-of-the-art deep imitation learning method that only learns a point estimate of the reward function.  ...  Prior work on high-confidence policy evaluation for imitation learning has used Bayesian inverse reinforcement learning (IRL) [37] to allow an agent to reason about reward uncertainty and policy robustness  ... 
arXiv:1912.04472v1 fatcat:c2ouhzmearhupopywchlvp7ckq

Bayesian Gaussian mixture model for robotic policy imitation [article]

Emmanuel Pignat, Sylvain Calinon
2019 arXiv   pre-print
A common approach to learn robotic skills is to imitate a policy demonstrated by a supervisor.  ...  These advantages make this method very convenient for imitation of robotic manipulation tasks in the continuous domain.  ...  In this work, we propose a computationally efficient and simple-to-apply Bayesian model that can be used for policy imitation.  ... 
arXiv:1904.10716v1 fatcat:gnqsxeiqjzaoxiubrjqapdgi6i

Bayesian Gaussian Mixture Model for Robotic Policy Imitation

Emmanuel Pignat, Sylvain Calinon
2019 Zenodo  
A common approach to learn robotic skills is to imitate a demonstrated policy.  ...  We propose to use a Bayesian method to quantify the action uncertainty at each state.  ...  In its original form, GPR assumes homoscedasticity (constant covariance over the state). b) Efficient and robust computation: Most Bayesian models can be computationally demanding for both learning and  ... 
doi:10.5281/zenodo.3676688 fatcat:kuty3k7okbf77aeo5zygkzf6ju

Bayesian Gaussian Mixture Model for Robotic Policy Imitation

Emmanuel Pignat, Sylvain Calinon
2019 Zenodo  
A common approach to learn robotic skills is to imitate a demonstrated policy.  ...  We propose to use a Bayesian method to quantify the action uncertainty at each state.  ...  In its original form, GPR assumes homoscedasticity (constant covariance over the state). b) Efficient and robust computation: Most Bayesian models can be computationally demanding for both learning and  ... 
doi:10.5281/zenodo.3676689 fatcat:y6paicjhs5c3nl6zcgkao4cmui

Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction

Yichuan Zhang, Yixing Lan, Qiang Fang, Xin Xu, Junxiang Li, Yujun Zeng, Ahmed Mostafa Khalil
2021 Computational Intelligence and Neuroscience  
Considering that human knowledge is not only interpretable but also suitable for generalization, we propose to exploit the potential of demonstrations by extracting knowledge from them via Bayesian networks  ...  and develop a novel RLfD method called Reinforcement Learning from demonstration via Bayesian Network-based Knowledge (RLBNK).  ...  Conditional independence used in Bayesian networks is the basic and robust form of knowledge. e Bayesian network classifier is robust, and we can learn the parameters of conditional distribution even with  ... 
doi:10.1155/2021/7588221 pmid:34603434 pmcid:PMC8486502 fatcat:hvjdbrpjxrbznntrrur4omz32m

A Bayesian Approach to Imitation in Reinforcement Learning

Bob Price, Craig Boutilier
2003 International Joint Conference on Artificial Intelligence  
We recast the problem of imitation in a Bayesian framework.  ...  In multiagent environments, forms of social learning such as teaching and imitation have been shown to aid the transfer of knowledge from experts to learners in reinforcement learning (RL).  ...  We see that imitation transfer is robust to modest differences in mentor and imitator objectives.  ... 
dblp:conf/ijcai/PriceB03 fatcat:ktycd7mwh5dghedskoqnkq6ziq

Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences [article]

Daniel S. Brown, Russell Coleman, Ravi Srinivasan, Scott Niekum
2020 arXiv   pre-print
However, Bayesian reward learning methods are typically computationally intractable for complex control problems.  ...  Bayesian reward learning from demonstrations enables rigorous safety and uncertainty analysis when performing imitation learning.  ...  This precludes robust safety and uncertainty analysis for imitation learning in high-dimensional problems or in problems in which a model of the MDP is unavailable.  ... 
arXiv:2002.09089v4 fatcat:vk6ebzm2ijesjdp3bahj5cjdgi

Safe end-to-end imitation learning for model predictive control [article]

Keuntaek Lee, Kamil Saigol, Evangelos A. Theodorou
2019 arXiv   pre-print
Our algorithm combines reinforcement learning and end-to-end imitation learning to simultaneously learn a control policy as well as a threshold over the predictive uncertainty of the learned model, with  ...  We propose the use of Bayesian networks, which provide both a mean value and an uncertainty estimate as output, to enhance the safety of learned control policies under circumstances in which a test-time  ...  This cost may be unavailable or impossible for a learned model to optimize directly. Instead, imitation learning assumes that an expert capable of optimizing this cost is available.  ... 
arXiv:1803.10231v3 fatcat:vco25ewkpzcatkm636qrpox5jm

A Bayesian Approach to Generative Adversarial Imitation Learning

Wonseok Jeon, Seokin Seo, Kee-Eung Kim
2018 Neural Information Processing Systems  
Generative adversarial training for imitation learning has shown promising results on high-dimensional and continuous control tasks.  ...  To address this issue, we first propose a Bayesian formulation of generative adversarial imitation learning (GAIL), where the imitation policy and the cost function are represented as stochastic neural  ...  Bayesian Framework for Adversarial Imitation Learning Agent-expert discrimination Suppose π A is fixed for simplicity, which will be later parameterized for learning.  ... 
dblp:conf/nips/JeonSK18 fatcat:obctzjpj2ndrhbcv3xgmix4seq

Ensemble Bayesian Decision Making with Redundant Deep Perceptual Control Policies [article]

Keuntaek Lee, Ziyi Wang, Bogdan I. Vlahov, Harleen K. Brar, Evangelos A. Theodorou
2020 arXiv   pre-print
This work presents a novel ensemble of Bayesian Neural Networks (BNNs) for control of safety-critical systems.  ...  Neural Networks (NNs) are generally not used for safety-critical systems as they can behave in unexpected ways in response to novel inputs.  ...  For imitation learning, this equation changes slightly.  ... 
arXiv:1811.12555v3 fatcat:ebmuubbxjbaslebukep2xzldpi

Robot learning from demonstration

A BILLARD
2004 Robotics and Autonomous Systems  
and the optimal imitation control policy.  ...  Work in that area tackles the development of robust algorithms for motor control, motor learning, gesture recognition and visuo-motor integration.  ... 
doi:10.1016/s0921-8890(04)00037-5 fatcat:ydogncl2p5hltorg2vcm5lucd4

Robot learning from demonstration

Aude Billard, Roland Siegwart
2004 Robotics and Autonomous Systems  
and the optimal imitation control policy.  ...  Work in that area tackles the development of robust algorithms for motor control, motor learning, gesture recognition and visuo-motor integration.  ... 
doi:10.1016/j.robot.2004.03.001 fatcat:lkx5gvv5mzcsbgm7sp2bs6g64y

Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts? [article]

Angelos Filos, Panagiotis Tigas, Rowan McAllister, Nicholas Rhinehart, Sergey Levine, Yarin Gal
2020 arXiv   pre-print
we term adaptive robust imitative planning (AdaRIP).  ...  In this paper, we highlight the limitations of current approaches to novel driving scenes and propose an epistemic uncertainty-aware planning method, called robust imitative planning (RIP).  ...  Robust Imitative Planning We seek an imitation learning method that (a) provides a distribution over expert plans; (b) quantifies epistemic uncertainty to allow for detection of OOD observations and (c  ... 
arXiv:2006.14911v2 fatcat:tjowgtfad5c7zgenvfhq7seq4e
« Previous Showing results 1 — 15 out of 6,285 results