Filters








200 Hits in 2.4 sec

DART: Noise Injection for Robust Imitation Learning [article]

Michael Laskey, Jonathan Lee, Roy Fox, Anca Dragan, Ken Goldberg
<span title="2017-10-18">2017</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We propose a new algorithm, DART (Disturbances for Augmenting Robot Trajectories), that collects demonstrations with injected noise, and optimizes the noise level to approximate the error of the robot's  ...  One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy.  ...  Lab, the Berkeley Deep Drive (BDD) Initiative, the Real-Time Intelligent Secure Execution (RISE) Lab, and the CITRIS "People and Robots" (CPAR) Initiative and by the Scalable Collaborative Human-Robot Learning  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1703.09327v2">arXiv:1703.09327v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ax27iisqqbdxhmkg22nsmv3qy4">fatcat:ax27iisqqbdxhmkg22nsmv3qy4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191025103507/https://arxiv.org/pdf/1703.09327v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/0e/07/0e0766916794fbba9be724782b4cbf72060fc3c0.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1703.09327v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Learning Robust Bed Making using Deep Imitation Learning with DART [article]

Michael Laskey, Chris Powers, Ruta Joshi, Arshan Poursohi, Ken Goldberg
<span title="2017-11-07">2017</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We explore how DART, an LfD algorithm for learning robust policies, can be applied to automating bed making without fiducial markers with a Toyota Human Support Robot (HSR).  ...  Experiments with a 1/2 scale twin bed and distractors placed on the bed, suggest policies learned on 50 demonstrations with DART achieve 96% sheet coverage, which is over 200% better than a corner detector  ...  We next evaluate the learned policies trained on 50 demonstrations that is either collected with Behavior Cloning (i.e. no noise injected) or DART.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1711.02525v1">arXiv:1711.02525v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fyi3xbzez5htzorkkzg2pzdjgi">fatcat:fyi3xbzez5htzorkkzg2pzdjgi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191016024639/https://arxiv.org/pdf/1711.02525v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/14/16/1416fab4fae0c2f061479fc3afdd0f6736ef4f8f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1711.02525v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Disturbance-Injected Robust Imitation Learning with Task Achievement [article]

Hirotaka Tahara, Hikaru Sasaki, Hanbit Oh, Brendan Michael, Takamitsu Matsubara
<span title="2022-05-09">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Robust imitation learning using disturbance injections overcomes issues of limited variation in demonstrations.  ...  To address this issue, this paper proposes a novel imitation learning framework that combines both policy robustification and optimal demonstration learning.  ...  Disturbances for Augmenting Robot Trajectories (DART) To mitigate covariate shift, DART [5] learns policies that are robust to error compounding by generating demonstrations with disturbance injection  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.04195v1">arXiv:2205.04195v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cq2p72pntvhh5hs5p6zhh7iz7y">fatcat:cq2p72pntvhh5hs5p6zhh7iz7y</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220511193748/https://arxiv.org/pdf/2205.04195v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a4/8a/a48a2c8f6335d0a608686433427ee4b2ff1ca307.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.04195v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Grasping with Chopsticks: Combating Covariate Shift in Model-free Imitation Learning for Fine Manipulation [article]

Liyiming Ke, Jingqiang Wang, Tapomayukh Bhattacharjee, Byron Boots, Siddhartha Srinivasa
<span title="2020-11-13">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Due to the lack of accurate models for fine manipulation, we explore model-free imitation learning, which traditionally suffers from the covariate shift phenomenon that causes poor generalization.  ...  Second, we generate synthetic corrective labels by adding bounded noise and combining parametric and non-parametric methods to prevent error accumulation.  ...  To remedy covariate shift, researchers have proposed interactive imitation learning methods, such as DAgger [24] and DART [25] , to query an expert online for corrective labels.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.06719v1">arXiv:2011.06719v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ubgxddm4lzcidogcxinkh2rdjq">fatcat:ubgxddm4lzcidogcxinkh2rdjq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201117004917/https://arxiv.org/pdf/2011.06719v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f5/18/f518ce9eac0b473883ff6ae3abc3836a75f82ff5.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.06719v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Shared Multi-Task Imitation Learning for Indoor Self-Navigation [article]

Junhong Xu, Qiwei Liu, Hanqing Guo, Aaron Kageza, Saeed AlQarni, Shaoen Wu
<span title="2018-08-14">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Deep imitation learning enables robots to learn from expert demonstrations to perform tasks such as lane following or obstacle avoidance.  ...  However, in the traditional imitation learning framework, one model only learns one task, and thus it lacks of the capability to support a robot to perform various different navigation tasks with one model  ...  We employ three different training techniques to train a robust SMIL framework: dropout, data augmentation, and noise injection.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1808.04503v1">arXiv:1808.04503v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/gwph3rjhfzeuzjcgbsnmbf3xay">fatcat:gwph3rjhfzeuzjcgbsnmbf3xay</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191016083120/https://arxiv.org/pdf/1808.04503v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1f/30/1f30b7db68bfd027fcfcb76ff8bced5604edc92a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1808.04503v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations [article]

Daniel S. Brown, Wonjoon Goo, Scott Niekum
<span title="2019-10-14">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Building on this theory, we introduce Disturbance-based Reward Extrapolation (D-REX), a ranking-based imitation learning method that injects noise into a policy learned through behavioral cloning to automatically  ...  To address these issues, we first contribute a sufficient condition for better-than-demonstrator imitation learning and provide theoretical results showing why preferences over demonstrations can better  ...  Instead, our proposed approach for better-than-demonstrator imitation learning uses noise injection to produce a wide variety of automatically-ranked of demonstrations in order to reduce the learner's  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1907.03976v3">arXiv:1907.03976v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/lldvi7dsjnhe3dxirzfdursdve">fatcat:lldvi7dsjnhe3dxirzfdursdve</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200924050006/https://arxiv.org/pdf/1907.03976v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/81/9d/819d3033e78bd54e8f410340bb9795b5d0ee5ad4.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1907.03976v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Iterative Reinforcement Learning Based Design of Dynamic Locomotion Skills for Cassie [article]

Zhaoming Xie, Patrick Clary, Jeremy Dao, Pedro Morais, Jonathan Hurst, Michiel van de Panne
<span title="2019-03-22">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The tuples also allow for robust policy distillation to new network architectures.  ...  Deep reinforcement learning (DRL) is a promising approach for developing legged locomotion skills.  ...  We evaluate the robustness of each policy by injecting noise of varying magnitude to the policy actions, increasing the mass of the pelvis by 20%, and applying pushes of 50N in the forward direction for  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1903.09537v1">arXiv:1903.09537v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fs7vnduzvrdd7jyh4qkubaysuq">fatcat:fs7vnduzvrdd7jyh4qkubaysuq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200930064517/https://arxiv.org/pdf/1903.09537v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e9/b3/e9b3c086bdd4135453394e8c61d21e7f7841a31e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1903.09537v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Augmenting Imitation Experience via Equivariant Representations [article]

Dhruv Sharma, Alihusein Kuwajerwala, Florian Shkurti
<span title="2021-10-14">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The robustness of visual navigation policies trained through imitation often hinges on the augmentation of the training image-action pairs.  ...  In this paper we show that there is another practical alternative for data augmentation for visual navigation based on extrapolating viewpoint embeddings and actions nearby the ones observed in the training  ...  π(Z), via imitation learning.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2110.07668v1">arXiv:2110.07668v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5bf5enpefbeojdagaw2ah3trem">fatcat:5bf5enpefbeojdagaw2ah3trem</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211019150319/https://arxiv.org/pdf/2110.07668v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/50/60/50602bae55dd47662b5c70f489e5288756dd9a24.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2110.07668v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

SAFARI: Safe and Active Robot Imitation Learning with Imagination [article]

Norman Di Palo, Edward Johns
<span title="2020-11-18">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
One of the main issues in Imitation Learning is the erroneous behavior of an agent when facing out-of-distribution situations, not covered by the set of demonstrations given by the expert.  ...  We empirically demonstrate how this method increases the performance on a set of manipulation tasks with respect to passive Imitation Learning, by gathering more informative demonstrations and by minimizing  ...  For DART, we collected N demonstrations following [35] , by injecting noise during the expert's demonstrations. We followed the original authors' code to implement this baseline.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.09586v1">arXiv:2011.09586v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/gooj2sbtyvhv5pl2qg363xlh3m">fatcat:gooj2sbtyvhv5pl2qg363xlh3m</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201121025051/https://arxiv.org/pdf/2011.09586v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/64/38/643812d844e5a0e65fd6ca6e155f1e6e40c02bcb.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.09586v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Neural probabilistic motor primitives for humanoid control [article]

Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, Nicolas Heess
<span title="2019-01-15">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids.  ...  The trained neural probabilistic motor primitive system can perform one-shot imitation of whole-body humanoid behaviors, robustly mimicking unseen trajectories.  ...  It yields time-indexed neural network policies that are robust to moderate amounts of action noise (see appendix A for additional details on the training procedure).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1811.11711v2">arXiv:1811.11711v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/pulp4gc5vrdvpibthme367cufm">fatcat:pulp4gc5vrdvpibthme367cufm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200928193359/https://arxiv.org/pdf/1811.11711v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/71/2b/712bd8b0c6cf54c8b9496360f58170b47a5bec3d.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1811.11711v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

LILA: Language-Informed Latent Actions [article]

Siddharth Karamcheti, Megha Srivastava, Percy Liang, Dorsa Sadigh
<span title="2021-11-05">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We introduce Language-Informed Latent Actions (LILA), a framework for learning natural language interfaces in the context of human-robot collaboration.  ...  We show that LILA models are not only more sample efficient and performant than imitation learning and end-effector control baselines, but that they are also qualitatively preferred by users.  ...  We further extend a special thank you to Madeline Liao and Raj Palleti for helping with the visual intuition for our imitation learning analysis.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.03205v1">arXiv:2111.03205v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/zlmzm7na7bb3ljxeci3iamuixu">fatcat:zlmzm7na7bb3ljxeci3iamuixu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211109010956/https://arxiv.org/pdf/2111.03205v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/bf/ba/bfba05093314e52317536b6cfc8b7fded8371e02.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.03205v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Vocal Exploration Is Locally Regulated during Song Learning

P. Ravbar, D. Lipkind, L. C. Parra, O. Tchernichovski
<span title="2012-03-07">2012</span> <i title="Society for Neuroscience"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/s7bticdwizdmhll4taefg57jde" style="color: black;">Journal of Neuroscience</a> </i> &nbsp;
Exploratory variability is essential for sensorimotor learning, but it is not known how and at what timescales it is regulated.  ...  We manipulated song learning in zebra finches to experimentally control the requirements for vocal exploration in different parts of their song.  ...  For example, when learning to throw darts, the initial exploratory variability of throws is usually high (Müller and Sternad, 2009 ).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1523/jneurosci.3740-11.2012">doi:10.1523/jneurosci.3740-11.2012</a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pubmed/22399765">pmid:22399765</a> <a target="_blank" rel="external noopener" href="https://pubmed.ncbi.nlm.nih.gov/PMC3312320/">pmcid:PMC3312320</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/vyat3x3ti5dpzihitaem5myvt4">fatcat:vyat3x3ti5dpzihitaem5myvt4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170825084820/http://www.jneurosci.org/content/jneuro/32/10/3422.full.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/16/f8/16f892dde3b8c6a650b5441cff3dec13bdb88004.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1523/jneurosci.3740-11.2012"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3312320" title="pubmed link"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> pubmed.gov </button> </a>

Deep Bayesian Reward Learning from Preferences [article]

Daniel S. Brown, Scott Niekum
<span title="2019-12-10">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Bayesian inverse reinforcement learning (IRL) methods are ideal for safe imitation learning, as they allow a learning agent to reason about reward uncertainty and the safety of a learned policy.  ...  We demonstrate that B-REX learns imitation policies that are competitive with a state-of-the-art deep imitation learning method that only learns a point estimate of the reward function.  ...  This high computational cost precludes robust safety and uncertainty analysis for imitation learning in complex high-dimensional problems.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1912.04472v1">arXiv:1912.04472v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/c2ouhzmearhupopywchlvp7ckq">fatcat:c2ouhzmearhupopywchlvp7ckq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200828220138/https://arxiv.org/pdf/1912.04472v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/04/f7/04f788ea49e7fbd55369fbfa0945c53dd40c9d3b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1912.04472v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Learning motor skills: from algorithms to robot experiments

Jens Kober
<span title="2014-01-28">2014</span> <i title="Walter de Gruyter GmbH"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/zhpjayiyj5gf5fycnlwue3krte" style="color: black;">it - Information Technology</a> </i> &nbsp;
For learning single motor skills, we study parametrized policy search methods and introduce a framework of reward-weighted imitation that allows us to derive both policy gradient methods and expectationmaximization  ...  We introduce a novel EM-inspired algorithm for policy learning that is particularly well-suited for motor primitives.  ...  The performance of the algorithm is fairly robust for values chosen in this range.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1515/itit-2014-1039">doi:10.1515/itit-2014-1039</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/bmwrxrspjjdp3h2etrfarp4r4i">fatcat:bmwrxrspjjdp3h2etrfarp4r4i</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190225085609/http://pdfs.semanticscholar.org/6022/eaa42eab7f1de08300dfb2a6f4b1171216aa.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/60/22/6022eaa42eab7f1de08300dfb2a6f4b1171216aa.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1515/itit-2014-1039"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> degruyter.com </button> </a>

Learning Skills to Patch Plans Based on Inaccurate Models [article]

Alex LaGrassa, Steven Lee, Oliver Kroemer
<span title="2020-09-29">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Meanwhile, learning is useful for adaptation, but can require a substantial amount of data collection.  ...  To show the efficacy of our method, we perform experiments with a shape insertion puzzle and compare our results to both pure planning and imitation learning approaches.  ...  ACKNOWLEDGEMENTS We thank Kevin Zhang, Jacky Liang, Mohit Sharma, and many others for providing the infrastructure needed for the robot experiments.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2009.13732v1">arXiv:2009.13732v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5xe67rfvkzbm7hcmi7vvchcaki">fatcat:5xe67rfvkzbm7hcmi7vvchcaki</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201002105650/https://arxiv.org/pdf/2009.13732v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2009.13732v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 200 results