Filters








7,117 Hits in 1.9 sec

Zero-Shot Visual Imitation

Deepak Pathak, Parsa Mahmoudieh, Guanghao Luo, Pulkit Agrawal, Dian Chen, Fred Shentu, Evan Shelhamer, Jitendra Malik, Alexei A. Efros, Trevor Darrell
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
These skills can then be used to imitate the visual demonstration provided by the expert [15] .  ...  K d R p x C P Z D I g C z g T U N d M c m r E E E g Y c G s H g f O w 3 7 k E q F o l b P Y z B D 0 l P s C 6 j R B u p X S h 7 A f S Y S C k I D X J k l 4 J D f J V w z Y 6 U h h We call our method zero-shot  ...  We compare to their method in both visual navigation and manipulation. (2) GSP-NoPrevAction-NoFwdConst is the ablation of our recurrent GSP without previous action history and without forward consistency  ... 
doi:10.1109/cvprw.2018.00278 dblp:conf/cvpr/PathakMLACSSMED18 fatcat:fkbiut3ttbgujfaswf3ar2a6nm

Zero-Shot Visual Imitation [article]

Deepak Pathak, Parsa Mahmoudieh, Guanghao Luo, Pulkit Agrawal, Dian Chen, Yide Shentu, Evan Shelhamer, Jitendra Malik, Alexei A. Efros, Trevor Darrell
2018 arXiv   pre-print
We evaluate our zero-shot imitator in two real-world settings: complex rope manipulation with a Baxter robot and navigation in previously unseen office environments with a TurtleBot.  ...  Our method is 'zero-shot' in the sense that the agent never has access to expert actions during training or for the task demonstration at inference.  ...  We evaluate our zero-shot imitator on real-world robots for rope manipulation tasks using a Baxter and office navigation using a TurtleBot.  ... 
arXiv:1804.08606v1 fatcat:dgrsryxhqvbfhiqmnmjkyyakqi

Zero-shot Imitation Learning from Demonstrations for Legged Robot Visual Navigation [article]

Xinlei Pan, Tingnan Zhang, Brian Ichter, Aleksandra Faust, Jie Tan, Sehoon Ha
2020 arXiv   pre-print
Here, we propose a zero-shot imitation learning approach for training a visual navigation policy on legged robots from human (third-person perspective) demonstrations, enabling high-quality navigation  ...  Imitation learning is a popular approach for training visual navigation policies.  ...  METHOD This section introduces a zero-shot imitation learning framework for visual navigation of a legged robot.  ... 
arXiv:1909.12971v2 fatcat:5yedpru2wnchxg3zwagupq262e

Detection and Captioning with Unseen Object Classes [article]

Berkan Demirel, Ramazan Gokberk Cinbis
2021 arXiv   pre-print
Our experiments show that the proposed zero-shot detection model obtains state-of-the-art performance on the MS-COCO dataset and the zero-shot captioning approach yields promising results.  ...  For this problem, we propose a detection-driven approach based on a generalized zero-shot detection model and a template-based sentence generation model.  ...  Noticeably, the performance gap between true zero-shot and (visually) supervised partial zero-shot captioning is larger in terms of the Avg. F1 metric.  ... 
arXiv:2108.06165v1 fatcat:k4riatvu6vcknggyiwq2kbjcze

Visual Goal-Directed Meta-Learning with Contextual Planning Networks [article]

Corban G. Rivera, David A Handelman
2021 arXiv   pre-print
We evaluate CPN along with several other approaches adapted for zero-shot goal-directed meta-learning.  ...  We adapted the metaworld benchmark tasks to create 24 zero-shot meta-learning from visual demonstration tasks for evaluation.  ...  Single-shot imitation learning via images has also been explored [22] , [23] , [1] , [24] .  ... 
arXiv:2111.09908v1 fatcat:tzl2dnkv5ne57nleq3guwwv77m

Gated-Attention Architectures for Task-Oriented Language Grounding

Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
2018 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment  ...  The proposed model combines the image and text representations using a Gated-Attention mechanism and learns a policy to execute the natural language instruction using standard reinforcement and imitation  ...  Multitask and Zero-Shot task generalization, across three modes of difficulty.  ... 
doi:10.1609/aaai.v32i1.11832 fatcat:4mekny7girh2hghlcldumbl4ni

Gated-Attention Architectures for Task-Oriented Language Grounding [article]

Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
2018 arXiv   pre-print
To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment  ...  The proposed model combines the image and text representations using a Gated-Attention mechanism and learns a policy to execute the natural language instruction using standard reinforcement and imitation  ...  Multitask and Zero-Shot task generalization, across three modes of difficulty.  ... 
arXiv:1706.07230v2 fatcat:pckcwi6gbbaqzoiesionmcsrou

Towards More Generalizable One-shot Visual Imitation Learning [article]

Zhao Mandi, Fangchen Liu, Kimin Lee, Pieter Abbeel
2022 arXiv   pre-print
We then study the multi-task setting, where multi-task training is followed by (i) one-shot imitation on variations within the training tasks, (ii) one-shot imitation on new tasks, and (iii) fine-tuning  ...  For consistency and comparison purposes, we first train and evaluate single-task agents (as done in prior few-shot imitation work).  ...  One-shot imitation learning.  ... 
arXiv:2110.13423v2 fatcat:dxstakcokjgghabxh65y75iweq

BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning [article]

Eric Jang, Alex Irpan, Mohi Khansari, Daniel Kappler, Frederik Ebert, Corey Lynch, Sergey Levine, Chelsea Finn
2022 arXiv   pre-print
We approach the challenge from an imitation learning perspective, aiming to study how scaling and broadening the data collected can facilitate such generalization.  ...  To that end, we develop an interactive and flexible imitation learning system that can learn from both demonstrations and interventions and can be conditioned on different forms of information that convey  ...  Daniel Kappler built the data annotation visualizer. Corey Lynch advised Frederik's internship and gave pointers on language models. Sergey Levine and Chelsea Finn supervised the project.  ... 
arXiv:2202.02005v1 fatcat:v2hr2vhlsjhubbngedinluefge

Visual Adversarial Imitation Learning using Variational Models [article]

Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn
2022 arXiv   pre-print
Towards addressing these challenges, we develop a variational model-based adversarial imitation learning (V-MAIL) algorithm.  ...  In contrast, providing visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents.  ...  Algorithm 2 Zero-Shot Transfer with V-MAIL prior model-free imitation learning approaches, and behavior cloning on five visual imitation tasks.  ... 
arXiv:2107.08829v2 fatcat:lp5atolewne37kcflwhfxfc76u

Joint Hypergraph Learning using feature fusion for Image Retrieval

2020 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
In this exploration field, label data and various visual highlights have been explored. Be that as it may, most existing strategies utilize these visual includes independently or successively.  ...  In this paper, we propose a worldwide and neighborhood visual highlights combination way to deal with get familiar with the significance of pictures by hypergraph approach.  ...  Subsequently, a little calculations recreation in imitation of mix various visual highlights in imitation of enhance the picture excerpt exactness.  ... 
doi:10.35940/ijitee.h6474.0891020 fatcat:byco263w75av7apwwwxxoluwvi

Generalization Through Hand-Eye Coordination: An Action Space for Learning Spatially-Invariant Visuomotor Control [article]

Chen Wang, Rui Wang, Ajay Mandlekar, Li Fei-Fei, Silvio Savarese, Danfei Xu
2021 arXiv   pre-print
Through a set of challenging multi-stage manipulation tasks, we show that a visuomotor policy equipped with HAN is able to inherit the key spatial invariance property of hand-eye coordination and achieve zero-shot  ...  Imitation Learning (IL) is an effective framework to learn visuomotor skills from offline demonstration data.  ...  EXPERIMENTS In this section, we seek to answer the following questions: (1) Does including HAN in a deep imitation learning pipeline improve the task performance and zero-shot generalization ability?  ... 
arXiv:2103.00375v2 fatcat:iwzmwsmbmjfclekqf4xyejxvza

Learning from Observation-Only Demonstration for Task-Oriented Language Grounding via Self-Examination

Tsu-Jui Fu, Yuta Tsuboi, Sosuke Kobayashi, Yuta Kikuchi
2019 Neural Information Processing Systems  
Combining imitation with natural language instruction promises to further make imitation learning more flexible and useful in real-world applications.  ...  Imitation learning is an effective method for learning a control policy from expert demonstrations.  ...  Detailed Analysis Zero-shot Generalization: To investigate the generability of visual-language grounding, we evaluate under zero-shot setting where new combinations of attribute-object pairs are unseen  ... 
dblp:conf/nips/FuTKK19 fatcat:xtsyvbosz5eurdxxodasyhvqb4

One-Shot Visual Imitation Learning via Meta-Learning [article]

Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, Sergey Levine
2017 arXiv   pre-print
Unlike prior methods for one-shot imitation, our method can scale to raw pixel inputs and requires data from significantly fewer prior tasks for effective learning of new skills.  ...  Our experiments on both simulated and real robot platforms demonstrate the ability to learn new tasks, end-to-end, from a single visual demonstration.  ...  Discussion and Future Work We proposed a method for one-shot visual imitation learning that can learn to perform tasks using visual inputs from just a single demonstration.  ... 
arXiv:1709.04905v1 fatcat:vz5ykqgh3zg3nae6kzv77h5v2e

FlowControl: Optical Flow Based Visual Servoing [article]

Max Argus and Lukas Hermann and Jon Long and Thomas Brox
2020 arXiv   pre-print
We present a practical method for realizing one-shot imitation for manipulation tasks, exploiting modern learning-based optical flow to perform real-time visual servoing.  ...  One-shot imitation is the vision of robot programming from a single demonstration, rather than by tedious construction of computer code.  ...  Few-shot imitation from videos is an appealing alternative to overcome this problem, as videos typically capture all task-relevant information.  ... 
arXiv:2007.00291v1 fatcat:7whjrzathfdxfmpcjhmh2usdr4
« Previous Showing results 1 — 15 out of 7,117 results