A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Self-Imitation Learning
[article]
2018
arXiv
pre-print
This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent's past good decisions. ...
Self-Imitation Learning The goal of self-imitation learning (SIL) is to imitate the agent's past good experiences in the actor-critic framework. ...
Thus, the agent can benefit more from self-imitation learning because self-imitation learning captures such rare experiences and learn from them. ...
arXiv:1806.05635v1
fatcat:a5i3fgkt2fajnl53de4erutjwy
Self-Imitation Advantage Learning
[article]
2020
arXiv
pre-print
Self-imitation learning is a Reinforcement Learning (RL) method that encourages actions whose returns were higher than expected, which helps in hard exploration and sparse reward problems. ...
We propose SAIL, a novel generalization of self-imitation learning for off-policy RL, based on a modification of the Bellman optimality operator that we connect to Advantage Learning. ...
RELATED WORK Extending self-imitation learning. Guo et al. ...
arXiv:2012.11989v1
fatcat:u3h7ugxgzzf6zapvmzivfabg5u
Self-Imitation Learning via Generalized Lower Bound Q-learning
[article]
2021
arXiv
pre-print
Self-imitation learning motivated by lower-bound Q-learning is a novel and effective approach for off-policy learning. ...
In this work, we propose a n-step lower bound which generalizes the original return-based lower-bound Q-learning, and introduce a new family of self-imitation learning algorithms. ...
based self-imitation learning [8] . ...
arXiv:2006.07442v3
fatcat:lwuisb33u5ca5cx6f4w3vepwf4
Self-Imitation Learning by Planning
[article]
2021
arXiv
pre-print
In this work, we solve this problem using our proposed approach called self-imitation learning by planning (SILP), where demonstration data are collected automatically by planning on the visited states ...
Imitation learning (IL) enables robots to acquire skills quickly by transferring expert knowledge, which is widely adopted in reinforcement learning (RL) to initialize exploration. ...
As we plan demonstrations online automatically by utilizing the self-generated states from the current policy, we name our approach as self-imitation learning by planning (SILP). ...
arXiv:2103.13834v2
fatcat:qofm3ggeljgfdn3exrydwajqli
Generative Adversarial Self-Imitation Learning
[article]
2018
arXiv
pre-print
This paper explores a simple regularizer for reinforcement learning by proposing Generative Adversarial Self-Imitation Learning (GASIL), which encourages the agent to imitate past good trajectories via ...
generative adversarial imitation learning framework. ...
Generative Adversarial Self-Imitation Learning The main idea of Generative Adversarial Self-Imitation Learning (GASIL) is to update the policy to imitate past good trajectories using GAIL framework (see ...
arXiv:1812.00950v1
fatcat:tataf5lntng5fhpp34btya6kbe
Learning Self-Imitating Diverse Policies
[article]
2019
arXiv
pre-print
In this work, we introduce a self-imitation learning algorithm that exploits and explores well in the sparse and episodic reward settings. ...
We then discuss limitations of self-imitation learning, and propose to solve them by using Stein variational policy gradient descent with the Jensen-Shannon kernel to learn multiple diverse policies. ...
This algorithm can be seen as self-imitation learning, in which the expert trajectories in the experience replays are self-generated by the agent during the course of learning, rather than using some external ...
arXiv:1805.10309v2
fatcat:3wjitdpiffdxjmiibxvanafdqe
Self-Imitation Learning for Robot Tasks with Sparse and Delayed Rewards
[article]
2021
arXiv
pre-print
In this paper, we propose a practical self-imitation learning method named Self-Imitation Learning with Constant Reward (SILCR). ...
The application of reinforcement learning (RL) in robotic control is still limited in the environments with sparse and delayed rewards. ...
Fig. 1 : 1 Our self-imitation learning framework for robot learning. ...
arXiv:2010.06962v3
fatcat:bjdjkj7o7jcotoj7nzexrw73ku
Using Self-Imitation to Direct Learning
2006
ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication
Self-imitation is where an agent is able to learn and replicate actions it has experienced through the manipulation of its body by another. ...
An evolutionary predecessor to observational imitation may have been self-imitation. ...
They support a form of self-imitation that may be the natural precursor to more complex forms of imitative learning. In our framework we use the idea of putting through directly. ...
doi:10.1109/roman.2006.314425
dblp:conf/ro-man/SaundersND06
fatcat:tpnywxoulfaxxfzcb4vqtcyobu
Episodic Self-Imitation Learning with Hindsight
2020
Electronics
Episodic self-imitation learning, a novel self-imitation algorithm with a trajectory selection module and an adaptive loss function, is proposed to speed up reinforcement learning. ...
Compared to the original self-imitation learning algorithm, which samples good state–action pairs from the experience replay buffer, our agent leverages entire episodes with hindsight to aid self-imitation ...
Figure 1 . 1 Illustration of difference between self-imitation learning (SIL)+hindsight experience replay (HER) and episodic self-imitation learning (ESIL). ...
doi:10.3390/electronics9101742
fatcat:rxpcgcsn2zcstfugpzjdzqcsre
Self-Imitation Learning of Locomotion Movements through Termination Curriculum
[article]
2019
arXiv
pre-print
In this paper, we propose and evaluate a novel combination of techniques for accelerating the learning of stable locomotion movements through self-imitation learning of synthetic animations. ...
This allows us to use reinforcement learning with Reference State Initialization (RSI) to find a neural network controller for imitating the synthesized reference motion. ...
In this paper, we propose a self-imitation learning approach for enabling rapid learning of stable locomotion controllers. ...
arXiv:1907.11842v2
fatcat:fe6yxsd3svbepd5ctka66jbaza
Self-Supervised Disentangled Representation Learning for Third-Person Imitation Learning
[article]
2021
arXiv
pre-print
Humans learn to imitate by observing others. However, robot imitation learning generally requires expert demonstrations in the first-person view (FPV). ...
Third-person imitation learning (TPIL) is the concept of learning action policies by observing other agents in a third-person view (TPV), similar to what humans do. ...
TCN [5] uses a time-contrastive way to learn representations by self-supervised metric learning. ...
arXiv:2108.01069v1
fatcat:w4kswzgqiffa3ifwn34src4s24
Multimodal imitation using self-learned sensorimotor representations
2016
2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
self-learned multimodal sensorimotor relations, without the need of solving inverse kinematic problems or explicit analytical models formulation. ...
We evaluate the proposed method on a humanoid iCub robot learning to interact with a piano keyboard and imitating a human demonstration. ...
Multimodal information is then crucial to improve skills and learned self-representations. Imitation learning methods have been shown effective in enhancing complex robots skills [1, 2] . ...
doi:10.1109/iros.2016.7759582
dblp:conf/iros/ZambelliD16
fatcat:3pjp76nvcvd2lmqxa2w3ntb5pa
Self-Practice Imitation Learning from Weak Policy
[chapter]
2013
Lecture Notes in Computer Science
Imitation learning is an effective strategy to reinforcement learning, which avoids the delayed reward problem by learning from mentor-demonstrated trajectories. ...
A limitation for imitation learning is that collecting sufficient qualified demonstrations is quite expensive. ...
The LEWE Framework We propose the LEarning from WEak policy (LEWE) framework that outlines the self-improve procedure for an agent, as shown in Algorithm 1. ...
doi:10.1007/978-3-642-40705-5_2
fatcat:rgsn57qubva2dctz5vcbs4764y
Learning intuitive physics and one-shot imitation using state-action-prediction self-organizing maps
[article]
2021
arXiv
pre-print
Humans seem to learn rich representations by exploration and imitation, build causal models of the world, and use both to flexibly solve new tasks. ...
Human learning and intelligence work differently from the supervised pattern recognition approach adopted in most deep learning architectures. ...
There are similar approaches which address intuitive physics learning on the basis of self-organizing maps. ...
arXiv:2007.01647v3
fatcat:wrv62evc25b4rbzlgu7b6j2zjm
Common Sensorimotor Representation for Self-initiated Imitation Learning
[chapter]
2012
Lecture Notes in Computer Science
This paper reports on a series of experiments comparing these two alternatives for self-initiated imitation tasks. ...
Internal representation is an important design decision in any imitation learning system. Actions and perceptual spaces were separate in classical AI due to the standard sense-process-act loop. ...
Practically, what is most important in self-initiated imitation is having high accuracy in marking the boundaries of behaviors to be learned. ...
doi:10.1007/978-3-642-31087-4_40
fatcat:md6bokzh3rcgvenzbwo5oz5qyi
« Previous
Showing results 1 — 15 out of 201,714 results