A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Towards a Fatality-Aware Benchmark of Probabilistic Reaction Prediction in Highly Interactive Driving Scenarios
[article]
2018
arXiv
pre-print
(PGM), neural networks (NN) and inverse reinforcement learning (IRL). ...
We employ prototype trajectories with designated motion patterns other than "intention" to homogenize the representation so that probabilities corresponding to each trajectory generated by different methods ...
Inverse reinforcement learning (IRL) Inverse reinforcement learning allows us to learn the cost functions of human by observing their behavior. ...
arXiv:1809.03478v1
fatcat:uujngpf6azgd7i3hx3rvlqssdy
Robot Skill Learning: From Reinforcement Learning to Evolution Strategies
2013
Paladyn: Journal of Behavioral Robotics
Owing to current trends involving searching in parameter space (rather than action space) and using reward-weighted averaging (rather than gradient estimation), reinforcement learning algorithms for policy ...
AbstractPolicy improvement methods seek to optimize the parameters of a policy with respect to a utility function. ...
simplification of the learning problem by using DMPs. ...
doi:10.2478/pjbr-2013-0003
fatcat:j3xthm6ywrclzlxjksiwm5ivk4
Learning motion primitive goals for robust manipulation
2011
2011 IEEE/RSJ International Conference on Intelligent Robots and Systems
To address these challenges we 1) present a simplified, computationally more efficient version of our model-free reinforcement learning algorithm PI 2 ; 2) extend PI 2 so that it simultaneously learns ...
Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions. ...
Policy Improvement for Reinforcement Learning The shape parameters θ are commonly acquired through imitation learning, i.e. a DMP is trained with an observed trajectory through supervised learning [5] ...
doi:10.1109/iros.2011.6094877
dblp:conf/iros/StulpTKPRS11
fatcat:skiywl4bhnfupfgwbwlpu7s7su
Learning motion primitive goals for robust manipulation
2011
2011 IEEE/RSJ International Conference on Intelligent Robots and Systems
To address these challenges we 1) present a simplified, computationally more efficient version of our model-free reinforcement learning algorithm PI 2 ; 2) extend PI 2 so that it simultaneously learns ...
Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions. ...
Policy Improvement for Reinforcement Learning The shape parameters θ are commonly acquired through imitation learning, i.e. a DMP is trained with an observed trajectory through supervised learning [5] ...
doi:10.1109/iros.2011.6048517
fatcat:qhlkglxx3rfkboxszardbpxe5y
Momentum control with hierarchical inverse dynamics on a torque-controlled humanoid
2015
Autonomous Robots
Using a reformulation of existing algorithms, we propose a simplification of the problem that allows to achieve real-time control. ...
Our results demonstrate that hierarchical inverse dynamics together with momentum control can be efficiently used for feedback control under real robot conditions. ...
EKF, UKF), trajectory optimization, data analysis and machine learning techniques (supervised, unsupervised and reinforcement learning) • Experience writing realtime-safe software, interfacing with sensors ...
doi:10.1007/s10514-015-9476-6
fatcat:sc63j3a3anac5pqi36v3tvxz2y
Feedback Control For Cassie With Deep Reinforcement Learning
[article]
2018
arXiv
pre-print
Deep reinforcement learning (DRL) offers a promising model-free approach for controlling bipedal locomotion which can more fully exploit the dynamics. ...
By formulating a feedback control problem as finding the optimal policy for a Markov Decision Process, we are able to learn robust walking controllers that imitate a reference motion with DRL. ...
Deep reinforcement learning (DRL), on the other hand, provides a method to develop controllers in a model-free manner, albeit with its own learning inefficiencies. ...
arXiv:1803.05580v2
fatcat:pzintp2yerc3bmqwbuxnjnygqm
Representation and Reinforcement Learning for Personalized Glycemic Control in Septic Patients
[article]
2017
arXiv
pre-print
The result demonstrates that reinforcement learning with appropriate patient state encoding can potentially provide optimal glycemic trajectories and allow clinicians to design a personalized strategy ...
We encoded patient states using a sparse autoencoder and adopted a reinforcement learning paradigm using policy iteration to learn the optimal policy from data. ...
In this study, we proposed and explored the reinforcement learning (RL) paradigm to learn the policy for choosing personalized optimal glycemic trajectories using retrospective data. ...
arXiv:1712.00654v1
fatcat:kxrzeqqiqfhmdewifozascqgny
Online Multi-Task Learning for Policy Gradient Methods
2014
International Conference on Machine Learning
To make agents more sampleefficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. ...
Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control. ...
Online Multi-Task Learning for Policy Gradient Methods ...
dblp:conf/icml/Bou-AmmarERT14
fatcat:tym54bnvf5dm5eh74szqsraxfm
Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning
[article]
2022
arXiv
pre-print
In this work, we propose a novel framework utilizing Adversarial Inverse Reinforcement Learning that can provide global explanations for decisions made by a Reinforcement Learning model and capture intuitive ...
Unfortunately, current black box nature of machine learning models is still an unresolved issue and this very nature prevents researchers from learning and providing explicative descriptions for a model's ...
This particular approach to inverse reinforcement learning is named Adversarial Inverse Reinforcement Learning (AIRL). ...
arXiv:2203.16464v1
fatcat:usjqjptwozfovf2nuvq6jwyo64
On Global Optimization of Walking Gaits for the Compliant Humanoid Robot, COMAN Using Reinforcement Learning
2012
Cybernetics and Information Technologies
In this paper a 15 degrees of Freedom dynamic model of a compliant humanoid robot is used, combined with reinforcement learning to perform global search in the parameter space to produce stable gaits. ...
In ZMP trajectory generation using simple models, often a considerable amount of trials and errors are involved to obtain locally stable gaits by manually tuning the gait parameters. ...
In Section 3 the trajectory generation method is briefly described. The reinforcement learning algorithm is described in Section 4. ...
doi:10.2478/cait-2012-0020
fatcat:wpan3fe2rzhi5gwme6pmpqbh4u
MineRL: A Large-Scale Dataset of Minecraft Demonstrations
[article]
2019
arXiv
pre-print
However, existing datasets compatible with reinforcement learning simulators do not have sufficient scale, structure, and quality to enable the further development and evaluation of methods focused on ...
The sample inefficiency of standard deep reinforcement learning methods precludes their application to many real-world problems. ...
It currently contains data for six tasks, none of which can be fully solved with standard deep reinforcement learning methods. ...
arXiv:1907.13440v1
fatcat:63khufur7nd73fb5b43f5jf5gm
Exploiting previous experience to constrain robot sensorimotor learning
2011
2011 11th IEEE-RAS International Conference on Humanoid Robots
This significantly reduces the number of test trials needed by standard reinforcement learning techniques. ...
The first stage is based on the generalization of previously trained movements associated with a specific task, which results in a first approximation of a suitable control policy in a new situation. ...
The goal of this paper is to speed-up the learning process by combining the ideas from imitation and reinforcement learning with statistical generalization [5] . ...
doi:10.1109/humanoids.2011.6100913
dblp:conf/humanoids/NemecVU11
fatcat:bco5ylyn7nbpzl2745g7azyrgi
The Adaptive Stress Testing Formulation
[article]
2020
arXiv
pre-print
We also provide three examples of validation problems formulated to work with AST. ...
Therefore, approximate validation methods are needed to tractably find failures without unsafe simplifications. ...
A specific reward function structure is then used with reinforcement learning algorithms in order to identify the most-likely failure of a system in a scenario. ...
arXiv:2004.04293v1
fatcat:deuwivp2fnaznkpyafcytt6eca
Adversarial Imitation Learning via Random Search
2019
2019 International Joint Conference on Neural Networks (IJCNN)
A derivative-free optimization based reinforcement learning and the simplification on policies obtain competitive performance on the dynamic complex tasks. ...
In this paper, we propose an imitation learning method that takes advantage of the derivative-free optimization with simple linear policies. ...
INTRODUCTION In 2013, the Deep Q-Learning showed how to combine classical Q-Learning with convolution neural network to successfully solve Atari games, reinvigorating reinforcement learning (RL) as one ...
doi:10.1109/ijcnn.2019.8852307
dblp:conf/ijcnn/ShinK19
fatcat:vdd7fy5frfe7lknvrfqxjiez7m
Reinforcement learning of full-body humanoid motor skills
2010
2010 10th IEEE-RAS International Conference on Humanoid Robots
Applying reinforcement learning to humanoid robots is challenging because humanoids have a large number of degrees of freedom and state and action spaces are continuous. ...
In this paper, we present a probabilistic reinforcement learning approach, which is derived from the framework of stochastic optimal control and path integrals. ...
acquire an accurate model of the robot and its interaction with the environment to enable more efficient model-based reinforcement learning. ...
doi:10.1109/ichr.2010.5686320
dblp:conf/humanoids/StulpBTS10
fatcat:fleceoscynbpxculcflwkoog5i
« Previous
Showing results 1 — 15 out of 11,671 results