Filters








11,671 Hits in 2.8 sec

Towards a Fatality-Aware Benchmark of Probabilistic Reaction Prediction in Highly Interactive Driving Scenarios [article]

Wei Zhan, Liting Sun, Yeping Hu, Jiachen Li, Masayoshi Tomizuka
2018 arXiv   pre-print
(PGM), neural networks (NN) and inverse reinforcement learning (IRL).  ...  We employ prototype trajectories with designated motion patterns other than "intention" to homogenize the representation so that probabilities corresponding to each trajectory generated by different methods  ...  Inverse reinforcement learning (IRL) Inverse reinforcement learning allows us to learn the cost functions of human by observing their behavior.  ... 
arXiv:1809.03478v1 fatcat:uujngpf6azgd7i3hx3rvlqssdy

Robot Skill Learning: From Reinforcement Learning to Evolution Strategies

Freek Stulp, Olivier Sigaud
2013 Paladyn: Journal of Behavioral Robotics  
Owing to current trends involving searching in parameter space (rather than action space) and using reward-weighted averaging (rather than gradient estimation), reinforcement learning algorithms for policy  ...  AbstractPolicy improvement methods seek to optimize the parameters of a policy with respect to a utility function.  ...  simplification of the learning problem by using DMPs.  ... 
doi:10.2478/pjbr-2013-0003 fatcat:j3xthm6ywrclzlxjksiwm5ivk4

Learning motion primitive goals for robust manipulation

Freek Stulp, Evangelos Theodorou, Mrinal Kalakrishnan, Peter Pastor, Ludovic Righetti, Stefan Schaal
2011 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems  
To address these challenges we 1) present a simplified, computationally more efficient version of our model-free reinforcement learning algorithm PI 2 ; 2) extend PI 2 so that it simultaneously learns  ...  Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions.  ...  Policy Improvement for Reinforcement Learning The shape parameters θ are commonly acquired through imitation learning, i.e. a DMP is trained with an observed trajectory through supervised learning [5]  ... 
doi:10.1109/iros.2011.6094877 dblp:conf/iros/StulpTKPRS11 fatcat:skiywl4bhnfupfgwbwlpu7s7su

Learning motion primitive goals for robust manipulation

F. Stulp, E. Theodorou, M. Kalakrishnan, P. Pastor, L. Righetti, S. Schaal
2011 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems  
To address these challenges we 1) present a simplified, computationally more efficient version of our model-free reinforcement learning algorithm PI 2 ; 2) extend PI 2 so that it simultaneously learns  ...  Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions.  ...  Policy Improvement for Reinforcement Learning The shape parameters θ are commonly acquired through imitation learning, i.e. a DMP is trained with an observed trajectory through supervised learning [5]  ... 
doi:10.1109/iros.2011.6048517 fatcat:qhlkglxx3rfkboxszardbpxe5y

Momentum control with hierarchical inverse dynamics on a torque-controlled humanoid

Alexander Herzog, Nicholas Rotella, Sean Mason, Felix Grimminger, Stefan Schaal, Ludovic Righetti
2015 Autonomous Robots  
Using a reformulation of existing algorithms, we propose a simplification of the problem that allows to achieve real-time control.  ...  Our results demonstrate that hierarchical inverse dynamics together with momentum control can be efficiently used for feedback control under real robot conditions.  ...  EKF, UKF), trajectory optimization, data analysis and machine learning techniques (supervised, unsupervised and reinforcement learning) • Experience writing realtime-safe software, interfacing with sensors  ... 
doi:10.1007/s10514-015-9476-6 fatcat:sc63j3a3anac5pqi36v3tvxz2y

Feedback Control For Cassie With Deep Reinforcement Learning [article]

Zhaoming Xie, Glen Berseth, Patrick Clary, Jonathan Hurst, Michiel van de Panne
2018 arXiv   pre-print
Deep reinforcement learning (DRL) offers a promising model-free approach for controlling bipedal locomotion which can more fully exploit the dynamics.  ...  By formulating a feedback control problem as finding the optimal policy for a Markov Decision Process, we are able to learn robust walking controllers that imitate a reference motion with DRL.  ...  Deep reinforcement learning (DRL), on the other hand, provides a method to develop controllers in a model-free manner, albeit with its own learning inefficiencies.  ... 
arXiv:1803.05580v2 fatcat:pzintp2yerc3bmqwbuxnjnygqm

Representation and Reinforcement Learning for Personalized Glycemic Control in Septic Patients [article]

Wei-Hung Weng, Mingwu Gao, Ze He, Susu Yan, Peter Szolovits
2017 arXiv   pre-print
The result demonstrates that reinforcement learning with appropriate patient state encoding can potentially provide optimal glycemic trajectories and allow clinicians to design a personalized strategy  ...  We encoded patient states using a sparse autoencoder and adopted a reinforcement learning paradigm using policy iteration to learn the optimal policy from data.  ...  In this study, we proposed and explored the reinforcement learning (RL) paradigm to learn the policy for choosing personalized optimal glycemic trajectories using retrospective data.  ... 
arXiv:1712.00654v1 fatcat:kxrzeqqiqfhmdewifozascqgny

Online Multi-Task Learning for Policy Gradient Methods

Haitham Bou-Ammar, Eric Eaton, Paul Ruvolo, Matthew E. Taylor
2014 International Conference on Machine Learning  
To make agents more sampleefficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning.  ...  Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.  ...  Online Multi-Task Learning for Policy Gradient Methods  ... 
dblp:conf/icml/Bou-AmmarERT14 fatcat:tym54bnvf5dm5eh74szqsraxfm

Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning [article]

Yuansheng Xie, Soroush Vosoughi, Saeed Hassanpour
2022 arXiv   pre-print
In this work, we propose a novel framework utilizing Adversarial Inverse Reinforcement Learning that can provide global explanations for decisions made by a Reinforcement Learning model and capture intuitive  ...  Unfortunately, current black box nature of machine learning models is still an unresolved issue and this very nature prevents researchers from learning and providing explicative descriptions for a model's  ...  This particular approach to inverse reinforcement learning is named Adversarial Inverse Reinforcement Learning (AIRL).  ... 
arXiv:2203.16464v1 fatcat:usjqjptwozfovf2nuvq6jwyo64

On Global Optimization of Walking Gaits for the Compliant Humanoid Robot, COMAN Using Reinforcement Learning

Houman Dallali, Petar Kormushev, Zhibin Li, Darwin Caldwell
2012 Cybernetics and Information Technologies  
In this paper a 15 degrees of Freedom dynamic model of a compliant humanoid robot is used, combined with reinforcement learning to perform global search in the parameter space to produce stable gaits.  ...  In ZMP trajectory generation using simple models, often a considerable amount of trials and errors are involved to obtain locally stable gaits by manually tuning the gait parameters.  ...  In Section 3 the trajectory generation method is briefly described. The reinforcement learning algorithm is described in Section 4.  ... 
doi:10.2478/cait-2012-0020 fatcat:wpan3fe2rzhi5gwme6pmpqbh4u

MineRL: A Large-Scale Dataset of Minecraft Demonstrations [article]

William H. Guss, Brandon Houghton, Nicholay Topin, Phillip Wang, Cayden Codel, Manuela Veloso, Ruslan Salakhutdinov
2019 arXiv   pre-print
However, existing datasets compatible with reinforcement learning simulators do not have sufficient scale, structure, and quality to enable the further development and evaluation of methods focused on  ...  The sample inefficiency of standard deep reinforcement learning methods precludes their application to many real-world problems.  ...  It currently contains data for six tasks, none of which can be fully solved with standard deep reinforcement learning methods.  ... 
arXiv:1907.13440v1 fatcat:63khufur7nd73fb5b43f5jf5gm

Exploiting previous experience to constrain robot sensorimotor learning

Bojan Nemec, Rok Vuga, Ales Ude
2011 2011 11th IEEE-RAS International Conference on Humanoid Robots  
This significantly reduces the number of test trials needed by standard reinforcement learning techniques.  ...  The first stage is based on the generalization of previously trained movements associated with a specific task, which results in a first approximation of a suitable control policy in a new situation.  ...  The goal of this paper is to speed-up the learning process by combining the ideas from imitation and reinforcement learning with statistical generalization [5] .  ... 
doi:10.1109/humanoids.2011.6100913 dblp:conf/humanoids/NemecVU11 fatcat:bco5ylyn7nbpzl2745g7azyrgi

The Adaptive Stress Testing Formulation [article]

Mark Koren, Anthony Corso, Mykel J. Kochenderfer
2020 arXiv   pre-print
We also provide three examples of validation problems formulated to work with AST.  ...  Therefore, approximate validation methods are needed to tractably find failures without unsafe simplifications.  ...  A specific reward function structure is then used with reinforcement learning algorithms in order to identify the most-likely failure of a system in a scenario.  ... 
arXiv:2004.04293v1 fatcat:deuwivp2fnaznkpyafcytt6eca

Adversarial Imitation Learning via Random Search

MyungJae Shin, Joongheon Kim
2019 2019 International Joint Conference on Neural Networks (IJCNN)  
A derivative-free optimization based reinforcement learning and the simplification on policies obtain competitive performance on the dynamic complex tasks.  ...  In this paper, we propose an imitation learning method that takes advantage of the derivative-free optimization with simple linear policies.  ...  INTRODUCTION In 2013, the Deep Q-Learning showed how to combine classical Q-Learning with convolution neural network to successfully solve Atari games, reinvigorating reinforcement learning (RL) as one  ... 
doi:10.1109/ijcnn.2019.8852307 dblp:conf/ijcnn/ShinK19 fatcat:vdd7fy5frfe7lknvrfqxjiez7m

Reinforcement learning of full-body humanoid motor skills

Freek Stulp, Jonas Buchli, Evangelos Theodorou, Stefan Schaal
2010 2010 10th IEEE-RAS International Conference on Humanoid Robots  
Applying reinforcement learning to humanoid robots is challenging because humanoids have a large number of degrees of freedom and state and action spaces are continuous.  ...  In this paper, we present a probabilistic reinforcement learning approach, which is derived from the framework of stochastic optimal control and path integrals.  ...  acquire an accurate model of the robot and its interaction with the environment to enable more efficient model-based reinforcement learning.  ... 
doi:10.1109/ichr.2010.5686320 dblp:conf/humanoids/StulpBTS10 fatcat:fleceoscynbpxculcflwkoog5i
« Previous Showing results 1 — 15 out of 11,671 results