Filters








248 Hits in 6.1 sec

Predictive Learning from Demonstration [chapter]

Erik A. Billing, Thomas Hellström, Lars-Erik Janlert
2011 Communications in Computer and Information Science  
A model-free learning algorithm called Predictive Sequence Learning (PSL) is presented and evaluated in a robot Learning from Demonstration (LFD) setting.  ...  The library is then used to control the robot by continually predicting the next action, based on the sequence of passed sensor and motor events.  ...  Acknowledgments We would like to thank Brandon Rohrer at Sandia National Laboratories and Christian Balkenius at Lund University for valuable input to this work.  ... 
doi:10.1007/978-3-642-19890-8_14 fatcat:j5npriqfsbfk5jvuwff3vvdwfy

Model-Free Learning and Control in a Mobile Robot

Brandon Rohrer, Michael Bernard, J. Dan Morrow, Fred Rothganger, Patrick Xavier
2009 2009 Fifth International Conference on Natural Computation  
A model-free, biologically-motivated learning and control algorithm called S-learning is described as implemented in an Surveyor SRV-1 mobile robot.  ...  S-learning demonstrated learning of robotic and environmental structure sufficient to allow it to achieve its goals (finding high-or low-contrast views in its environment).  ...  Acknowledgements Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000.  ... 
doi:10.1109/icnc.2009.38 dblp:conf/icnc/RohrerBMRX09 fatcat:46qi5j5lzfe35eghueux7yupna

Behavior recognition for Learning from Demonstration

Erik A Billing, Thomas Hellström, Lars-Erik Janlert
2010 2010 IEEE International Conference on Robotics and Automation  
Both methods are based on the dynamic temporal difference algorithm Predictive Sequence Learning (PSL) which has previously been proposed as a learning algorithm for robot control.  ...  The results indicate that PSLH-Comparison could be a suitable algorithm for integration in a hierarchical control system consistent with recent models of human perception and motor control.  ...  S-Comparison is based on S-Learning, a prediction-based control algorithm inspired by the human neuro-motor system [22] , [23] .  ... 
doi:10.1109/robot.2010.5509912 dblp:conf/icra/BillingHJ10 fatcat:cl3hstupnfc3jeuwcl3hst6niq

Model-Free reinforcement learning with continuous action in practice

T. Degris, P. M. Pilarski, R. S. Sutton
2012 2012 American Control Conference (ACC)  
ACKNOWLEDGEMENTS This work was supported by MPrime, the Alberta Innovates Centre for Machine Learning, the Glenrose Rehabilitation Hospital Fundation, Alberta Innovates-Technology Futures, NSERC, and Westgrid  ...  , a partner of Compute Canada.  ...  Model-Free Reinforcement Learning with Continuous Action in Practice Thomas Degris, Patrick M. Pilarski, Richard S.  ... 
doi:10.1109/acc.2012.6315022 fatcat:zcffq2qphvfpnnszvs5tu4ts5q

Door opening by joining reinforcement learning and intelligent control

Bojan Nemec, Leon Zlajpah, Ales Ude
2017 2017 18th International Conference on Advanced Robotics (ICAR)  
We propose a novel algorithm, that combines widely used reinforcement learning approach with intelligent control algorithms.  ...  In this paper we address a problem of how to open the doors with an articulated robot.  ...  Reinforcement learning (RL) algorithms, which incrementally improve the initial policy, are used for model free autonomous adaptation.  ... 
doi:10.1109/icar.2017.8023522 dblp:conf/icar/NemecZU17 fatcat:5stk4fvrlvfkdlfxwbrdok7lzi

Smart edutainment game for algorithmic thinking

Bourouaieh Douadi, Bensebaa Tahar, Seridi Hamid
2012 Procedia - Social and Behavioral Sciences  
This paper presents a novel approach in conceiving a hybrid learning environment that combines digital games characteristics, Micro World, and algorithm animation principles.  ...  The first one is a micro world, inspired by LOGO, where the student can write and visualize algorithms that create and act upon objects.  ...  Although a complete programming language, LOGO remains best known for its Turtle figures and children"s learning orientation. LOGO inspired a plethora of systems, especially robot programming games.  ... 
doi:10.1016/j.sbspro.2011.12.085 fatcat:bkihoxpoljfflmurdh3y4k4vym

Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient [article]

Samuele Tosatto, João Carvalho, Jan Peters
2021 arXiv   pre-print
The price of inefficiency becomes evident in real-world scenarios such as interaction-driven robot learning, where the success of RL has been rather limited, and a very high sample cost hinders straightforward  ...  Off-policy Reinforcement Learning (RL) holds the promise of better data efficiency as it allows sample reuse and potentially enables safe interaction with the environment.  ...  His research is focused on learning algorithms for control and robotics.  ... 
arXiv:2010.14771v3 fatcat:2w33jtlqe5cxvejf3om7myjc6e

Model-Based Reinforcement Learning with Continuous States and Actions

Marc Peter Deisenroth, Carl Edward Rasmussen, Jan Peters
2008 The European Symposium on Artificial Neural Networks  
GPDP is an approximate dynamic programming algorithm based on Gaussian process (GP) models for the value functions. In this paper, we extend GPDP to the case of unknown transition dynamics.  ...  After building a GP model for the transition dynamics, we apply GPDP to this model and determine a continuous-valued policy in the entire state space.  ...  We showed that in the case of the underpowered pendulum swing up, the policy based on learned system model performs almost as well as a computationally expensive optimal controller.  ... 
dblp:conf/esann/DeisenrothRP08 fatcat:wbcqskjtxzhnnlhqflg4lgkahq

Double Critic Deep Reinforcement Learning for Mapless 3D Navigation of Unmanned Aerial Vehicles [article]

Ricardo Bedin Grando, Junior Costa de Jesus, Victor Augusto Kich, Alisson Henrique Kolling, Paulo Lilles Jorge Drews-Jr
2021 arXiv   pre-print
This paper presents a novel deep reinforcement learning-based system for 3D mapless navigation for Unmanned Aerial Vehicles (UAVs).  ...  Instead of using a image-based sensing approach, we propose a simple learning system that uses only a few sparse range data from a distance sensor to train a learning agent.  ...  From the network, our robot receives a linear velocity, an altitude velocity and a ∆yaw. To control the robot, we used the RotorS' internal geometric tracking controller.  ... 
arXiv:2112.13724v1 fatcat:m6cpqs2oefgynjrz7i2p4wemcm

Robot Calligraphy using Pseudospectral Optimal Control in Conjunction with a Novel Dynamic Brush Model [article]

Sen Wang, Jiaqi Chen, Xuanliang Deng, Seth Hutchinson, Frank Dellaert
2020 arXiv   pre-print
In this paper, we formulate the calligraphy writing problem as a trajectory optimization problem, and propose an improved virtual brush model for simulating the real writing process.  ...  Our approach is inspired by pseudospectral optimal control in that we parameterize the actuator trajectory for each stroke as a Chebyshev polynomial.  ...  Calligraphy robots using learning-based methods Examples of simple learning-based methods include Sun et al. 's learning from demonstration [16] [1], Mueller et al.  ... 
arXiv:2003.01565v3 fatcat:tm26ooy7prdinkl4a4edpwz6mm

Robot Calligraphy using Pseudospectral Optimal Control in Conjunction with a Novel Dynamic Brush Model [article]

Sen Wang, Jiaqi Chen, Xuanliang Deng, Seth Hutchinson, Frank Dellaert
2020 arXiv   pre-print
In this paper, we formulate the calligraphy writing problem as a trajectory optimization problem, and propose an improved virtual brush model for simulating the real writing process.  ...  Our approach is inspired by pseudospectral optimal control in that we parameterize the actuator trajectory for each stroke as a Chebyshev polynomial.  ...  Calligraphy robots using learning-based methods Examples of simple learning-based methods include Sun et al. 's learning from demonstration [16] [1], Mueller et al.  ... 
arXiv:1911.08002v2 fatcat:lozdhgwm3racjhxaauysfqtdme

Learning-Based Synthesis of Safety Controllers [article]

Daniel Neider, Oliver Markgraf
2020 arXiv   pre-print
We develop a novel decision tree learning algorithm for this setting and show that our algorithm is guaranteed to converge to a reactive safety controller if a suitable overapproximation of the winning  ...  We propose a machine learning framework to synthesize reactive controllers for systems whose interactions with their adversarial environment are modeled by infinite-duration, two-player games over (potentially  ...  s learning algorithm to learn a decision tree t H over P that is consistent with S H . 1 2 TABLE I : I Properties of the compared tools Tool Easy to Easy to Guarantees No help model interpret  ... 
arXiv:1901.06801v4 fatcat:wm7j3dzht5ah7aejjq4gtppyva

Feedback controller parameterizations for Reinforcement Learning

John W. Roberts, Ian R. Manchester, Russ Tedrake
2011 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)  
Especially when learning feedback controllers for weakly stable systems, ineffective parameterizations can result in unstable controllers and poor performance both in terms of learning convergence and  ...  Reinforcement Learning offers a very general framework for learning controllers, but its effectiveness is closely tied to the controller parameterization used.  ...  There are many cases in which, even if the system model P (s) is known well, online learning can be advantageous over model-based design.  ... 
doi:10.1109/adprl.2011.5967370 dblp:conf/adprl/RobertsMT11 fatcat:2jq3vh3uqbhgzansheth5vsbki

Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information [article]

Jin Li, Xianyuan Zhan, Zixu Xiao, Guyue Zhou
2021 arXiv   pre-print
The latest methods that utilize human demonstration data and unsupervised representation learning has proven to be a promising direction to improve RL learning efficiency.  ...  End-to-end learning robotic manipulation with high data efficiency is one of the key challenges in robotics.  ...  Other model-free offline RL algorithms modifies Q-function training objective to learn a conservative, underestimated Q-function [35] , [36] .  ... 
arXiv:2110.10905v1 fatcat:rtyxzxfzizc57of525nvbck6aa

Accelerating autonomous learning by using heuristic selection of actions

Reinaldo A. C. Bianchi, Carlos H. C. Ribeiro, Anna H. R. Costa
2007 Journal of Heuristics  
Introduction Reinforcement learning (RL) algorithms are very attractive for solving a wide variety of control and planning problems when neither analytical model nor a sampling model is available a priori  ...  In general, case-based methods can face two problems: first, how to extract relevant features of a case for indexing the case base, and second, how to adapt a previous case for a current matched situation  ... 
doi:10.1007/s10732-007-9031-5 fatcat:qqj5xz63c5akfoxx4f6e7pltga
« Previous Showing results 1 — 15 out of 248 results