807 Hits in 3.5 sec

A Workflow for Offline Model-Free Robotic Reinforcement Learning [article]

Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine
2021 arXiv   pre-print
In this paper, our aim is to develop a practical workflow for using offline RL analogous to the relatively well-understood workflows for supervised learning problems.  ...  Offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction.  ...  setup as well as for providing us with offline datasets we could test our workflow on.  ... 
arXiv:2109.10813v2 fatcat:bt5kt23fgfcxbblzc4464hrsf4

UnrealROX+: An Improved Tool for Acquiring Synthetic Data from Virtual 3D Environments [article]

Pablo Martinez-Gonzalez, Sergiu Oprea, John Alejandro Castro-Vargas, Alberto Garcia-Garcia, Sergio Orts-Escolano, Jose Garcia-Rodriguez, Markus Vincze
2021 arXiv   pre-print
Nevertheless, its workflow was very tied to generate image sequences from a robotic on-board camera, making hard to generate data for other purposes.  ...  for interacting with the virtual environment from Deep Learning frameworks.  ...  This work has also been supported by Spanish national grants for PhD studies FPU17/00166, ACIF/2018/197 and UAFPU2019-13. Experiments were made possible by a generous hardware donation from NVIDIA.  ... 
arXiv:2104.11776v1 fatcat:wttdiq7oebcibbclf74cl6tx4e

Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach

Mai Xu, Yuhang Song, Jianyi Wang, MingLang Qiao, Liangyu Huo, Zulin Wang
2018 IEEE Transactions on Pattern Analysis and Machine Intelligence  
Finally, the experiments validate that our approach is effective in both offline and online prediction of HM positions for panoramic video, and that the learned offline-DHP model can improve the performance  ...  In online-DHP, the next HM position of one subject is estimated given the currently observed HM position, which is achieved by developing a DRL algorithm upon the learned offline-DHP model.  ...  Thus, we set the number of workflows N to be 58 in our experiments. Reinforcement learning vs. supervised learning.  ... 
doi:10.1109/tpami.2018.2858783 pmid:30047871 fatcat:ckgvhz5kyrbzlkyn3xu3nzmxau

D2.3 Multi-agent Deep Reinforcement Learning Scheme Specifications

Yue Zhang, Lu Ge, John Cosmas, Ben Meunier, Geoffrey Eappen, Kareem Ali, Israel Koffman, Alexandre Kazmierowski, Victor Gabillon, Alexander Artemenko, Uwe Wostradowski, Ta Dang Khoa Le
2021 Zenodo  
This deliverable reports all the activities and outcomes related to specifying the requirements for the multi-agent Deep Reinforcement Learning (MA-DRL) scheme of 6G BRAINS.  ...  This report lays the ground for the developments of the RL-applications in the MA-DRL scheme.  ...  For example, model-free DRL algorithms are usually evaluated on 100K to 10M interactions with academic environments such as Atari games or Mujoco robotics simulations.  ... 
doi:10.5281/zenodo.5786347 fatcat:hgquszibifar7pcjud6ckwkqkq

Transferable Force-Torque Dynamics Model for Peg-in-hole Task [article]

Junfeng Ding, Chen Wang, Cewu Lu
2019 arXiv   pre-print
We present a learning-based force-torque dynamics to achieve model-based control for contact-rich peg-in-hole task using force-only inputs.  ...  To tackle these problems, we propose a multi-pose force-torque state representation, based on which a dynamics model is learned with the data generated in a sample-efficient offline fashion.  ...  Model-based Reinforcement Learning With the learned dynamics model, the training of the RL policy model can be changed from online to offline.  ... 
arXiv:1912.00260v1 fatcat:pmdcb54ekvhgnnvs6cy5lroesa

Aggregating Long-Term Context for Learning Laparoscopic and Robot-Assisted Surgical Workflows [article]

Yutong Ban, Guy Rosman, Thomas Ward, Daniel Hashimoto, Taisei Kondo, Hidekazu Iwaki, Ozanan Meireles, Daniela Rus
2021 arXiv   pre-print
Analyzing surgical workflow is crucial for surgical assistance robots to understand surgeries.  ...  Deep learning techniques have recently been widely applied to recognizing surgical workflows.  ...  [14] used a temporal convolution network (TCN) for action segmentation. [31] applied TCN in surgery and combined it with reinforcement learning (RL) for surgical gesture recognition.  ... 
arXiv:2009.00681v4 fatcat:5kqgfyw2cbeglk5lcrvx5a5jxu

Reinforcement Learning in Practice: Opportunities and Challenges [article]

Yuxi Li
2022 arXiv   pre-print
Then we discuss challenges, in particular, 1) foundation, 2) representation, 3) reward, 4) exploration, 5) model, simulation, planning, and benchmarks, 6) off-policy/offline learning, 7) learning to learn  ...  In this article, we first give a brief introduction to reinforcement learning (RL), and its relationship with deep learning, machine learning and AI.  ...  Model-free RL interacts with the environment online or offline to collect a huge amount of training data.  ... 
arXiv:2202.11296v2 fatcat:xdtsmme22rfpfn6rgfotcspnhy


Johanna Ender, Jan Cetric Wagner, Georg Kunert, Fang Bin Guo, Roland Larek, Thorsten Pawletta
2019 Informatyka Automatyka Pomiary w Gospodarce i Ochronie Środowiska  
As a flexible hybrid cell for HRC integrated into a Self-Adapting-Production-Planning-System (SAPPS) assists the worker while interaction.  ...  There have been few studies on how Human Factors influence the design of workplaces for Human-Robot Collaboration (HRC).  ...  The learning process consists of two steps: the real-world process will be simulated with a model and learned (i) offline with the postoptimizing reinforcement learning; then machine learning needs to  ... 
doi:10.35784/iapgos.36 fatcat:zisvhumjzzfypowf66gtwj6yl4

TrueRMA: Learning Fast and Smooth Robot Trajectories with Recursive Midpoint Adaptations in Cartesian Space [article]

Jonas C. Kiemel, Pascal Meißner, Torsten Kröger
2020 arXiv   pre-print
We present TrueRMA, a data-efficient, model-free method to learn cost-optimized robot trajectories over a wide range of starting points and endpoints.  ...  Given a starting point and an endpoint as input, a neural network is trained to predict midpoint adaptations that minimize the cost of the resulting trajectory via reinforcement learning.  ...  Using model-free reinforcement learning, a policy π : S → A is trained to map states s ∈ S to those actions aA that maximize the expected reward.  ... 
arXiv:2006.03497v1 fatcat:kctq2fvtwbardpvwj7dep5lk6e

An Efficient and Accurate DDPG-based Recurrent Attention Model for Object Localization

Fengkai Ke
2020 IEEE Access  
His main research interests include deep learning, reinforcement learning, medical image processing, and so on.  ...  Some samples are repeatedly learned, which slows down the convergence rate of the neural network model, and even causes the model to converge to the local optimal solution.  ...  The agent of reinforcement learning accomplish a certain task through a large number of training. Reinforcement learning algorithms can be divided into model-based learning and model-free learning.  ... 
doi:10.1109/access.2020.3008171 fatcat:hr3e2i5qdvfxppzcryp5m7becm

Learning and Comfort in Human–Robot Interaction: A Review

Weitian Wang, Yi Chen, Rui Li, Yunyi Jia
2019 Applied Sciences  
In this paper, we present a comprehensive review for two significant topics in human–robot interaction: robots learning from demonstrations and human comfort.  ...  The collaboration quality between the human and the robot has been improved largely by taking advantage of robots learning from demonstrations.  ...  Reinforcement-Learning-Based Approach In this approach, the robot learns through trial and error to maximize a reward such that it allows the robot to discover new control policies through free exploration  ... 
doi:10.3390/app9235152 fatcat:67n52vkggbhtzlfz53bz5bglna

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems [article]

Rafael Figueiredo Prudencio, Marcos R. O. A. Maximo, Esther Luna Colombini
2022 arXiv   pre-print
Effective offline RL algorithms have a much wider range of applications than online RL, being particularly appealing for real-world applications such as education, healthcare, and robotics.  ...  With the widespread adoption of deep learning, reinforcement learning (RL) has experienced a dramatic increase in popularity, scaling to previously intractable problems, such as playing complex games from  ...  Index Terms-Deep learning, reinforcement learning, offline reinforcement learning, batch reinforcement learning I.  ... 
arXiv:2203.01387v2 fatcat:euobvze7kre3fi7blalnbbgefm

Document-editing Assistants and Model-based Reinforcement Learning as a Path to Conversational AI [article]

Katya Kudashkina, Patrick M. Pilarski, Richard S. Sutton
2020 arXiv   pre-print
Together, voice document editing and model-based reinforcement learning comprise a promising research direction for achieving conversational AI.  ...  In this article we argue for the domain of voice document editing and for the methods of model-based reinforcement learning.  ...  Acknowledgments We would like to thank Peter Wittek and Joseph Modayil for their useful discussions and feedback.  ... 
arXiv:2008.12095v1 fatcat:24o74xfhoba6hla7cbcdbijo6i

A Reinforcement Learning Approach to View Planning for Automated Inspection Tasks

Christian Landgraf, Bernd Meese, Michael Pabst, Georg Martius, Marco F. Huber
2021 Sensors  
Reinforcement Learning (RL) offers promising, intelligent solutions for robotic inspection and manufacturing tasks.  ...  The framework extends available open-source libraries and provides an interface to the Robot Operating System (ROS) for deploying any supported robot and sensor.  ...  Offline programming (OLP) systems are based on CAD models and robot simulation software.  ... 
doi:10.3390/s21062030 pmid:33805587 pmcid:PMC7998553 fatcat:6o6ivy6cdngg3gwhjn7w2rwr6a

Online Shielding for Stochastic Systems [article]

Bettina Könighofer, Julian Rudolf, Alexander Palmisano, Martin Tappler, Roderick Bloem
2020 arXiv   pre-print
In this paper, we propose a method to develop trustworthy reinforcement learning systems.  ...  Our main contribution is a new synthesis algorithm for computing the shield online.  ...  Reinforcement learning is implemented via approximate Q-learning [33] with the feature vector denoting the distance to the next apple.  ... 
arXiv:2012.09539v1 fatcat:pvxoyuhkcbgy3orcvlsop4odcq
« Previous Showing results 1 — 15 out of 807 results