Filters








33,834 Hits in 4.8 sec

Solving Compositional Reinforcement Learning Problems via Task Reduction [article]

Yunfei Li, Yilin Wu, Huazhe Xu, Xiaolong Wang, Yi Wu
2021 arXiv   pre-print
We propose a novel learning paradigm, Self-Imitation via Reduction (SIR), for solving compositional reinforcement learning problems. SIR is based on two core ideas: task reduction and self-imitation.  ...  Task reduction tackles a hard-to-solve task by actively reducing it to an easier task whose solution is known by the RL agent.  ...  Deep reinforcement learning (RL) has recently shown promising capabilities for solving complex decision making problems.  ... 
arXiv:2103.07607v2 fatcat:zspegy4xtjgvpimonjylblodme

Continuous-time MAXQ Algorithm for Web Service Composition

Hao Tang, Wenjing Liu, Wenjuan Cheng, Lei Zhou
2012 Journal of Software  
Index Terms-web service composition, hierarchical reinforcement learning, semi-Markov decision process (SMDP), MAXQ  ...  algorithm, to solve large-scale web service composition problems in the context of continuous-time semi-Markov decision process (SMDP) model under either average-or discounted-cost criteria.  ...  Therefore, it is more practical to use continuoustime hierarchical reinforcement learning (HRL) algorithms to solve web service composition problems.  ... 
doi:10.4304/jsw.7.5.943-950 fatcat:mulfpxnpqzhkfbjlel5ozpwnv4

Exploiting Variable Impedance for Energy Efficient Sequential Movements [article]

Fan Wu, Matthew Howard
2020 arXiv   pre-print
learning.  ...  The effectiveness of the proposed method is evaluated using two consecutive reaching tasks on a variable impedance actuator.  ...  Different from the vanilla reinforcement learning from exploration and evaluation we have an inner loop to solve {OCP i } sequentially.  ... 
arXiv:2002.12075v2 fatcat:4ewywh54nvd6leyunwghr3dqt4

Towards a Framework for Comparing the Complexity of Robotic Tasks [article]

Michelle Ho and Alec Farid and Anirudha Majumdar
2022 arXiv   pre-print
We illustrate our framework for comparing robotic tasks using (i) examples where one can analytically establish reductions, and (ii) reinforcement learning examples where the proposed algorithm can estimate  ...  To this end, we define a notion of reduction that formalizes the following intuition: Task 1 reduces to Task 2 if we can efficiently transform any policy that solves Task 2 into a policy that solves Task  ...  We demonstrate our framework using (i) illustrative examples where one can analytically establish reductions (Sec. 5), and (ii) numerical examples based on reinforcement learning problems where we apply  ... 
arXiv:2202.09892v3 fatcat:sabddhffkne2hhr642fecgqalu

Relating reinforcement learning performance to classification performance

John Langford, Bianca Zadrozny
2005 Proceedings of the 22nd international conference on Machine learning - ICML '05  
In particular, we discuss possible methods for generating training examples for a classifier learning algorithm.  ...  It gives us insight into what are the critical prediction problems necessary for solving reinforcement learning and the relative difficulty of these problems.  ...  Other machine learning reductions relate the performance of one task to the performance of another by a mapping from one task to another inherent in the learning process (as discussed in Beygelzimer et  ... 
doi:10.1145/1102351.1102411 dblp:conf/icml/LangfordZ05 fatcat:jbjr2n752bhw7aloih46qr5byu

Automatically Composing Representation Transformations as a Means for Generalization [article]

Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths
2019 arXiv   pre-print
This paper introduces the compositional problem graph as a broadly applicable formalism to relate tasks of different complexity in terms of problems with shared subproblems.  ...  A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task  ...  Sec. 4 describes how CRL takes advantage of this compositional formulation in a multi-task zero-shot generalization setup to solve new problems by re-using computations learned from solving past problems  ... 
arXiv:1807.04640v2 fatcat:rupxorh2lndmpgophzwjzo4a5a

Page 1237 of Psychological Abstracts Vol. 66, Issue 6 [page]

1981 Psychological Abstracts  
(McGill U, Montreal, Canada) Type of reinforcer, problem-solving set, and awareness in verbal conditioning. American Journal of Psychology, 1980(Sep), Vol 93(3), 539-549.  ...  —The effects of instructions that create a problem-solving attitude and of various types of reinforcers (verbal, monetary, or both) on awareness and performance in verbal conditioning were as- sessed both  ... 

Data-efficient Deep Reinforcement Learning for Dexterous Manipulation [article]

Ivaylo Popov, Nicolas Heess, Timothy Lillicrap, Roland Hafner, Gabriel Barth-Maron, Matej Vecerik, Thomas Lampe, Yuval Tassa, Tom Erez, Martin Riedmiller
2017 arXiv   pre-print
Deep learning and reinforcement learning methods have recently been used to solve a variety of problems in continuous control domains.  ...  Solving this difficult and practically relevant problem in the real world is an important long-term goal for the field of robotics.  ...  We attempt to transfer this idea to our compositional setup via, what we call, composite (shaping) rewards.  ... 
arXiv:1704.03073v1 fatcat:rpmfqrmnmvfatkinrtq37hdo6m

The neural and cognitive architecture for learning from a small sample [article]

Aurelio Cortese, Benedetto De Martino, Mitsuo Kawato
2018 arXiv   pre-print
Here we propose a model whereby higher cognitive functions profoundly interact with reinforcement learning to drastically reduce the degrees of freedom of the search space, simplifying complex problems  ...  The brain does not directly solve difficult problems, it is able to recast them into new and more tractable problems.  ...  We postulate that brains transform these intractable learning problems into more feasible reinforcement learning problems with small degrees of freedom while being guided by reward and penalty.  ... 
arXiv:1810.02476v1 fatcat:vvsuq23wcjglvc5sg7jusfvzkq

The neural and cognitive architecture for learning from a small sample

Aurelio Cortese, Benedetto De Martino, Mitsuo Kawato
2019 Current Opinion in Neurobiology  
Here, we propose a model whereby higher cognitive functions profoundly interact with reinforcement learning to drastically reduce the degrees of freedom of the search space, simplifying complex problems  ...  The brain does not directly solve difficult problems, it is able to recast them into new and more tractable problems.  ...  We postulate that brains transform these intractable learning problems into more feasible reinforcement learning problems with small degrees of freedom while being guided by reward and penalty.  ... 
doi:10.1016/j.conb.2019.02.011 pmid:30953964 fatcat:ok2xqz7anvbu3c4ak26mdr6nzu

Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning [article]

Yunfei Li, Tian Gao, Jiaqi Yang, Huazhe Xu, Yi Wu
2022 arXiv   pre-print
It has been a recent trend to leverage the power of supervised learning (SL) towards more effective reinforcement learning (RL) methods.  ...  PAIR substantially outperforms both non-phasic RL and phasic SL baselines on sparse-reward goal-conditioned robotic control problems, including a challenging stacking task.  ...  The main idea is to decompose a challenging task into a composition of two simpler sub-tasks so that both sub-tasks can be solved by the current policy.  ... 
arXiv:2206.12030v1 fatcat:filvc4klurdpbpvzd6iyj3im4a

Holistic Reinforcement Learning: The Role of Structure and Attention

Angela Radulescu, Yael Niv, Ian Ballard
2019 Trends in Cognitive Sciences  
In turn, selective attention biases reinforcement learning towards relevant dimensions of the environment.  ...  Reinforcement learning models capture many behavioral and neural effects but do not explain recent findings showing that structure in the environment influences learning.  ...  These compositions can be rapidly applied to solve new problems, such as tying a bow on a gift or triple-knotting one's shoelaces before a hike.  ... 
doi:10.1016/j.tics.2019.01.010 pmid:30824227 pmcid:PMC6472955 fatcat:n3dvwxnh5vggzoaqkemfkgzxpe

Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings [article]

Jorge A. Mendez and Alborz Geramifard and Mohammad Ghavamzadeh and Bing Liu
2022 arXiv   pre-print
Learning task-oriented dialog policies via reinforcement learning typically requires large amounts of interaction with users, which in practice renders such methods unusable for real-world applications  ...  We show how this approach is capable of learning with significantly less interaction with users, with a reduction of 35% in the number of dialogs required to learn, and to a higher level of proficiency  ...  Another related, but distinct, problem that has received attention in recent years is that of learning to solve composite dialog tasks [Cuayáhuitl et al., 2017 , Peng et al., 2017 , Wang et al., 2014  ... 
arXiv:2207.00468v1 fatcat:xennlr5jgfbphbc7oc455zp3me

Guided Imitation of Task and Motion Planning [article]

Michael James McDonald, Dylan Hadfield-Menell
2021 arXiv   pre-print
Among these tasks, we can learn a policy that solves the RoboSuite 4-object pick-place task 88% of the time from object pose observations and a policy that solves the RoboDesk 9-goal benchmark 79% of the  ...  While modern policy optimization methods can do complex manipulation from sensory data, they struggle on problems with extended time horizons and multiple sub-goals.  ...  Apprenticeship learning via inverse reinforcement learning. In Proceedings of the Twenty-First International Conference on Machine Learning, ICML ’04, page 1, New York, NY, USA, 2004.  ... 
arXiv:2112.03386v1 fatcat:liiwhjl67fao7phu2tuj4bj7yy

Reinforcement Learning Transfer via Common Subspaces [chapter]

Haitham Bou Ammar, Matthew E. Taylor
2012 Lecture Notes in Computer Science  
Although reinforcement learning (RL) has been successfully deployed in a variety of tasks, learning speed remains a fundamental problem for applying RL in complex environments.  ...  Transfer learning aims to ameliorate this shortcoming by speeding up learning through the adaptation of previously learned behaviors in similar tasks.  ...  Reinforcement learning (RL) is a popular framework that allows agents to solve sequential decision making problems with minimal feedback.  ... 
doi:10.1007/978-3-642-28499-1_2 fatcat:rvxgxa5awbfmvncaad3jo5jecu
« Previous Showing results 1 — 15 out of 33,834 results