329 Hits in 8.0 sec

Off-Policy Adversarial Inverse Reinforcement Learning [article]

Samin Yeasar Arnob
2020 arXiv   pre-print
Adversarial Inverse Reinforcement Learning (AIRL) leverages the idea of AIL, integrates a reward function approximation along with learning the policy, and shows the utility of IRL in the transfer learning  ...  algorithm in the continuous control tasks.  ...  Implementation of Discriminator Actor-Critic algorithm used in this paper is initially implemented as ICLR-2019 Reproducibility Challenge along with Sheldon Benard and Vincent Luczkow [23] .  ... 
arXiv:2005.01138v1 fatcat:pc73o264g5hy3g5s5e3s7kuope

Toward Robust Long Range Policy Transfer [article]

Wei-Cheng Tseng, Jin-Siang Lin, Yao-Min Feng, Min Sun
2021 arXiv   pre-print
To mimic this capability, hierarchical models combining primitive policies learned from prior tasks have been proposed.  ...  We demonstrate that our method outperforms other recent policy transfer methods by combining and adapting these reusable primitives in tasks with continuous action space.  ...  MCP (Peng et al. 2019 ): A multiplicative model that enables the agent to activate multiple primitives simultaneously is trained.  ... 
arXiv:2103.02957v1 fatcat:p3daxb43p5fidodefs6r37plei

Composing Task-Agnostic Policies with Deep Reinforcement Learning [article]

Ahmed H. Qureshi, Jacob J. Johnson, Yuzhe Qin, Taylor Henderson, Byron Boots, Michael C. Yip
2019 arXiv   pre-print
To date, there has been plenty of work on learning task-specific policies or skills but almost no focus on composing necessary, task-agnostic skills to find a solution to new problems.  ...  We evaluate our method in difficult cases where training policy through standard reinforcement learning (RL) or even hierarchical RL is either not feasible or exhibits high sample complexity.  ...  A recent and similar work to ours is a multiplicative composition policies (MCP) framework (Peng et al., 2019) .  ... 
arXiv:1905.10681v2 fatcat:5omkyama6bhffk4naz7xeh3aoi

Unsupervised Reinforcement Learning for Transferable Manipulation Skill Discovery [article]

Daesol Cho, Jigang Kim, H. Jin Kim
2022 arXiv   pre-print
manipulation tasks with the learned task-agnostic skills.  ...  It not only enables the agent to learn interaction behavior, the key aspect of the robotic manipulation learning, without access to the environment reward, but also to generalize to arbitrary downstream  ...  Another quantity is a compositional policy network structure, called multiplicative compositional policies (MCP) [20] .  ... 
arXiv:2204.13906v1 fatcat:lroylkcvmffotlkh6rsysqyhkm

Probabilistic Mixture-of-Experts for Efficient Deep Reinforcement Learning [article]

Jie Ren, Yewen Li, Zihan Ding, Wei Pan, Hao Dong
2021 arXiv   pre-print
Deep reinforcement learning (DRL) has successfully solved various problems recently, typically with a unimodal policy representation.  ...  However, grasping distinguishable skills for some tasks with non-unique optima can be essential for further improving its learning efficiency and performance, which may lead to a multimodal policy represented  ...  , the Multiplicative Compositional Policies (MCP) (Peng et al., 2019) , and other two implementations of PMOE with Gumbel-Softmax (Jang et al., 2017; Maddison et al., 2017) and REINFORCE (Williams,  ... 
arXiv:2104.09122v1 fatcat:vp7pnxndtvffzf7kpe74zshdsy

A Boolean Task Algebra for Reinforcement Learning [article]

Geraud Nangue Tasse, Steven James, Benjamin Rosman
2020 arXiv   pre-print
The ability to compose learned skills to solve new tasks is an important property of lifelong-learning agents. In this work, we formalise the logical composition of tasks as a Boolean algebra.  ...  We then show that by learning goal-oriented value functions and restricting the transition dynamics of the tasks, an agent can solve these new tasks with no further learning.  ...  We used the ADAM optimiser with batch size 32 and a learning rate of 10 −4 . We trained every 4 timesteps and update the target Q-network every 1000 steps.  ... 
arXiv:2001.01394v2 fatcat:6xft2o3oinhodpeo3esvone55a

OASIcs, Volume 36, MCPS'14, Complete Volume [article]

Volker Turau, Marta Kwiatkowska, Rahul Mangharam, Christoph Weyer
OASIcs, Volume 36, MCPS'14, Complete Volume  ...  Our team at the MD PnP Program has had the pleasure of working with many generous and brilliant collaborators.  ...  Determination whether sharing of the above with stakeholders spanning the clinical, technical, and management domains enable clearer communication, e.g., to support decision making.  ... 
doi:10.4230/oasics.mcps.2014 fatcat:gkvpwkcfkbcl3gjpr7b24e3ky4

Technical sessions

2007 2007 12th IEEE Symposium on Computers and Communications  
Simulation results are given illustrating the behavior of the new signaling overload control scheme and comparing it with existing schemes in the literature.  ...  This paper investigates three joint call admission control algorithms (JCAC) for heterogeneous cellular networks.  ...  When the networks compose, the registries also have to compose. This composition is based on a composition agreement that must be negotiated between the different parties involved.  ... 
doi:10.1109/iscc.2007.4381658 fatcat:fu7k2cap6bct5edk63zejhtk7q

The FORA Fog Computing Platform for Industrial IoT

Paul Pop, Wilfried Steiner, Jan Ruh, Sasikumar Punnekkat, Stefan Schulte, Mohammadreza Barzegaran, Bahram Zarrin
2021 Zenodo  
Instead, resources are composed, i.e., combined with each other.  ...  In our case, FN is equipped with a Commercial Of-The-Shelf (COTS) multicore processor (MCP), accelerators, such as FPGAs, for machine learning, and advanced wired and wireless networking capabilities.  ... 
doi:10.5281/zenodo.5856300 fatcat:5vlcbt45ljdppfhvz2iv7qmmh4

Catch Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks [article]

Josh Merel, Saran Tunyasuvunakool, Arun Ahuja, Yuval Tassa, Leonard Hasenclever, Vu Pham, Tom Erez, Greg Wayne, Nicolas Heess
2020 arXiv   pre-print
We develop an integrated neural-network based approach consisting of a motor primitive module, human demonstrations, and an instructed reinforcement learning regime with curricula and task variations.  ...  The resulting controllers can be deployed in real-time on a standard PC. See overview video, .  ...  Although Peng et al. 2019 reported that their multiplicative compositional policies (MCP) network structure performed better than an MLP in their settings, our own exploratory investigation of the reported  ... 
arXiv:1911.06636v2 fatcat:zjtpjueewbafjgqi5grgr6dcra

The FORA Fog Computing Platform for Industrial IoT [article]

Paul Pop, Bahram Zarrin, Mohammadreza Barzegaran, Stefan Schulte, Sasikumar Punnekkat, Jan Ruh, Wilfried Steiner
2020 arXiv   pre-print
802.1 Time-Sensitive Networking (TSN) and OPC Unified Architecture (OPC UA); mechanisms for resource management and or-chestration; and services for security, fault tolerance and distributed machine learning  ...  The FCP is based on: deter-ministic virtualization that reduces the effort required for safety and security assurance; middle-ware for supporting both critical control and dynamic Fog applications; deterministic  ...  Instead, resources are composed, i.e., combined with each other.  ... 
arXiv:2007.02696v1 fatcat:z3nbrimj6vbq3e3xhe34gm7umq

From programme theory to logic models for multispecialty community providers: a realist evidence synthesis

Rod Sheaff, Sarah L Brand, Helen Lloyd, Amanda Wanner, Mauro Fornasiero, Simon Briscoe, Jose M Valderas, Richard Byng, Mark Pearson
2018 Health Services and Delivery Research  
The NHS policy of constructing multispecialty community providers (MCPs) rests on a complex set of assumptions about how health systems can replace hospital use with enhanced primary care for people with  ...  Design Realist synthesis with a three-stage method: (1) for policy documents, elicit the IPT underlying the MCP policy, (2) review and synthesise secondary evidence relevant to those assumptions and (3  ...  or MSCP or PACS) and (NHS or "national health service*" or UK or "united kingdom*" or england* or wales* or scotland* or ireland*) ) OR AB ( (MCP or MSCP or PACS) and (NHS or "national health service*  ... 
doi:10.3310/hsdr06240 fatcat:mo7ld2egxzewrf5ctnn2n6b3i4

A systems thinking approach to understanding the challenges of achieving the circular economy

Eleni Iacovidou, John N. Hahladakis, Phil Purnell
2020 Environmental science and pollution research international  
Presently, discussions are mostly concerned with the importance of achieving CE and the benefits associated therewith, with the various barriers surrounding its implementation being less debated.  ...  This, in turn, can help to align priorities and transform our current practices, speeding up the process of closing the MCP loops in a sustainable manner.  ...  Subsequently, the alignment of businesses' priorities with those at the policy spheres and the introduction of policy mechanisms that can promote radical innovation and change in the way MCPs are used,  ... 
doi:10.1007/s11356-020-11725-9 pmid:33289042 fatcat:f4dh7m6xtvdz5pjrxgm6djfu5i

Dynamics-Regulated Kinematic Policy for Egocentric Pose Estimation [article]

Zhengyi Luo, Ryo Hachiuma, Ye Yuan, Kris Kitani
2021 arXiv   pre-print
We evaluate our egocentric pose estimation method in both controlled laboratory settings and real-world scenarios.  ...  As studied in MCP [35] , this hierarchical control policy increases the model's capacity to learn multiple skills simultaneously.  ...  Our Universal Humanoid Controller (UHC)'s workflow and architecture can be seen in Fig. 4 . π UHC (a t |s t , q t+1 ) is implemented as a multiplicative compositional policy (MCP) [35] with eight motion  ... 
arXiv:2106.05969v2 fatcat:p3jlnm7kbrgo3nynjvthtktebq


Dedong Wang, Shaoze Fang, Hongwei Fu
2019 Journal of Civil Engineering and Management  
TC2 It is easy to gauge market competition and policy. TC3 It is easy to assess how well each MCP participant is doing.  ...  In MCPs, composite quasi-rent can be considered the investment in specialized assets to support the construction of the MCPs, it may lead to additional transaction costs.  ... 
doi:10.3846/jcem.2019.9621 fatcat:23i5c45zcrajhg7i6rcynbljqi
« Previous Showing results 1 — 15 out of 329 results