22,372 Hits in 3.4 sec

Sparse cooperative Q-learning

Jelle R. Kok, Nikos Vlassis
2004 Twenty-first international conference on Machine learning - ICML '04  
In this paper we are interested in using Q-learning to learn the coordinated actions of a group of cooperative agents, using a sparse representation of the joint stateaction space of the agents.  ...  Next, we use a coordination-graph approach in which we represent the Q-values by value rules that specify the coordination dependencies of the agents at particular states.  ...  In this paper we describe a multiagent Q-learning technique, called Sparse Cooperative Q-learning, that allows a group of agents to learn how to jointly solve a task when the global coordination requirements  ... 
doi:10.1145/1015330.1015410 dblp:conf/icml/KokV04 fatcat:ectjh33w3bhujjg3qevz6qyll4

S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning? [article]

Shuang Luo, Yinchuan Li, Jiahui Li, Kun Kuang, Furui Liu, Yunfeng Shao, Chao Wu
2022 arXiv   pre-print
To this end, we propose a sparse state based MARL (S2RL) framework, which utilizes a sparse attention mechanism to discard irrelevant information in local observations.  ...  Collaborative multi-agent reinforcement learning (MARL) has been widely used in many practical applications, where each agent makes a decision based on its own observation.  ...  To better learn the role of entities in credit assignment, we use a mixing network to estimate the global Q-values 𝑄 𝑡𝑜𝑡 𝑑𝑒𝑛𝑠𝑒 and 𝑄 𝑡𝑜𝑡 𝑠𝑝𝑎𝑟𝑠𝑒 , using per-agent utility 𝑄 𝑖 𝑑𝑒𝑛𝑠𝑒  ... 
arXiv:2206.11054v1 fatcat:qqxdktxfabh5fktsi35bvxgoc4

Cooperative Multi-Agent Policy Gradients with Sub-optimal Demonstration [article]

Peixi Peng, Junliang Xing
2021 arXiv   pre-print
Many reality tasks such as robot coordination can be naturally modelled as multi-agent cooperative system where the rewards are sparse.  ...  To learn the multi-agent cooperation effectively and tackle the sub-optimality of demonstration, a self-improving learning method is proposed: On the one hand, the centralized state-action values are initialized  ...  Sparse cooperative Q-learning (Kok and Vlassis 2006) proposes a tabular Q-learning algorithm that learns to coordinate the actions of a group of cooperative agents only in the states where such coordination  ... 
arXiv:1812.01825v2 fatcat:phdg7eaofjdy7k452gfk465uii

On Solving Cooperative MARL Problems with a Few Good Experiences [article]

Rajiv Ranjan Kumar, Pradeep Varakantham
2020 arXiv   pre-print
Cooperative Multi-agent Reinforcement Learning (MARL) is crucial for cooperative decentralized decision learning in many domains such as search and rescue, drone surveillance, package delivery and fire  ...  Learning decisions with a few good experiences is extremely challenging in cooperative MARL problems due to three reasons.  ...  SIL for Cooperative MARL Self imitation learning in single agent case imitates past good experiences multiple times (based on priority) and prioritizes learning with those good experiences.  ... 
arXiv:2001.07993v1 fatcat:urd64oexsvg6lfrdmienehzpdm

MARL-Based Dual Reward Model on Segmented Actions for Multiple Mobile Robots in Automated Warehouse Environment

Hyeoksoo Lee, Jiwoo Hong, Jongpil Jeong
2022 Applied Sciences  
makes learning start earlier even there is a sparse reward problem and learning progress was maintained stably.  ...  problems when learning paths in a warehouse with reinforcement learning.  ...  Sparse Reward and Improvement Methods Reinforcement learning often does not go well because of the sparse reward problem.  ... 
doi:10.3390/app12094703 fatcat:ta2ypf7uqvhzveko4pvj56m3oq

A Real-World-Oriented Multi-Task Allocation Approach Based on Multi-Agent Reinforcement Learning in Mobile Crowd Sensing

Han, Zhang, Wu
2020 Information  
Q-learning.  ...  Secondly, two cooperation mechanisms are proposed for obtaining the stable joint action, which can minimize the total sensing time while meeting the sensing quality constraint, which optimizes the sensing  ...  Table 2 . 2 The sparse degree of Q value.  ... 
doi:10.3390/info11020101 fatcat:a3ackkkv6rbv7cgaloapkhn5li

Context-Aware Sparse Deep Coordination Graphs [article]

Tonghan Wang, Liang Zeng, Weijun Dong, Qianlan Yang, Yang Yu, Chongjie Zhang
2022 arXiv   pre-print
Learning sparse coordination graphs adaptive to the coordination dynamics among agents is a long-standing problem in cooperative multi-agent learning.  ...  We carry out a case study and experiments on the MACO and StarCraft II micromanagement benchmark to demonstrate the dynamics of sparse graph learning, the influence of graph sparseness, and the learning  ...  Sparse cooperative Q-learning (Kok & Vlassis, 2006) learns value functions for sparse coordination graphs, but the graph topology is static and predefined by prior knowledge.  ... 
arXiv:2106.02886v3 fatcat:en5argoocnbrzmjfiz3evp2lmi

Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping [article]

Eugenio Bargiacchi, Timothy Verstraeten, Diederik M. Roijers, Ann Nowé
2020 arXiv   pre-print
Our method outperforms the state-of-the-art algorithm sparse cooperative Q-learning algorithm, both on the well-known SysAdmin benchmark and randomized environments.  ...  We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, for efficient learning in multi-agent Markov decision processes.  ...  Discussion and Conclusions We have presented a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, which exploits the structure of a coordination graph in an MMDP to both  ... 
arXiv:2001.07527v1 fatcat:vxusvoweyfcbfkjedum6ulwz2a

Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems

Jianye Hao, Ho-Fung Leung, Zhong Ming
2014 ACM Transactions on Autonomous and Adaptive Systems  
Coordination in cooperative multiagent systems is an important problem and has received a lot of attention in multiagent learning literature.  ...  The learning performance of both types of learners are evaluated under a number of challenging deterministic and stochastic cooperative games.  ...  Most of previous works heavily rely on the Q-learning algorithm as the basis, and can be considered as various modifications of single agent Q-learning algorithms to cooperative multiagent environments  ... 
doi:10.1145/2644819 fatcat:bjdwsiqhcrh6dfrxk5vrw5ubzm

Using reinforcement learning to autonomously identify the source of errors for agents in a group mission [article]

Keishu Utimula, Ken-taro Hayaschi, Trevor J. Bihl, Kousuke Nakano, Kenta Hongo, Ryo Maezono
2022 arXiv   pre-print
To mitigate the abovementioned shortcoming, we (successfully) applied the reinforcement learning technique, achieving the maximization of such a sparse value function.  ...  Machine learning was concluded autonomously.The colliding action is the basis of distinguishing the hypothesizes.  ...  For such sparse reward optimizations, the reinforcement learning can be used as an effective alternative.  ... 
arXiv:2107.09232v3 fatcat:gavx4fzq3bhlzhafpcnrbbhoxy

Table of contents

2020 IEEE Communications Letters  
Luan 1455 MACHINE LEARNING Cooperative Spectrum Sensing Meets Machine Learning: Deep Reinforcement Learning Approach ..................... ..............................................................  ...  Meng, and Q. Wu 1441 Fingerprint Localization N.-S. Vo, M.-P. Bui, P. Q. Truong, C. Yin, and A. Masaracchia 1500  ... 
doi:10.1109/lcomm.2020.3000622 fatcat:tlgsphi6crc77hht7ffedd3rqq

Automatic Eigentemplate Learning for Sparse Template Tracker [chapter]

Keiji Sakabe, Tomoyuki Taguchi, Takeshi Shakunaga
2009 Lecture Notes in Computer Science  
Automatic eigentemplate learning is discussed for a sparse template tracker.  ...  Once the eigentemplate learning is accomplished, the sparse template tracker can work with the eigentemplate instead of an adaptive template.  ...  Since the two trackers are built in a common framework of sparse template tracker, the cooperation can be easily and widely utilized in a lot of applications.  ... 
doi:10.1007/978-3-540-92957-4_62 fatcat:d3eu3wzo5jcvbnckab2hsexvjy

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction [article]

Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Zhaopeng Meng, Changjie Fan, Li Wang
2019 arXiv   pre-print
In this paper, we study hierarchical deep MARL in cooperative multiagent problems with sparse and delayed reward.  ...  Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.  ...  This is because that it is difficult to learn effective behaviors over primitive actions with such sparse rewards.  ... 
arXiv:1809.09332v2 fatcat:ii73dmrohvhihklyga6xun2qgq

Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems [article]

Jiayu Chen, Yuanxin Zhang, Yuanfan Xu, Huimin Ma, Huazhong Yang, Jiaming Song, Yu Wang, Yi Wu
2021 arXiv   pre-print
We introduce a curriculum learning algorithm, Variational Automatic Curriculum Learning (VACL), for solving challenging goal-conditioned cooperative multi-agent reinforcement learning problems.  ...  Experiment results show that VACL solves a collection of sparse-reward problems with a large number of agents.  ...  (VACL), which solves a collection of sparse-reward multi-agent cooperative problems.  ... 
arXiv:2111.04613v2 fatcat:rsshfxxwrzfbvmjgdbl6vvl5ea

Multi-Agent Incentive Communication via Decentralized Teammate Modeling

Lei Yuan, Jianhao Wang, Fuxiang Zhang, Chenghe Wang, ZongZhang Zhang, Yang Yu, Chongjie Zhang
Effective communication can improve coordination in cooperative multi-agent reinforcement learning (MARL).  ...  Empirical results demonstrate that our method significantly outperforms baselines and achieves excellent performance on multiple cooperative MARL tasks.  ...  Q-learning Q * (s, a) = r(s, a) + γE s ′ [max a ′ Q(s ′ , a ′ )]. Deep Q- learning L(θ) = E (τ ,a,r,τ ′ )∈D r + γV τ ′ ; θ − − Q(τ , a; θ) 2 , Mixing Network 𝑄 ! (𝜏 ! , 𝑎 !  ... 
doi:10.1609/aaai.v36i9.21179 fatcat:xzrpiteyzfdstie6asiydsofm4
« Previous Showing results 1 — 15 out of 22,372 results