17,211 Hits in 7.0 sec

Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning [article]

Shuhan Qi, Shuhao Zhang, Xiaohan Hou, Jiajia Zhang, Xuan Wang, Jing Xiao
2022 arXiv   pre-print
In this paper, we design an distributed MARL framework based on the actor-work-learner architecture.  ...  Moreover, most of the existing distributed framework are proposed for single-agent reinforcement learning and not suitable for multi-agent.  ...  We first propose an actor-worker distributed algorithm training framework for multiple The algorithmic training framework is shown in Fig. 3 .  ... 
arXiv:2205.05248v1 fatcat:e6yknxvhpzbebfj7kgkbr5tlmi

Adaptive Mechanism Based on Shared Learning in Multi-agent System [chapter]

Qingshan Li, Hua Chu, Liang Diao, Lu Wang
2014 IFIP Advances in Information and Communication Technology  
Based on this, framework for constructing adaptive systems and shared learning algorithm of agent are given.  ...  learning in multiple agent system.  ...  This method defined knowledge which agent accumulate in Q-learning process as experience tuple, it is represented as a triple <s, a, Q(s, a)>, s and n are stand for status value and behaviour of value,  ... 
doi:10.1007/978-3-662-44980-6_13 fatcat:4db5luf3a5hm7p6gycg7di4tze

Sparse cooperative Q-learning

Jelle R. Kok, Nikos Vlassis
2004 Twenty-first international conference on Machine learning - ICML '04  
We show how Qlearning can be efficiently applied to learn a coordinated policy for the agents in the above framework.  ...  Next, we use a coordination-graph approach in which we represent the Q-values by value rules that specify the coordination dependencies of the agents at particular states.  ...  Acknowledgments We would like to thank the three reviewers for their detailed and constructive comments.  ... 
doi:10.1145/1015330.1015410 dblp:conf/icml/KokV04 fatcat:ectjh33w3bhujjg3qevz6qyll4

Survey on Multi-Agent Q-Learning frameworks for resource management in wireless sensor network [article]

Arvin Tashakori
2021 arXiv   pre-print
After that, the author presented a summary of the Q-Learning algorithm, a well-known classic solution for model-free reinforcement learning problems.  ...  In the third section, the author extended the Q-Learning algorithm for multi-agent scenarios and discussed its challenges.  ...  Independent Agents In [9, 8] , authors push forth the idea of independent learners as a solution for multi-agent Q-Learning algorithm in wireless sensor network resource management problem.  ... 
arXiv:2105.02371v1 fatcat:2iml3yk3svd47j5i7lfexljfie

Asynchronous Methods for Deep Reinforcement Learning [article]

Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu
2016 arXiv   pre-print
We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers.  ...  We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully  ...  Second, we make the observation that multiple actors-Algorithm 1 Asynchronous one-step Q-learning -pseudocode for each actor-learner thread. // Assume global shared θ, θ − , and counter T = 0.  ... 
arXiv:1602.01783v2 fatcat:sz3ut6hkqjatllnu6infdky3nq

Learning from experts in cognitive radio networks: The docitive paradigm

Ana Galindo-Serrano, Lorenza Giupponi, Pol Blasco, Mischa Dohler
2010 Proceedings of the 5th International ICST Conference on Cognitive Radio Oriented Wireless Networks and Communications  
The docitive paradigm proposes a timely solution based on knowledge sharing, which allows CRs to develop new capacities for selecting actions.  ...  Our goal is to solve the aggregated interference problem generated by multiple CR systems at the receivers of a primary system.  ...  Therefore, Q * (s, a * ) is minimal, and can be expressed as: Q * (s, a * ) = min a∈Ai [Q * (s, a)] . (5) The Q-value Q(s, a) represents the expected discounted cost for executing action a at state s and  ... 
doi:10.4108/icst.crowncom2010.9173 dblp:conf/crowncom/Galindo-Serrano10 fatcat:zohlse4ivrcjhplevsmvy3vame

Learning and Coordinating Repertoires of Behaviors with Common Reward: Credit Assignment and Module Activation [chapter]

Constantin A. Rothkopf, Dana H. Ballard
2013 Computational and Robotic Models of the Hierarchical Organization of Behavior  
Within this framework, we consider the problem faced by a single agent comprising multiple separate elemental task learners that we call modules, which jointly learn to solve tasks that arise as different  ...  Understanding extended natural behavior will require a theoretical understanding of the entire system as it is engaged in perception and action involving multiple concurrent goals such as foraging for  ...  Proof of Convergence First of all, note that the equations for O r are independent of the Q values and only depend on the sampling strategy.  ... 
doi:10.1007/978-3-642-39875-9_6 fatcat:xn2ynb4i6zgghjdxg4cwov3wh4

A Q-values Sharing Framework for Multiagent Reinforcement Learning under Budget Constraint [article]

Changxi Zhu and Ho-fung Leung and Shuyue Hu and Yi Cai
2020 arXiv   pre-print
We propose a partaker-sharer advising framework (PSAF) for cooperative MARL agents learning with budget constraint. In PSAF, each Q-learner can decide when to ask for Q-values and share its Q-values.  ...  Evaluation results show that our approach PSAF outperforms existing advising methods under both unlimited and limited budget, and we give an analysis of the impact of advising actions and sharing Q-values  ...  CONCLUSION AND FURTHER WORKS We here propose a Q-values sharing framework PSAF for multiple decentralized Q-learners learning with budget constraint.  ... 
arXiv:2011.14281v1 fatcat:xpryux5n6vadje3rvf5vmlez54

Kernel Sharing With Joint Boosting For Multi-Class Concept Detection

Wei Jiang, Shih-Fu Chang, Alexander C. Loui
2007 2007 IEEE Conference on Computer Vision and Pattern Recognition  
In this paper, unlike traditional approaches that independently build binary classifiers to detect individual concepts, we proposed a new framework for multi-class concept detection based on kernel sharing  ...  We demonstrated our approach by developing an extended JointBoost framework, which was used to choose the optimal kernel and subset of sharing classes in an iterative boosting process.  ...  All concepts C j ∈ S(t) share this binary classifier and have the same p t (y j I = 1|I) value. k j c (t) is a constant for concept C j which is not selected to share the weak learner in iteration t.  ... 
doi:10.1109/cvpr.2007.383483 dblp:conf/cvpr/JiangCL07 fatcat:gutqzospu5avtpop6xkhkjl7zq

Predicting and Preventing Coordination Problems in Cooperative Q-learning Systems

Nancy Fulda, Dan Ventura
2007 International Joint Conference on Artificial Intelligence  
We present a conceptual framework for creating Qlearning-based algorithms that converge to optimal equilibria in cooperative multiagent settings.  ...  This framework includes a set of conditions that are sufficient to guarantee optimal system performance.  ...  Also, the framework presented here makes an underlying assumption of the independence of the individual states s i .  ... 
dblp:conf/ijcai/FuldaV07 fatcat:433wcsenijeojf5hknsk6w7iqi

Transfer Learning in Multi-Agent Reinforcement Learning with Double Q-Networks for Distributed Resource Sharing in V2X Communication [article]

Hammad Zafar, Zoran Utkovski, Martin Kasparick, Slawomir Stanczak
2021 arXiv   pre-print
This work considers an extension of this framework by combining Double Q-learning (via Double DQN) and transfer learning.  ...  A recent work on the topic proposes a multi-agent reinforcement learning (MARL) approach based on deep Q-learning, which leverages a fingerprint-based deep Q-network (DQN) architecture.  ...  In particular, in TQL, the Q-values of the expert model will be transferred to the Q-values of the learner model.  ... 
arXiv:2107.06195v1 fatcat:exivhogmufcvfa556tc3zmmmtm

Multi-class multi-instance boosting for part-based human detection

Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung, Kuang-Yu Chang
2009 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops  
With the purpose of designing a general learning framework for detecting human parts, we formulate this task as a classification problem over non-aligned training examples of multiple classes.  ...  Second, instead of learning part detectors individually, MCMIBoost learns a unified detector for efficient detection, and uses the feature-sharing concept to design an efficient multi-class classifier.  ...  This work was supported in part by Ministry of Economic Affairs, Taiwan, under Grant No. 97-EC-17-A-02-S1-032.  ... 
doi:10.1109/iccvw.2009.5457475 dblp:conf/iccvw/ChenCHC09 fatcat:skuhg5vpjjgv5ehjngyqpv3cvq

A differentially private distributed data mining scheme with high efficiency for edge computing

Xianwen Sun, Ruzhi Xu, Longfei Wu, Zhitao Guan
2021 Journal of Cloud Computing: Advances, Systems and Applications  
However, data owners may not be willing to sharing the own data for the privacy concerns.  ...  Each participant builds an elegant decision model based on their own data, which has a good tradeoff between the computation and the accuracy of the data distribution, and shares it with other participants  ...  Acknowledgments We sincerely thank the Reviewers and the Editor for their valuable suggestions.  ... 
doi:10.1186/s13677-020-00225-3 fatcat:jmdsz3jj75chzpp2mbnmsszjqi

Containerized Distributed Value-Based Multi-Agent Reinforcement Learning [article]

Siyang Wu, Tonghan Wang, Chenghao Li, Yang Hu, Chongjie Zhang
2021 arXiv   pre-print
We propose a containerized learning framework to solve these problems.  ...  We pack several environment instances, a local learner and buffer, and a carefully designed multi-queue manager which avoids blocking into a container.  ...  Specifically, agents share a three-layer local Q-network, with a GRU (Cho et al., 2014) between two fully-connected layers, and the global Q value Q θ (τ , a), parameterized by θ, is learned as a monotonic  ... 
arXiv:2110.08169v2 fatcat:kahxitntpzhhjpvdy3zexv4igy

Networked Reinforcement Social Learning towards Coordination in Cooperative Multiagent Systems

Jianye Hao, Dongping Huang, Yi Cai, Ho-Fung Leung
2014 2014 IEEE 26th International Conference on Tools with Artificial Intelligence  
We distinguish two types of learners: individual action learner and joint action learner.  ...  It is not clear a priori if all agents can learn a consistent optimal coordination policy and what kind of impact different topology parameters could have on the learning performance of agents.  ...  Claus and Boutilier (1998) firstly distinguished two different types of learners (without optimistic exploration) based on Q-learning algorithm: independent learner and joint-action learner, and investigate  ... 
doi:10.1109/ictai.2014.63 dblp:conf/ictai/HaoHCL14 fatcat:qrgbc4mecfgflcritrcfol34aa
« Previous Showing results 1 — 15 out of 17,211 results