1,737 Hits in 7.7 sec

Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning [article]

Xiangxiang Chu, Hangjun Ye
2017 arXiv   pre-print
Deep reinforcement learning for multi-agent cooperation and competition has been a hot topic recently.  ...  Multi agent deep deterministic policy gradient obtained state of art results for some multi-agent games, whereas, it cannot scale well with growing amount of agents.  ...  Secondly, we propose several parameter sharing multi-agent deep deterministic policy gradient variants.  ... 
arXiv:1710.00336v2 fatcat:wsq3yokbpzgwvhntedmymcgg5i

Cooperative Multi-agent Control Using Deep Reinforcement Learning [chapter]

Jayesh K. Gupta, Maxim Egorov, Mykel Kochenderfer
2017 Lecture Notes in Computer Science  
We extend three classes of single-agent deep reinforcement learning algorithms based on policy gradient, temporal-difference error, and actor-critic methods to cooperative multi-agent systems.  ...  Using deep reinforcement learning with a curriculum learning scheme, our approach can solve problems that were previously considered intractable by most multi-agent reinforcement learning algorithms.  ...  The authors would like to thank the anonymous reviewers for their helpful comments.  ... 
doi:10.1007/978-3-319-71682-4_5 fatcat:ie4vvneipjgxbdwngj3bncs6eu

Cooperative Multi-Agent Reinforcement Learning with Approximate Model Learning

Young Joon Park, Young Jae Lee, Seoung Bum Kim
2020 IEEE Access  
INDEX TERMS reinforcement learning, model-free method, multi-agent system, multi-agent cooperation, actor-critic method, deterministic policy gradient SECTION I.  ...  We propose a method for learning cooperative policies in multi-agent environments by considering the communications among agents.  ...  The multi-agent deep deterministic policy gradient (MADDPG) is an extension of a deep deterministic policy gradient algorithm (DDPG) [22] .  ... 
doi:10.1109/access.2020.3007219 fatcat:j2u2j7zb4zgbboip25eqys34xa

Distributed Deep Deterministic Policy Gradient for Power Allocation Control in D2D-Based V2V Communications

Khoi Khac Nguyen, Trung Q. Duong, Ngo Anh Vien, Nhien-An Le-Khac, Long D. Nguyen
2019 IEEE Access  
INDEX TERMS Non-cooperative D2D communication, D2D-based V2V communications, power allocation, multi-agent deep reinforcement learning, and deep deterministic policy gradient (DDPG).  ...  In this paper, we present two novel approaches based on deep deterministic policy gradient algorithm, namely "distributed deep deterministic policy gradient" and "sharing deep deterministic policy gradient  ...  for multi-agent deep reinforcement learning problem.  ... 
doi:10.1109/access.2019.2952411 fatcat:hguldp3debfzheugwiane3ptai

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward [article]

Hassam Ullah Sheikh, Ladislau Bölöni
2020 arXiv   pre-print
To address this problem, we present Decomposed Multi-Agent Deep Deterministic Policy Gradient (DE-MADDPG): a novel cooperative multi-agent reinforcement learning framework that simultaneously learns to  ...  Many cooperative multi-agent problems require agents to learn individual tasks while contributing to the collective success of the group.  ...  DECOMPOSED MULTI-AGENT DEEP DETERMINISTIC POLICY GRADIENT We propose Decomposed Multi-Agent Deep Deterministic Policy Gradient: a multi-agent deep reinforcement learning Four environments for the defensive  ... 
arXiv:2003.10598v1 fatcat:cervcqlhc5dxlfmqrjyl3c5l3u

Scalable Deep Multi-Agent Reinforcement Learning via Observation Embedding and Parameter Noise

Jian Zhang, Yaozong Pan, Haitao Yang, Yuqiang Fang
2019 IEEE Access  
INDEX TERMS Artificial intelligence, multi-agent, deep reinforcement learning, deep deterministic policy gradient, actor-critic, centralized training with decentralized execution, observation embedding  ...  In this paper, we explore a scalable deep reinforcement learning (DRL) method for environments with multi-agents.  ...  Acknowledgment The authors acknowledge the financial support received from the research foundation of Space Engineering University, grant number: zx10356, for their support and encouragement in carrying  ... 
doi:10.1109/access.2019.2913235 fatcat:iurw3fsvr5bg5ai3cgqimkn4ni

Multi-Agent Actor-Critic with Generative Cooperative Policy Network [article]

Heechang Ryu, Hayong Shin, Jinkyoo Park
2018 arXiv   pre-print
We propose an efficient multi-agent reinforcement learning approach to derive equilibrium strategies for multi-agents who are participating in a Markov game.  ...  Mainly, we are focused on obtaining decentralized policies for agents to maximize the performance of a collaborative task by all the agents, which is similar to solving a decentralized Markov decision  ...  For example, [23] proposed multi-agent version of deep deterministic policy gradient algorithm to derive agents' policies for both cooperative and competitive MG.  ... 
arXiv:1810.09206v1 fatcat:66pvdm42jnf3jnjqmuk7svo5ym

Survey of Recent Multi-Agent Reinforcement Learning Algorithms Utilizing Centralized Training [article]

Piyush K. Sharma, Rolando Fernandez, Erin Zaroukian, Michael Dorothy, Anjon Basak, Derrik E. Asher
2021 arXiv   pre-print
The goal is to explore how different implementations of information sharing mechanism in centralized learning may give rise to distinct group coordinated behaviors in multi-agent systems performing cooperative  ...  Much work has been dedicated to the exploration of Multi-Agent Reinforcement Learning (MARL) paradigms implementing a centralized learning with decentralized execution (CLDE) approach to achieve human-like  ...  The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory  ... 
arXiv:2107.14316v1 fatcat:n7qmmwwdenfbdngkmzflsqcx7y

Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning [article]

Yongyuan Liang, Bangwei Li
2020 arXiv   pre-print
Multi-agent reinforcement learning is a standard framework for modeling multi-agent interactions applied in real-world scenarios.  ...  Inspired by experience sharing in human groups, learning knowledge parallel reusing between agents can potentially promote team learning performance, especially in multi-task environments.  ...  Reinforcement Learning approach as an extension of policy gradient (PG) [27] , which optimize policy by update policy parameters θ along the gradient direction, ∇ θ J(θ), ∇ θ J(θ) = E s∼p π ,a∼π θ [∇  ... 
arXiv:2003.13085v1 fatcat:mxj6mtulivaj5aomeaojzw5p6q

Decentralized Multi-Agent Advantage Actor-Critic

Scott Barnes
We present a decentralized advantage actor-critic algorithm that utilizes learning agents in parallel environments with synchronous gradient descent.  ...  , and runs on a single multi-core CPU.  ...  We have introduced a decentralized multi-agent actor-critic algorithm that does not require the use of a replay buffer, shared critic, or a dynamic model of the environment.  ... 
doi:10.6084/m9.figshare.13718365.v1 fatcat:qzgaxoqpwze7fkw7d22dbrbwai

A Parallel Evolutionary Algorithm with Value Decomposition for Multi-agent Problems [chapter]

Gao Li, Qiqi Duan, Yuhui Shi
2020 Lecture Notes in Computer Science  
Recently, Reinforcement Learning (RL) has made significant progress on single-agent problems. However, multi-agent problems still cannot be easily solved by traditional RL algorithms.  ...  Our algorithm adopts Evolution Strategies (ES) for optimizing policy which is used to control agents and a value decomposition method for estimating proper fitness for each policy.  ...  Multi-Agent Deep Deterministic Policy Gradient (MADDPG) [19] achieves great performance in a set of simple multi-agent problem environments.  ... 
doi:10.1007/978-3-030-53956-6_57 fatcat:gowdjtn77fdzvfejrtuuk4j2ua

Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces [article]

Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan
2019 arXiv   pre-print
Deep Reinforcement Learning (DRL) has been applied to address a variety of cooperative multi-agent problems with either discrete action spaces or continuous action spaces.  ...  deep parameterized Q-learning method.  ...  Finally we need to compute gradients for deterministic policy network.  ... 
arXiv:1903.04959v1 fatcat:tr27qpwttjgonpnst7amdooclm

Reinforcement Learning in Dynamic Task Scheduling: A Review

Chathurangi Shyalika, Thushari Silva, Asoka Karunananda
2020 SN Computer Science  
This review paper is about a research study that focused on Reinforcement Learning techniques that have been used for dynamic task scheduling.  ...  Reinforcement Learning is an emergent technology which has been able to solve the problem of the optimal task and resource scheduling dynamically.  ...  Thushari Silva and Professor Asoka Karunananda for their massive guidance and commitment throughout the research. Funding The funding is handled by the Authors itself.  ... 
doi:10.1007/s42979-020-00326-5 fatcat:egp6vgpetbcwdasm45vunmo3n4

Multi-vehicle Flocking Control with Deep Deterministic Policy Gradient Method [article]

Yang Lyu, Quan Pan, Jinwen Hu, Chunhui Zhao, Shuai Liu
2018 arXiv   pre-print
Specifically the deep deterministic policy gradient (DDPG) with centralized training and distributed execution process is implemented to obtain the flocking control policy.  ...  In this paper the Multi-vehicles System (MVS) flocking control with collision avoidance and communication preserving is considered based on the deep reinforcement learning framework.  ...  ACKNOWLEDGMENT The authors would like to thank the National Research Foundation, Keppel Corporation, and National University of Singapore for supporting this work done in the Keppel-NUS Corporate Laboratory  ... 
arXiv:1806.00196v1 fatcat:gzlzaazffzdynkch4gzv4b7thy

Multi-Agent Deep Reinforcement Learning with Adaptive Policies [article]

Yixiang Wang, Feng Wu
2019 arXiv   pre-print
We propose a novel approach to address one aspect of the non-stationarity problem in multi-agent reinforcement learning (RL), where the other agents may alter their policies due to environment changes  ...  We empirically evaluated our method on a variety of common benchmark problems proposed for multi-agent deep RL in the literature.  ...  Multi-Agent Deep Deterministic Policy Gradient Policy gradient methods are a popular choice for a variety of RL tasks.  ... 
arXiv:1912.00949v1 fatcat:tivvmotqwffyfkowmcxid6hjcu
« Previous Showing results 1 — 15 out of 1,737 results