68 Hits in 5.8 sec

Graph-based Cross Entropy method for solving multi-robot decentralized POMDPs

Shayegan Omidshafiei, Ali-akbar Agha-mohammadi, Christopher Amato, Shih-Yuan Liu, Jonathan P. How, John Vian
2016 2016 IEEE International Conference on Robotics and Automation (ICRA)  
This paper introduces a probabilistic algorithm for multi-robot decision-making under uncertainty, which can be posed as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP).  ...  This paper proposes a cross-entropy based Dec-POSMDP algorithm motivated by the combinatorial optimization literature.  ...  G-DICE The CE method is extended to solve Dec-POSMDPs while using FSAs for policy representation. The resulting algorithm is called Graph-based Direct Cross Entropy (G-DICE).  ... 
doi:10.1109/icra.2016.7487751 dblp:conf/icra/OmidshafieiAALH16 fatcat:t7ebsbffvzbejjn54divfywmfy

Decentralized multi-robot cooperation with auctioned POMDPs

Jesus Capitan, Matthijs T.J. Spaan, Luis Merino, Anibal Ollero
2013 The international journal of robotics research  
To address this issue, this paper proposes to decentralize multi-robot partially observable Markov decision processes (POMDPs) while maintaining cooperation between robots by using POMDP policy auctions  ...  We address this issue by exploiting a decentralized data fusion method in order to efficiently maintain a joint belief state among the robots.  ...  planning under uncertainty; Section 4 proposed a role-based model for multi-robot planning; Section 5 describes the algorithms for auctioning POMDPs in a decentralized manner and the overall overview  ... 
doi:10.1177/0278364913483345 fatcat:gt4ge4kj5fdoratqlio6jqgjou

Scalable accelerated decentralized multi-robot policy search in continuous observation spaces

Shayegan Omidshafiei, Christopher Amato, Miao Liu, Michael Everett, Jonathan P. How, John Vian
2017 2017 IEEE International Conference on Robotics and Automation (ICRA)  
This paper presents the first ever approach for solving continuous-observation Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and their semi-Markovian counterparts, Dec-POSMDPs  ...  An SK-FSA search algorithm titled Entropy-based Policy Search using Continuous Kernel Observations (EPSCKO) is introduced and applied to the first ever continuous-observation Dec-POMDP/Dec-POSMDP domain  ...  [3] , MacDec-POMDP Heuristic Search (MDHS) [2] , and Graph-based Direct Cross Entropy method (G-DICE) [9] .  ... 
doi:10.1109/icra.2017.7989106 dblp:conf/icra/OmidshafieiALEH17 fatcat:khta32qorrb4pohgsvnhtt2woi

Decentralized control of multi-robot partially observable Markov decision processes using belief space macro-actions

Shayegan Omidshafiei, Ali–Akbar Agha–Mohammadi, Christopher Amato, Shih–Yuan Liu, Jonathan P How, John Vian
2017 The international journal of robotics research  
Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are general models for multi-robot coordination problems.  ...  This work extends the Dec-POMDP model to the Decentralized Partially Observable Semi-Markov Decision Process (Dec-POSMDP) to take advantage of high-level representations that are natural for multi-robot  ...  The authors gratefully acknowledge the anonymous reviewers for their highly insightful comments and feedback.  ... 
doi:10.1177/0278364917692864 fatcat:ignx7d7efzfk5fjp25czzyxy4q

Near-Optimal Adversarial Policy Switching for Decentralized Asynchronous Multi-Agent Systems [article]

Trong Nghia Hoang, Yuchen Xiao, Kavinayan Sivakumar, Christopher Amato, Jonathan How
2017 arXiv   pre-print
A key challenge in multi-robot and multi-agent systems is generating solutions that are robust to other self-interested or even adversarial parties who actively try to prevent the agents from achieving  ...  This is achieved via a recently developed graph-based direct cross-entropy (G-DICE) stochastic optimization method of [8] .  ...  The weights w {{λ k i j }, {δ k i j (t)}} can then be optimized via simulation using the graph-based direct cross-entropy (G-DICE) optimization method described in [8] (see Figure 1) .  ... 
arXiv:1710.06525v1 fatcat:wwuz5ms3f5f3phbhvx32o4sa6e

SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning [article]

Zhiwei Xu, Yunpeng Bai, Dapeng Li, Bin Zhang, Guoliang Fan
2021 arXiv   pre-print
As one of the solutions to the decentralized partially observable Markov decision process (Dec-POMDP) problems, the value decomposition method has achieved significant results recently.  ...  SIDE can be extended to any value decomposition method to tackle partially observable problems.  ...  SMAC is a multi-agent testbed dedicated to solving Dec-POMDP problems.  ... 
arXiv:2105.06228v2 fatcat:5uxlesgmtrgtjdym5dywlh6efq

COG-DICE: An Algorithm for Solving Continuous-Observation Dec-POMDPs

Madison Clark-Turner, Christopher Amato
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
The decentralized partially observable Markov decision process (Dec-POMDP) is a powerful model for representing multi-agent problems with decentralized behavior.  ...  Unfortunately, current Dec-POMDP solution methods cannot solve problems with continuous observations, which are common in many real-world domains.  ...  The Graph-Based Direct Cross-Entropy (G-DICE) algorithm is a state-of-the-art method that is scalable and can solve infinite-horizon problems with continuous state spaces [Omidshafiei et al., 2016] .  ... 
doi:10.24963/ijcai.2017/638 dblp:conf/ijcai/Clark-TurnerA17 fatcat:v473wfxgvne2hg7xfx3a4gnlyu

Information Gathering in Decentralized POMDPs by Policy Graph Improvement [article]

Mikko Lauri, Joni Pajarinen, Jan Peters
2019 arXiv   pre-print
Decentralized partially observable Markov decision processes (Dec-POMDPs) are a general, principled model well-suited for such decentralized multiagent decision-making problems.  ...  In this paper, we investigate Dec-POMDPs for decentralized information gathering problems. An optimal solution of a Dec-POMDP maximizes the expected sum of rewards over time.  ...  The two heuristic methods are joint equilibrium based search for policies (JESP) [13] and direct cross-entropy policy search (DICEPS) [16] .  ... 
arXiv:1902.09840v1 fatcat:qlw3krbnwjex7pel2av7k6k3le

Multi-agent active information gathering in discrete and continuous-state decentralized POMDPs by policy graph improvement

Mikko Lauri, Joni Pajarinen, Jan Peters
2020 Autonomous Agents and Multi-Agent Systems  
The Dec-POMDP is a principled model for co-operative decentralized multi-agent decision-making.  ...  In contrast to most prior work on Dec-POMDPs, we set the reward as a non-linear function of the agents' state information, for example the negative Shannon entropy.  ...  The two heuristic methods are joint equilibrium based search for policies, JESP [33] , and direct cross-entropy policy search, DICEPS [40] .  ... 
doi:10.1007/s10458-020-09467-6 fatcat:weoo6jd3czdinljhvxfifnkyz4

FACMAC: Factored Multi-Agent Centralised Policy Gradients [article]

Bei Peng, Tabish Rashid, Christian A. Schroeder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson
2021 arXiv   pre-print
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.  ...  Like MADDPG, a popular multi-agent actor-critic method, our approach uses deep deterministic policy gradients to learn policies.  ...  Specifically, we implement IQL-CEM, which uses the cross-entropy method (CEM De Boer et al. 2005) to perform approximate greedy action selection.  ... 
arXiv:2003.06709v5 fatcat:r2m3ogppmfb7bihzpm5ik5pgay

Decentralized Reinforcement Learning for Multi-Target Search and Detection by a Team of Drones [article]

Roi Yehoshua, Juan Heredia-Juesas, Yushu Wu, Christopher Amato, Jose Martinez-Lorenzo
2021 arXiv   pre-print
In this paper we develop a multi-agent deep reinforcement learning (MADRL) method to coordinate a group of aerial vehicles (drones) for the purpose of locating a set of static targets in an unknown area  ...  In contrast to other state-of-the-art MADRL methods, our method is fully decentralized during both learning and execution, can handle high-dimensional and continuous observation spaces, and does not require  ...  Our method, called Decentralized Advantage Actor-Critic (DA2C), is based on extending the A2C algorithm [14] to the multi-agent case.  ... 
arXiv:2103.09520v1 fatcat:3cxglomxszatdhrtt7vg3mxvvm

Task-Oriented Active Sensing via Action Entropy Minimization

Tipakorn Greigarn, Michael S. Branicky, M. Cenk Cavusoglu
2019 IEEE Access  
The proposed method is validated via simulations.  ...  This is reasonable for applications where the goal is to obtain information.  ...  This paper was presented in part at the IEEE/RSJ International Conference on Intelligence Robots and Systems, Daejeon, Korea, 2016.  ... 
doi:10.1109/access.2019.2941706 pmid:31737464 pmcid:PMC6857841 fatcat:za2762u2mrelffqlo7thu5igqq

Semantic-level decentralized multi-robot decision-making using probabilistic macro-observations

Shayegan Omidshafiei, Shih-Yuan Liu, Michael Everett, Brett T. Lopez, Christopher Amato, Miao Liu, Jonathan P. How, John Vian
2017 2017 IEEE International Conference on Robotics and Automation (ICRA)  
This paper formalizes the concept of macro-observations in Decentralized Partially Observable Semi-Markov Decision Processes (Dec-POSMDPs), allowing scalable semantic-level multi-robot decision making.  ...  To the best of our knowledge, this is the first demonstration of a realtime, convolutional neural net-based classification framework running fully onboard a team of quadrotors in a multi-robot decision-making  ...  Perception data is collected and used to train the HBNI-based macro-observation process, which is then used for Dec-POSMDP policy search via the Graph-based Direct Cross Entropy algorithm [11] .  ... 
doi:10.1109/icra.2017.7989107 dblp:conf/icra/OmidshafieiLELA17 fatcat:l4pxld4rjjbc5kbwqtg5fbcncy

Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning [article]

Zhiwei Xu, Dapeng Li, Bin Zhang, Yuan Zhan, Yunpeng Bai, Guoliang Fan
2022 arXiv   pre-print
This paper proposes an implicit model-based multi-agent reinforcement learning method based on value decomposition methods.  ...  The significant compounding error may hinder the learning process when model-based methods are applied to multi-agent tasks.  ...  for continuous cooperative multi-agent robotic control in the multi-agent field, called Multi-Agent MuJoCo.  ... 
arXiv:2204.09418v2 fatcat:movcfvexe5g53cxs3b53usl3la

Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial [article]

Amal Feriani, Ekram Hossain
2020 arXiv   pre-print
We expect this tutorial to stimulate more research endeavors to build scalable and decentralized systems based on MARL.  ...  In this context, this tutorial focuses on the role of DRL with an emphasis on deep Multi-Agent Reinforcement Learning (MARL) for AI-enabled 6G networks.  ...  For example, Cross Entropy Method (CEM) is a famous planning approach to escape local optima which shooting methods suffer from.  ... 
arXiv:2011.03615v1 fatcat:zzbotslc3vczxmwf72y43k3ica
« Previous Showing results 1 — 15 out of 68 results