A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Graph-based Cross Entropy method for solving multi-robot decentralized POMDPs
2016
2016 IEEE International Conference on Robotics and Automation (ICRA)
This paper introduces a probabilistic algorithm for multi-robot decision-making under uncertainty, which can be posed as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP). ...
This paper proposes a cross-entropy based Dec-POSMDP algorithm motivated by the combinatorial optimization literature. ...
G-DICE The CE method is extended to solve Dec-POSMDPs while using FSAs for policy representation. The resulting algorithm is called Graph-based Direct Cross Entropy (G-DICE). ...
doi:10.1109/icra.2016.7487751
dblp:conf/icra/OmidshafieiAALH16
fatcat:t7ebsbffvzbejjn54divfywmfy
Decentralized multi-robot cooperation with auctioned POMDPs
2013
The international journal of robotics research
To address this issue, this paper proposes to decentralize multi-robot partially observable Markov decision processes (POMDPs) while maintaining cooperation between robots by using POMDP policy auctions ...
We address this issue by exploiting a decentralized data fusion method in order to efficiently maintain a joint belief state among the robots. ...
planning under uncertainty; Section 4 proposed a role-based model for multi-robot planning; Section 5 describes the algorithms for auctioning POMDPs in a decentralized manner and the overall overview ...
doi:10.1177/0278364913483345
fatcat:gt4ge4kj5fdoratqlio6jqgjou
Scalable accelerated decentralized multi-robot policy search in continuous observation spaces
2017
2017 IEEE International Conference on Robotics and Automation (ICRA)
This paper presents the first ever approach for solving continuous-observation Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and their semi-Markovian counterparts, Dec-POSMDPs ...
An SK-FSA search algorithm titled Entropy-based Policy Search using Continuous Kernel Observations (EPSCKO) is introduced and applied to the first ever continuous-observation Dec-POMDP/Dec-POSMDP domain ...
[3] , MacDec-POMDP Heuristic Search (MDHS) [2] , and Graph-based Direct Cross Entropy method (G-DICE) [9] . ...
doi:10.1109/icra.2017.7989106
dblp:conf/icra/OmidshafieiALEH17
fatcat:khta32qorrb4pohgsvnhtt2woi
Decentralized control of multi-robot partially observable Markov decision processes using belief space macro-actions
2017
The international journal of robotics research
Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are general models for multi-robot coordination problems. ...
This work extends the Dec-POMDP model to the Decentralized Partially Observable Semi-Markov Decision Process (Dec-POSMDP) to take advantage of high-level representations that are natural for multi-robot ...
The authors gratefully acknowledge the anonymous reviewers for their highly insightful comments and feedback. ...
doi:10.1177/0278364917692864
fatcat:ignx7d7efzfk5fjp25czzyxy4q
Near-Optimal Adversarial Policy Switching for Decentralized Asynchronous Multi-Agent Systems
[article]
2017
arXiv
pre-print
A key challenge in multi-robot and multi-agent systems is generating solutions that are robust to other self-interested or even adversarial parties who actively try to prevent the agents from achieving ...
This is achieved via a recently developed graph-based direct cross-entropy (G-DICE) stochastic optimization method of [8] . ...
The weights w {{λ k i j }, {δ k i j (t)}} can then be optimized via simulation using the graph-based direct cross-entropy (G-DICE) optimization method described in [8] (see Figure 1) . ...
arXiv:1710.06525v1
fatcat:wwuz5ms3f5f3phbhvx32o4sa6e
SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning
[article]
2021
arXiv
pre-print
As one of the solutions to the decentralized partially observable Markov decision process (Dec-POMDP) problems, the value decomposition method has achieved significant results recently. ...
SIDE can be extended to any value decomposition method to tackle partially observable problems. ...
SMAC is a multi-agent testbed dedicated to solving Dec-POMDP problems. ...
arXiv:2105.06228v2
fatcat:5uxlesgmtrgtjdym5dywlh6efq
COG-DICE: An Algorithm for Solving Continuous-Observation Dec-POMDPs
2017
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
The decentralized partially observable Markov decision process (Dec-POMDP) is a powerful model for representing multi-agent problems with decentralized behavior. ...
Unfortunately, current Dec-POMDP solution methods cannot solve problems with continuous observations, which are common in many real-world domains. ...
The Graph-Based Direct Cross-Entropy (G-DICE) algorithm is a state-of-the-art method that is scalable and can solve infinite-horizon problems with continuous state spaces [Omidshafiei et al., 2016] . ...
doi:10.24963/ijcai.2017/638
dblp:conf/ijcai/Clark-TurnerA17
fatcat:v473wfxgvne2hg7xfx3a4gnlyu
Information Gathering in Decentralized POMDPs by Policy Graph Improvement
[article]
2019
arXiv
pre-print
Decentralized partially observable Markov decision processes (Dec-POMDPs) are a general, principled model well-suited for such decentralized multiagent decision-making problems. ...
In this paper, we investigate Dec-POMDPs for decentralized information gathering problems. An optimal solution of a Dec-POMDP maximizes the expected sum of rewards over time. ...
The two heuristic methods are joint equilibrium based search for policies (JESP) [13] and direct cross-entropy policy search (DICEPS) [16] . ...
arXiv:1902.09840v1
fatcat:qlw3krbnwjex7pel2av7k6k3le
Multi-agent active information gathering in discrete and continuous-state decentralized POMDPs by policy graph improvement
2020
Autonomous Agents and Multi-Agent Systems
The Dec-POMDP is a principled model for co-operative decentralized multi-agent decision-making. ...
In contrast to most prior work on Dec-POMDPs, we set the reward as a non-linear function of the agents' state information, for example the negative Shannon entropy. ...
The two heuristic methods are joint equilibrium based search for policies, JESP [33] , and direct cross-entropy policy search, DICEPS [40] . ...
doi:10.1007/s10458-020-09467-6
fatcat:weoo6jd3czdinljhvxfifnkyz4
FACMAC: Factored Multi-Agent Centralised Policy Gradients
[article]
2021
arXiv
pre-print
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces. ...
Like MADDPG, a popular multi-agent actor-critic method, our approach uses deep deterministic policy gradients to learn policies. ...
Specifically, we implement IQL-CEM, which uses the cross-entropy method (CEM De Boer et al. 2005) to perform approximate greedy action selection. ...
arXiv:2003.06709v5
fatcat:r2m3ogppmfb7bihzpm5ik5pgay
Decentralized Reinforcement Learning for Multi-Target Search and Detection by a Team of Drones
[article]
2021
arXiv
pre-print
In this paper we develop a multi-agent deep reinforcement learning (MADRL) method to coordinate a group of aerial vehicles (drones) for the purpose of locating a set of static targets in an unknown area ...
In contrast to other state-of-the-art MADRL methods, our method is fully decentralized during both learning and execution, can handle high-dimensional and continuous observation spaces, and does not require ...
Our method, called Decentralized Advantage Actor-Critic (DA2C), is based on extending the A2C algorithm [14] to the multi-agent case. ...
arXiv:2103.09520v1
fatcat:3cxglomxszatdhrtt7vg3mxvvm
Task-Oriented Active Sensing via Action Entropy Minimization
2019
IEEE Access
The proposed method is validated via simulations. ...
This is reasonable for applications where the goal is to obtain information. ...
This paper was presented in part at the IEEE/RSJ International Conference on Intelligence Robots and Systems, Daejeon, Korea, 2016. ...
doi:10.1109/access.2019.2941706
pmid:31737464
pmcid:PMC6857841
fatcat:za2762u2mrelffqlo7thu5igqq
Semantic-level decentralized multi-robot decision-making using probabilistic macro-observations
2017
2017 IEEE International Conference on Robotics and Automation (ICRA)
This paper formalizes the concept of macro-observations in Decentralized Partially Observable Semi-Markov Decision Processes (Dec-POSMDPs), allowing scalable semantic-level multi-robot decision making. ...
To the best of our knowledge, this is the first demonstration of a realtime, convolutional neural net-based classification framework running fully onboard a team of quadrotors in a multi-robot decision-making ...
Perception data is collected and used to train the HBNI-based macro-observation process, which is then used for Dec-POSMDP policy search via the Graph-based Direct Cross Entropy algorithm [11] . ...
doi:10.1109/icra.2017.7989107
dblp:conf/icra/OmidshafieiLELA17
fatcat:l4pxld4rjjbc5kbwqtg5fbcncy
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning
[article]
2022
arXiv
pre-print
This paper proposes an implicit model-based multi-agent reinforcement learning method based on value decomposition methods. ...
The significant compounding error may hinder the learning process when model-based methods are applied to multi-agent tasks. ...
for continuous cooperative multi-agent robotic control in the multi-agent field, called Multi-Agent MuJoCo. ...
arXiv:2204.09418v2
fatcat:movcfvexe5g53cxs3b53usl3la
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial
[article]
2020
arXiv
pre-print
We expect this tutorial to stimulate more research endeavors to build scalable and decentralized systems based on MARL. ...
In this context, this tutorial focuses on the role of DRL with an emphasis on deep Multi-Agent Reinforcement Learning (MARL) for AI-enabled 6G networks. ...
For example, Cross Entropy Method (CEM) is a famous planning approach to escape local optima which shooting methods suffer from. ...
arXiv:2011.03615v1
fatcat:zzbotslc3vczxmwf72y43k3ica
« Previous
Showing results 1 — 15 out of 68 results