A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Learning Cooperation and Online Planning Through Simulation and Graph Convolutional Network
[article]
2021
arXiv
pre-print
Multi-agent Markov Decision Process (MMDP) has been an effective way of modelling sequential decision making algorithms for multi-agent cooperative environments. A number of algorithms based on centralized and decentralized planning have been developed in this domain. However, dynamically changing environment, coupled with exponential size of the state and joint action space, make it difficult for these algorithms to provide both efficiency and scalability. Recently, Centralized planning
arXiv:2110.08480v1
fatcat:3rw4kdve6zgldpwkmmbosov2x4