Learning Progress Driven Multi-Agent Curriculum [article]

Wenshuai Zhao, Zhiyuan Li, Joni Pajarinen
2024 arXiv   pre-print
Curriculum reinforcement learning (CRL) aims to speed up learning by gradually increasing the difficulty of a task, usually quantified by the achievable expected return. Inspired by the success of CRL in single-agent settings, a few works have attempted to apply CRL to multi-agent reinforcement learning (MARL) using the number of agents to control task difficulty. However, existing works typically use manually defined curricula such as a linear scheme. In this paper, we first apply
more » ... art single-agent self-paced CRL to sparse reward MARL. Although with satisfying performance, we identify two potential flaws of the curriculum generated by existing reward-based CRL methods: (1) tasks with high returns may not provide informative learning signals and (2) the exacerbated credit assignment difficulty in tasks where more agents yield higher returns. Thereby, we further propose self-paced MARL (SPMARL) to prioritize tasks based on learning progress instead of the episode return. Our method not only outperforms baselines in three challenging sparse-reward benchmarks but also converges faster than self-paced CRL.
arXiv:2205.10016v2 fatcat:ryzq7ehhtva27cd223hvdokeaa