A multiagent reinforcement learning algorithm by dynamically merging markov decision processes

Mohammad Ghavamzadeh, Sridhar Mahadevan
2002 Proceedings of the first international joint conference on Autonomous agents and multiagent systems part 2 - AAMAS '02  
doi:10.1145/544932.544940 fatcat:ri5enaepczbrbgbdpqofpgj5ji