Conditional random fields for multi-agent reinforcement learning

Xinhua Zhang, Douglas Aberdeen, S. V. N. Vishwanathan
2007 Proceedings of the 24th international conference on Machine learning - ICML '07  
Conditional random fields (CRFs) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of observation and label pairs. Underlying all CRFs is the assumption that, conditioned on the training data, the labels are independent and identically distributed (iid). In this paper we explore the use of CRFs in a class of temporal learning algorithms, namely policygradient reinforcement learning (RL). Now the labels are
more » ... longer iid. They are actions that update the environment and affect the next observation. From an RL point of view, CRFs provide a natural way to model joint actions in a decentralized Markov decision process. They define how agents can communicate with each other to choose the optimal joint action. Our experiments include a synthetic network alignment problem, a distributed sensor network, and road traffic control; clearly outperforming RL methods which do not model the proper joint policy.
doi:10.1145/1273496.1273640 dblp:conf/icml/ZhangAV07 fatcat:qm5uqf4emzgvbdlhzvssrgwgge