Multi-Hop Knowledge Graph Reasoning with Reward Shaping

Xi Victoria Lin, Richard Socher, Caiming Xiong
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
Multi-hop reasoning is an effective approach for query answering (QA) over incomplete knowledge graphs (KGs). The problem can be formulated in a reinforcement learning (RL) setup, where a policy-based agent sequentially extends its inference path until it reaches a target. However, in an incomplete KG environment, the agent receives low-quality rewards corrupted by false negatives in the training data, which harms generalization at test time. Furthermore, since no golden action sequence is used
more » ... for training, the agent can be misled by spurious search trajectories that incidentally lead to the correct answer. We propose two modeling advances to address both issues: (1) we reduce the impact of false negative supervision by adopting a pretrained onehop embedding model to estimate the reward of unobserved facts; (2) we counter the sensitivity to spurious paths of on-policy RL by forcing the agent to explore a diverse set of paths using randomly generated edge masks. Our approach significantly improves over existing path-based KGQA models on several benchmark datasets and is comparable or better than embedding-based models.
doi:10.18653/v1/d18-1362 dblp:conf/emnlp/LinSX18 fatcat:pqubglr3mzdqxaqzckqwcd3eo4