A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Towards Knowledge-Augmented Visual Question Answering
2020
Proceedings of the 28th International Conference on Computational Linguistics
unpublished
Visual Question Answering (VQA) remains algorithmically challenging while it is effortless for humans. Humans combine visual observations with general and commonsense knowledge to answer a question about a given image. In this paper, we address the problem of incorporating general knowledge into VQA models while leveraging the visual information. We propose a model that captures the interactions between objects in a visual scene and entities in an external knowledge source. Our model is a
doi:10.18653/v1/2020.coling-main.169
fatcat:gtrquytov5fvvnvk36zafetwta