A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval
[article]
2020
arXiv
pre-print
The goal of weakly-supervised video moment retrieval is to localize the video segment most relevant to the given natural language query without access to temporal annotations during training. Prior strongly- and weakly-supervised approaches often leverage co-attention mechanisms to learn visual-semantic representations for localization. However, while such approaches tend to focus on identifying relationships between elements of the video and language modalities, there is less emphasis on
arXiv:1909.13784v2
fatcat:btgosisk6bb4pklnwgpkojk53m