A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning
2018
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Dense video captioning is a newly emerging task that aims at both localizing and describing all events in a video. We identify and tackle two challenges on this task, namely, (1) how to utilize both past and future contexts for accurate event proposal predictions, and (2) how to construct informative input to the decoder for generating natural event descriptions. First, previous works predominantly generate temporal event proposals in the forward direction, which neglects future video context.
doi:10.1109/cvpr.2018.00751
dblp:conf/cvpr/WangJ00X18
fatcat:l3mc7jrzhna4djoe6okz4x7vdy