A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
TVDIM: Enhancing Image Self-Supervised Pretraining via Noisy Text Data
[article]
2021
arXiv
pre-print
Inspired by this, we propose a novel self-supervised learning method, named Text-enhanced Visual Deep InfoMax (TVDIM), to learn better visual representations by fully utilizing the naturally-existing multimodal ...
Experimental results show that, TVDIM significantly outperforms previous visual self-supervised methods when processing the same set of images. ...
The first stage is to do self-supervised training for image encoder via the proposed TVDIM. ...
arXiv:2106.01797v2
fatcat:ut3dcxos7bhelb6vllwo5dicti