1 Hit in 1.3 sec

TVDIM: Enhancing Image Self-Supervised Pretraining via Noisy Text Data [article]

Pengda Qin, Yuhong Li, Kefeng Deng, Qiang Wu
2021 arXiv   pre-print
Inspired by this, we propose a novel self-supervised learning method, named Text-enhanced Visual Deep InfoMax (TVDIM), to learn better visual representations by fully utilizing the naturally-existing multimodal  ...  Experimental results show that, TVDIM significantly outperforms previous visual self-supervised methods when processing the same set of images.  ...  The first stage is to do self-supervised training for image encoder via the proposed TVDIM.  ... 
arXiv:2106.01797v2 fatcat:ut3dcxos7bhelb6vllwo5dicti