A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is
Learning with Noisy Correspondence for Cross-modal Matching
Neural Information Processing Systems
Cross-modal matching, which aims to establish the correspondence between two different modalities, is fundamental to a variety of tasks such as cross-modal retrieval and vision-and-language understanding. Although a huge number of crossmodal matching methods have been proposed and achieved remarkable progress in recent years, almost all of these methods implicitly assume that the multimodal training data are correctly aligned. In practice, however, such an assumption is extremely expensive evendblp:conf/nips/HuangNLDXWP21 fatcat:b6wcewyp3zbw7h6htb7r7o7ivm