A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Multimodal Representation Learning via Maximization of Local Mutual Information
[article]
2021
arXiv
pre-print
We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image. Our method trains image and text encoders by encouraging the resulting representations to exhibit high local mutual information. We make use of recent advances in mutual
arXiv:2103.04537v4
fatcat:4itsxi2myzcg3hyptj52qt7g3m