A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Spatial Cross-Attention Improves Self-Supervised Visual Representation Learning
[article]
2022
arXiv
pre-print
Unsupervised representation learning methods like SwAV are proved to be effective in learning visual semantics of a target dataset. The main idea behind these methods is that different views of a same image represent the same semantics. In this paper, we further introduce an add-on module to facilitate the injection of the knowledge accounting for spatial cross correlations among the samples. This in turn results in distilling intra-class information including feature level locations and cross
arXiv:2206.05028v1
fatcat:zfzelprk4fambjhrmxnmjrk7q4