A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit <a rel="external noopener" href="https://arxiv.org/pdf/1707.00811v1.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
One-Shot Fine-Grained Instance Retrieval
[article]
<span title="2017-07-04">2017</span>
<i >
arXiv
</i>
<span class="release-stage" >pre-print</span>
Fine-Grained Visual Categorization (FGVC) has achieved significant progress recently. However, the number of fine-grained species could be huge and dynamically increasing in real scenarios, making it difficult to recognize unseen objects under the current FGVC framework. This raises an open issue to perform large-scale fine-grained identification without a complete training set. Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR).
<span class="external-identifiers">
<a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1707.00811v1">arXiv:1707.00811v1</a>
<a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/zfnpunlofrbspkgpz755xe5p2a">fatcat:zfnpunlofrbspkgpz755xe5p2a</a>
</span>
more »
... e-Shot" denotes the ability of identifying unseen objects through a fine-grained retrieval task assisted with an incomplete auxiliary training set. This paper first presents the detailed description to OSFGIR task and our collected OSFGIR-378K dataset. Next, we propose the Convolutional and Normalization Networks (CN-Nets) learned on the auxiliary dataset to generate a concise and discriminative representation. Finally, we present a coarse-to-fine retrieval framework consisting of three components, i.e., coarse retrieval, fine-grained retrieval, and query expansion, respectively. The framework progressively retrieves images with similar semantics, and performs fine-grained identification. Experiments show our OSFGIR framework achieves significantly better accuracy and efficiency than existing FGVC and image retrieval methods, thus could be a better solution for large-scale fine-grained object identification.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200829000710/https://arxiv.org/pdf/1707.00811v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext">
<button class="ui simple right pointing dropdown compact black labeled icon button serp-button">
<i class="icon ia-icon"></i>
Web Archive
[PDF]
<div class="menu fulltext-thumbnail">
<img src="https://blobs.fatcat.wiki/thumbnail/pdf/ac/2a/ac2a3722b025685f840e04d595828bb2ffb6493b.180px.jpg" alt="fulltext thumbnail" loading="lazy">
</div>
</button>
</a>
<a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1707.00811v1" title="arxiv.org access">
<button class="ui compact blue labeled icon button serp-button">
<i class="file alternate outline icon"></i>
arxiv.org
</button>
</a>