One-Shot Fine-Grained Instance Retrieval

Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
2017 Proceedings of the 2017 ACM on Multimedia Conference - MM '17  
Fine-Grained Visual Categorization (FGVC) has achieved signicant progress recently. However, the number of ne-grained species could be huge and dynamically increasing in real scenarios, making it di cult to recognize unseen objects under the current FGVC framework. This raises an open issue to perform large-scale negrained identi cation without a complete training set. Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR). "One-Shot"
more » ... notes the ability of identifying unseen objects through a ne-grained retrieval task assisted with an incomplete auxiliary training set. This paper rst presents the detailed description to OSFGIR task and our collected OSFGIR-378K dataset. Next, we propose the Convolutional and Normalization Networks (CN-Nets) learned on the auxiliary dataset to generate a concise and discriminative representation. Finally, we present a coarse-to-ne retrieval framework consisting of three components, i.e., coarse retrieval, ne-grained retrieval, and query expansion, respectively. The framework progressively retrieves images with similar semantics, and performs ne-grained identi cation. Experiments show our OSFGIR framework achieves signi cantly better accuracy and e ciency than existing FGVC and image retrieval methods, thus could be a better solution for large-scale ne-grained object identi cation.
doi:10.1145/3123266.3123278 dblp:conf/mm/YaoZZLT17 fatcat:s6stj7xujzganjypus6l2phive