A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Integrating Scene Text and Visual Appearance for Fine-Grained Image Classification
[article]
2017
arXiv
pre-print
Text in natural images contains rich semantics that are often highly relevant to objects or scene. In this paper, we focus on the problem of fully exploiting scene text for visual understanding. The main idea is combining word representations and deep visual features into a globally trainable deep convolutional neural network. First, the recognized words are obtained by a scene text reading system. Then, we combine the word embedding of the recognized words and the deep visual features into a
arXiv:1704.04613v2
fatcat:njyzdagkondknmvytbgshm6mhu