A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
This paper presents a novel deep learning architecture to classify structured objects in datasets with a large number of visually similar categories. We model sequences of images as linear-chain CRFs, and jointly learn the parameters from both local-visual features and neighboring classes. The visual features are computed by convolutional layers, and the class embeddings are learned by factorizing the CRF pairwise potential matrix. This forms a highly nonlinear objective function which isarXiv:1705.07420v2 fatcat:pjcw534dsnbdjmgdrbjoduucsa