Learning visual similarity for product design with convolutional neural networks

Sean Bell, Kavita Bala
2015 ACM Transactions on Graphics  
a) Query 1: Input scene and box (b) Project into 256D embedding (c) Results 2: use of product in-situ Convolutional Neural Network Learned Parameters θ (a) Query 2: Product (c) Results 1: visually similar products Figure 1: Visual search using a learned embedding. Query 1: given an input box in a photo (a), we crop and project into an embedding (b) using a trained convolutional neural network (CNN) and return the most visually similar products (c). Query 2: we apply the same method to search
more » ... in-situ examples of a product in designer photographs. The CNN is trained from pairs of internet images, and the boxes are collected using crowdsourcing. The 256D embedding is visualized in 2D with t-SNE. Photo credit: Crisp Architects and Rob Karosis (photographer). Abstract Popular sites like Houzz, Pinterest, and LikeThatDecor, have communities of users helping each other answer questions about products in images. In this paper we learn an embedding for visual search in interior design. Our embedding contains two different domains of product images: products cropped from internet scenes, and products in their iconic form. With such a multi-domain embedding, we demonstrate several applications of visual search including identifying products in scenes and finding stylistically similar products. To obtain the embedding, we train a convolutional neural network on pairs of images. We explore several training architectures including re-purposing object classifiers, using siamese networks, and using multitask learning. We evaluate our search quantitatively and qualitatively and demonstrate high quality results for search across multiple visual domains, enabling new applications in interior design.
doi:10.1145/2766959 fatcat:2wiftcztdfa4rlhbwxeclyni3a