A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Road Detection through Supervised Classification
[article]
2016
arXiv
pre-print
Autonomous driving is a rapidly evolving technology. Autonomous vehicles are capable of sensing their environment and navigating without human input through sensory information such as radar, lidar, GNSS, vehicle odometry, and computer vision. This sensory input provides a rich dataset that can be used in combination with machine learning models to tackle multiple problems in supervised settings. In this paper we focus on road detection through gray-scale images as the sole sensory input. Our
arXiv:1605.03150v1
fatcat:r6shqqktrvakrfhwaprlo2cpa4
more »
... ntributions are twofold: first, we introduce an annotated dataset of urban roads for machine learning tasks; second, we introduce a road detection framework on this dataset through supervised classification and hand-crafted feature vectors.
Discovering Style Trends through Deep Visually Aware Latent Item Embeddings
[article]
2018
arXiv
pre-print
In this paper, we explore Latent Dirichlet Allocation (LDA) and Polylingual Latent Dirichlet Allocation (PolyLDA), as a means to discover trending styles in Overstock from deep visual semantic features transferred from a pretrained convolutional neural network and text-based item attributes. To utilize deep visual semantic features in conjunction with LDA, we develop a method for creating a bag of words representation of unrolled image vectors. By viewing the channels within the convolutional
arXiv:1804.08704v1
fatcat:eimohosc25gdvns6zdurpc5ady
more »
... yers of a Resnet-50 as being representative of a word, we can index these activations to create visual documents. We then train LDA over these documents to discover the latent style in the images. We also incorporate text-based data with PolyLDA, where each representation is viewed as an independent language attempting to describe the same style. The resulting topics are shown to be excellent indicators of visual style across our platform.
Item Popularity Prediction in E-commerce Using Image Quality Feature Vectors
[article]
2016
arXiv
pre-print
In [5] a naive Bayes classifier is used and Aryafar et. al [2] studied the significance of color in favorited listings on Etsy using logistic regression, perceptron, passive aggressive and margin infused ...
arXiv:1605.03663v1
fatcat:6ugsmrwnabbgrmsqqatgj2zr2i
A Multimodal Recommender System for Large-scale Assortment Generation in E-commerce
[article]
2018
arXiv
pre-print
E-commerce platforms surface interesting products largely through product recommendations that capture users' styles and aesthetic preferences. Curating recommendations as a complete complementary set, or assortment, is critical for a successful e-commerce experience, especially for product categories such as furniture, where items are selected together with the overall theme, style or ambiance of a space in mind. In this paper, we propose two visually-aware recommender systems that can
arXiv:1806.11226v1
fatcat:2cnak27q7ndnjirvhso5m5cqpa
more »
... cally curate an assortment of living room furniture around a couple of pre-selected seed pieces for the room. The first system aims to maximize the visual-based style compatibility of the entire selection by making use of transfer learning and topic modeling. The second system extends the first by incorporating text data and applying polylingual topic modeling to infer style over both modalities. We review the production pipeline for surfacing these visually-aware recommender systems and compare them through offline validations and large-scale online A/B tests on Overstock. Our experimental results show that complimentary style is best discovered over product sets when both visual and textual data are incorporated.
Neural Networks with Manifold Learning for Diabetic Retinopathy Detection
[article]
2016
arXiv
pre-print
Widespread outreach programs using remote retinal imaging have proven to decrease the risk from diabetic retinopathy, the leading cause of blindness in the US. However, this process still requires manual verification of image quality and grading of images for level of disease by a trained human grader and will continue to be limited by the lack of such scarce resources. Computer-aided diagnosis of retinal images have recently gained increasing attention in the machine learning community. In
arXiv:1612.03961v1
fatcat:3qk65gbz6bh4lp37a642h67z6i
more »
... paper, we introduce a set of neural networks for diabetic retinopathy classification of fundus retinal images. We evaluate the efficiency of the proposed classifiers in combination with preprocessing and augmentation steps on a sample dataset. Our experimental results show that neural networks in combination with preprocessing on the images can boost the classification accuracy on this dataset. Moreover the proposed models are scalable and can be used in large scale datasets for diabetic retinopathy detection. The models introduced in this paper can be used to facilitate the diagnosis and speed up the detection process.
Images Don't Lie: Transferring Deep Visual Semantic Features to Large-Scale Multimodal Learning to Rank
[article]
2015
arXiv
pre-print
Search is at the heart of modern e-commerce. As a result, the task of ranking search results automatically (learning to rank) is a multibillion dollar machine learning problem. Traditional models optimize over a few hand-constructed features based on the item's text. In this paper, we introduce a multimodal learning to rank model that combines these traditional features with visual semantic features transferred from a deep convolutional neural network. In a large scale experiment using data
arXiv:1511.06746v1
fatcat:dhwzfckyb5e67f7qxmzsty7tee
more »
... the online marketplace Etsy, we verify that moving to a multimodal representation significantly improves ranking quality. We show how image features can capture fine-grained style information not available in a text-only representation. In addition, we show concrete examples of how image information can successfully disentangle pairs of highly different items that are ranked similarly by a text-only model.
Style conditioned recommendations
2019
Proceedings of the 13th ACM Conference on Recommender Systems - RecSys '19
We propose Style Conditioned Recommendations (SCR) and introduce style injection as a method to diversify recommendations. We use Conditional Variational Autoencoder (CVAE) architecture, where both the encoder and decoder are conditioned on a user profile learned from item content data. This allows us to apply style transfer methodologies to the task of recommendations, which we refer to as injection. To enable style injection, user profiles are learned to be interpretable such that they
doi:10.1145/3298689.3347007
dblp:conf/recsys/IqbalAA19
fatcat:wcngwunznva7pmbyr6dlccwyd4
more »
... users' propensities for specific predefined styles. These are learned via label-propagation from a dataset of item content, with limited labeled points. To perform injection, the condition on the encoder is learned while the condition on the decoder is selected per explicit feedback. Explicit feedback can be taken either from a user's response to a style or interest quiz, or from item ratings. In the absence of explicit feedback, the condition at the encoder is applied to the decoder. We show a 12% improvement on NDCG@20 over the traditional VAE based approach and an average 22% improvement on AUC across all classes for predicting user style profiles against our best performing baseline. After injecting styles we compare the user style profile to the style of the recommendations and show that injected styles have an average +133% increase in presence. Our results show that style injection is a powerful method to diversify recommendations while maintaining personal relevance. Our main contribution is an application of a semi-supervised approach that extends item labels to interpretable user profiles.
Music genre classification using explicit semantic analysis
2011
Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies - MIRUM '11
Motivation We are interested in automatically finding genre labels for a music data set. The collection of user protocols is common in human factors research, but analyzing the large data sets produced can be tedious. Recent work has proposed the use of an automated method based on explicit semantic analysis to identify the most representative genre patterns in a large data set. The method only uses signal-based mel frequency cepstral coefficients (MFCCs) as audio feature representation and
doi:10.1145/2072529.2072539
dblp:conf/mm/AryafarS11
fatcat:xlaf7dx4wjdffkbi5j4ah5eit4
more »
... oves upon previous methods that use the same set of features for music genre classification.
Discovering Style Trends Through Deep Visually Aware Latent Item Embeddings
2018
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
In this paper, we explore Latent Dirichlet Allocation (LDA) [1] and Polylingual Latent Dirichlet Allocation (PolyLDA) [6] , as a means to discover trending styles in Overstock 1 from deep visual semantic features transferred from a pretrained convolutional neural network and textbased item attributes. To utilize deep visual semantic features in conjunction with LDA, we develop a method for creating a bag of words representation of unrolled image vectors. By viewing the channels within the
doi:10.1109/cvprw.2018.00253
dblp:conf/cvpr/IqbalKA18
fatcat:ilf3oiaxrnc7vlfjifi2yaiufe
more »
... utional layers of a Resnet-50 [2] as being representative of a word, we can index these activations to create visual documents. We then train LDA over these documents to discover the latent style in the images. We also incorporate text-based data with PolyLDA, where each representation is viewed as an independent language attempting to describe the same style. The resulting topics are shown to be excellent indicators of visual style across our platform.
An Ensemble-based Approach to Click-Through Rate Prediction for Promoted Listings at Etsy
2017
Proceedings of the ADKDD'17 on ZZZ - ADKDD'17
Etsy is a global marketplace where people across the world connect to make, buy and sell unique goods. Sellers at Etsy can promote their product listings via advertising campaigns similar to traditional sponsored search ads. Click-Through Rate (CTR) prediction is an integral part of online search advertising systems where it is utilized as an input to auctions which determine the final ranking of promoted listings to a particular user for each query. In this paper, we provide a holistic view of
doi:10.1145/3124749.3124758
dblp:conf/kdd/AryafarGH17
fatcat:sz4gjlj76bh5rg6z6s2vdlgcgq