Filters








72 Hits in 1.1 sec

Pairwise Quantization [article]

Artem Babenko, Relja Arandjelović, Victor Lempitsky
2016 arXiv   pre-print
Babenko at artem.babenko@phystech.edu  ...  60 40 24 12 OPQ error, 10 −3 7.570 6.952 5.828 4.432 1.793 PairQ error, 10 −3 6.464 4.601 3.719 2.529 1.297 Error reduction w.r.t OPQ 15% 34% 36% 42% 28% Please contact Artem  ... 
arXiv:1606.01550v1 fatcat:pbuuy3iumnfebjdrtoas74mzui

Aggregating Deep Convolutional Features for Image Retrieval [article]

Artem Babenko, Victor Lempitsky
2015 arXiv   pre-print
Several recent works have shown that image descriptors produced by deep convolutional neural networks provide state-of-the-art performance for image classification and retrieval problems. It has also been shown that the activations from the convolutional layers can be interpreted as local features describing particular image regions. These local features can be aggregated using aggregation approaches developed for local features (e.g. Fisher vectors), thus providing new powerful global
more » ... rs. In this paper we investigate possible ways to aggregate local deep features to produce compact global descriptors for image retrieval. First, we show that deep features and traditional hand-engineered features have quite different distributions of pairwise similarities, hence existing aggregation methods have to be carefully re-evaluated. Such re-evaluation reveals that in contrast to shallow features, the simple aggregation method based on sum pooling provides arguably the best performance for deep convolutional features. This method is efficient, has few parameters, and bears little risk of overfitting when e.g. learning the PCA matrix. Overall, the new compact global descriptor improves the state-of-the-art on four common benchmarks considerably.
arXiv:1510.07493v1 fatcat:e6khopiv2rclhnzsjg4gliheyi

Unsupervised Neural Quantization for Compressed-Domain Similarity Search [article]

Stanislav Morozov, Artem Babenko
2019 arXiv   pre-print
We tackle the problem of unsupervised visual descriptors compression, which is a key ingredient of large-scale image retrieval systems. While the deep learning machinery has benefited literally all computer vision pipelines, the existing state-of-the-art compression methods employ shallow architectures, and we aim to close this gap by our paper. In more detail, we introduce a DNN architecture for the unsupervised compressed-domain retrieval, based on multi-codebook quantization. The proposed
more » ... hitecture is designed to incorporate both fast data encoding and efficient distances computation via lookup tables. We demonstrate the exceptional advantage of our scheme over existing quantization approaches on several datasets of visual descriptors via outperforming the previous state-of-the-art by a large margin.
arXiv:1908.03883v1 fatcat:3aklfr6eiba5jndscoqwd6khjy

Functional Space Analysis of Local GAN Convergence [article]

Valentin Khrulkov, Artem Babenko, Ivan Oseledets
2021 arXiv   pre-print
Recent work demonstrated the benefits of studying continuous-time dynamics governing the GAN training. However, this dynamics is analyzed in the model parameter space, which results in finite-dimensional dynamical systems. We propose a novel perspective where we study the local dynamics of adversarial training in the general functional space and show how it can be represented as a system of partial differential equations. Thus, the convergence properties can be inferred from the eigenvalues of
more » ... he resulting differential operator. We show that these eigenvalues can be efficiently estimated from the target dataset before training. Our perspective reveals several insights on the practical tricks commonly used to stabilize GANs, such as gradient penalty, data augmentation, and advanced integration schemes. As an immediate practical benefit, we demonstrate how one can a priori select an optimal data augmentation strategy for a particular generation task.
arXiv:2102.04448v1 fatcat:bh4puu6sqrhvnabihcdhhprntm

Disentangled Representations from Non-Disentangled Models [article]

Valentin Khrulkov, Leyla Mirvakhabova, Ivan Oseledets, Artem Babenko
2021 arXiv   pre-print
Discovering interpretable directions We consider several recently proposed methods: ClosedForm , GANspace (Härkönen et al., 2020) , LatentDiscovery (Voynov & Babenko, 2020) .  ...  The discovery of directions that allow for interesting image manipulations is a nontrivial task, which, however, can be performed in an unsupervised manner surprisingly efficiently (Voynov & Babenko,  ... 
arXiv:2102.06204v1 fatcat:yeg24kjkkzdnhhcxv4igm5mz3u

Impostor Networks for Fast Fine-Grained Recognition [article]

Vadim Lebedev, Artem Babenko, Victor Lempitsky
2018 arXiv   pre-print
In this work we introduce impostor networks, an architecture that allows to perform fine-grained recognition with high accuracy and using a light-weight convolutional network, making it particularly suitable for fine-grained applications on low-power and non-GPU enabled platforms. Impostor networks compensate for the lightness of its 'backend' network by combining it with a lightweight non-parametric classifier. The combination of a convolutional network and such non-parametric classifier is
more » ... ined in an end-to-end fashion. Similarly to convolutional neural networks, impostor networks can fit large-scale training datasets very well, while also being able to generalize to new data points. At the same time, the bulk of computations within impostor networks happen through nearest neighbor search in high-dimensions. Such search can be performed efficiently on a variety of architectures including standard CPUs, where deep convolutional networks are inefficient. In a series of experiments with three fine-grained datasets, we show that impostor networks are able to boost the classification accuracy of a moderate-sized convolutional network considerably at a very small computational cost.
arXiv:1806.05217v1 fatcat:rysn4soshnb3pd6t2waqbeng4q

Editable Neural Networks [article]

Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitriy Pyrkin, Sergei Popov, Artem Babenko
2020 arXiv   pre-print
These days deep neural networks are ubiquitously used in a wide range of tasks, from image classification and machine translation to face identification and self-driving cars. In many applications, a single model error can lead to devastating financial, reputational and even life-threatening consequences. Therefore, it is crucially important to correct model mistakes quickly as they appear. In this work, we investigate the problem of neural network editing - how one can efficiently patch a
more » ... ke of the model on a particular sample, without influencing the model behavior on other samples. Namely, we propose Editable Training, a model-agnostic training technique that encourages fast editing of the trained model. We empirically demonstrate the effectiveness of this method on large-scale image classification and machine translation tasks.
arXiv:2004.00345v2 fatcat:lnrxcjtkbbemvjkdqayoc4ngfy

RPGAN: GANs Interpretability via Random Routing [article]

Andrey Voynov, Artem Babenko
2020 arXiv   pre-print
Several prior works have demonstrated that in discriminative convolutional neural networks, different layers are "responsible" for different levels of abstraction (Zeiler & Fergus, 2014; Babenko et al  ... 
arXiv:1912.10920v2 fatcat:s2dgfkxmfze4po5pvus3xgv6du

Neural Codes for Image Retrieval [article]

Artem Babenko, Anton Slesarev, Alexandr Chigorin, Victor Lempitsky
2014 arXiv   pre-print
It has been shown that the activations invoked by an image within the top layers of a large convolutional neural network provide a high-level descriptor of the visual content of the image. In this paper, we investigate the use of such descriptors (neural codes) within the image retrieval application. In the experiments with several standard retrieval benchmarks, we establish that neural codes perform competitively even when the convolutional neural network has been trained for an unrelated
more » ... ification task (e.g.\ Image-Net). We also evaluate the improvement in the retrieval performance of neural codes, when the network is retrained on a dataset of images that are similar to images encountered at test time. We further evaluate the performance of the compressed neural codes and show that a simple PCA compression provides very good short codes that give state-of-the-art accuracy on a number of datasets. In general, neural codes turn out to be much more resilient to such compression in comparison other state-of-the-art descriptors. Finally, we show that discriminative dimensionality reduction trained on a dataset of pairs of matched photographs improves the performance of PCA-compressed neural codes even further. Overall, our quantitative experiments demonstrate the promise of neural codes as visual descriptors for image retrieval.
arXiv:1404.1777v2 fatcat:rzhl4fk5pbfljjlslxvkrbfn4q

Product Split Trees

Artem Babenko, Victor Lempitsky
2017 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
In this work, we introduce a new kind of spatial partition trees for efficient nearest-neighbor search. Our approach first identifies a set of useful data splitting directions, and then learns a codebook that can be used to encode such directions. We use the product-quantization idea in order to make the effective codebook large, the evaluation of scalar products between the query and the encoded splitting direction very fast, and the encoding itself compact. As a result, the proposed data
more » ... ture (Product Split tree) achieves compact clustering of data points, while keeping the traversal very efficient. In the nearest-neighbor search experiments on high-dimensional data, product split trees achieved state-of-the-art performance, demonstrating better speed-accuracy tradeoff than other spatial partition trees.
doi:10.1109/cvpr.2017.669 dblp:conf/cvpr/BabenkoL17 fatcat:fvrh66vgibaxbjzs5cet5ln7ay

Neural Codes for Image Retrieval [chapter]

Artem Babenko, Anton Slesarev, Alexandr Chigorin, Victor Lempitsky
2014 Lecture Notes in Computer Science  
It has been shown that the activations invoked by an image within the top layers of a large convolutional neural network provide a high-level descriptor of the visual content of the image. In this paper, we investigate the use of such descriptors (neural codes) within the image retrieval application. In the experiments with several standard retrieval benchmarks, we establish that neural codes perform competitively even when the convolutional neural network has been trained for an unrelated
more » ... ification task (e.g. Image-Net). We also evaluate the improvement in the retrieval performance of neural codes, when the network is retrained on a dataset of images that are similar to images encountered at test time. We further evaluate the performance of the compressed neural codes and show that a simple PCA compression provides very good short codes that give state-of-the-art accuracy on a number of datasets. In general, neural codes turn out to be much more resilient to such compression in comparison other state-of-the-art descriptors. Finally, we show that discriminative dimensionality reduction trained on a dataset of pairs of matched photographs improves the performance of PCAcompressed neural codes even further. Overall, our quantitative experiments demonstrate the promise of neural codes as visual descriptors for image retrieval.
doi:10.1007/978-3-319-10590-1_38 fatcat:s3ad4dk34jgdbawkkorr7wseba

Towards Similarity Graphs Constructed by Deep Reinforcement Learning [article]

Dmitry Baranchuk, Artem Babenko
2020 arXiv   pre-print
DEEP100K dataset (Babenko & Lempitsky, 2016 ) is a subset of one billion of 96-dimensional CNN-produced feature vectors of natural images from the Web. The base set contains 100,000 vectors.  ... 
arXiv:1911.12122v2 fatcat:moev63syunf6no4hn2w2z46cdi

Improving Bilayer Product Quantization for Billion-Scale Approximate Nearest Neighbors in High Dimensions [article]

Artem Babenko, Victor Lempitsky
2014 arXiv   pre-print
The top-performing systems for billion-scale high-dimensional approximate nearest neighbor (ANN) search are all based on two-layer architectures that include an indexing structure and a compressed datapoints layer. An indexing structure is crucial as it allows to avoid exhaustive search, while the lossy data compression is needed to fit the dataset into RAM. Several of the most successful systems use product quantization (PQ) for both the indexing and the dataset compression layers. These
more » ... s are however limited in the way they exploit the interaction of product quantization processes that happen at different stages of these systems. Here we introduce and evaluate two approximate nearest neighbor search systems that both exploit the synergy of product quantization processes in a more efficient way. The first system, called Fast Bilayer Product Quantization (FBPQ), speeds up the runtime of the baseline system (Multi-D-ADC) by several times, while achieving the same accuracy. The second system, Hierarchical Bilayer Product Quantization (HBPQ) provides a significantly better recall for the same runtime at a cost of small memory footprint increase. For the BIGANN dataset of billion SIFT descriptors, the 10% increase in Recall@1 and the 17% increase in Recall@10 is observed.
arXiv:1404.1831v1 fatcat:2v3z64lpqvbgvczllct34lgazi

Revisiting Deep Learning Models for Tabular Data [article]

Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, Artem Babenko
2021 arXiv   pre-print
The existing literature on deep learning for tabular data proposes a wide range of novel architectures and reports competitive results on various datasets. However, the proposed models are usually not properly compared to each other and existing works often use different benchmarks and experiment protocols. As a result, it is unclear for both researchers and practitioners what models perform best. Additionally, the field still lacks effective baselines, that is, the easy-to-use models that
more » ... de competitive performance across different problems. In this work, we perform an overview of the main families of DL architectures for tabular data and raise the bar of baselines in tabular DL by identifying two simple and powerful deep architectures. The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works. The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks. Both models are compared to many existing architectures on a diverse set of tasks under the same training and tuning protocols. We also compare the best DL models with Gradient Boosted Decision Trees and conclude that there is still no universally superior solution.
arXiv:2106.11959v2 fatcat:v4g4vvf3ojd25jkuyvjxwb26fy

The Inverted Multi-Index

Artem Babenko, Victor Lempitsky
2015 IEEE Transactions on Pattern Analysis and Machine Intelligence  
A new data structure for efficient similarity search in very large datasets of high-dimensional vectors is introduced. This structure called the inverted multi-index generalizes the inverted index idea by replacing the standard quantization within inverted indices with product quantization. For very similar retrieval complexity and preprocessing time, inverted multi-indices achieve a much denser subdivision of the search space compared to inverted indices, while retaining their memory
more » ... . Our experiments with large datasets of SIFT and GIST vectors demonstrate that because of the denser subdivision, inverted multiindices are able to return much shorter candidate lists with higher recall. Augmented with a suitable reranking procedure, multi-indices were able to improve the speed of approximate nearest neighbor search on the dataset of 1 billion SIFT vectors by an order of magnitude compared to the best previously published systems, while achieving better recall and incurring only few percent of memory overhead.
doi:10.1109/tpami.2014.2361319 pmid:26357346 fatcat:tjh5cgacjreu3bsz76qjmqal3i
« Previous Showing results 1 — 15 out of 72 results