Filters








61,188 Hits in 6.0 sec

Indexing expensive functions for efficient multi-dimensional similarity search

Hanxiong Chen, Jianquan Liu, Kazutaka Furuse, Jeffrey Xu Yu, Nobuo Ohbo
2010 Knowledge and Information Systems  
This leads to the increasing need for supporting the indexing of high dimensional data.  ...  Linear scan of the data with approximation is more efficient in the high dimensional similarity search. However, approaches so far have concentrated on reducing I/O, and ignored the computation cost.  ...  Our algorithms are extremely efficient for high dimensional search with an expensive function.  ... 
doi:10.1007/s10115-010-0303-2 fatcat:yclhqlr3irf3bkzi6jt74jd6nu

Learning to Prune in Metric and Non-Metric Spaces

Leonid Boytsov, Bilegsaikhan Naidan
2013 Neural Information Processing Systems  
Our method was competitive against state-of-the-art methods and, in most cases, was more efficient for the same rank approximation quality.  ...  Both methods are evaluated using data sets with metric (Euclidean) and non-metric (KL-divergence and Itakura-Saito) distance functions.  ...  Acknowledgements We thank Lawrence Cayton for providing the data sets, the bbtree code, and answering our questions; Anna Belova for checking the proof of Property 1 (supplemental materials) and editing  ... 
dblp:conf/nips/BoytsovN13 fatcat:lantcw4ovvawhh57o6mydziyhe

Efficient similarity search within user-specified projective subspaces

Michael E. Houle, Xiguo Ma, Vincent Oria, Jichao Sun
2016 Information Systems  
Even if the query subspaces were known in advance, constructing an index for every possible subspace would still be prohibitively expensive.  ...  Motivated by the difficulty of search in higher dimensional spaces due to the so-called 'curse of dimensionality' [1, 2, 3], the efficiency of similarity search may be improved through an appropriate projection  ...  In particular, we follow the so-called 'multi-step' search strategy [9, 10, 11] , utilizing 1-dimensional distances as lower bounds to efficiently prune the search space.  ... 
doi:10.1016/j.is.2016.01.008 fatcat:jnrjvvd3izdjtn4i5if4upxw2y

A Multi-way Divergence Metric for Vector Spaces [chapter]

Robert Moss, Richard Connor
2013 Lecture Notes in Computer Science  
We expect uses of the function in the domain of similarity search to follow.  ...  The majority of work in similarity search focuses on the efficiency of threshold and nearest-neighbour queries.  ...  We believe that this function will turn out to be useful in the domain of similarity search.  ... 
doi:10.1007/978-3-642-41062-8_17 fatcat:zrfsnl7oxzc23hca5fi5tokksa

Optimal multi-step k-nearest neighbor search

Thomas Seidl, Hans-Peter Kriegel
1998 Proceedings of the 1998 ACM SIGMOD international conference on Management of data - SIGMOD '98  
Whereas algorithms that are directly based on indexes work well for simple medium-dimensional similarity distance functions, they do not meet the efficiency requirements of complex high-dimensional and  ...  For an increasing number of modern database applications, efficient support of similarity search becomes an important task.  ...  Examples Lower-Bounding Filter Distance Functions For similarity search in presence of complex high-dimensional or even user-adaptable similarity distance functions, multi-step algorithms are available  ... 
doi:10.1145/276304.276319 dblp:conf/sigmod/SeidlK98 fatcat:mktlcsod5nhsbaa3licv3cn4wy

Optimal multi-step k-nearest neighbor search

Thomas Seidl, Hans-Peter Kriegel
1998 SIGMOD record  
Whereas algorithms that are directly based on indexes work well for simple medium-dimensional similarity distance functions, they do not meet the efficiency requirements of complex high-dimensional and  ...  For an increasing number of modern database applications, efficient support of similarity search becomes an important task.  ...  Examples Lower-Bounding Filter Distance Functions For similarity search in presence of complex high-dimensional or even user-adaptable similarity distance functions, multi-step algorithms are available  ... 
doi:10.1145/276305.276319 fatcat:hbqyc4rlpvctrblbrg2lfyscqa

Image annotation by large-scale content-based image retrieval

Xirong Li, Le Chen, Lei Zhang, Fuzong Lin, Wei-Ying Ma
2006 Proceedings of the 14th annual ACM international conference on Multimedia - MULTIMEDIA '06  
Given an uncaptioned image, first in the search stage, we perform content-based image retrieval (CBIR) facilitated by high-dimensional indexing to find a set of visually similar images from a large-scale  ...  Based on search technologies, this framework does not impose an explicit training stage, but efficiently leverages large-scale and well-annotated images, and is potentially capable of dealing with unlimited  ...  for indexing high-dimensional visual features, and the search result clustering (SRC) [7] technique.  ... 
doi:10.1145/1180639.1180764 dblp:conf/mm/LiCZLM06 fatcat:otpeqlfrzncc5pnsctlh7eh5xu

Engineering Efficient and Effective Non-metric Space Library [chapter]

Leonid Boytsov, Bilegsaikhan Naidan
2013 Lecture Notes in Computer Science  
We present a new similarity search library and discuss a variety of design and performance issues related to its development.  ...  Rather than looking for the best method, we want to ensure that the library implements competitive baselines, which can be useful for future work.  ...  We would like to thank Lawrence Cayton for providing the data sets, Vladimir Pestov for the discussion on the curse of dimensionality, and anonymous reviewers for helpful suggestions.  ... 
doi:10.1007/978-3-642-41062-8_28 fatcat:7g6627ejkrfr5nnepm3ddhg2lm

Efficient Locality-Sensitive Hashing Over High-Dimensional Data Streams

Chengcheng Yang, Dong Deng, Shuo Shang, Ling Shao
2020 2020 IEEE 36th International Conference on Data Engineering (ICDE)  
In this paper, we present PDA-LSH, a novel and practical disk-based LSH index that can offer efficient support for both updates and searches.  ...  for high-dimensional streaming data processing.  ...  We proposed an efficient LSH index, namely PDA-LSH, for high-dimensional streaming data processing.  ... 
doi:10.1109/icde48307.2020.00220 dblp:conf/icde/YangDS020 fatcat:53r3b6sgt5e3bpdhe3g47kzype

Fast Similarity Search for Learned Metrics

B. Kulis, P. Jain, K. Grauman
2009 IEEE Transactions on Pattern Analysis and Machine Intelligence  
To enable sub-linear time similarity search under the learned metric, we show how to encode a learned Mahalanobis parameterization into randomized locality-sensitive hash functions.  ...  We propose a method to efficiently index into a large database of examples according to a learned metric.  ...  In order to efficiently index multi-dimensional data, data structures based on spatial partitioning and recursive hyperplane decomposition have been developed, including k − d-trees [10] and metric trees  ... 
doi:10.1109/tpami.2009.151 pmid:19834137 fatcat:hxm7popjxzfj3gpukzn43jufau

Efficient region-based image retrieval

Roger Weber, Michael Mlivoncic
2003 Proceedings of the twelfth international conference on Information and knowledge management - CIKM '03  
As our experiments show, these bounding functions are so tight, that we have to evaluate the expensive distance function for less than 0.5% of the images.  ...  In this paper, we apply a multi-step approach to enable region-based techniques for large image collections.  ...  In this work, we have described efficient multi-step algorithms to search for the k best matches for RBIR queries.  ... 
doi:10.1145/956876.956878 fatcat:hvp3qcdp4fa5tkbyzuuyfwhasq

Efficient region-based image retrieval

Roger Weber, Michael Mlivoncic
2003 Proceedings of the twelfth international conference on Information and knowledge management - CIKM '03  
As our experiments show, these bounding functions are so tight, that we have to evaluate the expensive distance function for less than 0.5% of the images.  ...  In this paper, we apply a multi-step approach to enable region-based techniques for large image collections.  ...  In this work, we have described efficient multi-step algorithms to search for the k best matches for RBIR queries.  ... 
doi:10.1145/956863.956878 dblp:conf/cikm/WeberM03 fatcat:kehkjl5fbfcjrliujzwchmm7tu

ProteinDBS: a real-time retrieval system for protein structure comparison

C.-R. Shyu, P.-H. Chi, G. Scott, D. Xu
2004 Nucleic Acids Research  
We have developed a web server (ProteinDBS) for the life science community to search for similar protein tertiary structures in real time.  ...  When meaningful contents, represented in a multi-dimensional feature space, have been extracted from distance matrices, an advanced indexing structure, Entropy Balanced Statistical (EBS) k-d tree, is utilized  ...  (16) for maintaining the tertiary structures.  ... 
doi:10.1093/nar/gkh436 pmid:15215453 pmcid:PMC441574 fatcat:fhkca5oscbdt7hi5k5wgc3ph4q

Distance-based indexing for high-dimensional metric spaces

Tolga Bozkaya, Meral Ozsoyoglu
1997 Proceedings of the 1997 ACM SIGMOD international conference on Management of data - SIGMOD '97  
In this paper, we introduce a distance based index structure called multi-vantage point (mvp) tree for similarity queries on high-dimensional metric spaces.  ...  Distance based index structures are proposed for applications where the data domain is high dimensional, or the distance function used to compute distances between data objects is non-Euclidean.  ...  In this paper, we introduce the mvp-tree (multi-vantage point tree) as a general solution to the problem of answering similarity based queries efficiently for high-dimensional metric spaces.  ... 
doi:10.1145/253260.253345 dblp:conf/sigmod/BozkayaO97 fatcat:7zqybhoebfb5rnlo4cjglcdeze

Distance-based indexing for high-dimensional metric spaces

Tolga Bozkaya, Meral Ozsoyoglu
1997 SIGMOD record  
In this paper, we introduce a distance based index structure called multi-vantage point (mvp) tree for similarity queries on high-dimensional metric spaces.  ...  Distance based index structures are proposed for applications where the data domain is high dimensional, or the distance function used to compute distances between data objects is non-Euclidean.  ...  In this paper, we introduce the mvp-tree (multi-vantage point tree) as a general solution to the problem of answering similarity based queries efficiently for high-dimensional metric spaces.  ... 
doi:10.1145/253262.253345 fatcat:xwwra6arz5fe5ayx2tnavxw2n4
« Previous Showing results 1 — 15 out of 61,188 results