47 Hits in 5.1 sec

Multi-Level Spherical Locality Sensitive Hashing For Approximate Near Neighbors [article]

Teresa Nicole Brooks, Rania Almajalid
2017 arXiv   pre-print
This paper introduces "Multi-Level Spherical LSH": parameter-free, a multi-level, data-dependant Locality Sensitive Hashing data structure for solving the Approximate Near Neighbors Problem (ANN).  ...  This data structure uses a modified version of a multi-probe adaptive querying algorithm, with the potential of achieving a O(n^p + t) query run time, for all inputs n where t <= n.  ...  Parameter-free Locality Sensitive Hashing for Spherical Range Reporting In this paper, the authors propose two adaptive LSH based algorithms to solve the spherical range reporting problem, as well as a  ... 
arXiv:1709.03517v2 fatcat:5yuof3wnqzaizj64q3srk7f35y

An Adaptive Multi-level Hashing Structure for Fast Approximate Similarity Search

Alexander Ocsa, Elaine P. M. de Sousa
2010 Journal of Information and Data Management  
In this context, an approximate similarity search algorithm known as Locality Sensitive Hashing (LSH) was recently proposed to query high-dimensional datasets with efficient computational time.  ...  By employing a Multi-level scheme it is possible to dynamically adapt the data domain parameters and exploit the resulting multi-resolution index structure to speed up the query process.  ...  In this direction, Locality Sensitive Hashing (LSH) [Datar et al. 2004 ] is one of the recent hash-based techniques proposed to organize and query high-dimensional data.  ... 
dblp:journals/jidm/OcsaS10 fatcat:konxw26hrzggdcqnu5lzwruglu

Hybrid LSH: Faster Near Neighbors Reporting in High-dimensional Space [article]

Ninh Pham
2017 arXiv   pre-print
We study the r-near neighbors reporting problem (r-NN), i.e., reporting all points in a high-dimensional point set S that lie within a radius r of a given query point q.  ...  in high-dimensional space.  ...  INTRODUCTION We study the r-near neighbors reporting problem (rNNR) (or spherical range reporting) [2, 5] : Given a d-dimensional point set S of size n, reporting all points in S that lie within * Research  ... 
arXiv:1607.06179v3 fatcat:sk6wcxc3i5aohnm3acbxsylwbm

Parameter-free Locality Sensitive Hashing for Spherical Range Reporting

Thomas D. Ahle, Martin Aumüller, Rasmus Pagh
2017 Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms  
We present a data structure for *spherical range reporting* on a point set S, i.e., reporting all points in S that lie within radius r of a given query point q.  ...  We further present a parameter-free way of using multi-probing, for LSH families that support it, and show that for many such families this approach allows us to get expected query time close to O(n^ρ+  ...  In addition they want to thank Ninh Pham for early discussions about adaptive LSH and output sensitivity.  ... 
doi:10.1137/1.9781611974782.16 dblp:conf/soda/AhleAP17 fatcat:c7x5e4wu4je65e25yheak7y6gi

Local Density Estimation in High Dimensions [article]

Xian Wu, Moses Charikar, Vishnu Natchu
2018 arXiv   pre-print
We develop two estimators, LSH Count and Multi-Probe Count that use locality sensitive hashing to preprocess the data to accurately and efficiently estimate the answers to such questions via importance  ...  An important question that arises in the study of high dimensional vector representations learned from data is: given a set D of vectors and a query q, estimate the number of points within a specified  ...  Xian Wu was supported by a Harold Thomas Hahn Jr. Fellowship from the Department of Management Science and Engineering at Stanford University.  ... 
arXiv:1809.07471v1 fatcat:4c5hj62zw5cz5dsyjmnkb3wcy4

PM-LSH: a fast and accurate in-memory framework for high-dimensional approximate NN and closest pair search [article]

Bolong Zheng, Xi Zhao, Lianggui Weng, Nguyen Quoc Viet Hung, Hang Liu, Christian S. Jensen
2021 arXiv   pre-print
In addition, we extend PM-LSH to support closest pair (CP) search in high-dimensional spaces.  ...  In contrast, we propose a fast and accurate in-memory LSH framework, called PM-LSH, that aims to compute c-ANN queries on large-scale, high-dimensional datasets.  ...  In addition, we extend PM-LSH to support closest pair (CP) search in high-dimensional spaces.  ... 
arXiv:2107.05537v1 fatcat:wkxkobax4neu5mg6y2m7wlrcpu

Hashing for Similarity Search: A Survey [article]

Jingdong Wang, Heng Tao Shen, Jingkuan Song, Jianqiu Ji
2014 arXiv   pre-print
In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since the pioneering work locality sensitive hashing.  ...  We divide the hashing algorithms two main categories: locality sensitive hashing, which designs hash functions without exploring the data distribution and learning to hash, which learns hash functions  ...  Multi-probe LSH The basic idea of multi-probe LSH [91] is to intelligently probe multiple buckets that are likely to contain query results in a hash table, whose hash values may not necessarily be the  ... 
arXiv:1408.2927v1 fatcat:reknwesjnbafvcbouyudrzp4rq

Redundant Bit Vectors for Quickly Searching High-Dimensional Regions [chapter]

Jonathan Goldstein, John C. Plat, Christopher J. C. Burges
2005 Lecture Notes in Computer Science  
RBVs rely on three key ideas: 1) approximate the high-dimensional regions/distributions as tightened hyperrectangles, 2) partition the query space to store each item redundantly in an index and 3) use  ...  R for the hypersphere and for the hypercube that encloses 99.9% of the probability mass of a unit spherical Gaussian.  ...  If the data does not lie in a vector space, it can still get mapped to a vector space by kernel PCA [6] or multi-dimensional scaling [7] .  ... 
doi:10.1007/11559887_9 fatcat:oauddmqfzjdjbdikutbbf6s5vu

A reliable order-statistics-based approximate nearest neighbor search algorithm [article]

Luisa Verdoliva, Davide Cozzolino, Giovanni Poggi
2016 arXiv   pre-print
Overall, the proposed algorithm corresponds to locality sensitive hashing in the space of directions, with hashing based on the order of components.  ...  Nonetheless, LSH can achieve a good performance through a number of clever expedients, like the use of multiple hash tables [20] , and multi-probe search [22] .  ...  If ROSANNA is regarded as LSH in the space of directions, then the same goal is pursued by the Spherical LSH (SLSH) proposed in [27] , (not to be confused with the unrelated Spherical Hashing [28] ),  ... 
arXiv:1509.03453v2 fatcat:qmru3pefmjdrxcbctfpg3rz7di

Practical linear-space Approximate Near Neighbors in high dimension [article]

Georgia Avarikioti, Ioannis Z. Emiris, Ioannis Psarros, Georgios Samaras
2016 arXiv   pre-print
The c-approximate Near Neighbor problem in high dimensional spaces has been mainly addressed by Locality Sensitive Hashing (LSH), which offers polynomial dependence on the dimension, query time sublinear  ...  To illustrate our claim of practicality, we offer an open-source implementation in C++, and report on several experiments in dimension up to 1000 and n up to 10^6.  ...  FALCONN for multi-probe LSH.  ... 
arXiv:1612.07405v1 fatcat:qwc4rcdhcjaxxbtuqljmpep77y

Improved Similarity Search for Large Data in Machine Learning and Robotics

Josiah Walker
2019 Figshare  
hash (LSH) code generation method which has a lower computational and technical cost than baseline methods, while maintaining performance across a range of datasets.  ...  Applying this framework speeds up existing LSH boosting algorithms without loss of performance.  ...  Subsampled Locality-Sensitive Hashing (Chapter 3) Investigating a dimensionality-free hashing technique for similarity search on high-dimensional data resulted in Subsampled LSH.  ... 
doi:10.6084/m9.figshare.9942509 fatcat:ajvkwfnmyff6hjw7kh2njeuosm

Parametric Plan Caching Using Density-Based Clustering

Gunes Aluç, David E. DeHaan, Ivan T. Bowman
2012 2012 IEEE 28th International Conference on Data Engineering  
The clustering algorithm is density-based, and it exploits locality-sensitive hashing as a pre-processing step so that clusters in the plan spaces can be efficiently stored in database histograms and queried  ...  First, the ordering may place two distant points from the multi-dimensional space next to each other.  ...  Because database histograms are unidimensional data structures, we require a method to map multi-dimensional distributions to a single dimension.  ... 
doi:10.1109/icde.2012.57 dblp:conf/icde/AlucDB12 fatcat:kvbafoc3vnbpbejyzs5pidr2k4

Renewing the respect for similarity

Shimon Edelman, Reza Shahbazi
2012 Frontiers in Computational Neuroscience  
We argue for a renewed focus on similarity as an explanatory concept, by surveying established results and new developments in the theory and methods of similarity-preserving associative lookup and dimensionality  ...  hashing (LSH) and to concomitant statistics, (5) introduces a new model, the Chorus of Relational Descriptors (ChoRD), that extends this framework to scene representation and interpretation, (6) describes  ...  THE CHORUS TRANSFORM IMPLEMENTS LOCALITY-SENSITIVE HASHING (LSH) Significant progress in similarity-based high-dimensional data management has been recently brought about by the development of new algorithms  ... 
doi:10.3389/fncom.2012.00045 pmid:22811664 pmcid:PMC3396327 fatcat:qwcovuag4zb47msmltzfigbvny

Learning to Hash for Indexing Big Data - A Survey [article]

Jun Wang, Wei Liu, Sanjiv Kumar, Shih-Fu Chang
2015 arXiv   pre-print
., Locality-Sensitive Hashing (LSH), explore data-independent hash functions with random projections or permutations.  ...  [119] used LSH for the vector hashing task. While simple, the hashing technique (mapping + LSH) of [119] perhaps suffers from the high dimensionality of the constructed new vector space.  ...  After learning a multi-layer RBM through pre-training and finetuning on a collection of documents, the hash code of any document is acquired by simply thresholding the output of the deepest layer.  ... 
arXiv:1509.05472v1 fatcat:haj52w3cbbgszlmalfyu2kvzde

ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms [article]

Martin Aumüller, Erik Bernhardsson, Alexander Faithfull
2018 arXiv   pre-print
It provides a standard interface for measuring the performance and quality achieved by nearest neighbor algorithms on different standard data sets.  ...  It supports several different ways of integrating k-NN algorithms, and its configuration system automatically tests a range of parameter settings for each algorithm.  ...  This work was supported by a GPU donation from NVIDIA.  ... 
arXiv:1807.05614v2 fatcat:4jy5ouj7ybe2hhw6jttsmuk3ca
« Previous Showing results 1 — 15 out of 47 results