Filters








955 Hits in 6.7 sec

Optimal lower bounds for locality sensitive hashing (except when q is tiny) [article]

Ryan O'Donnell, Yi Wu, Yuan Zhou
2009 arXiv   pre-print
We study lower bounds for Locality Sensitive Hashing (LSH) in the strongest setting: point sets in 0,1^d under the Hamming distance.  ...  In this paper we show an optimal lower bound: rho must be at least 1/c (minus o_d(1)).  ...  Acknowledgments The authors would like to thank Alexandr Andoni, Piotr Indyk, Assaf Naor, and Kunal Talwar for helpful discussions.  ... 
arXiv:0912.0250v1 fatcat:q7di7aq7tba4fjc373glkj6ile

Optimal Lower Bounds for Locality-Sensitive Hashing (Except When q is Tiny)

Ryan O'Donnell, Yi Wu, Yuan Zhou
2014 ACM Transactions on Computation Theory  
We study lower bounds for Locality Sensitive Hashing (LSH) in the strongest setting: point sets in {0, 1} d under the Hamming distance.  ...  Still, we conclude by discussing why it would be more satisfying to find LSH lower bounds that hold for tiny q.  ...  Acknowledgments The authors would like to thank Alexandr Andoni, Piotr Indyk, Assaf Naor, and Kunal Talwar for helpful discussions.  ... 
doi:10.1145/2578221 fatcat:ute36yf7ojfarllh3y3udxf2su

Optimal lower bounds for locality sensitive hashing (except when q is tiny)

Ryan O'Donnell, Yi Wu, Yuan Zhou
2018
We study lower bounds for Locality Sensitive Hashing (LSH) in the strongest setting: point sets in {0; 1}d under the Hamming distance.  ...  Recall that H is said to be an (r; cr; p; q)-sensitive hash family if all pairs x; y ∈ {0; 1}d with dist(x; y) ≤ r have probability at least p of collision under a randomly chosen h ∈ H, whereas all pairs  ...  Acknowledgments The authors would like to thank Alexandr Andoni, Piotr Indyk, Assaf Naor, and Kunal Talwar for helpful discussions.  ... 
doi:10.1184/r1/6608117.v1 fatcat:bg3okmx6p5eyjg7wgml5r3pfuy

Discrete Graph Hashing

Wei Liu, Cun Mu, Sanjiv Kumar, Shih-Fu Chang
2014 Neural Information Processing Systems  
unsupervised hashing methods, especially for longer codes.  ...  We argue that the degraded performance is due to inferior optimization procedures used to achieve discrete binary codes.  ...  Proposition 1 implies that the optimization in Eq. ( 8 ) can be interpreted as to maximize a lower bound of tr B AB which is the first term of the objective Q(B, Y) in the original problem (4).  ... 
dblp:conf/nips/LiuMKC14 fatcat:23uz6y3gunhk5bgfeekb33bfm4

A Refined Analysis of LSH for Well-dispersed Data Points [article]

Wenlong Mou, Liwei Wang
2016 arXiv   pre-print
While classical approaches suffer from the curse of dimensionality, locality sensitive hashing (LSH) can effectively solve a-approximate r-near neighbor problem, and has been proven to be optimal in the  ...  Combined with optimal data-oblivious LSH scheme, we get a new query time bound depending on N_b and doubling dimension.  ...  In the rest of this paper, we will denote uniform locality sensitive hashings using the term LSH, except when it is specified as data-dependent.  ... 
arXiv:1612.04571v1 fatcat:hjfjw6rdvrbszd4g7uzm7tcngi

Locality-Sensitive Hashing Techniques for Nearest Neighbor Search

Keon Myung Lee
2012 International Journal of Fuzzy Logic and Intelligent Systems  
Locality-sensitive hashing techniques have been developed for approximate but fast nearest neighbor search.  ...  This paper introduces the notion of locality-sensitive hashing and surveys the locality-sensitive hashing techniques.  ...  [16] , in which the locality-sensitive hash functions are defined as follows: A family of functions H = {h : S → U } is an LSH family when for any two points p, q ∈ S, any function h from H, the following  ... 
doi:10.5391/ijfis.2012.12.4.300 fatcat:qwkw27hpd5ht5jhzwfgcajyfs4

Compact Hyperplane Hashing with Bilinear Functions [article]

Wei Liu, Jun Wang (IBM T. J. Watson Research Center), Yadong Mu, Shih-Fu Chang
2012 arXiv   pre-print
The key idea is the bilinear form of the proposed hash functions, which leads to higher collision probability than the existing hyperplane hash functions when using random projections.  ...  To this end, this paper proposes a novel hyperplane hashing technique which yields compact hash codes.  ...  Acknowledgement: This work is supported in part by a Facebook fellowship to the first author.  ... 
arXiv:1206.4618v1 fatcat:vrwgdzpqlfafro4nrommeirfh4

Trinary-Projection Trees for Approximate Nearest Neighbor Search

Jingdong Wang, Naiyan Wang, You Jia, Jian Li, Gang Zeng, Hongbin Zha, Xian-Sheng Hua
2014 IEEE Transactions on Pattern Analysis and Machine Intelligence  
In addition, we provide an extension using multiple randomized trees for improved performance. We justify our approach on large scale local patch indexing and similar image search.  ...  We address the problem of approximate nearest neighbor (ANN) search for visual descriptor indexing.  ...  This is left for future work. Hashing Locality sensitive hashing (LSH) [11] , one of the typical hashing algorithms, is a method of performing ANN search in high dimensions.  ... 
doi:10.1109/tpami.2013.125 pmid:24356357 fatcat:dawzydol5na7be55fmcd3tfuse

ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity [article]

Otmar Ertl
2020 arXiv   pre-print
This paper introduces a class of one-pass locality-sensitive hash algorithms that are orders of magnitude faster than the original approach.  ...  In combination with a hash algorithm that maps those weighted sets to compact signatures which allow fast estimation of pairwise similarities, it constitutes a valuable method for big data applications  ...  In this way the same locality-sensitive hash algorithms can be used as for J W .  ... 
arXiv:1911.00675v2 fatcat:tahh46co4neptiyqm27ufv5hu4

Semi-Supervised Hashing for Large-Scale Search

Jun Wang, S. Kumar, Shih-Fu Chang
2012 IEEE Transactions on Pattern Analysis and Machine Intelligence  
The popular hashing methods, e.g., Locality Sensitive Hashing and Spectral Hashing, construct hash functions based on random or principal projections.  ...  There exist supervised hashing methods that can handle such semantic similarity but they are prone to overfitting when labeled data is small or noisy.  ...  ACKNOWLEDGMENTS We thank the reviewers for their helpful comments and insights. J.  ... 
doi:10.1109/tpami.2012.48 pmid:22331853 fatcat:j4yq5n5j75emxebut2ipqlursq

Sub-linear RACE Sketches for Approximate Kernel Density Estimation on Streaming Data [article]

Benjamin Coleman, Anshumali Shrivastava
2019 arXiv   pre-print
This array is sufficient to estimate the kernel density for a large class of kernels. Our sketch is practical to implement and comes with strong theoretical guarantees.  ...  Unfortunately, kernel methods scale poorly for large, high dimensional datasets.  ...  Recently, locality-sensitive hashing has been used as a fast adaptive sampler for the KDE problem [5, 35] .  ... 
arXiv:1912.02283v1 fatcat:6wkc667asbabxdpx24entm36ui

SAXually Explicit Images: Finding Unusual Shapes

Li Wei, Eamonn Keogh, Xiaopeng Xi
2006 IEEE International Conference on Data Mining. Proceedings  
While the brute force search algorithm has quadratic time complexity, we avoid this by using locality-sensitive hashing to estimate similarity between shapes which enables us to reorder the search more  ...  Among the visual features of multimedia content, shape is of particular interest because humans can often recognize objects solely on the basis of shape.  ...  Define the locality-sensitive hash function f : Σ w → Σ k by ] [ ],..., [ ], [ ) ( 2 1 k i s i s i s s f = In other words, the locality-sensitive hash function concatenates characters from, at most, k  ... 
doi:10.1109/icdm.2006.138 dblp:conf/icdm/WeiKX06 fatcat:r5f75fwm55hxjk5s5dbs3wyh2m

NNMap: A method to construct a good embedding for nearest neighbor classification

Jing Chen, Yuan Yan Tang, C.L. Philip Chen, Bin Fang, Zhaowei Shang, Yuewei Lin
2015 Neurocomputing  
An important property of NNMap is that the embedding optimization criterion is appropriate for both vector and non-vector data, and equally valid in both metric and non-metric spaces.  ...  The quantitative quality criterion is proposed as a local structure descriptor of sample data distribution. Embedding quality corresponds to the quality of the local structure.  ...  Acknowledgment The authors gratefully thank the two anonymous reviewers for their helpful and constructive comments. This work is supported by  ... 
doi:10.1016/j.neucom.2014.11.014 fatcat:46syhtpcurabnj2u6uik27nrda

Semi-supervised hashing for scalable image retrieval

Jun Wang, Sanjiv Kumar, Shih-Fu Chang
2010 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition  
There exist supervised hashing methods that can handle such semantic similarity but they are prone to overfitting when labeled data is small or noisy.  ...  In this work, we propose a semi-supervised hashing method that is formulated as minimizing empirical error on the labeled data while maximizing variance and independence of hash bits over the labeled and  ...  Zhenguo Li for his valuable comments. J. Wang was supported in part by Google Intern Scholarship. S.-F. Chang is supported in part by National Science Foundation Award CNS-07-51078.  ... 
doi:10.1109/cvpr.2010.5539994 dblp:conf/cvpr/WangKC10 fatcat:moyggvoinncffgd5be7ylfi4dm

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Alexandr Andoni, Ilya Razenshteyn
2015 Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing - STOC '15  
In contrast to [AINR14], the new bound is not only optimal, but in fact improves over the best (optimal) LSH data structures [IM98, AI06] for all approximation factors c > 1.  ...  Our result completes the direction set forth in [AINR14] who gave a proof-of-concept that data-dependent hashing can outperform classic Locality Sensitive Hashing (LSH).  ...  Acknowledgments We thank Piotr Indyk and Sepideh Mahabadi for insightful discussions about the problem and for reading early drafts of this write-up.  ... 
doi:10.1145/2746539.2746553 dblp:conf/stoc/AndoniR15 fatcat:nnnwlmdyfzfxlk2mbjma7c7lki
« Previous Showing results 1 — 15 out of 955 results