A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
2-Bit Random Projections, NonLinear Estimators, and Approximate Near Neighbor Search
[article]
2016
arXiv
pre-print
Combining these results, we conclude that 2-bit random projections should be recommended for approximate near neighbor search and similarity estimation. Extensive experimental results are provided. ...
Therefore, a 2-bit scheme appears to be overall a good choice for the task of sublinear time approximate near neighbor search via hash tables. ...
In summary, our paper advances the state-of-the-art of random projections in the context of approximate near neighbor search. Figure 1 1 : 2-bit random projections. ...
arXiv:1602.06577v1
fatcat:wycuuag36bbrjg5kkt4cav3pii
Quantized Random Projections and Non-Linear Estimation of Cosine Similarity
2016
Neural Information Processing Systems
Random projections constitute a simple, yet effective technique for dimensionality reduction with applications in learning and search problems. ...
A specific focus is on the on the trade-off between bit depth and the number of projections given a fixed budget of bits for storage or transmission. ...
Acknowledgement The work of Ping Li and Martin Slawski is supported by NSF-Bigdata-1419210 and NSF-III-1360971. The work of Michael Mitzenmacher is supported by NSF CCF-1535795 and NSF CCF-1320231. ...
dblp:conf/nips/0001MS16
fatcat:bb7tffrrufg7dlkbgamghxocme
CoRE Kernels
[article]
2014
arXiv
pre-print
However, training nonlinear kernel SVM can be (very) costly in time and memory and may not be suitable for truly large-scale industrial applications (e.g. search). ...
We propose two types of (nonlinear) CoRE kernels for non-binary sparse data and demonstrate the effectiveness of the new kernels through a classification experiment. ...
These signs, which are bits, provide good indexing & space partitioning capability to allow sublinear time approximate near neighbor search under the framework of LSH [13] . ...
arXiv:1404.6216v1
fatcat:rb4pwfoyyrgmvk7lgshvgzekxe
Sign Stable Projections, Sign Cauchy Projections and Chi-Square Kernels
[article]
2013
arXiv
pre-print
The method of stable random projections is popular for efficiently computing the Lp distances in high dimension (where 0<p<=2), using small space. ...
., Cauchy random projections), we show that the probability of collision can be accurately approximated as functions of the chi-square similarity. ...
For example, we can build hash tables using the bits to achieve sublinear time near neighbor search, although this paper does not focus on near neighbor search. ...
arXiv:1308.1009v1
fatcat:mrtj7xqeo5cbtmkxcaytrgqs3m
Hashing for Similarity Search: A Survey
[article]
2014
arXiv
pre-print
Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. ...
Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. ...
There are two widelystudied randomized search problems: randomized capproximate R-near neighbor search and randomized R-near neighbor search. ...
arXiv:1408.2927v1
fatcat:reknwesjnbafvcbouyudrzp4rq
Sign Stable Random Projections for Large-Scale Learning
[article]
2015
arXiv
pre-print
, clustering, and near-neighbor search). ...
After the processing by sign stable random projections, the inner products of the processed data approximate various types of nonlinear kernels depending on the value of α. ...
search. ...
arXiv:1504.07235v1
fatcat:n5zaah36rzbdjfezgjfmogwcwu
Learning to Hash for Indexing Big Data - A Survey
[article]
2015
arXiv
pre-print
In response, Approximate Nearest Neighbor (ANN) search based on hashing techniques has become popular due to its promising performance in both efficiency and accuracy. ...
Prior randomized hashing methods, e.g., Locality-Sensitive Hashing (LSH), explore data-independent hash functions with random projections or permutations. ...
The critical technique exploited by [119] is twostep: 1) a simple mapping that maps both query and database elements to "points" in a new vector space, and 2) doing approximate nearest neighbor search ...
arXiv:1509.05472v1
fatcat:haj52w3cbbgszlmalfyu2kvzde
Learning to Hash for Indexing Big Data—A Survey
2016
Proceedings of the IEEE
In response, approximate nearest neighbor (ANN) search based on hashing techniques has become popular due to its promising performance in both efficiency and accuracy. ...
Prior randomized hashing methods, e.g., locality-sensitive hashing (LSH), explore data-independent hash functions with random projections or permutations. ...
(b) Neighbors ðx 1 ; x 2 Þ and nonneighbors ðx 3 ; x 4 Þ of the hyperplane query P w , and the ideal neighbors are the points ? w.
Fig. 11 . 11 Two distinct nearest neighbor search problems. ...
doi:10.1109/jproc.2015.2487976
fatcat:4eok2ubzxnc5nmc4hgt4qmqhcy
Bootstrap sequential projection multi kernel Locality Sensitive Hashing
2014
2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
Such kind of an application is Recommendation system that uses the approximation theory of near neighbor concept. ...
When query arrives to find s similar results or nearest neighbor, this query is also projected on b hash keys using above procedure and then using hash table we find s most approximate nearest neighbor ...
doi:10.1109/icacci.2014.6968294
dblp:conf/icacci/MehtaG14
fatcat:hjnzm5fayfesjmaaau33w2bikq
Sizing sketches
2007
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems - SIGMETRICS '07
Recent theoretical and experimental studies have shown that sketches constructed from feature vectors using randomized projections can effectively approximate 1 distance on the feature vectors with the ...
Sketches are compact data structures that can be used to estimate properties of the original data in building largescale search engines and data analysis systems. ...
Recent theoretical and experimental studies have shown that sketches constructed based on random projections can be used to approximate 1 distance and that such sketches can achieve good filtering accuracy ...
doi:10.1145/1254882.1254900
dblp:conf/sigmetrics/WangDJLCL07
fatcat:2sno2wjvpzb3pfjourym45u7se
Sizing sketches
2007
Performance Evaluation Review
Recent theoretical and experimental studies have shown that sketches constructed from feature vectors using randomized projections can effectively approximate 1 distance on the feature vectors with the ...
Sketches are compact data structures that can be used to estimate properties of the original data in building largescale search engines and data analysis systems. ...
Recent theoretical and experimental studies have shown that sketches constructed based on random projections can be used to approximate 1 distance and that such sketches can achieve good filtering accuracy ...
doi:10.1145/1269899.1254900
fatcat:tils2c2g5jf6hhnu57vktrkurq
Locality-Sensitive Hashing Techniques for Nearest Neighbor Search
2012
International Journal of Fuzzy Logic and Intelligent Systems
Locality-sensitive hashing techniques have been developed for approximate but fast nearest neighbor search. ...
Nearest neighbor search is such a task which finds from a data set the k nearest data points to queries. ...
Depending on how to choose directions, they are categorized into maximum variance kd-trees, PCA trees, 2-means trees, and random projection trees. ...
doi:10.5391/ijfis.2012.12.4.300
fatcat:qwkw27hpd5ht5jhzwfgcajyfs4
Hashing Algorithms for Large-Scale Learning
[article]
2011
arXiv
pre-print
Our theoretical and empirical comparisons illustrate that usually b-bit minwise hashing is significantly more accurate (at the same storage) than VW (and random projections) in binary data. ...
We then compare b-bit minwise hashing with the Vowpal Wabbit (VW) algorithm (which is related the Count-Min (CM) sketch). Interestingly, VW has the same variances as random projections. ...
duplicate detection, near-neighbor search, etc. ...
arXiv:1106.0967v1
fatcat:dravguwffvat7lnhua5vparrra
Hashing Techniques
2017
ACM Computing Surveys
Hashing techniques have also evolved from simple randomization approaches to advanced adaptive methods considering locality, structure, label information, and data security, for effective hashing. ...
With the rapid development of information storage and networking technologies, quintillion bytes of data are generated every day from social networks, business transactions, sensors, and many other domains ...
Definition 2.3 (Approximate nearest neighbor (ANN)). ...
doi:10.1145/3047307
fatcat:u5asusjs7vdq7f3a6wgnesnodq
Unsupervised Representation Learning via Neural Activation Coding
[article]
2021
arXiv
pre-print
) nearest neighbor retrieval on CIFAR-10 and FLICKR-25K. ...
We argue that the deep encoder should maximize its nonlinear expressivity on the data for downstream predictors to take full advantage of its representation power. ...
and (ii) nearest neighbor search on CIFAR-10 and FLICKR-25K. ...
arXiv:2112.04014v1
fatcat:zrmhyqk3ofcfhanduncfy3vpyy
« Previous
Showing results 1 — 15 out of 6,415 results