Filters








6,415 Hits in 6.7 sec

2-Bit Random Projections, NonLinear Estimators, and Approximate Near Neighbor Search [article]

Ping Li, Michael Mitzenmacher, Anshumali Shrivastava
2016 arXiv   pre-print
Combining these results, we conclude that 2-bit random projections should be recommended for approximate near neighbor search and similarity estimation. Extensive experimental results are provided.  ...  Therefore, a 2-bit scheme appears to be overall a good choice for the task of sublinear time approximate near neighbor search via hash tables.  ...  In summary, our paper advances the state-of-the-art of random projections in the context of approximate near neighbor search. Figure 1 1 : 2-bit random projections.  ... 
arXiv:1602.06577v1 fatcat:wycuuag36bbrjg5kkt4cav3pii

Quantized Random Projections and Non-Linear Estimation of Cosine Similarity

Ping Li, Michael Mitzenmacher, Martin Slawski
2016 Neural Information Processing Systems  
Random projections constitute a simple, yet effective technique for dimensionality reduction with applications in learning and search problems.  ...  A specific focus is on the on the trade-off between bit depth and the number of projections given a fixed budget of bits for storage or transmission.  ...  Acknowledgement The work of Ping Li and Martin Slawski is supported by NSF-Bigdata-1419210 and NSF-III-1360971. The work of Michael Mitzenmacher is supported by NSF CCF-1535795 and NSF CCF-1320231.  ... 
dblp:conf/nips/0001MS16 fatcat:bb7tffrrufg7dlkbgamghxocme

CoRE Kernels [article]

Ping Li
2014 arXiv   pre-print
However, training nonlinear kernel SVM can be (very) costly in time and memory and may not be suitable for truly large-scale industrial applications (e.g. search).  ...  We propose two types of (nonlinear) CoRE kernels for non-binary sparse data and demonstrate the effectiveness of the new kernels through a classification experiment.  ...  These signs, which are bits, provide good indexing & space partitioning capability to allow sublinear time approximate near neighbor search under the framework of LSH [13] .  ... 
arXiv:1404.6216v1 fatcat:rb4pwfoyyrgmvk7lgshvgzekxe

Sign Stable Projections, Sign Cauchy Projections and Chi-Square Kernels [article]

Ping Li, Gennady Samorodnitsky, John Hopcroft
2013 arXiv   pre-print
The method of stable random projections is popular for efficiently computing the Lp distances in high dimension (where 0<p<=2), using small space.  ...  ., Cauchy random projections), we show that the probability of collision can be accurately approximated as functions of the chi-square similarity.  ...  For example, we can build hash tables using the bits to achieve sublinear time near neighbor search, although this paper does not focus on near neighbor search.  ... 
arXiv:1308.1009v1 fatcat:mrtj7xqeo5cbtmkxcaytrgqs3m

Hashing for Similarity Search: A Survey [article]

Jingdong Wang, Heng Tao Shen, Jingkuan Song, Jianqiu Ji
2014 arXiv   pre-print
Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search.  ...  Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database.  ...  There are two widelystudied randomized search problems: randomized capproximate R-near neighbor search and randomized R-near neighbor search.  ... 
arXiv:1408.2927v1 fatcat:reknwesjnbafvcbouyudrzp4rq

Sign Stable Random Projections for Large-Scale Learning [article]

Ping Li
2015 arXiv   pre-print
, clustering, and near-neighbor search).  ...  After the processing by sign stable random projections, the inner products of the processed data approximate various types of nonlinear kernels depending on the value of α.  ...  search.  ... 
arXiv:1504.07235v1 fatcat:n5zaah36rzbdjfezgjfmogwcwu

Learning to Hash for Indexing Big Data - A Survey [article]

Jun Wang, Wei Liu, Sanjiv Kumar, Shih-Fu Chang
2015 arXiv   pre-print
In response, Approximate Nearest Neighbor (ANN) search based on hashing techniques has become popular due to its promising performance in both efficiency and accuracy.  ...  Prior randomized hashing methods, e.g., Locality-Sensitive Hashing (LSH), explore data-independent hash functions with random projections or permutations.  ...  The critical technique exploited by [119] is twostep: 1) a simple mapping that maps both query and database elements to "points" in a new vector space, and 2) doing approximate nearest neighbor search  ... 
arXiv:1509.05472v1 fatcat:haj52w3cbbgszlmalfyu2kvzde

Learning to Hash for Indexing Big Data—A Survey

Jun Wang, Wei Liu, Sanjiv Kumar, Shih-Fu Chang
2016 Proceedings of the IEEE  
In response, approximate nearest neighbor (ANN) search based on hashing techniques has become popular due to its promising performance in both efficiency and accuracy.  ...  Prior randomized hashing methods, e.g., locality-sensitive hashing (LSH), explore data-independent hash functions with random projections or permutations.  ...  (b) Neighbors ðx 1 ; x 2 Þ and nonneighbors ðx 3 ; x 4 Þ of the hyperplane query P w , and the ideal neighbors are the points ? w. Fig. 11 . 11 Two distinct nearest neighbor search problems.  ... 
doi:10.1109/jproc.2015.2487976 fatcat:4eok2ubzxnc5nmc4hgt4qmqhcy

Bootstrap sequential projection multi kernel Locality Sensitive Hashing

Harsham Mehta, Deepak Garg
2014 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI)  
Such kind of an application is Recommendation system that uses the approximation theory of near neighbor concept.  ...  When query arrives to find s similar results or nearest neighbor, this query is also projected on b hash keys using above procedure and then using hash table we find s most approximate nearest neighbor  ... 
doi:10.1109/icacci.2014.6968294 dblp:conf/icacci/MehtaG14 fatcat:hjnzm5fayfesjmaaau33w2bikq

Sizing sketches

Zhe Wang, Wei Dong, William Josephson, Qin Lv, Moses Charikar, Kai Li
2007 Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems - SIGMETRICS '07  
Recent theoretical and experimental studies have shown that sketches constructed from feature vectors using randomized projections can effectively approximate 1 distance on the feature vectors with the  ...  Sketches are compact data structures that can be used to estimate properties of the original data in building largescale search engines and data analysis systems.  ...  Recent theoretical and experimental studies have shown that sketches constructed based on random projections can be used to approximate 1 distance and that such sketches can achieve good filtering accuracy  ... 
doi:10.1145/1254882.1254900 dblp:conf/sigmetrics/WangDJLCL07 fatcat:2sno2wjvpzb3pfjourym45u7se

Sizing sketches

Zhe Wang, Wei Dong, William Josephson, Qin Lv, Moses Charikar, Kai Li
2007 Performance Evaluation Review  
Recent theoretical and experimental studies have shown that sketches constructed from feature vectors using randomized projections can effectively approximate 1 distance on the feature vectors with the  ...  Sketches are compact data structures that can be used to estimate properties of the original data in building largescale search engines and data analysis systems.  ...  Recent theoretical and experimental studies have shown that sketches constructed based on random projections can be used to approximate 1 distance and that such sketches can achieve good filtering accuracy  ... 
doi:10.1145/1269899.1254900 fatcat:tils2c2g5jf6hhnu57vktrkurq

Locality-Sensitive Hashing Techniques for Nearest Neighbor Search

Keon Myung Lee
2012 International Journal of Fuzzy Logic and Intelligent Systems  
Locality-sensitive hashing techniques have been developed for approximate but fast nearest neighbor search.  ...  Nearest neighbor search is such a task which finds from a data set the k nearest data points to queries.  ...  Depending on how to choose directions, they are categorized into maximum variance kd-trees, PCA trees, 2-means trees, and random projection trees.  ... 
doi:10.5391/ijfis.2012.12.4.300 fatcat:qwkw27hpd5ht5jhzwfgcajyfs4

Hashing Algorithms for Large-Scale Learning [article]

Ping Li, Anshumali Shrivastava, Joshua Moore, Arnd Christian Konig
2011 arXiv   pre-print
Our theoretical and empirical comparisons illustrate that usually b-bit minwise hashing is significantly more accurate (at the same storage) than VW (and random projections) in binary data.  ...  We then compare b-bit minwise hashing with the Vowpal Wabbit (VW) algorithm (which is related the Count-Min (CM) sketch). Interestingly, VW has the same variances as random projections.  ...  duplicate detection, near-neighbor search, etc.  ... 
arXiv:1106.0967v1 fatcat:dravguwffvat7lnhua5vparrra

Hashing Techniques

Lianhua Chi, Xingquan Zhu
2017 ACM Computing Surveys  
Hashing techniques have also evolved from simple randomization approaches to advanced adaptive methods considering locality, structure, label information, and data security, for effective hashing.  ...  With the rapid development of information storage and networking technologies, quintillion bytes of data are generated every day from social networks, business transactions, sensors, and many other domains  ...  Definition 2.3 (Approximate nearest neighbor (ANN)).  ... 
doi:10.1145/3047307 fatcat:u5asusjs7vdq7f3a6wgnesnodq

Unsupervised Representation Learning via Neural Activation Coding [article]

Yookoon Park, Sangho Lee, Gunhee Kim, David M. Blei
2021 arXiv   pre-print
) nearest neighbor retrieval on CIFAR-10 and FLICKR-25K.  ...  We argue that the deep encoder should maximize its nonlinear expressivity on the data for downstream predictors to take full advantage of its representation power.  ...  and (ii) nearest neighbor search on CIFAR-10 and FLICKR-25K.  ... 
arXiv:2112.04014v1 fatcat:zrmhyqk3ofcfhanduncfy3vpyy
« Previous Showing results 1 — 15 out of 6,415 results