Entropy based locality sensitive hashing

Qiang Wang, Zhiyuan Guo, Gang Liu, Jun Guo
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Nearest neighbor problem has recently been a research focus, especially on large amounts of data. Locality sensitive hashing (LSH) scheme based on p-stable distributions is a good solution to the approximate nearest neighbor (ANN) problem, but points are always mapped to a poor distribution. This paper proposes a set of new hash mapping functions based on entropy for LSH. Using our new hash functions the distribution of mapped values will be approximately uniform, which is the maximum entropy
more » ... stribution. This paper also provides a method on how these parameters should be adjusted to get better performance. Experimental results show that the proposed method will be more accurate with the same time consuming. Index Terms-Locality sensitive hashing (LSH), approximate nearest neighbor (ANN), entropy, information retrieval, large-scale
doi:10.1109/icassp.2012.6288065 dblp:conf/icassp/WangGLG12 fatcat:mqpwqtun3nbl7idbjit6rj2caq