K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes

Kaiming He, Fang Wen, Jian Sun
2013 2013 IEEE Conference on Computer Vision and Pattern Recognition  
In computer vision there has been increasing interest in learning hashing codes whose Hamming distance approximates the data similarity. The hashing functions play roles in both quantizing the vector space and generating similarity-preserving codes. Most existing hashing methods use hyper-planes (or kernelized hyper-planes) to quantize and encode. In this paper, we present a hashing method adopting the k-means quantization. We propose a novel Affinity-Preserving K-means algorithm which
more » ... ously performs k-means clustering and learns the binary indices of the quantized cells. The distance between the cells is approximated by the Hamming distance of the cell indices. We further generalize our algorithm to a product space for learning longer codes. Experiments show our method, named as K-means Hashing (KMH), outperforms various state-of-the-art hashing encoding methods. 1 The Hamming distance is defined as the number of different bits between two binary codes. Given two codes i and j, it is computed by popcnt(iˆj) in C++, whereˆis bitwise xor and popcnt is the instruction counting non-zero bits. This commend takes about 10 −9 s.
doi:10.1109/cvpr.2013.378 dblp:conf/cvpr/HeWS13 fatcat:apavshm5mfbtpceug2ffxpif7m