Doc2hash: Learning Discrete Latent variables for Documents Retrieval

Yifei Zhang, Hao Zhu
2019 North American Chapter of the Association for Computational Linguistics  
Learning to hash via generative models has become a powerful paradigm for fast similarity search in documents retrieval. To get binary representation (i.e., hash codes), the discrete distribution prior (i.e., Bernoulli Distribution) is applied to train the variational autoencoder (VAE). However, the discrete stochastic layer is usually incompatible with the backpropagation in the training stage and thus causes a gradient flow problem. In this paper, we propose a method, Doc2hash, that solves
more » ... gradient flow problem of the discrete stochastic layer by using continuous relaxation on priors, and trains the generative model in an end-to-end manner to generate hash codes. In qualitative and quantitative experiments, we show the proposed model outperforms other state of the art methods.
doi:10.18653/v1/n19-1232 dblp:conf/naacl/ZhangZ19 fatcat:2a6b2orh4zghpd7gesjjt76hle