Compressed Nonparametric Language Modelling

Ehsan Shareghi, Gholamreza Haffari, Trevor Cohn
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
Hierarchical Pitman-Yor Process priors are compelling for learning language models, outperforming point-estimate based methods. However, these models remain unpopular due to computational and statistical inference issues, such as memory and time usage, as well as poor mixing of sampler. In this work we propose a novel framework which represents the HPYP model compactly using compressed suffix trees. Then, we develop an efficient approximate inference scheme in this framework that has a much
more » ... r memory footprint compared to full HPYP and is fast in the inference time. The experimental results illustrate that our model can be built on significantly larger datasets compared to previous HPYP models, while being several orders of magnitudes smaller, fast for training and inference, and outperforming the perplexity of the state-of-the-art Modified Kneser-Ney count-based LM smoothing by up to 15%.
doi:10.24963/ijcai.2017/376 dblp:conf/ijcai/ShareghiHC17 fatcat:f3hnat6scjfqjijofkrixra6ty