Learning Kernels over Strings using Gaussian Processes

Daniel Beck, Trevor Cohn
2017 International Joint Conference on Natural Language Processing  
Non-contiguous word sequences are widely known to be important in modelling natural language. However they are not explicitly encoded in common text representations. In this work we propose a model for text processing using string kernels, capable of flexibly representing non-contiguous sequences. Specifically, we derive a vectorised version of the string kernel algorithm and their gradients, allowing efficient hyperparameter optimisation as part of a Gaussian Process framework. Experiments on
more » ... ynthetic data and text regression for emotion analysis show the promise of this technique.
dblp:conf/ijcnlp/BeckC17 fatcat:xniubxw2gva73ct62whpmholt4