Bhattacharyya and Expected Likelihood Kernels [chapter]

Tony Jebara, Risi Kondor
2003 Lecture Notes in Computer Science  
We introduce a new class of kernels between distributions. These induce a kernel on the input space between data points by associating to each datum a generative model fit to the data point individually. The kernel is then computed by integrating the product of the two generative models corresponding to two data points. This kernel permits discriminative estimation via, for instance, support vector machines, while exploiting the properties, assumptions, and invariances inherent in the choice of
more » ... generative model. It satisfies Mercer's condition and can be computed in closed form for a large class of models, including exponential family models, mixtures, hidden Markov models and Bayesian networks. For other models the kernel can be approximated by sampling methods. Experiments are shown for multinomial models in text classification and for hidden Markov models for protein sequence classification. For
doi:10.1007/978-3-540-45167-9_6 fatcat:uxa7odqhlfbrtnrnfyoiw4lkn4