A Generative Model for Self/Non-self Discrimination in Strings [chapter]

Matti Pöllä
2009 Lecture Notes in Computer Science  
A statistical generative model is presented as an alternative to negative selection in anomaly detection of string data. We extend the probabilistic approach to binary classification from fixed-length binary strings into variable-length strings from a finite symbol alphabet by fitting a mixture model of multinomial distributions for the frequency of adjacent symbols. Robust and localized change analysis of text documents is viewed as an application area.
doi:10.1007/978-3-642-04921-7_30 fatcat:7khvyg6o3zfpxfh3aca3blgwcm