Incremental anomaly detection using two-layer cluster-based structure

Elnaz Bigdeli, Mahdi Mohammadi, Bijan Raahemi, Stan Matwin
2018 Information Sciences  
Anomaly detection algorithms face several challenges, including processing speed and dealing with noise in data. In this thesis, a two-layer clusterbased anomaly detection structure is presented which is fast, noise-resilient and incremental. In this structure, each normal pattern is considered as a cluster, and each cluster is represented using a Gaussian Mixture Model (GMM). Then, new instances are presented to the GMM to be labeled as normal or abnormal. The proposed structure comprises
more » ... main steps. In the first step, the data are clustered. The second step is to represent each cluster in a way that enables the model to classify new instances. The Summarization based on Gaussian Mixture Model (SGMM) proposed in this thesis represents each cluster as a GMM. In the third step, a two-layer structure efficiently updates clusters using GMM representation while detecting and ignoring redundant instances. A new approach, called Collective Probabilistic Labeling (CPL) is presented to update clusters in a batch mode. This approach makes the updating phase noise-resistant and fast. The collective approach also introduces a new concept called 'rag bag' used to store new instances. The new instances collected in the rag bag are clustered and summarized by GMMs. This enables online systems to identify nearby clusters in the existing and new clusters, and merge them quickly, despite the presence of noise to update the ii 157 8.
doi:10.1016/j.ins.2017.11.023 fatcat:fisgkfropjh7fj3tpxoxdqszxa