HRN: A Holistic Approach to One Class Learning
Wenpeng Hu, Mengyu Wang, Qi Qin, Jinwen Ma, Bing Liu
2020
Neural Information Processing Systems
Existing neural network based one-class learning methods mainly use various forms of auto-encoders or GAN style adversarial training to learn a latent representation of the given one class of data. This paper proposes an entirely different approach based on a novel regularization, called holistic regularization (or H-regularization), which enables the system to consider the data holistically, not to produce a model that biases towards some features. Combined with a proposed 2-norm instancelevel
more »
... data normalization, we obtain an effective one-class learning method, called HRN. To our knowledge, the proposed regularization and the normalization method have not been reported before. Experimental evaluation using both benchmark image classification and traditional anomaly detection datasets show that HRN markedly outperforms the state-of-the-art existing deep/non-deep learning models. The code of HRN can be found here 3 . Introduction One-class learning or classification has many applications. For example, in information retrieval, one has a set of documents of interest and wants to identify more such documents [55] . Perhaps, the biggest application is in anomaly or novelty detection, e.g., intrusion detection, fraud detection, medical anomaly detection, anomaly detection in social networks and Internet of things, etc [8, 9] . Recently, image and video based applications have also become popular [13, 49, 70] . More details about these applications and others can be found in the recent survey [7, 61] . One-class learning: Let X be the space of all possible data. Let X ⊆ X be the set of all instances of a particular class. Given a training dataset T ⊆ X of the class, we want to learn a one-class classifier f (x) : X → {0, 1}, where f (x) = 1 if x ∈ X (i.e., x is an instance of the class) and f (x) = 0 otherwise (i.e., x is not an instance of the class, e.g., an anomaly). In most applications, deciding whether a data instance belongs to the given training class or is an anomaly can be subjective and a threshold is often used based on the application. Like most existing papers [68, 64, 8, 82] , this work is interested in a score function instead, and ignores the above binary decision problem. In this case, the commonly used evaluation metric is AUC (Area Under the ROC curve). Early works on one-class classification or learning include one-class SVM (OCSVM) [75], and Support Vector Data Description (SVDD) [78]. More recently, deep learning models have been proposed for the same purpose [68, 8], which mainly learn a good latent representation of the given * Equal contribution † Corresponding author. The work was done when B. Liu was at Peking University on leave of absence from
dblp:conf/nips/HuWQM020
fatcat:lt4okucryzhdhnieebixp6dpdy