LAGO: A Computationally Efficient Approach for Statistical Detection

Mu Zhu, Wanhua Su, Hugh A Chipman
2006 Technometrics  
We study a general class of statistical detection problems where the underlying objective is to detect items belonging to a rare class from a very large database. We propose a computationally efficient method to achieve this goal. Our method consists of two steps. In the first step, we estimate the density function of the rare class alone with an adaptive bandwidth kernel density estimator. The adaptive choice of the bandwidth is inspired by the ancient Chinese board game known today as Go. In
more » ... he second step, we adjust this density locally depending on the density of the background class nearby. We show that the amount of adjustment needed in the second step is approximately equal to the adaptive bandwidth from the first step, which gives us additional computational savings. We name the resulting method LAGO for "locally adjusted Go-kernel density estimator." We then apply LAGO to a real drug discovery data set and compare its performance with a number of existing and popular methods.
doi:10.1198/004017005000000643 fatcat:qs2hobjudvfhzm5kvxdsk6lyuy