LAGO: A Computationally Efficient Approach for Statistical Detection

Mu Zhu, Wanhua Su, Hugh A Chipman
2006 Technometrics  
We study a general class of statistical detection problems where the underlying objective is to detect items belonging to a rare class from a very large database. We propose a computationally efficient method to achieve this goal. Our method consists of two steps. In the first step we estimate the density function of the rare class alone with an adaptive bandwidth kernel density estimator. The adaptive choice of the bandwidth is inspired by the ancient Chinese board game known today as Go. In
more » ... n today as Go. In the second step we adjust this density locally depending on the density of the background class nearby. We show that the amount of adjustment needed in the second step is approximately equal to the adaptive bandwidth from the first step, which gives us additional computational savings. We name the resulting method LAGO, for "locally adjusted Go-kernel density estimator." We then apply LAGO to a real drug discovery dataset and compare its performance with a number of existing and popular methods.
doi:10.1198/004017005000000643 fatcat:qs2hobjudvfhzm5kvxdsk6lyuy