Consistency of data-driven histogram methods for density estimation and classification

Gábor Lugosi, Andrew Nobel
1996 The Annals of Statistics  
We present general sufficient conditions for the almost sure L 1 -consistency of histogram density estimates based on data-dependent partitions. Analogous conditions guarantee the almost-sure risk consistency of histogram classification schemes based on data-dependent partitions. Multivariate data is considered throughout. In each case, the desired consistency requires shrinking cells, subexponential growth of a combinatorial complexity measure, and sub-linear growth of the number of cells. It
more » ... s not required that the cells of every partition be rectangles with sides paralles to the coordinate axis, or that each cell contain a minimum number of points. No assumptions are made concerning the common distribution of the training vectors. We apply the results to establish the consistency of several known partitioning estimates, including the k n -spacing density estimate, classifiers based on statistically equivalent blocks, and classifiers based on multivariate clustering schemes.
doi:10.1214/aos/1032894460 fatcat:dfcqcdzcczh53pvgobgsb5p7hq