A General Classification Rule for Probability Measures

Sanjeev R. Kulkarni, Ofer Zeitouni
1995 Annals of Statistics  
We consider the problem of classifying an unknown probability distribution based on a sequence of random samples drawn according to this distribution. Specifically, if A is a subset of the space of all probability measures M1(E) over some compact Polish space E, we want to decide whether or not the unknown distribution belongs to A or its complement. We propose an algorithm which leads a.s. to a correct decision for any A satisfying certain structural assumptions. A refined decision procedure
more » ... ecision procedure is also presented which, given a countable collection Ai C MI(E), i = 1, 2,... each satisfying the structural assumption, will eventually determine a.s. the membership of the distribution in any finite number of the Ai. Applications to density estimation and the problem of order determination of Markov processes are discussed. Abbreviated Title: Classifying Probability Measures. Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number.
doi:10.1214/aos/1176324714 fatcat:xejlx53srra45p5iwzssjwtymu