Learning poisson binomial distributions
Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio
2012
Proceedings of the 44th symposium on Theory of Computing - STOC '12
We consider a basic problem in unsupervised learning: learning an unknown Poisson Binomial Distribution. A Poisson Binomial Distribution (PBD) over {0, 1, . . . , n} is the distribution of a sum of n independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 [Poi37] and are a natural n-parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic
more »
... arning problem was poorly understood, and known results for it were far from optimal. We essentially settle the complexity of the learning problem for this basic class of distributions. As our first main result we give a highly efficient algorithm which learns to ǫ-accuracy (with respect to the total variation distance) usingÕ(1/ǫ 3 ) samples independent of n. The running time of the algorithm is quasilinear in the size of its input data, i.e.,Õ(log(n)/ǫ 3 ) bit-operations. 1 (Observe that each draw from the distribution is a log(n)-bit string.) Our second main result is a proper learning algorithm that learns to ǫ-accuracy using O(1/ǫ 2 ) samples, and runs in time (1/ǫ) poly(log(1/ǫ)) · log n. This sample complexity is nearly optimal, since any algorithm for this problem must use Ω(1/ǫ 2 ) samples. We also give positive and negative results for some extensions of this learning problem to weighted sums of independent Bernoulli random variables. We writeÕ(·) to hide factors which are polylogarithmic in the argument toÕ(·); thus, for example,Õ(a log b) denotes a quantity which is O(a log b · log c (a log b)) for some absolute constant c. a much richer class of distributions. (See Section 1.2 below.) It is believed that Poisson [Poi37] was the first to consider this extension of the Binomial distribution 2 and the distribution is sometimes referred to as "Poisson's Binomial Distribution" in his honor; we shall simply call these distributions PBDs. PBDs are one of the most basic classes of discrete distributions; indeed, they are arguably the simplest nparameter probability distribution that has some nontrivial structure. As such they have been intensely studied in probability and statistics (see Section 1.2) and arise in many settings; for example, we note here that tail bounds on PBDs form an important special case of Chernoff/Hoeffding bounds [Che52, Hoe63, DP09] . In application domains, PBDs have many uses in research areas such as survey sampling, case-control studies, and survival analysis, see e.g., [CL97] for a survey of the many uses of these distributions in applications. Given the simplicity and ubiquity of these distributions, it is quite surprising that the problem of density estimation for PBDs (i.e., learning an unknown PBD from independent samples) is not well understood in the statistics or learning theory literature. This is the problem we consider, and essentially settle, in this paper. We work in a natural PAC-style model of learning an unknown discrete probability distribution which is essentially the model of [KMR + 94]. In this learning framework for our problem, the learner is provided with the value of n and with independent samples drawn from an unknown PBD X. Using these samples, the learner must with probability at least 1−δ output a hypothesis distributionX such that the total variation distance d TV (X,X) is at most ǫ, where ǫ, δ > 0 are accuracy and confidence parameters that are provided to the learner. 3 A proper learning algorithm in this framework outputs a distribution that is itself a Poisson Binomial Distribution, i.e., a vectorp = (p 1 , . . . ,p n ) which describes the hypothesis PBDX = n i=1X i where E[X i ] =p i .
doi:10.1145/2213977.2214042
dblp:conf/stoc/DaskalakisDS12
fatcat:ybmy5idp5zabjnwtxtzmz54g3m