Filters








1,010 Hits in 10.4 sec

How to Achieve Minimax Expected Kullback-Leibler Distance from an Unknown Finite Distribution [chapter]

Dietrich Braess, Jürgen Forster, Tomas Sauer, Hans U. Simon
2002 Lecture Notes in Computer Science  
The basic goal is to find rules that map "partial information" about a distribution X over an m-letter alphabet into a guess X for X such that the Kullback-Leibler divergence between X and X is as small  ...  The cost associated with a rule is the maximal expected Kullback-Leibler divergence between X and X.  ...  Then the Kullback-Leibler distance between X and X, denoted as D(X X) and sometimes called relative entropy, measures how many additional bits we use compared to an optimal code for X.  ... 
doi:10.1007/3-540-36169-3_30 fatcat:xdtkvsio4rckxorefjh6osh34i

The minimum description length principle in coding and modeling

A. Barron, J. Rissanen, Bin Yu
1998 IEEE Transactions on Information Theory  
The normalized maximized likelihood, mixture, and predictive codings are each shown to achieve the stochastic complexity to within asymptotically vanishing terms.  ...  We assess the performance of the minimum description length criterion both from the vantage point of quality of data compression and accuracy of statistical inference.  ...  to , which is different from the expected regret considered in Section III, is the Kullback-Leibler divergence between and This identity links the fundamental quantity, expected redundancy, from coding  ... 
doi:10.1109/18.720554 fatcat:s7ylg53uvzhufabbucrzq4m26q

The Minimum Description Length Principle in Coding and Modeling [chapter]

2009 Information Theory  
The normalized maximized likelihood, mixture, and predictive codings are each shown to achieve the stochastic complexity to within asymptotically vanishing terms.  ...  We assess the performance of the minimum description length criterion both from the vantage point of quality of data compression and accuracy of statistical inference.  ...  to , which is different from the expected regret considered in Section III, is the Kullback-Leibler divergence between and This identity links the fundamental quantity, expected redundancy, from coding  ... 
doi:10.1109/9780470544907.ch25 fatcat:b5bnutcdg5cyvetrcohul7omvm

Kullback–Leibler divergence for interacting multiple model estimation with random matrices

Wenling Li, Yingmin Jia
2016 IET Signal Processing  
The system state and the unknown covariance are jointly estimated in the framework of Bayesian estimation, where the unknown covariance is modeled as a random matrix according to an inverse-Wishart distribution  ...  Instead of using the moment matching approach, this difficulty is overcome by minimizing the weighted Kullback-Leibler divergence for inverse-Wishart distributions.  ...  In addition, the Kullback-Leibler divergence can be considered an example of the Ali-Silvey class of information theoretic measures [23] , and it quantities how close a probability distribution is to  ... 
doi:10.1049/iet-spr.2015.0149 fatcat:eeiyx26p7ffito67f3l5xjtst4

Kullback-Leibler divergence for interacting multiple model estimation with random matrices [article]

Wenling Li, Yingmin Jia
2014 arXiv   pre-print
The system state and the unknown covariance are jointly estimated in the framework of Bayesian estimation, where the unknown covariance is modeled as a random matrix according to an inverse-Wishart distribution  ...  Instead of using the moment matching approach, this difficulty is overcome by minimizing the weighted Kullback-Leibler divergence for inverse-Wishart distributions.  ...  In addition, the Kullback-Leibler divergence can be considered an example of the Ali-Silvey class of information theoretic measures [23] , and it quantities how close a probability distribution is to  ... 
arXiv:1411.1284v1 fatcat:67drn6o3nngfblcx5wadzua3cy

Adaptive Optimal Transport [article]

Montacer Essid, Debra Laefer, Esteban G. Tabak
2019 arXiv   pre-print
Specifically, instead of a discrete point-bypoint assignment, the new procedure seeks an optimal map T(x) defined for all x, minimizing the Kullback-Leibler divergence between (T(xi)) and the target (y_j  ...  An adaptive, adversarial methodology is developed for the optimal transport problem between two distributions μ and ν, known only through a finite set of independent samples (x_i)_i=1..N and (y_j)_j=1.  ...  Acknowledgments The authors would like to thank Yongxin Chen for connecting our variational formulation of the Kullback-Leibler divergence with the Donsker-Varadhan formula.  ... 
arXiv:1807.00393v2 fatcat:fngcr3u4mzbhnglkyblk5gb3iy

Maximum-loss, minimum-win and the Esscher pricing principle

R. M. Kovacevic
2011 IMA Journal of Management Mathematics  
The basic idea is to value a (financial) random variable by its worst case expectation, where the most unfavourable probability measure-the 'worst case distribution'-lies within a given Kullback-Leibler  ...  The article gives an overview of the properties of this measure and analyses relations to other risk and acceptability measures and to the well-known Esscher pricing principle, used in insurance mathematics  ...  On the one hand, usage of the Kullback-Leibler divergence can be seen sceptically: it is not a full distance and does not metricize the weak topology.  ... 
doi:10.1093/imaman/dpr019 fatcat:upiwxhf7avbtpjuhm4nnx65urq

On the Frequentist Properties of Bayesian Nonparametric Methods

Judith Rousseau
2016 Annual Review of Statistics and Its Application  
In particular I will explain how posterior concentration rates can be derived and what we learn from such analysis in terms of impact of the prior distribution in large dimensional models.  ...  In this paper, I will review the main results on the asymptotic properties of the posterior distribution in nonparametric or large dimensional models.  ...  From (Schwartz, 1965) and (Barron, 1988) , posterior consistency at θ0 under the loss d(., .) is achieved if for all θ ∈ Θ there exists D(θ0; θ) (typically the Kullback-Leibler divergence) such that  ... 
doi:10.1146/annurev-statistics-041715-033523 fatcat:vpkoyzg4p5fmfiilucxtvfwnii

Feature context-dependency and complexity-reduction in probability landscapes for integrative genomics

Annick Lesne, Arndt Benecke
2008 Theoretical Biology and Medical Modelling  
Furthermore, insights into the nature of individual features and a classification of features according to their minimal context-dependency are achieved.  ...  The question of how to integrate heterogeneous sources of biological information into a coherent framework that allows the gene regulatory code in eukaryotes to be systematically investigated is one of  ...  This work has been supported by funds from the Institut des Hautes Études Scientifiques, the Centre National de la Recherche Scientifique (CNRS), the French Ministry of Research through the "Complexité  ... 
doi:10.1186/1742-4682-5-21 pmid:18783599 pmcid:PMC2559821 fatcat:ym3br2e7arbmbjwfmwe3cj6ahq

Composite Tests under Corrupted Data

Michel Broniatowski, Jana Jurečková, Ashok Moses, Emilie Miranda
2019 Entropy  
. , n, we observe X i = Z i + δ V i, with an unknown parameter δ and an unobservable random variable V i. It is assumed that the random variables Z i are i.i.d., as are the X i and the V i.  ...  A new definition of least-favorable hypotheses for the aggregate family of tests is presented, and a relation with the Kullback-Leibler divergence between the sets f δ δ and g δ δ is presented.  ...  Acknowledgments: The authors are thankful to Jan Kalina for discussion; they also thank two anonymous referees for comments which helped to improve on a former version of this paper.  ... 
doi:10.3390/e21010063 pmid:33266779 fatcat:55d74zhaevhyffnbtouym3mwmi

Estimating the bias of a noisy coin

Christopher Ferrie, Robin Blume-Kohout
2012 arXiv   pre-print
So we introduce a pointwise lower bound on the minimum achievable risk as an alternative to the minimax criterion, and use this bound to show that HML estimators are pretty good.  ...  Optimal estimation of a coin's bias using noisy data is surprisingly different from the same problem with noiseless data. We study this problem using entropy risk to quantify estimators' accuracy.  ...  RBK acknowledges financial support from the Government of Canada through the Perimeter Institute, and from the LANL LDRD program.  ... 
arXiv:1201.1493v1 fatcat:5zc43zhbmrfudjwi4pqzbaalvu

Bayesian and Robust Bayesian analysis under a general class of balanced loss functions

Mohammad Jafari Jozani, Éric Marchand, Ahmad Parsian
2010 Statistical Papers  
For estimating an unknown parameter θ, we introduce and motivate the use of balanced loss functions of the form L ρ,ω,δ 0 (θ, δ) = ωρ(δ 0 , δ)+(1−ω)ρ(θ, δ), as well as weighted versions q(θ)L ρ,ω,δ 0 (  ...  Finally, with regards to various robust Bayesian analysis criteria; which include posterior regret gamma-minimaxity, conditional gamma-minimaxity, and most stable, we again establish explicit connections  ...  For instance, natural parameter exponential family of distributions with densities f (x|θ) = e θT (x)−ψ(θ) h(x) (with respect to a σ-finite measure ν on X ), and unknown natural parameter θ, lead to Kullback-Leibler  ... 
doi:10.1007/s00362-010-0307-8 fatcat:fa7pqczjc5bzvggk7ahdrtzcte

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Peyman Mohajerin Esfahani, Daniel Kuhn
2017 Mathematical programming  
We consider stochastic programs where the distribution of the uncertain parameters is only observable through a finite training dataset.  ...  In this paper we demonstrate that, under mild assumptions, the distributionally robust optimization problems over Wasserstein balls can in fact be reformulated as finite convex programs-in many interesting  ...  The authors are grateful to Ruiwei Jiang and Nathan Kallus for their valuable and instructive comments. This research was supported by the Swiss National Science Foundation under Grant BSCGI0 157733.  ... 
doi:10.1007/s10107-017-1172-1 fatcat:3ajvciiiu5c4jnhmyomr7watjm

Asymptotic minimax regret for data compression, gambling, and prediction

Qun Xie, A.R. Barron
2000 IEEE Transactions on Information Theory  
And how does the solution to the worst case sequence problem relate to the solution to the corresponding expectation version min max (log 1 ( 1 ) log 1 ( 1 ))?  ...  Analogous conclusions are given for the case of prediction, gambling, and compression when, for each observation, one has access to side information from an alphabet of size .  ...  ACKNOWLEDGMENT The authors wish to thank T. Cover, E. Ordentlich, Y. Freund, M. Feder, Y. Shtarkov, N. Merhav, and I. Csiszár for helpful discussions regarding this work.  ... 
doi:10.1109/18.825803 fatcat:odrmsxzpuveybat7db7bgwi7ry

Lower bounds for the minimax risk using f-divergences and applications [article]

Adityanand Guntuboyina
2011 arXiv   pre-print
Two applications are provided: a new minimax lower bound for the reconstruction of convex bodies from noisy support function measurements and a different proof of a recent minimax lower bound for the estimation  ...  Lower bounds involving f-divergences between the underlying probability measures are proved for the minimax risk in estimation problems. Our proofs just use simple convexity facts.  ...  Example III.2 (Kullback-Leibler divergence). Let f (x) = x log x.  ... 
arXiv:1002.0042v2 fatcat:rbjcygzfafbdfpbgbqk5qj7ila
« Previous Showing results 1 — 15 out of 1,010 results