The Internet Archive has a preservation copy of this work in our general collections.
The file type is application/pdf
.
Filters
On the Convergence Speed of MDL Predictions for Bernoulli Sequences
[article]
2004
arXiv
pre-print
We consider the Minimum Description Length principle for online sequence prediction. ...
We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. ...
On the one hand, the cumulative loss for MDL predictions may be exponential, i.e. 2 Kw(ϑ 0 ) . ...
arXiv:cs/0407039v1
fatcat:v7e3bouvnrftvkggew3c65wh5q
On the Convergence Speed of MDL Predictions for Bernoulli Sequences
[chapter]
2004
Lecture Notes in Computer Science
We consider the Minimum Description Length principle for online sequence prediction. ...
We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. ...
On the one hand, the cumulative loss for MDL predictions may be exponential, i.e. 2 Kw(ϑ 0 ) . ...
doi:10.1007/978-3-540-30215-5_23
fatcat:smdaoeqcjzafrkuhrpf5wtomoi
Convergence of Discrete MDL for Sequential Prediction
[chapter]
2004
Lecture Notes in Computer Science
We observe that there are at least three different ways of using MDL for prediction. One of these has worse prediction properties, for which predictions only converge if the MDL estimator stabilizes. ...
The bound characterizing the convergence speed for MDL predictions is exponentially larger as compared to Bayes mixtures. ...
So in particular, under the conditions of Theorem 6.4, the hybrid MDL predictions converge almost surely. No statement about the convergence speed can be made. ...
doi:10.1007/978-3-540-27819-1_21
fatcat:tku4s2karjbtbivmmuf7mvfkyq
Convergence of Discrete MDL for Sequential Prediction
[article]
2004
arXiv
pre-print
We observe that there are at least three different ways of using MDL for prediction. One of these has worse prediction properties, for which predictions only converge if the MDL estimator stabilizes. ...
The bound characterizing the convergence speed for MDL predictions is exponentially larger as compared to Bayes mixtures. ...
So in particular, under the conditions of Theorem 6.4, the hybrid MDL predictions converge almost surely. No statement about the convergence speed can be made. ...
arXiv:cs/0404057v1
fatcat:ofzltoh6mnettn5ni4ntnkao7i
MDL convergence speed for Bernoulli sequences
2006
Statistics and computing
one, and (b) it additionally specifies the convergence speed. ...
We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. ...
So for both, Bayes mixtures and MDL, convergence with probability one holds, while the convergence speed is exponentially worse for MDL compared to the Bayes mixture. ...
doi:10.1007/s11222-006-6746-3
fatcat:va7meu52nzdwvi6ktcdbdl3byq
Asymptotics of Discrete MDL for Online Prediction
2005
IEEE Transactions on Information Theory
We identify two ways of predicting by MDL for this setup, namely a static} and a dynamic one. (A third variant, hybrid MDL, will turn out inferior.) ...
This is accomplished by proving finite bounds on the quadratic, the Hellinger, and the Kullback-Leibler loss of the MDL learner, which are however exponentially worse than for Bayesian prediction. ...
Thanks to Peter Grünwald and an anonymous reviewer for their very valuable comments and suggestions. This work was supported by SNF grant 2100-67712.02. ...
doi:10.1109/tit.2005.856956
fatcat:z5kspfdzlnfydispsqf4swnfaa
Revisiting enumerative two-part crude MDL for Bernoulli and multinomial distributions (Extended version)
[article]
2016
arXiv
pre-print
We leverage the Minimum Description Length (MDL) principle as a model selection technique for Bernoulli distributions and compare several types of MDL codes. ...
Both the theoretical analysis and the experimental comparisons suggest that one might use the enumerative code rather than NML in practice, for Bernoulli and multinomial distributions. ...
Standard MDL codes for Bernoulli strings We briefly present one simplistic example of two-part crude MDL code for encoding binary strings using the Bernoulli model, as well as a modern MDL code based on ...
arXiv:1608.05522v2
fatcat:7m5gkyzicvgyhde3zriyzqolgm
Sequential Predictions based on Algorithmic Complexity
[article]
2005
arXiv
pre-print
We show that for deterministic computable environments, the "posterior" and losses of m converge, but rapid convergence could only be shown on-sequence; the off-sequence convergence can be slow. ...
This paper studies sequence prediction based on the monotone Kolmogorov complexity Km=-log m, i.e. based on universal deterministic/one-part MDL. m is extremely close to Solomonoff's universal prior M, ...
Qualitatively, for deterministic, computable environments, the posterior converges and is self-optimizing, but rapid convergence could only be shown on-sequence; the (for prediction equally important) ...
arXiv:cs/0508043v1
fatcat:psltsxgxeba5pk6bb7af3l27fm
Sequential predictions based on algorithmic complexity
2006
Journal of computer and system sciences (Print)
We show that for deterministic computable environments, the "posterior" and losses of m converge, but rapid convergence could only be shown on-sequence; the off-sequence convergence can be slow. ...
This paper studies sequence prediction based on the monotone Kolmogorov complexity Km = − log m, i.e. based on universal deterministic/one-part MDL. m is extremely close to Solomonoff's universal prior ...
Speed of off-sequence convergence of m for computable environments. ...
doi:10.1016/j.jcss.2005.07.001
fatcat:tpvox6g6jbakjlfnu6ti2aw2vy
Sequence Prediction based on Monotone Complexity
[article]
2003
arXiv
pre-print
We show that for deterministic computable environments, the "posterior" and losses of m converge, but rapid convergence could only be shown on-sequence; the off-sequence behavior is unclear. ...
This paper studies sequence prediction based on the monotone Kolmogorov complexity Km=-log m, i.e. based on universal deterministic/one-part MDL. m is extremely close to Solomonoff's prior M, the latter ...
(vii) Since (vi) implies (vii) by continuity, we have convergence of the instantaneous losses for computable environments x 1:∞ , but since we do not know the speed of convergence off-sequence, we do not ...
arXiv:cs/0306036v1
fatcat:qhcwtzup4vhs5lpuisbfph6gx4
Model Selection and the Principle of Minimum Description Length
2001
Journal of the American Statistical Association
This paper reviews the principle of Minimum Description Length MDL for problems of model selection. ...
As a principle for statistical modeling in general, one strength of MDL is that it can be intuitively extended to provide useful tools for new problems. ...
Acknowledgements The authors would like to thank Jianhua Huang for his help with a preliminary draft of this paper. ...
doi:10.1198/016214501753168398
fatcat:a32xsezxhnexvmodusu32yue7e
A tutorial introduction to the minimum description length principle
[article]
2004
arXiv
pre-print
It serves as a basis for the technical introduction given in the second chapter, in which all the ideas of the first chapter are made mathematically precise. ...
This tutorial provides an overview of and introduction to Rissanen's Minimum Description Length (MDL) Principle. ...
Acknowledgments The author would like to thank Jay Myung, Mark Pitt, Steven de Rooij and Teemu Roos, who read a preliminary version of this chapter and suggested several improvements.
Notes ...
arXiv:math/0406077v1
fatcat:zhxexym4vbh2zpyko6pkawszly
Model Selection Based on Minimum Description Length
2000
Journal of Mathematical Psychology
We introduce the fundamental concept of MDL, called the stochastic complexity, and we show how it can be used for model selection. ...
We introduce the minimum description length (MDL) principle, a general principle for inductive inference based on the idea that regularities (laws) underlying data can always be used to compress data. ...
Paul Vita nyi, Ronald de Wolf, In Jae Myung, Malcolm Forster and the anonymous referees are to be thanked for providing detailed feedback on earlier versions of this paper. ...
doi:10.1006/jmps.1999.1280
pmid:10733861
fatcat:gst6rgndnjawvkt43yllf3kgv4
Bayesian properties of normalized maximum likelihood and its fast computation
2014
2014 IEEE International Symposium on Information Theory
The representation also has the practical advantage of speeding the calculation of marginals and conditionals required for coding and prediction applications. ...
(MDL) method of statistical modeling and estimation. ...
ACKNOWLEDGMENTS We thank the anonymous reviewers for useful comments. ...
doi:10.1109/isit.2014.6875117
dblp:conf/isit/BarronRW14
fatcat:cxmooeb2djhtdfdyqyyz4wnwwm
The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions
[chapter]
2002
Lecture Notes in Computer Science
We show that the Speed Prior allows for deriving a computable strategy for optimal prediction of future y, given past x. ...
We conclude with several nontraditional predictions concerning the future of our universe. This paper is based on section 6 of TR IDSIA-20-00, Version 2.0: ...
Acknowledgment The material presented here is based on section 6 of [22] . Thanks to Ray Solomonoff, Leonid Levin, Marcus Hutter, Christof Schmidhuber, and an unknown reviewer, for useful comments. ...
doi:10.1007/3-540-45435-7_15
fatcat:jzto6cxunvclrlhxanjus6q6ui
« Previous
Showing results 1 — 15 out of 133 results