Filters








187 Hits in 6.8 sec

Learning PDFA with Asynchronous Transitions [chapter]

Borja Balle, Jorge Castro, Ricard Gavaldà
2010 Lecture Notes in Computer Science  
Specifically, Ron et al. [11] showed that acyclic PDFA can be learned w.r.t. the Kullback-Leibler (KL) divergence in time polynomial in alphabet size, 1/ , 1/δ, number of target states, and 1/µ, where  ...  In particular, it has been observed that polynomial-time learnability of PDFA is feasible if one allows polynomiality not only in the number states but also in other measures of the target automaton complexity  ...  Finally, bounding techniques from [3] are combined with (1) to prove that, with high probability, is close to A w.r.t. the KL divergence.  ... 
doi:10.1007/978-3-642-15488-1_24 fatcat:mm26lwwm5vbzrdqn3arggxbfle

On the learnability of discrete distributions

Michael Kearns, Yishay Mansour, Dana Ron, Ronitt Rubinfeld, Robert E. Schapire, Linda Sellie
1994 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing - STOC '94  
Acknowledgments We would like to thank Nati Linial for helpful discussions  ...  First of all, we will say D~is (efficiently) ezactly learnable (either with a generator or with an evaluator) if the resulting hypothesis achieves Kullback-Leibler divergence O to the target (with high  ...  The Kullback-Leibler divergence is the most standard notion of the difference between distributions, and has been studied extensively in the information theory literature.  ... 
doi:10.1145/195058.195155 dblp:conf/stoc/KearnsMRRSS94 fatcat:lag2b3ltdzdtbnunubbpbgrgia

Guest Editors' foreword

Marcus Hutter, Frank Stephan, Vladimir Vovk, Thomas Zeugmann
2013 Theoretical Computer Science  
The results rely on a key bound on the Kullback-Leibler divergence between distributions of this form. Furthermore, this bound introduces a new complexity measure.  ...  One type of such queries are statistical queries where an underlying distribution is assumed and the teacher returns a polynomial-time program which has -with respect to the underlying distribution -an  ... 
doi:10.1016/j.tcs.2012.10.007 fatcat:ciit7zasgzgytfxx66tzb5onku

A Learning Criterion for Stochastic Rules [chapter]

Kenji Yamanishi
1990 Colt Proceedings 1990  
Sufficient conditions for polynomial-sample-size learnability and polynomial-time learnability of any classes of stochastic rules with finite partitioning are also derived.  ...  Stochastic rules here refer to those which probabilistically asign a number of classes, {Y}, to each attribute vector X.  ...  Acknowledgments The author especially wishes to express his sincere gratitude to Dr. Abe and Mr.  ... 
doi:10.1016/b978-1-55860-146-8.50008-4 fatcat:rltd2hub5vh63jl4np6up7y6zi

Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms

P. Dupont, F. Denis, Y. Esposito
2005 Pattern Recognition  
On the other hand, HMMs with final probabilities and probabilistic automata generate distributions over strings of finite length.  ...  The first part of this work concentrates on probability distributions generated by these models. Necessary and sufficient conditions for an automaton to define a probabilistic language are detailed.  ...  Kearns et al. use the Kullback-Leibler divergence D KL as distance measure between P andP : We consider that the term stochastic qualifies a process, while the term probabilistic qualifies a model  ... 
doi:10.1016/j.patcog.2004.03.020 fatcat:uytkvkyfpfbj3lhibry6cfwue4

Computational learning theory

Dana Angluin
1992 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing - STOC '92  
Polyno- mial learnability of probabilistic concepts with re- spect to the Kullback-Leibler divergence.  ...  The set is also clearly closed under the operation of complementing each concept in a class with respect to X.  ... 
doi:10.1145/129712.129746 dblp:conf/stoc/Angluin92 fatcat:7aw3cnd745bellyhu7phywpul4

PAC-Learnability of Probabilistic Deterministic Finite State Automata in Terms of Variation Distance [chapter]

Nick Palmer, Paul W. Goldberg
2005 Lecture Notes in Computer Science  
that using the variation distance, we obtain polynomial sample size bounds that are independent of the expected length of strings.  ...  We build on recent work by Clark and Thollard, and show that the use of the variation distance allows simplifications to be made to the algorithms, and also a strengthening of the results; in particular  ...  In this paper we study the same problem, using variation distance instead of Kullback-Leibler divergence.  ... 
doi:10.1007/11564089_14 fatcat:elftebjq3bdxnngqqiidyfktum

PAC-learnability of probabilistic deterministic finite state automata in terms of variation distance

Nick Palmer, Paul W. Goldberg
2007 Theoretical Computer Science  
that using the variation distance, we obtain polynomial sample size bounds that are independent of the expected length of strings.  ...  We build on recent work by Clark and Thollard, and show that the use of the variation distance allows simplifications to be made to the algorithms, and also a strengthening of the results; in particular  ...  In this paper we study the same problem, using variation distance instead of Kullback-Leibler divergence.  ... 
doi:10.1016/j.tcs.2007.07.023 fatcat:dsegekwgcfddbljrt4lzqb6zoe

A learning criterion for stochastic rules

Kenji Yamanishi
1992 Machine Learning  
Sufficient conditions for polynomial-sample-size learnability and polynomial-time learnability of any classes of stochastic rules with finite partitioning are also derived.  ...  lists) with at most k literals (k is fixed) in each decision, and polynomial-sample-size learnability of stochastic decision trees (a stochastic analogue of decision trees) with at most k depth.  ...  Acknowledgments The author especially wishes to express his sincere gratitude to Dr. Abe and Mr.  ... 
doi:10.1007/bf00992676 fatcat:st7dsalrxbhudhfcmhsjgxmcfy

A Lower Bound for Learning Distributions Generated by Probabilistic Automata [chapter]

Borja Balle, Jorge Castro, Ricard Gavaldà
2010 Lecture Notes in Computer Science  
Known algorithms for learning PDFA can only be shown to run in time polynomial in the so-called distinguishability µ of the target machine, besides the number of states and the usual accuracy and confidence  ...  Finally, we show a lower bound: every algorithm to learn PDFA using queries with a resonable tolerance needs a number of queries larger than (1/µ) c for every c < 1.  ...  [16] showed that acyclic PDFA can be learned w.r.t the Kullback-Leibler divergence in time polynomial in alphabet size, 1/ , 1/δ, number of target states, and 1/µ, where µ denotes the distinguishability  ... 
doi:10.1007/978-3-642-16108-7_17 fatcat:xknfxptpsfhvra7udyy4dn47mu

Predicting with Distributions [article]

Michael Kearns, Zhiwei Steven Wu
2017 arXiv   pre-print
Our main results take the form of rather general reductions from our model to algorithms for PAC learning the function class and the distribution class separately, and show that virtually every such combination  ...  Our methods include a randomized reduction to classification noise and an application of Le Cam's method to obtain robust learning algorithms.  ...  KL denotes Kullback-Leibler divergence (KL divergence).  ... 
arXiv:1606.01275v3 fatcat:cxqihkcdwzew3jzoqru4y6jbzm

Using Boltzmann Machines for probability estimation: A general framework for neural network learning [chapter]

Hilbert J. Kappen
1994 Machine Intelligence and Pattern Recognition  
This opens the possibility to study the generalization performance of the network as a function of temperature instead of the number of hidden units.  ...  It is shown that temperature dependent spontaneous symmetry breaking occurs in the hidden layer of these networks.  ...  Learning rules are immediately obtained by inserting the Boltz mann dist.ribution in the Kullback divergence and taking the derivatives with respect to all the adapt.ive weights.  ... 
doi:10.1016/b978-0-444-81892-8.50031-6 fatcat:wfz7ukh6r5cunhkg46cfnkpy5i

Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes

Hassan Ashtiani, Shai Ben-David, Nicholas J. A. Harvey, Christopher Liaw, Abbas Mehrabian, Yaniv Plan
2018 Neural Information Processing Systems  
Any class of distributions that allows such a sample compression scheme can also be learned with few samples.  ...  The core of our main result is showing that the class of Gaussians in R d has a small-sized sample compression.  ...  Addendum The lower bound of Theorem 1.2 was recently improved in a subsequent work [8] from Ω(kd 2 /ε 2 log 3 (1/ε)) to Ω(kd 2 /ε 2 log(1/ε)) using a different construction.  ... 
dblp:conf/nips/AshtianiBHLMP18 fatcat:vklzdgu4ord2jkgudks6ky5xlm

Compressing deep graph convolution network with multi-staged knowledge distillation

Junghun Kim, Jinhong Jung, U. Kang, Yuchen Qiu
2021 PLoS ONE  
Specifically, MustaD presents up to 4.21%p improvement of accuracy compared to the second-best KD models.  ...  Extensive experiments on four real-world datasets show that MustaD provides the state-of-the-art performance compared to other KD based methods.  ...  K number of layers. GCN s (�) single effective GCN layer in MUSTAD; shared in the student model. Kð�Þ kernel function. D KL ð�Þ Kullback-Leibler divergence.  ... 
doi:10.1371/journal.pone.0256187 pmid:34388224 pmcid:PMC8363007 fatcat:cv75gwtcnfbc7luszqgvqpcpqm

On the Quantum versus Classical Learnability of Discrete Distributions [article]

Ryan Sweke, Jean-Pierre Seifert, Dominik Hangleiter, Jens Eisert
2021 arXiv   pre-print
In addition, we discuss techniques for proving classical generative modelling hardness results, as well as the relationship between the PAC learnability of Boolean functions and the PAC learnability of  ...  Our primary result is the explicit construction of a class of discrete probability distributions which, under the decisional Diffie-Hellman assumption, is provably not efficiently PAC learnable by a classical  ...  JPS acknowledges funding of the Berlin Institute for the Foundations of Learning and Data, the Einstein Foundation Berlin and the BMBF "Post-Quantum-Cryptography" framework.  ... 
arXiv:2007.14451v2 fatcat:xronfji7lzelhggcrj2ie7pps4
« Previous Showing results 1 — 15 out of 187 results