A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is `application/pdf`

.

## Filters

##
###
Learning PDFA with Asynchronous Transitions
[chapter]

2010
*
Lecture Notes in Computer Science
*

Specifically, Ron et al. [11] showed that acyclic PDFA can be learned w.r.t.

doi:10.1007/978-3-642-15488-1_24
fatcat:mm26lwwm5vbzrdqn3arggxbfle
*the**Kullback*-*Leibler*(KL)*divergence*in time*polynomial*in alphabet size, 1/ , 1/δ, number*of*target states, and 1/µ, where ... In particular, it has been observed that*polynomial*-time*learnability**of*PDFA is feasible if one allows*polynomiality*not only in*the*number states but also in other measures*of**the*target automaton complexity ... Finally, bounding techniques from [3] are combined*with*(1)*to*prove that,*with*high probability,Â is close*to*A w.r.t.*the*KL*divergence*. ...##
###
On the learnability of discrete distributions

1994
*
Proceedings of the twenty-sixth annual ACM symposium on Theory of computing - STOC '94
*

Acknowledgments We would like

doi:10.1145/195058.195155
dblp:conf/stoc/KearnsMRRSS94
fatcat:lag2b3ltdzdtbnunubbpbgrgia
*to*thank Nati Linial for helpful discussions ... First*of*all, we will say D~is (efficiently) ezactly*learnable*(either*with*a generator or*with*an evaluator) if*the*resulting hypothesis achieves*Kullback*-*Leibler**divergence*O*to**the*target (*with*high ...*The**Kullback*-*Leibler**divergence*is*the*most standard notion*of**the*difference between distributions, and has been studied extensively in*the*information theory literature. ...##
###
Guest Editors' foreword

2013
*
Theoretical Computer Science
*

*The*results rely on a key bound on

*the*

*Kullback*-

*Leibler*

*divergence*between distributions

*of*this form. Furthermore, this bound introduces a new complexity measure. ... One type

*of*such queries are statistical queries where an underlying distribution is assumed and

*the*teacher returns a

*polynomial*-time program which has -

*with*

*respect*

*to*

*the*underlying distribution -an ...

##
###
A Learning Criterion for Stochastic Rules
[chapter]

1990
*
Colt Proceedings 1990
*

Sufficient conditions for

doi:10.1016/b978-1-55860-146-8.50008-4
fatcat:rltd2hub5vh63jl4np6up7y6zi
*polynomial*-sample-size*learnability*and*polynomial*-time*learnability**of*any classes*of*stochastic rules*with*finite partitioning are also derived. ... Stochastic rules here refer*to*those which*probabilistically*asign a number*of*classes, {Y},*to*each attribute vector X. ... Acknowledgments*The*author especially wishes*to*express his sincere gratitude*to*Dr. Abe and Mr. ...##
###
Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms

2005
*
Pattern Recognition
*

On

doi:10.1016/j.patcog.2004.03.020
fatcat:uytkvkyfpfbj3lhibry6cfwue4
*the*other hand, HMMs*with*final probabilities and*probabilistic*automata generate distributions over strings*of*finite length. ...*The*first part*of*this work concentrates on probability distributions generated by these models. Necessary and sufficient conditions for an automaton*to*define a*probabilistic*language are detailed. ... Kearns et al. use*the**Kullback*-*Leibler**divergence*D KL as distance measure between P andP : We consider that*the*term stochastic qualifies a process, while*the*term*probabilistic*qualifies a model ...##
###
Computational learning theory

1992
*
Proceedings of the twenty-fourth annual ACM symposium on Theory of computing - STOC '92
*

Polyno-
mial

doi:10.1145/129712.129746
dblp:conf/stoc/Angluin92
fatcat:7aw3cnd745bellyhu7phywpul4
*learnability**of**probabilistic**concepts**with*re- spect*to**the**Kullback*-*Leibler**divergence*. ...*The*set is also clearly closed under*the*operation*of*complementing each*concept*in a class*with**respect**to*X. ...##
###
PAC-Learnability of Probabilistic Deterministic Finite State Automata in Terms of Variation Distance
[chapter]

2005
*
Lecture Notes in Computer Science
*

that using

doi:10.1007/11564089_14
fatcat:elftebjq3bdxnngqqiidyfktum
*the*variation distance, we obtain*polynomial*sample size bounds that are independent*of**the*expected length*of*strings. ... We build on recent work by Clark and Thollard, and show that*the*use*of**the*variation distance allows simplifications*to*be made*to**the*algorithms, and also a strengthening*of**the*results; in particular ... In this paper we study*the*same problem, using variation distance instead*of**Kullback*-*Leibler**divergence*. ...##
###
PAC-learnability of probabilistic deterministic finite state automata in terms of variation distance

2007
*
Theoretical Computer Science
*

that using

doi:10.1016/j.tcs.2007.07.023
fatcat:dsegekwgcfddbljrt4lzqb6zoe
*the*variation distance, we obtain*polynomial*sample size bounds that are independent*of**the*expected length*of*strings. ... We build on recent work by Clark and Thollard, and show that*the*use*of**the*variation distance allows simplifications*to*be made*to**the*algorithms, and also a strengthening*of**the*results; in particular ... In this paper we study*the*same problem, using variation distance instead*of**Kullback*-*Leibler**divergence*. ...##
###
A learning criterion for stochastic rules

1992
*
Machine Learning
*

Sufficient conditions for

doi:10.1007/bf00992676
fatcat:st7dsalrxbhudhfcmhsjgxmcfy
*polynomial*-sample-size*learnability*and*polynomial*-time*learnability**of*any classes*of*stochastic rules*with*finite partitioning are also derived. ... lists)*with*at most k literals (k is fixed) in each decision, and*polynomial*-sample-size*learnability**of*stochastic decision trees (a stochastic analogue*of*decision trees)*with*at most k depth. ... Acknowledgments*The*author especially wishes*to*express his sincere gratitude*to*Dr. Abe and Mr. ...##
###
A Lower Bound for Learning Distributions Generated by Probabilistic Automata
[chapter]

2010
*
Lecture Notes in Computer Science
*

Known algorithms for learning PDFA can only be shown

doi:10.1007/978-3-642-16108-7_17
fatcat:xknfxptpsfhvra7udyy4dn47mu
*to*run in time*polynomial*in*the*so-called distinguishability µ*of**the*target machine, besides*the*number*of*states and*the*usual accuracy and confidence ... Finally, we show a lower bound: every algorithm*to*learn PDFA using queries*with*a resonable tolerance needs a number*of*queries larger than (1/µ) c for every c < 1. ... [16] showed that acyclic PDFA can be learned w.r.t*the**Kullback*-*Leibler**divergence*in time*polynomial*in alphabet size, 1/ , 1/δ, number*of*target states, and 1/µ, where µ denotes*the*distinguishability ...##
###
Predicting with Distributions
[article]

2017
*
arXiv
*
pre-print

Our main results take

arXiv:1606.01275v3
fatcat:cxqihkcdwzew3jzoqru4y6jbzm
*the*form*of*rather general reductions from our model*to*algorithms for PAC learning*the*function class and*the*distribution class separately, and show that virtually every such combination ... Our methods include a randomized reduction*to*classification noise and an application*of*Le Cam's method*to*obtain robust learning algorithms. ... KL denotes*Kullback*-*Leibler**divergence*(KL*divergence*). ...##
###
Using Boltzmann Machines for probability estimation: A general framework for neural network learning
[chapter]

1994
*
Machine Intelligence and Pattern Recognition
*

This opens

doi:10.1016/b978-0-444-81892-8.50031-6
fatcat:wfz7ukh6r5cunhkg46cfnkpy5i
*the*possibility*to*study*the*generalization performance*of**the*network as a function*of*temperature instead*of**the*number*of*hidden units. ... It is shown that temperature dependent spontaneous symmetry breaking occurs in*the*hidden layer*of*these networks. ... Learning rules are immediately obtained by inserting*the*Boltz mann dist.ribution in*the**Kullback**divergence*and taking*the*derivatives*with**respect**to*all*the*adapt.ive weights. ...##
###
Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes

2018
*
Neural Information Processing Systems
*

Any class

dblp:conf/nips/AshtianiBHLMP18
fatcat:vklzdgu4ord2jkgudks6ky5xlm
*of*distributions that allows such a sample compression scheme can also be learned*with*few samples. ...*The*core*of*our main result is showing that*the*class*of*Gaussians in R d has a small-sized sample compression. ... Addendum*The*lower bound*of*Theorem 1.2 was recently improved in a subsequent work [8] from Ω(kd 2 /ε 2 log 3 (1/ε))*to*Ω(kd 2 /ε 2 log(1/ε)) using a different construction. ...##
###
Compressing deep graph convolution network with multi-staged knowledge distillation

2021
*
PLoS ONE
*

Specifically, MustaD presents up

doi:10.1371/journal.pone.0256187
pmid:34388224
pmcid:PMC8363007
fatcat:cv75gwtcnfbc7luszqgvqpcpqm
*to*4.21%p improvement*of*accuracy compared*to**the*second-best KD models. ... Extensive experiments on four real-world datasets show that MustaD provides*the*state-*of*-*the*-art performance compared*to*other KD based methods. ... K number*of*layers. GCN s (�) single effective GCN layer in MUSTAD; shared in*the*student model. Kð�Þ kernel function. D KL ð�Þ*Kullback*-*Leibler**divergence*. ...##
###
On the Quantum versus Classical Learnability of Discrete Distributions
[article]

2021
*
arXiv
*
pre-print

In addition, we discuss techniques for proving classical generative modelling hardness results, as well as

arXiv:2007.14451v2
fatcat:xronfji7lzelhggcrj2ie7pps4
*the*relationship between*the*PAC*learnability**of*Boolean functions and*the*PAC*learnability**of*... Our primary result is*the*explicit construction*of*a class*of*discrete probability distributions which, under*the*decisional Diffie-Hellman assumption, is provably not efficiently PAC*learnable*by a classical ... JPS acknowledges funding*of**the*Berlin Institute for*the*Foundations*of*Learning and Data,*the*Einstein Foundation Berlin and*the*BMBF "Post-Quantum-Cryptography" framework. ...
« Previous

*Showing results 1 — 15 out of 187 results*