Filters








1,919 Hits in 5.4 sec

Entropy Samplers and Strong Generic Lower Bounds For Space Bounded Learning * †

Dana Moshkovitz, Michal Moshkovitz
unpublished
Our work builds on a combinatorial framework that we suggested in a previous work for proving lower bounds on space bounded learning.  ...  The strong lower bound is obtained by defining a new notion of pseudorandomness, the entropy sampler. Raz obtained a similar result using different ideas.  ...  account). 28:4 Generic Lower Bounds For Space Bounded Learning 2 Preliminaries log(·) always means log 2 (·).  ... 
fatcat:lasaeh4f3jgobdly6hpphzzkwm

Algebraic Methods in Computational Complexity (Dagstuhl Seminar 18391)

Markus Bläser, Valentine Kabanets, Jacobo Torán, Christopher Umans, Michael Wagner
2019 Dagstuhl Reports  
The Razborov-Smolensky polynomial-approximation method for proving constant-depth circuit lower bounds, the PCP characterization of NP, and the Agrawal-Kayal-Saxena polynomial-time primality test are some  ...  There have been significant recent advances in algebraic circuit lower bounds, and the so-called chasm at depth 4 suggests that the restricted models now being considered are not so far from ones that  ...  Strong enough circuit lower bounds can be used to construct pseudo-random generators that can then be used to deterministically simulate randomized algorithms.  ... 
doi:10.4230/dagrep.8.9.133 dblp:journals/dagstuhl-reports/BlaserKTU18 fatcat:bqddvcedazgqngwix6ptk7syuq

Correlation Congruence for Knowledge Distillation [article]

Baoyun Peng, Xiao Jin, Jiaheng Liu, Shunfeng Zhou, Yichao Wu, Yu Liu, Dongsheng Li, Zhaoning Zhang
2019 arXiv   pre-print
Empirical experiments and ablation studies on image classification tasks (including CIFAR-100, ImageNet-1K) and metric learning tasks (including ReID and Face Recognition) show that the proposed CCKD substantially  ...  The CCKD can be easily deployed in the majority of the teacher-student framework such as KD and hint-based learning methods.  ...  For SUR-sampler, the kmeans is adopted and the number of clusters is set to 1000 to generate superclass.  ... 
arXiv:1904.01802v1 fatcat:cfnogaqtwvbrvcdhgwqntupzxm

Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy [article]

Hunter Gabbard, Chris Messenger, Ik Siong Heng, Francesco Tonolini, Roderick Murray-Smith
2021 arXiv   pre-print
The training procedure need only be performed once for a given prior parameter space and the resulting trained machine can then generate samples describing the posterior distribution ∼ 6 orders of magnitude  ...  For binary neutron star and neutron star black hole systems prompt counterpart electromagnetic (EM) signatures are expected on timescales of 1 second – 1 minute and the current fastest method for alerting  ...  We thank Nvidia for the generous donation of a Tesla V-100 GPU used in addition to LVC computational resources.  ... 
arXiv:1909.06296v4 fatcat:nphiyuexlzdbtaa6jmgx7dyiey

Learning Deep Boltzmann Machines using Adaptive MCMC

Ruslan Salakhutdinov
2010 International Conference on Machine Learning  
The commonly used Gibbs sampler tends to get trapped in one local mode, which often results in unstable learning dynamics and leads to poor parameter estimates.  ...  In this paper, we concentrate on learning DBM's using adaptive MCMC algorithms. We first show a close connection between Fast PCD and adaptive MCMC.  ...  Acknowledgments We acknowledge the financial support from NSERC, Shell, and NTT Communication Sciences Laboratory.  ... 
dblp:conf/icml/Salakhutdinov10 fatcat:oovdykame5d6tmxdtbfx7jhwae

Generative Adversarial Networks as Variational Training of Energy Based Models [article]

Shuangfei Zhai, Yu Cheng, Rogerio Feris, Zhongfei Zhang
2016 arXiv   pre-print
In this paper, we study deep generative models for effective unsupervised learning.  ...  The training of VGAN takes a two step procedure: given p(x), q(x) is updated to maximize the lower bound; p(x) is then updated one step with samples drawn from q(x) to decrease the lower bound.  ...  From a probabilistic point of view, the use of Euclidean distance assumes Gaussian distributions (or mixtures thereof) in the input space, which is a strong assumption and is often times inaccurate for  ... 
arXiv:1611.01799v1 fatcat:yr4ei2jip5bchjg3fka5f32mmq

An Introduction to Randomness Extractors [chapter]

Ronen Shaltiel
2011 Lecture Notes in Computer Science  
The min-entropy of X We use min-entropy to measure the amount of random bits that can be extracted from a source. 1 Note that a distribution with min-entropy at least m has that for every x ∈ Supp(X),  ...  By the previous discussion having min-entropy at least m is a necessary condition for extracting m bits of randomness. 2 We could hope that it is a sufficient condition and that there exists an extractor  ...  In Section 3.1 we discuss explicit constructions and lower bounds.  ... 
doi:10.1007/978-3-642-22012-8_2 fatcat:ldxe2dhplfdhvcdepfval6cxne

Active Boundary Annotation using Random MAP Perturbations

Subhransu Maji, Tamir Hazan, Tommi S. Jaakkola
2014 International Conference on Artificial Intelligence and Statistics  
By reasoning about the entropy reduction and cost tradeoff, our algorithm actively selects the next annotation task.  ...  In this setting we develop novel entropy bounds that are based on the expected amount of perturbation to the potential function that is needed to change MAP decisions.  ...  These upper bounds allow us to use Bayesian approaches for active learning efficiently even for exponentially large space of annotations.  ... 
dblp:conf/aistats/MajiHJ14 fatcat:52cr3nmigbdpfc3c5wexnlbwpi

Steganography using Gibbs random fields

Tomáš Filler, Jessica Fridrich
2010 Proceedings of the 12th ACM workshop on Multimedia and security - MM&Sec '10  
The Gibbs sampler is the key tool for simulating the impact of optimal embedding and for constructing practical embedding algorithms.  ...  In this work, we provide a general framework and practical methods for embedding with an arbitrary distortion function that does not have to be additive over pixels and thus can consider interactions among  ...  Having obtained the expected distortion and entropy using the Gibbs sampler and the thermodynamic integration, the rate-distortion bound [H(π λ ), Eπ λ [D]] can be plotted as a curve parametrized by λ.  ... 
doi:10.1145/1854229.1854266 dblp:conf/mmsec/FillerF10 fatcat:4iy4xvfrpncydjtbwwq27c2vfe

On the quantitative analysis of deep belief networks

Ruslan Salakhutdinov, Iain Murray
2008 Proceedings of the 25th international conference on Machine learning - ICML '08  
Efficient greedy algorithms for learning and approximate inference have allowed these models to be applied successfully in many application domains.  ...  We further show how an AIS estimator, along with approximate inference, can be used to estimate a lower bound on the log-probability that a DBN model with multiple hidden layers assigns to the test data  ...  Acknowledgments We thank Geoffrey Hinton and Radford Neal for many helpful suggestions. This research was supported by NSERC and CFI. Iain Murray is supported by the government of Canada.  ... 
doi:10.1145/1390156.1390266 dblp:conf/icml/SalakhutdinovM08 fatcat:xklszjo5tre4xn5lu5zjaobsuu

Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors [article]

Gintare Karolina Dziugaite, Daniel M. Roy
2019 arXiv   pre-print
In particular, Entropy-SGLD can be configured to yield relatively tight generalization bounds and still fit real labels, although these same settings do not obtain state-of-the-art performance.  ...  We show that Entropy-SGD (Chaudhari et al., 2017), when viewed as a learning algorithm, optimizes a PAC-Bayes bound on the risk of a Gibbs (posterior) classifier, i.e., a randomized classifier obtained  ...  The authors would like to thank Pratik Chaudhari, Pascal Germain, David McAllester, and Stefano Soatto for helpful discussions. GKD is supported by an EPSRC studentship.  ... 
arXiv:1712.09376v3 fatcat:l3fssx5csbhedcrtl2ojaaznle

Efficient Gradient-Free Variational Inference using Policy Search

Oleg Arenz, Mingjun Zhong, Gerhard Neumann
2018 International Conference on Machine Learning  
For GMMs, we apply a variational lower bound to decompose the learning objective into sub-problems given by learning the individual mixture components and the coefficients.  ...  We propose an efficient, gradient-free method for learning general GMM approximations of multimodal distributions based on recent insights from stochastic search methods.  ...  Calculations for this research were conducted on the Lichtenberg high performance computer of the TU Darmstadt.  ... 
dblp:conf/icml/ArenzZN18 fatcat:iyh3blgtlneddnxd3abwylct54

Deterministic extractors for small-space sources

Jesse Kamp, Anup Rao, Salil Vadhan, David Zuckerman
2006 Proceedings of the thirty-eighth annual ACM symposium on Theory of computing - STOC '06  
We give polynomial-time, deterministic randomness extractors for sources generated in small space, where we model space s sources on {0, 1} n as sources generated by width 2 s branching programs.  ...  This model generalizes both the well-studied models of independent sources and symbol-fixing sources.  ...  His notable contributions to probabilistic analysis and randomized algorithms are most relevant for this paper.  ... 
doi:10.1145/1132516.1132613 dblp:conf/stoc/KampRVZ06 fatcat:eiozcwdvnrct5hngkee5h4moha

Deterministic extractors for small-space sources

Jesse Kamp, Anup Rao, Salil Vadhan, David Zuckerman
2011 Journal of computer and system sciences (Print)  
We give polynomial-time, deterministic randomness extractors for sources generated in small space, where we model space s sources on {0, 1} n as sources generated by width 2 s branching programs.  ...  This model generalizes both the well-studied models of independent sources and symbol-fixing sources.  ...  His notable contributions to probabilistic analysis and randomized algorithms are most relevant for this paper.  ... 
doi:10.1016/j.jcss.2010.06.014 fatcat:awnpahwghjauhkk3rx56vorphm

No MCMC for me: Amortized sampling for fast and stable training of energy-based models [article]

Will Grathwohl, Jacob Kelly, Milad Hashemi, Mohammad Norouzi, Kevin Swersky, David Duvenaud
2021 arXiv   pre-print
In this work, we present a simple method for training EBMs at scale which uses an entropy-regularized generator to amortize the MCMC sampling typically used in EBM training.  ...  We improve upon prior MCMC-based entropy regularization methods with a fast variational approximation.  ...  For VERA the hyper-parameters we searched over were the learning rates for the NICE model and for the generator. Compared to we needed to use much lower learning rates.  ... 
arXiv:2010.04230v3 fatcat:a3n7ekjmybhbdh2kedl2z7cvmu
« Previous Showing results 1 — 15 out of 1,919 results