A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit <a rel="external noopener" href="https://arxiv.org/pdf/1812.01216v1.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
Parameter Re-Initialization through Cyclical Batch Size Schedules
[article]
<span title="2018-12-04">2018</span>
<i >
arXiv
</i>
<span class="release-stage" >pre-print</span>
Optimal parameter initialization remains a crucial problem for neural network training. A poor weight initialization may take longer to train and/or converge to sub-optimal solutions. Here, we propose a method of weight re-initialization by repeated annealing and injection of noise in the training process. We implement this through a cyclical batch size schedule motivated by a Bayesian perspective of neural network training. We evaluate our methods through extensive experiments on tasks in
<span class="external-identifiers">
<a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1812.01216v1">arXiv:1812.01216v1</a>
<a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mkk5auxhunerdhd6u5zrrgbjmq">fatcat:mkk5auxhunerdhd6u5zrrgbjmq</a>
</span>
more »
... age modeling, natural language inference, and image classification. We demonstrate the ability of our method to improve language modeling performance by up to 7.91 perplexity and reduce training iterations by up to 61%, in addition to its flexibility in enabling snapshot ensembling and use with adversarial training.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191012075841/https://arxiv.org/pdf/1812.01216v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext">
<button class="ui simple right pointing dropdown compact black labeled icon button serp-button">
<i class="icon ia-icon"></i>
Web Archive
[PDF]
<div class="menu fulltext-thumbnail">
<img src="https://blobs.fatcat.wiki/thumbnail/pdf/40/87/4087ebc37a1650dbb5d8205af0850bee74f3784b.180px.jpg" alt="fulltext thumbnail" loading="lazy">
</div>
</button>
</a>
<a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1812.01216v1" title="arxiv.org access">
<button class="ui compact blue labeled icon button serp-button">
<i class="file alternate outline icon"></i>
arxiv.org
</button>
</a>