A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
How Do Adam and Training Strategies Help BNNs Optimization?
[article]
2021
arXiv
pre-print
The best performing Binary Neural Networks (BNNs) are usually attained using Adam optimization and its multi-step training variants. ...
specific training strategies. ...
Besides comparing Adam to SGD, we further explore how training strategies affect BNN optimization. Previous works proposed different training strategies: Yang et al. ...
arXiv:2106.11309v1
fatcat:5m3ezp4gxvdsbdg2ptv2udpzmq
BAMSProd: A Step towards Generalizing the Adaptive Optimization Methods to Deep Binary Model
[article]
2020
arXiv
pre-print
In this paper, we provide an explicit convex optimization example where training the BNNs with the traditionally adaptive optimization methods still faces the risk of non-convergence, and identify that ...
Recent methods have significantly reduced the performance degradation of Binary Neural Networks (BNNs), but guaranteeing the effective and efficient training of BNNs is an unsolved problem. ...
These models are all trained with default strategies and data augmentation in [43] . ...
arXiv:2009.13799v1
fatcat:oia2pd4pznellcg62rcyy3wkra
Augmenting Neural Networks with Priors on Function Values
[article]
2022
arXiv
pre-print
How can we coherently leverage such prior knowledge to help improve a neural network model that is quite accurate in some regions of input space -- typically near the training data -- but wildly wrong ...
Herein, we tackle this problem by developing an approach to augment BNNs with prior information on the function values themselves. ...
The GB1 data set used a fully-connected neural network with 1 hidden layer containing 300 dimensions, ReLU non-linearities, and was optimized using Adam with a weight-decay of 0.0001. ...
arXiv:2202.04798v3
fatcat:tb25rlg65vdvzj4r3pdgeb3ujy
Bimodal Distributed Binarized Neural Networks
[article]
2022
arXiv
pre-print
Preserving this distribution during binarization-aware training creates robust and informative binary feature maps and significantly reduces the generalization error of the BNN. ...
Our source code, experimental settings, training logs, and binary models are available at . ...
It contains over 1.2M training images from 1,000 different categories. For ImageNet, we use an ADAM optimizer with a momentum of 0.9 and a learning rate set to 1e − 3. ...
arXiv:2204.02004v1
fatcat:hbck33udlbfvrolw4nf76d24cu
Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?
2019
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
We conclude that the error of BNNs are predominantly caused by the intrinsic instability (training time) and non-robustness (train & test time). ...
While ensemble techniques have been broadly believed to be only marginally helpful for strong classifiers such as deep neural networks, our analysis and experiments show that they are naturally a perfect ...
and how it interacts with different optimizers such as SGD or ADAM [37] . ...
doi:10.1109/cvpr.2019.00506
dblp:conf/cvpr/ZhuDS19
fatcat:fed4idlbqrcnzg2kicg5uoazpu
Understanding Learning Dynamics of Binary Neural Networks via Information Bottleneck
[article]
2020
arXiv
pre-print
However, training BNNs are not easy due to the discontinuity in activation functions, and the training dynamics of BNNs is not well understood. ...
We analyze BNNs through the Information Bottleneck principle and observe that the training dynamics of BNNs is considerably different from that of Deep Neural Networks (DNNs). ...
Hence, an insight into the learning dynamics can help in the development of efficient optimizers targeted towards training BNNs. ...
arXiv:2006.07522v1
fatcat:3tz44z7ia5hw5ftohiqg67z2b4
A comprehensive review of Binary Neural Network
[article]
2022
arXiv
pre-print
Along the way, it examines BNN (a) purpose: their early successes and challenges; (b) BNN optimization: selected representative works that contain essential optimization techniques; (c) deployment: open-source ...
frameworks for BNN modeling and development; (d) terminal: efficient computing architectures and devices for BNN and (e) applications: diverse applications with BNN. ...
Extend based on Real-to-Bin's training strategy, BNN-Adam investigates and designs a new training scheme based on Adam optimizer and can successfully improve Real-to-Bin and ReActNet's trained performance ...
arXiv:2110.06804v3
fatcat:b2w6atz27fbgdacq5aiov32bpi
VIME: Variational Information Maximizing Exploration
[article]
2017
arXiv
pre-print
While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. ...
This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. ...
Acknowledgments This work was done in collaboration between UC Berkeley, Ghent University and OpenAI. ...
arXiv:1605.09674v4
fatcat:lrwm2ssr7nb3dhrektnzymohuu
BARS: Joint Search of Cell Topology and Layout for Accurate and Efficient Binary ARchitectures
[article]
2021
arXiv
pre-print
And we propose to automatically search for the optimal information flow. ...
%A notable challenge of BNN architecture search lies in that binary operations exacerbate the "collapse" problem of differentiable NAS, for which we incorporate various search and derive strategies to ...
Many recent works [18, 3, 28] follow its binarization scheme, and so do we. ...
arXiv:2011.10804v3
fatcat:67m4p5vdofev5ldb7cf56g3kqu
Safety and Robustness in Decision Making: Deep Bayesian Recurrent Neural Networks for Somatic Variant Calling in Cancer
[article]
2019
arXiv
pre-print
The genomic profile underlying an individual tumor can be highly informative in the creation of a personalized cancer treatment strategy for a given patient; a practice known as precision oncology. ...
The identification of these aberrations from sequencing noise and germline variant background poses a classic classification-style problem. ...
Adam optimizer [4] was used to optimize the cost function of each network. ...
arXiv:1912.02065v1
fatcat:c6iztas5bjcvzcjcsh44fjvxjq
A Review of Binarized Neural Networks
2019
Electronics
We give a tutorial of the general BNN methodology and review various contributions, implementations and applications of BNNs. ...
In this work, we review Binarized Neural Networks (BNNs). BNNs are deep neural networks that use binary values for activations and weights, instead of full precision values. ...
Using the STE, the real valued weights can be updated with an optimization strategy, like SDG or Adam. ...
doi:10.3390/electronics8060661
fatcat:7cvd6fn2undjdhuunnthsrzfzu
Binarizing by Classification: Is soft function really necessary?
[article]
2022
arXiv
pre-print
The MLP-based classifier can fit any continuous function theoretically and is adaptively learned to binarize networks and backpropagate gradients without any specific soft function. ...
Although many hand-designed soft functions have been proposed to approximate gradients, their mechanism is not clear and there are still huge performance gaps between binary models and their full-precision ...
Training strategy: For the experiments on CIFAR10 dataset, we apply SGD as our optimizer with a momentum of 0.9 and a weight decay of 1e-4. ...
arXiv:2205.07433v2
fatcat:lpkedp6thjd25mqp6aau4evkre
Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?
[article]
2018
arXiv
pre-print
We conclude that the error of BNNs is predominantly caused by the intrinsic instability (training time) and non-robustness (train & test time). ...
While ensemble techniques have been broadly believed to be only marginally helpful for strong classifiers such as deep neural networks, our analyses and experiments show that they are naturally a perfect ...
and how it interacts with different optimizers such as SGD or ADAM [37] . ...
arXiv:1806.07550v2
fatcat:t36uex5eezh4vjelw2aaxubcv4
How to Train a Compact Binary Neural Network with High Accuracy?
2017
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
How to train a binary neural network (BinaryNet) with both high compression rate and high accuracy on large scale dataset? ...
We answer this question through a careful analysis of previous work on BinaryNets, in terms of training strategies, regularization, and activation approximation. ...
We emphasize that proper training strategies are actually as important as how to do better approximation, which was largely over-
Our Approach In the section, we present in detail how we train a Bina-ryNet ...
doi:10.1609/aaai.v31i1.10862
fatcat:2tl4iuby7zeplgyqwty52raodu
Binary Neural Networks: A Survey
2020
Pattern Recognition
We also investigate other practical aspects of binary neural networks such as the hardware-friendly design and the training tricks. ...
However, the binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network. ...
The studies based on BNNs can also help us to analyze how structures in deep neural networks work. ...
doi:10.1016/j.patcog.2020.107281
fatcat:p7ohjigozza5viejq6x7cyf6zi
« Previous
Showing results 1 — 15 out of 321 results