Filters








321 Hits in 4.1 sec

How Do Adam and Training Strategies Help BNNs Optimization? [article]

Zechun Liu, Zhiqiang Shen, Shichao Li, Koen Helwegen, Dong Huang, Kwang-Ting Cheng
2021 arXiv   pre-print
The best performing Binary Neural Networks (BNNs) are usually attained using Adam optimization and its multi-step training variants.  ...  specific training strategies.  ...  Besides comparing Adam to SGD, we further explore how training strategies affect BNN optimization. Previous works proposed different training strategies: Yang et al.  ... 
arXiv:2106.11309v1 fatcat:5m3ezp4gxvdsbdg2ptv2udpzmq

BAMSProd: A Step towards Generalizing the Adaptive Optimization Methods to Deep Binary Model [article]

Junjie Liu, Dongchao Wen, Deyu Wang, Wei Tao, Tse-Wei Chen, Kinya Osa, Masami Kato
2020 arXiv   pre-print
In this paper, we provide an explicit convex optimization example where training the BNNs with the traditionally adaptive optimization methods still faces the risk of non-convergence, and identify that  ...  Recent methods have significantly reduced the performance degradation of Binary Neural Networks (BNNs), but guaranteeing the effective and efficient training of BNNs is an unsolved problem.  ...  These models are all trained with default strategies and data augmentation in [43] .  ... 
arXiv:2009.13799v1 fatcat:oia2pd4pznellcg62rcyy3wkra

Augmenting Neural Networks with Priors on Function Values [article]

Hunter Nisonoff, Yixin Wang, Jennifer Listgarten
2022 arXiv   pre-print
How can we coherently leverage such prior knowledge to help improve a neural network model that is quite accurate in some regions of input space -- typically near the training data -- but wildly wrong  ...  Herein, we tackle this problem by developing an approach to augment BNNs with prior information on the function values themselves.  ...  The GB1 data set used a fully-connected neural network with 1 hidden layer containing 300 dimensions, ReLU non-linearities, and was optimized using Adam with a weight-decay of 0.0001.  ... 
arXiv:2202.04798v3 fatcat:tb25rlg65vdvzj4r3pdgeb3ujy

Bimodal Distributed Binarized Neural Networks [article]

Tal Rozen, Moshe Kimhi, Brian Chmiel, Avi Mendelson, Chaim Baskin
2022 arXiv   pre-print
Preserving this distribution during binarization-aware training creates robust and informative binary feature maps and significantly reduces the generalization error of the BNN.  ...  Our source code, experimental settings, training logs, and binary models are available at .  ...  It contains over 1.2M training images from 1,000 different categories. For ImageNet, we use an ADAM optimizer with a momentum of 0.9 and a learning rate set to 1e − 3.  ... 
arXiv:2204.02004v1 fatcat:hbck33udlbfvrolw4nf76d24cu

Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?

Shilin Zhu, Xin Dong, Hao Su
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
We conclude that the error of BNNs are predominantly caused by the intrinsic instability (training time) and non-robustness (train & test time).  ...  While ensemble techniques have been broadly believed to be only marginally helpful for strong classifiers such as deep neural networks, our analysis and experiments show that they are naturally a perfect  ...  and how it interacts with different optimizers such as SGD or ADAM [37] .  ... 
doi:10.1109/cvpr.2019.00506 dblp:conf/cvpr/ZhuDS19 fatcat:fed4idlbqrcnzg2kicg5uoazpu

Understanding Learning Dynamics of Binary Neural Networks via Information Bottleneck [article]

Vishnu Raj, Nancy Nayak, Sheetal Kalyani
2020 arXiv   pre-print
However, training BNNs are not easy due to the discontinuity in activation functions, and the training dynamics of BNNs is not well understood.  ...  We analyze BNNs through the Information Bottleneck principle and observe that the training dynamics of BNNs is considerably different from that of Deep Neural Networks (DNNs).  ...  Hence, an insight into the learning dynamics can help in the development of efficient optimizers targeted towards training BNNs.  ... 
arXiv:2006.07522v1 fatcat:3tz44z7ia5hw5ftohiqg67z2b4

A comprehensive review of Binary Neural Network [article]

Chunyu Yuan, Sos S. Agaian
2022 arXiv   pre-print
Along the way, it examines BNN (a) purpose: their early successes and challenges; (b) BNN optimization: selected representative works that contain essential optimization techniques; (c) deployment: open-source  ...  frameworks for BNN modeling and development; (d) terminal: efficient computing architectures and devices for BNN and (e) applications: diverse applications with BNN.  ...  Extend based on Real-to-Bin's training strategy, BNN-Adam investigates and designs a new training scheme based on Adam optimizer and can successfully improve Real-to-Bin and ReActNet's trained performance  ... 
arXiv:2110.06804v3 fatcat:b2w6atz27fbgdacq5aiov32bpi

VIME: Variational Information Maximizing Exploration [article]

Rein Houthooft, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel
2017 arXiv   pre-print
While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios.  ...  This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics.  ...  Acknowledgments This work was done in collaboration between UC Berkeley, Ghent University and OpenAI.  ... 
arXiv:1605.09674v4 fatcat:lrwm2ssr7nb3dhrektnzymohuu

BARS: Joint Search of Cell Topology and Layout for Accurate and Efficient Binary ARchitectures [article]

Tianchen Zhao, Xuefei Ning, Xiangsheng Shi, Songyi Yang, Shuang Liang, Peng Lei, Jianfei Chen, Huazhong Yang, Yu Wang
2021 arXiv   pre-print
And we propose to automatically search for the optimal information flow.  ...  %A notable challenge of BNN architecture search lies in that binary operations exacerbate the "collapse" problem of differentiable NAS, for which we incorporate various search and derive strategies to  ...  Many recent works [18, 3, 28] follow its binarization scheme, and so do we.  ... 
arXiv:2011.10804v3 fatcat:67m4p5vdofev5ldb7cf56g3kqu

Safety and Robustness in Decision Making: Deep Bayesian Recurrent Neural Networks for Somatic Variant Calling in Cancer [article]

Geoffroy Dubourg-Felonneau, Omar Darwish, Christopher Parsons, Dami Rebergen, John W Cassidy, Nirmesh Patel, Harry W Clifford
2019 arXiv   pre-print
The genomic profile underlying an individual tumor can be highly informative in the creation of a personalized cancer treatment strategy for a given patient; a practice known as precision oncology.  ...  The identification of these aberrations from sequencing noise and germline variant background poses a classic classification-style problem.  ...  Adam optimizer [4] was used to optimize the cost function of each network.  ... 
arXiv:1912.02065v1 fatcat:c6iztas5bjcvzcjcsh44fjvxjq

A Review of Binarized Neural Networks

Taylor Simons, Dah-Jye Lee
2019 Electronics  
We give a tutorial of the general BNN methodology and review various contributions, implementations and applications of BNNs.  ...  In this work, we review Binarized Neural Networks (BNNs). BNNs are deep neural networks that use binary values for activations and weights, instead of full precision values.  ...  Using the STE, the real valued weights can be updated with an optimization strategy, like SDG or Adam.  ... 
doi:10.3390/electronics8060661 fatcat:7cvd6fn2undjdhuunnthsrzfzu

Binarizing by Classification: Is soft function really necessary? [article]

Yefei He, Luoming Zhang, Weijia Wu, Hong Zhou
2022 arXiv   pre-print
The MLP-based classifier can fit any continuous function theoretically and is adaptively learned to binarize networks and backpropagate gradients without any specific soft function.  ...  Although many hand-designed soft functions have been proposed to approximate gradients, their mechanism is not clear and there are still huge performance gaps between binary models and their full-precision  ...  Training strategy: For the experiments on CIFAR10 dataset, we apply SGD as our optimizer with a momentum of 0.9 and a weight decay of 1e-4.  ... 
arXiv:2205.07433v2 fatcat:lpkedp6thjd25mqp6aau4evkre

Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? [article]

Shilin Zhu, Xin Dong, Hao Su
2018 arXiv   pre-print
We conclude that the error of BNNs is predominantly caused by the intrinsic instability (training time) and non-robustness (train & test time).  ...  While ensemble techniques have been broadly believed to be only marginally helpful for strong classifiers such as deep neural networks, our analyses and experiments show that they are naturally a perfect  ...  and how it interacts with different optimizers such as SGD or ADAM [37] .  ... 
arXiv:1806.07550v2 fatcat:t36uex5eezh4vjelw2aaxubcv4

How to Train a Compact Binary Neural Network with High Accuracy?

Wei Tang, Gang Hua, Liang Wang
2017 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
How to train a binary neural network (BinaryNet) with both high compression rate and high accuracy on large scale dataset?  ...  We answer this question through a careful analysis of previous work on BinaryNets, in terms of training strategies, regularization, and activation approximation.  ...  We emphasize that proper training strategies are actually as important as how to do better approximation, which was largely over- Our Approach In the section, we present in detail how we train a Bina-ryNet  ... 
doi:10.1609/aaai.v31i1.10862 fatcat:2tl4iuby7zeplgyqwty52raodu

Binary Neural Networks: A Survey

Haotong Qin, Ruihao Gong, Xianglong Liu, Xiao Bai, Jingkuan Song, Nicu Sebe
2020 Pattern Recognition  
We also investigate other practical aspects of binary neural networks such as the hardware-friendly design and the training tricks.  ...  However, the binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network.  ...  The studies based on BNNs can also help us to analyze how structures in deep neural networks work.  ... 
doi:10.1016/j.patcog.2020.107281 fatcat:p7ohjigozza5viejq6x7cyf6zi
« Previous Showing results 1 — 15 out of 321 results