9,147 Hits in 5.7 sec

Regularizing Neural Networks via Stochastic Branch Layers [article]

Wonpyo Park, Paul Hongsuck Seo, Bohyung Han, Minsu Cho
2019 arXiv   pre-print
We introduce a novel stochastic regularization technique for deep neural networks, which decomposes a layer into multiple branches with different parameters and merges stochastically sampled combinations  ...  The proposed regularizer allows the model to explore diverse regions of the model parameter space via multiple combinations of branches to find better local minima.  ...  In training a neural network, the stochastic branch layers act as a regularizer.  ... 
arXiv:1910.01467v1 fatcat:b7sccp23ijemrexu4vyfvh2mzq

Stochastic Shake-Shake Regularization for Affective Learning from Speech

Che-Wei Huang, Shrikanth Narayanan
2018 Interspeech 2018  
We propose stochastic Shake-Shake regularization based on multi-branch residual architectures to mitigate over-fitting in affective learning from speech.  ...  network parameters than to boosting the generalization power.  ...  Stochastic Shake-Shake regularized 3-branch ResNeXt with p l = 1 − l L (1 − pL) for every Shaking layer, where pL = 0.50.  ... 
doi:10.21437/interspeech.2018-1327 dblp:conf/interspeech/HuangN18 fatcat:5az7x4danjgyxjqmhcwoxun74y

StochasticNet: Forming Deep Neural Networks via Stochastic Connectivity [article]

Mohammad Javad Shafiee, Parthipan Siva, Alexander Wong
2015 arXiv   pre-print
Motivated by this intriguing finding, we introduce the concept of StochasticNet, where deep neural networks are formed via stochastic connectivity between neurons.  ...  Deep neural networks is a branch in machine learning that has seen a meteoric rise in popularity due to its powerful abilities to represent and model high-level abstractions in highly complex data.  ...  and network regularization methods.  ... 
arXiv:1508.05463v4 fatcat:jm4cqujmtjhrzdopixwux5m2me

A Chain Graph Interpretation of Real-World Neural Networks [article]

Yuesong Shen, Daniel Cremers
2020 arXiv   pre-print
One major issue is that our current interpretation of neural networks (NNs) as function approximators is too generic to support in-depth analysis.  ...  It is thus a promising framework that deepens our understanding of neural networks and provides a coherent theoretical formulation for future deep learning research.  ...  Its introduction of stochastic noise has shown effective at regularizing neural networks and its success can also be argued from an ensemble learning perspective [52] .  ... 
arXiv:2006.16856v2 fatcat:vp3khlurmbfejgbjdylruw7mo4

Blockout: Dynamic Model Selection for Hierarchical Deep Networks

Calvin Murdock, Zhen Li, Howard Zhou, Tom Duerig
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
via heuristic clustering methods.  ...  of hierarchical network structures.  ...  Deep Neural Networks Deep neural networks are layered nonlinear functions f : R d → R p that take d-dimensional images as input and output p-dimensional predictions.  ... 
doi:10.1109/cvpr.2016.283 dblp:conf/cvpr/MurdockLZD16 fatcat:gppidxc3yrfuzekyz223wb25li

BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks [article]

Surat Teerapittayanon, Bradley McDanel, H.T. Kung
2017 arXiv   pre-print
Deep neural networks are state of the art methods for many learning tasks due to their ability to extract increasingly better features at each network layer.  ...  The architecture allows prediction results for a large portion of test samples to exit the network early via these branches when samples can already be inferred with high confidence.  ...  For the former, branches will provide regularization on the main branch (baseline network), and vice versa.  ... 
arXiv:1709.01686v1 fatcat:fq6z7z3xdbfvhpwhj7zdjmvooa

Convolutional Neural Networks with Dynamic Regularization [article]

Yi Wang, Zhen-Peng Bian, Junhui Hou, Lap-Pui Chau
2020 arXiv   pre-print
For convolutional neural networks (CNNs), regularization methods, such as DropBlock and Shake-Shake, have illustrated the improvement in the generalization performance.  ...  That is, the regularization strength is fixed to a predefined schedule, and manual adjustments are required to adapt to various network architectures.  ...  Stochastic depth randomly drops a certain number of residual branches of ResNet so that the network is shrunk in training.  ... 
arXiv:1909.11862v3 fatcat:a7i2krz6anhofoq5jdpd6f4znq

Multi-scale Convolution Aggregation and Stochastic Feature Reuse for DenseNets [article]

Mingjie Wang, Jun Zhou, Wendong Mao, Minglun Gong
2018 arXiv   pre-print
To address this problem, a regularization method named Stochastic Feature Reuse is also presented.  ...  Recently, Convolution Neural Networks (CNNs) obtained huge success in numerous vision tasks.  ...  Stochastic Feature Reuse Dropout [8] , Drop-connect [30] and Maxout [4] provide excellent regularization methods through modifying interactions among neural units or connections between different  ... 
arXiv:1810.01373v1 fatcat:tj5zeutrq5bqdfuqahpxcxazee

Comprehensive Online Network Pruning via Learnable Scaling Factors [article]

Muhammad Umair Haider, Murtaza Taj
2020 arXiv   pre-print
Width wise pruning (filter pruning) is commonly performed via learnable gates or switches and sparsity regularizers whereas pruning of layers has so far been performed arbitrarily by manually designing  ...  One of the major challenges in deploying deep neural network architectures is their size which has an adverse effect on their inference time and memory requirements.  ...  Considering a neural network with L hidden layers h l where l ∈ {1, · · · , L}.  ... 
arXiv:2010.02623v1 fatcat:xecqx2coxnfa3jd2vwnm5vcofu

AugShuffleNet: Improve ShuffleNetV2 via More Information Communication [article]

Longqing Ye
2022 arXiv   pre-print
Based on ShuffleNetV2, we build a more powerful and efficient model family, termed as AugShuffleNets, by introducing higher frequency of cross-layer information communication for better model performance  ...  Deep networks with stochastic depth [19] reduces training depth of ResNets by randomly skipping layers.  ...  Short Connection ResNets and Highway Networks [18] introduce skip connections to allow training deeper neural networks.  ... 
arXiv:2203.06589v1 fatcat:ohdkaek45neyxgrbisl2q2t44m

Deciding How to Decide: Dynamic Routing in Artificial Neural Networks [article]

Mason McGill, Pietro Perona
2017 arXiv   pre-print
We find that, in dynamically-routed networks trained to classify images, layers and branches become specialized to process distinct categories of images.  ...  We propose and systematically evaluate three strategies for training dynamically-routed artificial neural networks: graphs of learned transformations through which different input signals may take different  ...  To support both frequently-and infrequently-used layers, we regularize subnetworks as they are activated bŷ d, instead of regularizing the entire network directly.  ... 
arXiv:1703.06217v2 fatcat:talv3zme6rhurcts2c66tkek6e

Detecting and Counting Small Animal Species Using Drone Imagery by Applying Deep Learning [chapter]

Ravi Sahu
2019 Visual Object Tracking in the Deep Neural Networks Era [Working Title]  
For this purpose, the U-Net architecture neural network was implemented. Dilated convolution layer was added to usual U-Net.  ...  Designed flexible architecture allows to train neural network for pixel-wise semantic segmentation with accuracy value 0.9863 on the tiny dataset.  ...  Branches have less but detected probabilities. Visual Object Tracking with Deep Neural Networks  ... 
doi:10.5772/intechopen.88437 fatcat:gw7lg757pjbavniheehbf6or3y

Learning Decoupling Features Through Orthogonality Regularization [article]

Li Wang, Rongzhi Gu, Weiji Zhuang, Peng Gao, Yujun Wang, Yuexian Zou
2022 arXiv   pre-print
Bearing this in mind, a two-branch deep network (KWS branch and SV branch) with the same network structure is developed and a novel decoupling feature learning method is proposed to push up the performance  ...  The results demonstrate that the orthogonality regularization helps the network to achieve SOTA EER of 1.31% and 1.87% on KWS and SV, respectively.  ...  Architecture of the proposed two-branch neural network. It consists of a shared temporal convolutional layer and two branches, the KWS branch and the SV branch.  ... 
arXiv:2203.16772v1 fatcat:3qmqdhg5v5esbdlknzybib6dga

BranchConnect: Image Categorization with Learned Branch Connections

Karim Ahmed, Lorenzo Torresani
2018 2018 IEEE Winter Conference on Applications of Computer Vision (WACV)  
The stem then splits into multiple branches implementing parallel feature extractors, which are ultimately connected to the final classification layer via learned gated connections.  ...  The stem of the tree includes a sequence of convolutional layers common to all classes.  ...  Stochastic pooling [23] is another regularization mechanism leveraging stochasticity during learning.  ... 
doi:10.1109/wacv.2018.00141 dblp:conf/wacv/AhmedT18 fatcat:kwzvnuacprbdzhrecd6hfalcce

MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning [article]

Yuan Gao, Haoping Bai, Zequn Jie, Jiayi Ma, Kui Jia, Wei Liu
2020 arXiv   pre-print
We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL).  ...  This is realized with a minimum entropy regularization on the architecture weights during the search phase, which makes the architecture weights converge to near-discrete values and therefore achieves  ...  Specifically, we start with multiple fixed single-task network branches, representing each intermediate layer as a node and the associated feature fusion operations as an edge.  ... 
arXiv:2003.14058v1 fatcat:eqkyyoz2s5dc5gk3g3ysbhqcey
« Previous Showing results 1 — 15 out of 9,147 results