A Robust Initialization of Residual Blocks for Effective ResNet Training without Batch Normalization [article]

Enrico Civitelli, Alessio Sortino, Matteo Lapucci, Francesco Bagattini, Giulio Galvan
2021 arXiv   pre-print
Batch Normalization is an essential component of all state-of-the-art neural networks architectures. However, since it introduces many practical issues, much recent research has been devoted to designing normalization-free architectures. In this paper, we show that weights initialization is key to train ResNet-like normalization-free networks. In particular, we propose a slight modification to the summation operation of a block output to the skip connection branch, so that the whole network is
more » ... orrectly initialized. We show that this modified architecture achieves competitive results on CIFAR-10 without further regularization nor algorithmic modifications.
arXiv:2112.12299v1 fatcat:q7ms67kuejduffprk3diuyvswm