Dropout as data augmentation [article]

Xavier Bouthillier, Kishore Konda, Pascal Vincent, Roland Memisevic
2016 arXiv   pre-print
Dropout is typically interpreted as bagging a large number of models sharing parameters. We show that using dropout in a network can also be interpreted as a kind of data augmentation in the input space without domain knowledge. We present an approach to projecting the dropout noise within a network back into the input space, thereby generating augmented versions of the training data, and we show that training a deterministic network on the augmented samples yields similar results. Finally, we
more » ... ropose a new dropout noise scheme based on our observations and show that it improves dropout results without adding significant computational cost.
arXiv:1506.08700v4 fatcat:mtr5qwwrirdw5o4u3ntzunf2t4