Filters








71,498 Hits in 4.0 sec

Data augmentation instead of explicit regularization [article]

Alex Hernández-García, Peter König
2020 arXiv   pre-print
Here, we first provide formal definitions of explicit and implicit regularization that help understand essential differences between techniques.  ...  Second, we contrast data augmentation with weight decay and dropout.  ...  This approach of training with data augmentation instead of explicit regularization improves the performance and saves valuable computational resources that are responsible for significant greenhouse gas  ... 
arXiv:1806.03852v5 fatcat:ussbftciyjhdhd2t4vmltkzs3m

Do deep nets really need weight decay and dropout? [article]

Alex Hernández-García, Peter König
2018 arXiv   pre-print
may not be necessary for object recognition if enough data augmentation is introduced.  ...  In this paper we build upon recent research that suggests that explicit regularization may not be as important as widely believed and carry out an ablation study that concludes that weight decay and dropout  ...  We leave for future work the analysis of these results on a larger set of architectures and data sets, as well as further exploring the benefits of data augmentation compared to explicit regularization  ... 
arXiv:1802.07042v3 fatcat:c5jk6bpuhngqfbcrz7yca63oke

Understanding deep learning requires rethinking generalization [article]

Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals
2017 arXiv   pre-print
exceeds the number of data points as it usually does in practice.  ...  This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise.  ...  More strikingly, with data-augmentation on but other explicit regularizers off, Inception is able to achieve a top-1 accuracy of 72.95%.  ... 
arXiv:1611.03530v2 fatcat:qr5suinu3vfb7a3mrnvualpaza

Adaptive Low-Rank Factorization to regularize shallow and deep neural networks [article]

Mohammad Mahdi Bejani, Mehdi Ghatee
2020 arXiv   pre-print
Instead, we use this regularization scheme adaptively when the complexity of a layer is high. The complexity of any layer can be evaluated by the nonlinear condition numbers of its learning system.  ...  In addition, most of the regularization schemes decrease the learning speed.  ...  [41, 42] , data visualization [15] , and ensemble learning [43] .  ... 
arXiv:2005.01995v1 fatcat:qimrvd6u5felzac64r37l4cuwi

A Data-Augmentation Is Worth A Thousand Samples: Exact Quantification From Analytical Augmented Sample Moments [article]

Randall Balestriero, Ishan Misra, Yann LeCun
2022 arXiv   pre-print
Data-Augmentation (DA) is known to improve performance across tasks and datasets.  ...  How does the augmentation policy impact the final parameters of a model?  ...  Background Explicit Regularizers From Data-Augmentation.  ... 
arXiv:2202.08325v1 fatcat:43ekytvw5fegvp46xqi26w6zuq

Jointly Learnable Data Augmentations for Self-Supervised GNNs [article]

Zekarias T. Kefato and Sarunas Girdzijauskas and Hannes Stärk
2021 arXiv   pre-print
First, instead of heuristics we propose a learnable data augmentation method that is jointly learned with the embeddings by leveraging the inherent signal encoded in the graph.  ...  In addition, we take advantage of the flexibility of the learnable data augmentation and introduce a new strategy that augments in the embedding space, called post augmentation.  ...  Furthermore, the method does not require explicit contrastive terms or negative sampling.  ... 
arXiv:2108.10420v1 fatcat:znyoshgeivbcvgkkjs22cj5hqu

Counterfactual Maximum Likelihood Estimation for Training Deep Networks [article]

Xinyi Wang, Wenhu Chen, Michael Saxon, William Yang Wang
2021 arXiv   pre-print
, Implicit CMLE and Explicit CMLE, for causal predictions of deep learning models using observational data.  ...  the regular evaluations.  ...  Related Work Another line of work approaches this problem with data augmentation.  ... 
arXiv:2106.03831v2 fatcat:ig54ylsa25hbldrdwp6raxt25a

Improved Robustness to Open Set Inputs via Tempered Mixup [article]

Ryne Roady, Tyler L. Hayes, Christopher Kanan
2020 arXiv   pre-print
Supervised classification methods often assume that evaluation data is drawn from the same distribution as training data and that all classes are present for training.  ...  Our method achieves state-of-the-art results on open set classification baselines and easily scales to large-scale open set classification problems.  ...  To overcome this limitation, we designed our method around drawing novel samples for confidence loss using data augmentation instead of relying on an explicit background set.  ... 
arXiv:2009.04659v1 fatcat:bvvztda72nbpbgxkeivtxfjiu4

Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization [article]

Mufan Sang, Haoqi Li, Fang Liu, Andrew O. Arnold, Li Wan
2022 arXiv   pre-print
of positive data pairs.  ...  We also explore the effectiveness of alternative online data augmentation strategies on both the time domain and frequency domain.  ...  Instead, our online data augmentation allows multiple different augmentations randomly and collectively applied to each input utterance.  ... 
arXiv:2112.04459v2 fatcat:amiklcmgjzdsnl4vssn6t7icwi

DivAug: Plug-in Automated Data Augmentation with Explicit Diversity Maximization [article]

Zirui Liu, Haifeng Jin, Ting-Hsiang Wang, Kaixiong Zhou, Xia Hu
2021 arXiv   pre-print
However, two factors regarding the diversity of augmented data are still missing: 1) the explicit definition (and thus measurement) of diversity and 2) the quantifiable relationship between diversity and  ...  Specifically, recent work has empirically shown that the superior performance of the automated data augmentation methods stems from increasing the diversity of augmented data .  ...  It motivates us to explore the possibility of using an explicit diversity measure to quantify the regularization effect of the augmented data may have on the model.  ... 
arXiv:2103.14545v2 fatcat:qfei7off7bhthon6pfx3nzjcoq

Group-Valued Regularization for Analysis of Articulated Motion [chapter]

Guy Rosman, Alex M. Bronstein, Michael M. Bronstein, Xue-Cheng Tai, Ron Kimmel
2012 Lecture Notes in Computer Science  
We extend augmented-Lagrangian total variation regularization to smooth rigid motion cues on the scanned 3D surface obtained from a range scanner.  ...  We present a novel method for estimation of articulated motion in depth scans. The method is based on a framework for regularization of vector-and matrix-valued functions on parametric surfaces.  ...  Augmented Lagrangian Regularization of Group-Valued Maps on Parametric Surfaces Using an augmented Lagrangian term in order to enforce the constraint of u = v ∈ SE(3), the overall functional reads max  ... 
doi:10.1007/978-3-642-33863-2_6 fatcat:qbpb5gz5qjaprhadw7353he6ee

Virtual Augmentation Supported Contrastive Learning of Sentence Representations [article]

Dejiao Zhang, Wei Xiao, Henghui Zhu, Xiaofei Ma, Andrew O. Arnold
2022 arXiv   pre-print
Originating from the interpretation that data augmentation essentially constructs the neighborhoods of each training instance, we in turn utilize the neighborhood to generate effective data augmentations  ...  This challenge is magnified in natural language processing where no general rules exist for data augmentation due to the discrete nature of natural language.  ...  C Evaluating VaSCL with explicit Data Augmentations Please refer to Figure 4 for complete evaluation of VaSCL with explicit data augmentations.  ... 
arXiv:2110.08552v2 fatcat:t374n34vsjhh5fqim6q6xodq2e

Regularizing Deep Networks with Semantic Data Augmentation [article]

Yulin Wang, Gao Huang, Shiji Song, Xuran Pan, Yitong Xia, Cheng Wu
2021 arXiv   pre-print
Data augmentation is widely known as a simple yet surprisingly effective technique for regularizing deep networks.  ...  Conventional data augmentation schemes, e.g., flipping, translation or rotation, are low-level, data-independent and class-agnostic operations, leading to limited diversity for augmented samples.  ...  Data augmentation is a widely used technique to regularize deep networks.  ... 
arXiv:2007.10538v5 fatcat:slbkfaoe2za4thezso4xyrwtlq

Squared ℓ_2 Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations [article]

Haohan Wang, Zeyi Huang, Xindi Wu, Eric P. Xing
2020 arXiv   pre-print
Data augmentation is one of the most popular techniques for improving the robustness of neural networks.  ...  In addition to directly training the model with original samples and augmented samples, a torrent of methods regularizing the distance between embeddings/representations of the original samples and their  ...  Also, slightly different from the intuitive understanding of "vertices" above, A3 regulates the behavior of embedding instead of raw data.  ... 
arXiv:2011.13052v1 fatcat:wpn6qlr3a5fwnahngpjalirzye

Stochastic Training is Not Necessary for Generalization [article]

Jonas Geiping, Micah Goldblum, Phillip E. Pope, Michael Moeller, Tom Goldstein
2022 arXiv   pre-print
To this end, we show that the implicit regularization of SGD can be completely replaced with explicit regularization even when comparing against a strong and well-researched baseline.  ...  It is widely believed that the implicit regularization of SGD is fundamental to the impressive generalization behavior we observe in neural networks.  ...  data augmentation.  ... 
arXiv:2109.14119v2 fatcat:izkob2pvcfefhaqospgdzjnr7e
« Previous Showing results 1 — 15 out of 71,498 results