Filters








13,364 Hits in 4.4 sec

Adversarial Training Can Hurt Generalization [article]

Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John C. Duchi, Percy Liang
2019 arXiv   pre-print
While adversarial training can improve robust accuracy (against an adversary), it sometimes hurts standard accuracy (when there is no adversary).  ...  Finally, we show that robust self-training mostly eliminates this tradeoff by leveraging unlabeled data.  ...  Standard training only fits the training points, which can be done with a simple estimator that generalizes well; adversarial training encourages fitting perturbations of the training points making the  ... 
arXiv:1906.06032v2 fatcat:awfsnfzzmfbolp5zfpbqzqw26a

Improving robustness of language models from a geometry-aware perspective [article]

Bin Zhu, Zhaoquan Gu, Le Wang, Jinyin Chen, Qi Xuan
2022 arXiv   pre-print
On top of FADA, we propose geometry-aware adversarial training (GAT) to perform adversarial training on friendly adversarial data so that we can save a large number of search steps.  ...  Inspired by this, we propose friendly adversarial data augmentation (FADA) to generate friendly adversarial data.  ...  Denote X f as the friendly adversarial data generated by FADA, Eq. ( 4 ) can be reformulated as max δs ≤ L(f θ (X f + δ s ), y). (5) The tiny δ s can be obtained by some gradient-based adversarial training  ... 
arXiv:2204.13309v1 fatcat:bjmmi4agujdxnmwgrbu6ndwjkq

Towards the Memorization Effect of Neural Networks in Adversarial Training [article]

Han Xu, Xiaorui Liu, Wentao Wang, Wenbiao Ding, Zhongqin Wu, Zitao Liu, Anil Jain, Jiliang Tang
2021 arXiv   pre-print
While, DNNs which are optimized via adversarial training algorithms can also achieve perfect training performance by memorizing the labels of atypical samples, as well as the adversarially perturbed atypical  ...  However, adversarially trained models always suffer from poor generalization, with both relatively low clean accuracy and robustness on the test set.  ...  to validate the statement in Section 3.2, where we state that fitting atypical samples in adversarial training can hurt the performance (clean & adversarial accuracy) of typical samples.  ... 
arXiv:2106.04794v1 fatcat:fy2pkijb3fcdxezfjbpq53qrja

Why adversarial training can hurt robust accuracy [article]

Jacob Clarysse and Julia Hörmann and Fanny Yang
2022 arXiv   pre-print
In this paper, we demonstrate that, surprisingly, the opposite may be true -- Even though adversarial training helps when enough data is available, it may hurt robust generalization in the small sample  ...  Machine learning classifiers with high test accuracy often perform poorly under adversarial attacks. It is commonly believed that adversarial training alleviates this issue.  ...  We see in Figure 5a and 6a that, indeed, adversarial training can hurt robust generalization with increasing perturbation budget tr .  ... 
arXiv:2203.02006v1 fatcat:cc4z7lkotraw3fowpatdp5uuei

The Curious Case of Adversarially Robust Models: More Data Can Help, Double Descend, or Hurt Generalization [article]

Yifei Min, Lin Chen, Amin Karbasi
2020 arXiv   pre-print
In this paper, however, we challenge this conventional belief and show that more training data can hurt the generalization of adversarially robust models in the classification problems.  ...  We prove that more data always hurts the generalization performance of adversarially trained models with large perturbations.  ...  , double descend, or hurt generalization of the adversarially trained model, respectively.  ... 
arXiv:2002.11080v2 fatcat:zm5mfpu6lnbnbdsnidtxjq3hle

Smooth Adversarial Training [article]

Cihang Xie, Mingxing Tan, Boqing Gong, Alan Yuille, Quoc V. Le
2021 arXiv   pre-print
Hence we propose smooth adversarial training (SAT), in which we replace ReLU with its smooth approximations to strengthen adversarial training.  ...  It is also generally believed that, unless making networks larger, network architectural elements would otherwise matter little in improving adversarial robustness.  ...  We conjecture that this non-smooth nature hurts the training process, especially when we train models adversarially.  ... 
arXiv:2006.14536v2 fatcat:asva2qyh5nckzdt4ymgsu4zo2i

Beneficial Perturbations Network for Defending Adversarial Examples [article]

Shixian Wen, Amanda Rios, Laurent Itti
2021 arXiv   pre-print
less than classic adversarial training; 4) BPN can improve the generalization of the network 5) BPN trained only with Fast Gradient Sign Attack can generalize to defend PGD attacks.  ...  . 2) BPN can defend against adversarial examples with negligible additional computation and parameter costs compared to training only on clean examples; 3) BPN hurts the accuracy on clean examples much  ...  costs; (2) BPN can alleviate the accuracy trade-off -hurts the accuracy on clean examples less than classical adversarial training; (3) The increased diversity of the training set can improve generalization  ... 
arXiv:2009.12724v3 fatcat:qjbnjhh7hvb3hokjsxjywqsuky

Adversarial Training for Large Neural Language Models [article]

Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon, Jianfeng Gao
2020 arXiv   pre-print
Generalization and robustness are both key desiderata for designing machine learning methods. Adversarial training can enhance robustness, but past work often finds it hurts generalization.  ...  However, these models are still vulnerable to adversarial attacks. In this paper, we show that adversarial pre-training can improve both generalization and robustness.  ...  The curious case of adversarially robust models: More data can help, double descend, or hurt generalization. arXiv preprint arXiv:2002.11080.  ... 
arXiv:2004.08994v2 fatcat:ygxjh3azxjgotippvqtensrt3u

NoiLIn: Do Noisy Labels Always Hurt Adversarial Training? [article]

Jingfeng Zhang, Xilie Xu, Bo Han, Tongliang Liu, Gang Niu, Lizhen Cui, Masashi Sugiyama
2021 arXiv   pre-print
Firstly, we find that NL injection in inner maximization for generating adversarial data augments natural data implicitly, which benefits AT's generalization.  ...  Adversarial training (AT) based on minimax optimization is a popular learning style that enhances the model's adversarial robustness.  ...  NL is often deemed to hurt the training.  ... 
arXiv:2105.14676v1 fatcat:mtcfp6dlabehbpd43ij4jrxrwu

Large Norms of CNN Layers Do Not Hurt Adversarial Robustness [article]

Youwei Liang, Dong Huang
2021 arXiv   pre-print
However, they can slightly hurt adversarial robustness.  ...  Observing this unexpected phenomenon, we compute the norms of layers in the CNNs trained with three different adversarial training frameworks and surprisingly find that adversarially robust CNNs have comparable  ...  Regularizing Norms Improves Generalization but Can Hurt Adversarial Robustness To better understand the effect of regularizing the norm of CNN layers, we conduct experiments with various models on CIFAR  ... 
arXiv:2009.08435v6 fatcat:evlkidedfje2bpkuoahjani4bq

Adversarial Training: embedding adversarial perturbations into the parameter space of a neural network to build a robust system [article]

Shixian Wen, Laurent Itti
2019 arXiv   pre-print
Thus, we can achieve adversarial training with negligible cost compared to requiring a training set of adversarial example images.  ...  In addition, if combined with classical adversarial training, our perturbation biases can alleviate accuracy trade-off difficulties, and diversify adversarial perturbations.  ...  Difficulty two: accuracy trade-off between clean examples and adversarial examples -although adversarial training can improve the robustness against adversarial examples, it sometimes hurts accuracy  ... 
arXiv:1910.04279v1 fatcat:w4irh5ucq5dgln4bcebhwhntd4

Effectiveness of Adversarial Examples and Defenses for Malware Classification [article]

Robert Podschwadt, Hassan Takabi
2019 arXiv   pre-print
In order to better understand the space of adversarial examples in malware classification, we study different approaches of crafting adversarial examples and defense techniques in the malware domain and  ...  Although artificial neural networks perform very well on these tasks, they are also vulnerable to adversarial examples.  ...  The generator and discriminator are trained for 100 epochs. Every epoch the generated adversarial examples are tested against a black box detector.  ... 
arXiv:1909.04778v1 fatcat:wpiqojci2ndrpovogjpai3e3ni

Imbalanced Adversarial Training with Reweighting [article]

Wentao Wang, Han Xu, Xiaorui Liu, Yaxin Li, Bhavani Thuraisingham, Jiliang Tang
2021 arXiv   pre-print
to natural training, adversarially trained models can suffer much worse performance on under-represented classes, when the training dataset is extremely imbalanced. (2) Traditional reweighting strategies  ...  For example, upweighting the under-represented classes will drastically hurt the model's performance on well-represented classes, and as a result, finding an optimal reweighting value can be tremendously  ...  In general, adversarial training can be formulated to minimize the model's average error on adversarially perturbed input examples [26, 41, 30] .  ... 
arXiv:2107.13639v1 fatcat:akibrgtevzdxbe43gfznikyhfy

Generalized but not Robust? Comparing the Effects of Data Modification Methods on Out-of-Domain Generalization and Adversarial Robustness [article]

Tejas Gokhale, Swaroop Mishra, Man Luo, Bhavdeep Singh Sachdeva, Chitta Baral
2022 arXiv   pre-print
Data modification, either via additional training datasets, data augmentation, debiasing, and dataset filtering, has been proposed as an effective solution for generalizing to out-of-domain (OOD) inputs  ...  This work serves as an empirical study towards understanding the relationship between generalizing to unseen domains and defending against adversarial perturbations.  ...  Our findings can be summarized as follows: • More data benefits OOD generalization, • Data filtering hurts OOD generalization, and • Data filtering significantly hurts adversarial robustness on all benchmarks  ... 
arXiv:2203.07653v1 fatcat:nz3xx5q5kbeohge4ye4eollkmy

Towards Improving Adversarial Training of NLP Models [article]

Jin Yong Yoo, Yanjun Qi
2021 arXiv   pre-print
We demonstrate that vanilla adversarial training with A2T can improve an NLP model's robustness to the attack it was originally trained with and also defend the model against other types of word substitution  ...  However, recent methods for generating NLP adversarial examples involve combinatorial search and expensive sentence encoders for constraining the generated instances.  ...  However, in case of smaller datasets such as MR, data augmentation can also hurt robustness.  ... 
arXiv:2109.00544v2 fatcat:e4j6md7tfbf2rmo2nbu45myzeq
« Previous Showing results 1 — 15 out of 13,364 results