Filters








187 Hits in 5.4 sec

Regularization Can Help Mitigate Poisoning Attacks... with the Right Hyperparameters [article]

Javier Carnerero-Cano, Luis Muñoz-González, Phillippa Spencer, Emil C. Lupu
2021 arXiv   pre-print
of poisoning attacks.  ...  We propose a novel optimal attack formulation that considers the effect of the attack on the hyperparameters, modelling the attack as a minimax bilevel optimization problem.  ...  ACKNOWLEDGMENTS We gratefully acknowledge financial support for this work from the UK Defence Science and Technology Laboratory (Dstl); contract no: DSTLX-1000120987.  ... 
arXiv:2105.10948v1 fatcat:unsnfr3ys5bijemoyf2zpodbee

Identifying a Training-Set Attack's Target Using Renormalized Influence Estimation [article]

Zayd Hammoudeh, Daniel Lowd
2022 arXiv   pre-print
This can then be combined with adversarial-instance identification to find (and remove) the attack instances, mitigating the attack with minimal impact on other predictions.  ...  We demonstrate our method's generality on backdoor and poisoning attacks across various data domains including text, vision, and speech.  ...  ACKNOWLEDGMENTS The authors would like to thank Jonathan Brophy for helpful discussions and feedback.  ... 
arXiv:2201.10055v1 fatcat:536lnngyszhy3fcqvgqikwebgm

Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning [article]

Antonio Emanuele Cinà, Kathrin Grosse, Ambra Demontis, Sebastiano Vascon, Werner Zellinger, Bernhard A. Moser, Alina Oprea, Battista Biggio, Marcello Pelillo, Fabio Roli
2022 arXiv   pre-print
This assumption is challenged by the threat of poisoning, an attack that manipulates the training data to compromise the model's performance at test time.  ...  In this survey, we provide a comprehensive systematization of poisoning attacks and defenses in machine learning, reviewing more than 200 papers published in the field in the last 15 years.  ...  The latter work shows that hyperparameters related to regularization affect backdoor performance.  ... 
arXiv:2205.01992v1 fatcat:634zayldxfgfrlucascahjesxm

Concealed Data Poisoning Attacks on NLP Models [article]

Eric Wallace, Tony Z. Zhao, Shi Feng, Sameer Singh
2021 arXiv   pre-print
We conclude by proposing three defenses that can mitigate our attack at some cost in prediction accuracy or extra human annotation.  ...  However, it is much less understood whether, and how, predictions can be manipulated with small, concealed changes to the training data.  ...  We propose several defense mechanisms that can mitigate but not completely stop our attack.  ... 
arXiv:2010.12563v2 fatcat:juakzuobanedjk5xlo66wo25xq

Poisoning Semi-supervised Federated Learning via Unlabeled Data: Attacks and Defenses [article]

Yi Liu, Xingliang Yuan, Ruihui Zhao, Cong Wang, Dusit Niyato, Yefeng Zheng
2022 arXiv   pre-print
Evaluations under different attack conditions show that the proposed defense can well alleviate such unlabeled poisoning attacks.  ...  Our study unveils the vulnerability of SSFL to unlabeled poisoning attacks and provides the community with potential defense methods.  ...  As shown in Fig. 4 and Fig. 5 , our defense can better mitigate consistency loss poisoning attack under different settings.  ... 
arXiv:2012.04432v2 fatcat:3wxbf2twhfcopenn2u3shyffoi

Energy-Latency Attacks via Sponge Poisoning [article]

Antonio Emanuele Cinà, Ambra Demontis, Battista Biggio, Fabio Roli, Marcello Pelillo
2022 arXiv   pre-print
In this work, we demonstrate that sponge attacks can also be implanted at training time, when model training is outsourced to a third party, via an attack that we call sponge poisoning.  ...  We present a novel formalization for sponge poisoning, overcoming the limitations related to the optimization of test-time sponge examples, and show that this attack is possible even if the attacker only  ...  Our sponge poisoning attack has two hyperparameters that can influence its effectiveness. The former is σ (see Eq. 2) that regulates the approximation goodness of ˆ 0 to the actual 0 .  ... 
arXiv:2203.08147v2 fatcat:xks7waniffg45kblnzxcbpwc2i

Towards the Memorization Effect of Neural Networks in Adversarial Training [article]

Han Xu, Xiaorui Liu, Wentao Wang, Wenbiao Ding, Zhongqin Wu, Zitao Liu, Anil Jain, Jiliang Tang
2021 arXiv   pre-print
However, adversarially trained models always suffer from poor generalization, with both relatively low clean accuracy and robustness on the test set.  ...  Specifically, the perfectly fitted DNNs can memorize the labels of many atypical samples, generalize their memorization to correctly classify test atypical samples and enjoy better test performance.  ...  Even though the reweighting method discussed above can help mitigate the impact from those "poisoning" atypical samples, we observe that the model's performance (especially on the typical samples) is still  ... 
arXiv:2106.04794v1 fatcat:fy2pkijb3fcdxezfjbpq53qrja

Local and Central Differential Privacy for Robustness and Privacy in Federated Learning [article]

Mohammad Naseri, Jamie Hayes, Emiliano De Cristofaro
2021 arXiv   pre-print
DP also mitigates white-box membership inference attacks in FL, and our work is the first to show it empirically. Neither LDP nor CDP, however, defend against property inference.  ...  ., via membership, property, and backdoor attacks. This paper investigates whether and to what extent one can use differential Privacy (DP) to protect both privacy and robustness in FL.  ...  The authors wish to thank Boris Köpf, Shurti Tople, and Santiago Zanella-Beguelin for helpful feedback and comments.  ... 
arXiv:2009.03561v4 fatcat:vd6cvai5hfejxf3rzlgcyvoaxe

Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks [article]

Ambra Demontis, Marco Melis, Maura Pintor, Matthew Jagielski, Battista Biggio, Alina Oprea, Cristina Nita-Rotaru, Fabio Roli
2019 arXiv   pre-print
In this paper, we present a comprehensive analysis aimed to investigate the transferability of both test-time evasion and training-time poisoning attacks.  ...  We provide a unifying optimization framework for evasion and poisoning attacks, and a formal definition of transferability of such attacks.  ...  For poisoning attacks, the best surrogates are generally models with similar levels of regularization as the target model.  ... 
arXiv:1809.02861v4 fatcat:mndeutlpmjdwrghudt54spnj5q

MetaPoison: Practical General-purpose Clean-label Data Poisoning [article]

W. Ronny Huang, Jonas Geiping, Liam Fowl, Gavin Taylor, Tom Goldstein
2021 arXiv   pre-print
MetaPoison can achieve arbitrary adversary goals -- like using poisons of one class to make a target image don the label of another arbitrarily chosen class.  ...  Data poisoning -- the process by which an attacker takes control of a model by making imperceptible changes to a subset of the training data -- is an emerging threat in the context of neural networks.  ...  The authors had neither affiliation nor correspondence with the Google Cloud AutoML Vision team at the time of obtaining these results.  ... 
arXiv:2004.00225v2 fatcat:a3fms4d3lbcpxkzniagesxd63m

TABOR: A Highly Accurate Approach to Inspecting and Restoring Trojan Backdoors in AI Systems [article]

Wenbo Guo, Lun Wang, Xinyu Xing, Min Du, Dawn Song
2019 arXiv   pre-print
It could be activated and thus forces that infected model behaving abnormally only when an input data sample with a particular trigger present is fed to that model.  ...  On the other hand, the proposed techniques cannot accurately detect the existence of trojan backdoors, nor restore high-fidelity trojan backdoor images, especially when the triggers pertaining to the trojan  ...  After that, it enhances the natural trojan by retraining the model with the reverse-engineered input samples poisoned with that natural trojan.  ... 
arXiv:1908.01763v2 fatcat:iuf5fn56wveebixixy64r52eee

A BIC-based Mixture Model Defense against Data Poisoning Attacks on Classifiers [article]

Xi Li, David J. Miller, Zhen Xiang, George Kesidis
2022 arXiv   pre-print
" DP attacks is herein proposed that: 1) addresses the most challenging embedded DP scenario wherein, if DP is present, the poisoned samples are an a priori unknown subset of the training set, and with  ...  ; 3) jointly identifies poisoned components and samples by minimizing the BIC cost defined over the whole training set, with the identified poisoned data removed prior to classifier training.  ...  Thus, the defender can always take the clean class as reference in helping to identify poisoned samples in the corrupted class (This should be especially helpful for label flipping attacks, where the poisoned  ... 
arXiv:2105.13530v2 fatcat:f4e4o3q4gzdq3bm2hg45c6ibeq

Stronger Data Poisoning Attacks Break Data Sanitization Defenses [article]

Pang Wei Koh, Jacob Steinhardt, Percy Liang
2021 arXiv   pre-print
Machine learning models trained on data from the outside world can be corrupted by data poisoning attacks that inject malicious points into the models' training sets.  ...  Our attacks are based on two ideas: (i) we coordinate our attacks to place poisoned points near one another, and (ii) we formulate each attack as a constrained optimization problem, with constraints designed  ...  PWK was supported by the Facebook Fellowship Program. JS was supported by the Fannie and John Hertz Foundation Fellowship.  ... 
arXiv:1811.00741v2 fatcat:ct7cayftlvgvzehxe24pqm265e

Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU Models [article]

Mengnan Du, Varun Manjunatha, Rajiv Jain, Ruchi Deshpande, Franck Dernoncourt, Jiuxiang Gu, Tong Sun, Xia Hu
2021 arXiv   pre-print
Based on this shortcut measurement, we propose a shortcut mitigation framework LTGR, to suppress the model from making overconfident predictions for samples with large shortcut degree.  ...  In this work, we show that the words in the NLU training set can be modeled as a long-tailed distribution.  ...  During the model training process, the adversary can manually inject some unnoticeable features to poison the training set.  ... 
arXiv:2103.06922v3 fatcat:q2s4ndta2nauvnhizyxcdhxddi

Adversarial Learning in Statistical Classification: A Comprehensive Review of Defenses Against Attacks [article]

David J. Miller, Zhen Xiang, George Kesidis
2019 arXiv   pre-print
The paper concludes with a discussion of future work.  ...  After introducing relevant terminology and the goals and range of possible knowledge of both attackers and defenders, we survey recent work on test-time evasion (TTE), data poisoning (DP), and reverse  ...  ACKNOWLEDGMENT The authors would like to acknowledge research contributions of their student Yujia Wang, which were helpful in the development of this paper.  ... 
arXiv:1904.06292v3 fatcat:dguztg5w5neirgggg5irh6doci
« Previous Showing results 1 — 15 out of 187 results