2,182 Hits in 3.9 sec

Backdoor Pre-trained Models Can Transfer to All [article]

Lujia Shen, Shouling Ji, Xuhong Zhang, Jinfeng Li, Jing Chen, Jie Shi, Chengfang Fang, Jianwei Yin, Ting Wang
2021 pre-print
However, a pre-trained model with backdoor can be a severe threat to the applications.  ...  It can thus introduce backdoor to a wide range of downstream tasks without any prior knowledge.  ...  RELATED WORK 2.1 Pre-trained Language Models Recent work has shown that the language models pre-trained on large text corpus can learn universal language representations [35] .  ... 
doi:10.1145/3460120.3485370 arXiv:2111.00197v1 fatcat:dwvwhpzjvjh5bh4z7o5utahy4y

BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models [article]

Kangjie Chen, Yuxian Meng, Xiaofei Sun, Shangwei Guo, Tianwei Zhang, Jiwei Li, Chun Fan
2021 arXiv   pre-print
Pre-trained Natural Language Processing (NLP) models can be easily adapted to a variety of downstream language tasks. This significantly accelerates the development of language models.  ...  In this work, we propose \Name, the first task-agnostic backdoor attack against the pre-trained NLP models.  ...  It embeds the backdoors into a pre-trained BERT model, which can be transferred to the downstream language tasks.  ... 
arXiv:2110.02467v1 fatcat:fekccp75frauba4fedciefpnni

BadNets: Evaluating Backdooring Attacks on Deep Neural Networks

Tianyu Gu, Kang Liu, Brendan Dolan-Gavitt, Siddharth Garg
2019 IEEE Access  
However, these networks are typically computationally expensive to train, requiring weeks of computation on many GPUs; as a result, many users outsource the training procedure to the cloud or rely on pre-trained  ...  In this paper, we show that the outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a BadNet) that has the state-of-the-art  ...  pre-trained model adapts to her task using transfer learning.  ... 
doi:10.1109/access.2019.2909068 fatcat:5uzh3sxsmjdcviw6obkb2mfhae

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain [article]

Tianyu Gu, Brendan Dolan-Gavitt, Siddharth Garg
2019 arXiv   pre-print
However, these networks are typically computationally expensive to train, requiring weeks of computation on many GPUs; as a result, many users outsource the training procedure to the cloud or rely on pre-trained  ...  In this paper we show that outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a BadNet) that has state-of-the-art  ...  pre-trained model adapts to her task using transfer learning.  ... 
arXiv:1708.06733v2 fatcat:y4vysmtjkfgm5alivwjsnxh76e

Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-Level Backdoor Attacks [article]

Zhengyan Zhang, Guangxuan Xiao, Yongwei Li, Tian Lv, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Xin Jiang, Maosong Sun
2021 arXiv   pre-print
Specifically, attackers can add a simple pre-training task, which restricts the output representations of trigger instances to pre-defined vectors, namely neuron-level backdoor attack (NeuBA).  ...  Pre-trained models (PTMs) have been widely used in various downstream tasks. The parameters of PTMs are distributed on the Internet and may suffer backdoor attacks.  ...  Introduction Inspired by the success of pre-trained models (PTMs), most people follow the pre-train-then-finetuning paradigm to develop new deep learning models.  ... 
arXiv:2101.06969v3 fatcat:56rey2shmjbn3kejyucaq6pllu

Exploring the Universal Vulnerability of Prompt-based Learning Paradigm [article]

Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
2022 arXiv   pre-print
We also find conventional fine-tuning models are not vulnerable to adversarial triggers constructed from pre-trained language models.  ...  In this paper, we explore this universal vulnerability by either injecting backdoor triggers or searching for adversarial triggers on pre-trained language models using only plain text.  ...  We introduce the approach to injecting pre-defined backdoor triggers into language models during pre-training (BToP).  ... 
arXiv:2204.05239v1 fatcat:avuzbynvubfw3pop5xxbne5lzq

Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes [article]

Sanghyun Hong, Michael-Andrei Panaitescu-Liess, Yiğitcan Kaya, Tudor Dumitraş
2021 arXiv   pre-print
To study this hypothesis, we weaponize quantization-aware training and propose a new training framework to implement adversarial quantization outcomes.  ...  For example, a quantized model can misclassify some test-time samples that are otherwise classified correctly. It is not known whether such differences lead to a new security vulnerability.  ...  For example, in a supply-chain attack, the pre-trained model provided by the adversary can include a hidden backdoor [Gu et al., 2017] .  ... 
arXiv:2110.13541v2 fatcat:hzacgl6dujgnzjhfamwax77anm

Backdoor Attacks against Transfer Learning with Pre-trained Deep Learning Models [article]

Shuo Wang, Surya Nepal, Carsten Rudolph, Marthie Grobler, Shangyu Chen, Tianle Chen
2020 arXiv   pre-print
Many pre-trained Teacher models used in transfer learning are publicly available and maintained by public platforms, increasing their vulnerability to backdoor attacks.  ...  Transfer learning provides an effective solution for feasibly and fast customize accurate Student models, by transferring the learned knowledge of pre-trained Teacher models over large datasets via fine-tuning  ...  can manipulate the pre-trained Teacher models to generate customized Student models that give the wrong predictions.  ... 
arXiv:2001.03274v2 fatcat:ojctk2rbpfcm3exrcaipcjirre

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses [article]

Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein
2021 arXiv   pre-print
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.  ...  The absence of trustworthy human supervision over the data collection process exposes organizations to security vulnerabilities; training data can be manipulated to control and degrade the downstream behaviors  ...  Acknowledgements We thank Jiantao Jiao, Mohammad Mahmoody, and Jacob Steinhardt for helpful pointers to relevant literature.  ... 
arXiv:2012.10544v4 fatcat:2tpz6l2dpbgrjcyf5yxxv3pvii

Deep Learning Backdoors [article]

Shaofeng Li, Shiqing Ma, Minhui Xue, Benjamin Zi Hao Zhao
2021 arXiv   pre-print
Intuitively, a backdoor attack against Deep Neural Networks (DNNs) is to inject hidden malicious behaviors into DNNs such that the backdoor model behaves legitimately for benign inputs, yet invokes a predefined  ...  These filters can be applied to the original image by replacing or perturbing a set of image pixels.  ...  Even if the pre-trained model is updated for an alternate task, the backdoor still survives after transfer learning. There are two ways to create backdoored DNN models.  ... 
arXiv:2007.08273v2 fatcat:e7eygc3ivbhc5ebb5vlrxpw74y

Resurrecting Trust in Facial Recognition: Mitigating Backdoor Attacks in Face Recognition to Prevent Potential Privacy Breaches [article]

Reena Zelenkova, Jack Swallow, M.A.P. Chamikara, Dongxi Liu, Mohan Baruwal Chhetri, Seyit Camtepe, Marthie Grobler, Mahathir Almashor
2022 arXiv   pre-print
However, this can drastically affect model accuracy.  ...  Backdoor attacks cause a model to misclassify a particular class as a target class during recognition.  ...  Pre-trained model weights can be retained, or 'frozen' from the pre-trained network and are not updated when the model is trained for a new task [21] .  ... 
arXiv:2202.10320v1 fatcat:xibosqz3evhivfu52xqo6u7jqe

Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks [article]

Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P Dickerson, Tom Goldstein
2021 arXiv   pre-print
Data poisoning and backdoor attacks manipulate training data in order to cause models to fail during inference.  ...  A recent survey of industry practitioners found that data poisoning is the number one concern among threats ranging from model stealing to adversarial attacks.  ...  To further standardize these tests, we provide pre-trained models to test against. The parameters of one model are given to the attacker.  ... 
arXiv:2006.12557v3 fatcat:3th2xa7vz5f4depd25vcmqatmm

Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring [article]

Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, Joseph Keshet
2018 arXiv   pre-print
Training these networks is computationally expensive and requires vast amounts of training data. Selling such pre-trained models can, therefore, be a lucrative business model.  ...  Moreover, we provide a theoretical analysis, relating our approach to previous work on backdooring.  ...  to a pre-trained model.  ... 
arXiv:1802.04633v3 fatcat:qaojyy4ccngafl66z4hkqwyuwm

Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning [article]

Linyang Li, Demin Song, Xiaonan Li, Jiehang Zeng, Ruotian Ma, Xipeng Qiu
2021 arXiv   pre-print
Pre-Trained Models have been widely applied and recently proved vulnerable under backdoor attacks: the released pre-trained weights can be maliciously poisoned with certain triggers.  ...  When the triggers are activated, even the fine-tuned model will predict pre-defined labels, causing a security threat.  ...  Acknowledgments We would like to thank the anonymous reviewers for their valuable comments.  ... 
arXiv:2108.13888v1 fatcat:ylmwogxaq5fldjerpgxhxqseua

Backdoor Attacks on the DNN Interpretation System [article]

Shihong Fang, Anna Choromanska
2020 arXiv   pre-print
The saliency maps are incorporated in the penalty term of the objective function that is used to train a deep model and its influence on model training is conditioned upon the presence of a trigger.  ...  In this paper we design a backdoor attack that alters the saliency map produced by the network for an input image only with injected trigger that is invisible to the naked eye while maintaining the prediction  ...  The obtained models are used as the pre-trained models for our attack experiments.  ... 
arXiv:2011.10698v2 fatcat:domdcxqhlfddncw5bgqs462s24
« Previous Showing results 1 — 15 out of 2,182 results