A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Natural Backdoor Attack on Text Data
[article]
2021
arXiv
pre-print
In this paper, we first propose the natural backdoor attacks on NLP models. ...
Moreover, we exploit the various attack strategies to generate trigger on text data and investigate different types of triggers based on modification scope, human recognition, and special cases. ...
We are going to evaluate our attack approach on other NLP applications and study the defense against natural backdoor attacks. ...
arXiv:2006.16176v4
fatcat:xsbpf6dfgvgqjhyx5mktei5h7i
Textual Backdoor Defense via Poisoned Sample Recognition
2021
Applied Sciences
Deep learning models are vulnerable to backdoor attacks. The success rate of textual backdoor attacks based on data poisoning in existing research is as high as 100%. ...
In order to enhance the natural language processing model's defense against backdoor attacks, we propose a textual backdoor defense method via poisoned sample recognition. ...
Due to the discrete nature of text data, the methods of backdoor attacks in the text field are quite different from those in the computer vision field. ...
doi:10.3390/app11219938
fatcat:gubtkohgabglri2ueo42coi5f4
Hidden Backdoors in Human-Centric Language Models
[article]
2021
arXiv
pre-print
In this paper, we create covert and natural triggers for textual backdoor attacks, hidden backdoors, where triggers can fool both modern language models and human inspection. ...
, and finally 91.12% ASR against QA updated with only 27 poisoning data samples on a model previously trained with 92,024 samples (0.029%). ...
PRELIMINARIES In this section, we describe backdoor attacks on Natural Language Processing (NLP) models and present preliminary backgrounds for our hidden backdoor attacks. ...
arXiv:2105.00164v3
fatcat:pgooo3npujf7pm6eu25uyraucm
Rethink the Evaluation for Attack Strength of Backdoor Attacks in Natural Language Processing
[article]
2022
arXiv
pre-print
The most threatening backdoor attack is the stealthy backdoor, which defines the triggers as text style or syntactic. ...
It has been shown that natural language processing (NLP) models are vulnerable to a kind of security threat called the Backdoor Attack, which utilizes a 'backdoor trigger' paradigm to mislead the models ...
Formulation of Backdoor Attack Without loss of generality, we take the typical text classification model as the victim model to formalize textual backdoor attacks based on training data poisoning, and ...
arXiv:2201.02993v2
fatcat:jq46uakjnffvhgbcxh5abxuprq
Textual Backdoor Attacks with Iterative Trigger Injection
[article]
2022
arXiv
pre-print
The backdoor attack has become an emerging threat for Natural Language Processing (NLP) systems. ...
Experiments on sentiment analysis and hate speech detection show that our proposed attack is both stealthy and effective, raising alarm on the usage of untrusted training data. ...
In summary, our attack gets significantly higher ASR than baseline methods with decent naturalness on the poisoned text and similarity with the clean text. ...
arXiv:2205.12700v1
fatcat:6fgw7ywi2vfmflhtmfw3jvibhm
BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models
[article]
2021
arXiv
pre-print
Previous NLP backdoor attacks mainly focus on some specific tasks. This makes those attacks less general and applicable to other kinds of NLP models and tasks. ...
However, NLP models have been shown to be vulnerable to backdoor attacks, where a pre-defined trigger word in the input text causes model misprediction. ...
Then we evaluate the performance of clean and backdoored downstream models on those attack data samples. ...
arXiv:2110.02467v1
fatcat:fekccp75frauba4fedciefpnni
Can Adversarial Weight Perturbations Inject Neural Backdoors?
[article]
2020
arXiv
pre-print
Here, injecting a backdoor refers to obtaining a desired outcome from the model when a trigger pattern is added to the input, while retaining the original model predictions on a non-triggered input. ...
We empirically show that these adversarial weight perturbations exist universally across several computer vision and natural language processing tasks. ...
[4] consider backdoor attacks through data poisoning attacks. ...
arXiv:2008.01761v1
fatcat:fdjal2xzffbpzo23aodir6fx5y
Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification
[article]
2021
arXiv
pre-print
Previous work mainly focused on the defense of backdoor attacks in computer vision, little attention has been paid to defense method for RNN backdoor attacks regarding text classification. ...
LSTM-based text classification by data poisoning. ...
Backdoor attack which is a malicious attack on training data has been reported as a new threat to neural networks. ...
arXiv:2007.12070v3
fatcat:4yjoxlskmjdvlc6yfkdr52fpfe
RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models
[article]
2021
arXiv
pre-print
Motivated by this observation, we construct a word-based robustness-aware perturbation to distinguish poisoned samples from clean samples to defend against the backdoor attacks on natural language processing ...
Backdoor attacks, which maliciously control a well-trained model's outputs of the instances with specific triggers, are recently shown to be serious threats to the safety of reusing deep neural networks ...
Universal adversarial
attacks with natural triggers for text classification.
arXiv preprint arXiv:2005.00174.
Lichao Sun. 2020. Natural backdoor attack on text data. ...
arXiv:2110.07831v1
fatcat:r2tdsrtrafhsjhqlrrouiykjvi
Dynamic Backdoors with Global Average Pooling
[article]
2022
arXiv
pre-print
Outsourced training and machine learning as a service have resulted in novel attack vectors like backdoor attacks. ...
In this work, we are the first to show that dynamic backdoor attacks could happen due to a global average pooling layer without increasing the percentage of the poisoned training data. ...
One of them is the backdoor attack [4] . A backdoored model misclassifies trigger-stamped inputs to an attacker-chosen target but operates normally in any other case. ...
arXiv:2203.02079v1
fatcat:qn6fxo5po5d7dl5ri32v4obq3m
Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks
[article]
2021
arXiv
pre-print
When a deep neural model is injected with a backdoor, it will behave normally on standard inputs but give adversary-specified predictions once the input contains specific backdoor triggers. ...
Current textual backdoor attacks have poor attack performance in some tough situations. In this paper, we find two simple tricks that can make existing textual backdoor attacks much more harmful. ...
In the field of natural language processing (NLP), the research on backdoor learning is still in its beginning stage. ...
arXiv:2110.08247v1
fatcat:fevl3baaefhflnmnnpcyrapnju
BadNL: Backdoor Attacks Against NLP Models
[article]
2020
arXiv
pre-print
Previous backdoor attacks mainly focus on computer vision tasks. ...
In this paper, we present the first systematic investigation of the backdoor attack against models designed for natural language processing (NLP) tasks. ...
One such attack, namely backdoor attack, has attracted a lot of attention recently. ...
arXiv:2006.01043v1
fatcat:a627azfbfzam5ck4sx6gfyye34
Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer
[article]
2021
arXiv
pre-print
Text style is a feature that is naturally irrelevant to most NLP tasks, and thus suitable for adversarial and backdoor attacks. ...
Experimental results show that popular NLP models are vulnerable to both adversarial and backdoor attacks based on text style transfer -- the attack success rates can exceed 90% without much effort. ...
Backdoor Attacks on Text Research into backdoor attacks on text is still in the beginning stages. ...
arXiv:2110.07139v1
fatcat:4fzkkr4xfrdovkbkcfpud7fyna
Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
[article]
2021
arXiv
pre-print
The experiments on text classification tasks show that previous defense methods cannot resist our weight-poisoning method, which indicates that our method can be widely applied and may provide hints for ...
Pre-Trained Models have been widely applied and recently proved vulnerable under backdoor attacks: the released pre-trained weights can be maliciously poisoned with certain triggers. ...
This work was supported by the National Key Research and Development Program of China (No. 2020AAA0106702) and National Natural Science Foundation of China (No. 62022027). ...
arXiv:2108.13888v1
fatcat:ylmwogxaq5fldjerpgxhxqseua
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning
[article]
2021
arXiv
pre-print
In this work, we propose BadEncoder, the first backdoor attack to self-supervised learning. ...
., Google's image encoder pre-trained on ImageNet and OpenAI's Contrastive Language-Image Pre-training (CLIP) image encoder pre-trained on 400 million (image, text) pairs collected from the Internet. ...
Text: Some studies [51, 62, 63] showed that natural language classifiers are also vulnerable to backdoor attacks. For instance, Zhang et al. ...
arXiv:2108.00352v1
fatcat:s7minotidfacrgetljasu2hs34
« Previous
Showing results 1 — 15 out of 2,750 results