11,389 Hits in 4.6 sec

Deep Text Classification Can be Fooled

Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, Wenchang Shi
2018 Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence  
The experiment results show that the adversarial samples generated by our method can successfully fool both state-of-the-art character-level and word-level DNN-based text classifiers.  ...  The adversarial samples can be perturbed to any desirable classes without compromising their utilities. At the same time, the introduced perturbation is difficult to be perceived.  ...  A question arises naturally as whether the text classification DNNs can also be attacked as already done to image or audio classification DNNs.  ... 
doi:10.24963/ijcai.2018/585 dblp:conf/ijcai/0002LSBLS18 fatcat:tw6xx55rkrgldmhwvodygkuvye

PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models [article]

Bing He, Mustaque Ahamad, Srijan Kumar
2021 arXiv   pre-print
Several deep learning-based detection models have been created. However, malicious users can evade deep detection models by manipulating their behavior, rendering these models of little use.  ...  In the attack, the adversary generates a new post to fool the classifier.  ...  material for Figure 1 and the members of the CLAWS Data Science research group for their feedback on paper, Ankur Bhardwaj for dataset preparation, and Rohit Mujumdar and Shreeshaa Kulkarni for generated text  ... 
arXiv:2109.06777v2 fatcat:z66jltengbhgdkunmj3d3loysy

Universal Rules for Fooling Deep Neural Networks based Text Classification [article]

Di Li, Danilo Vasconcellos Vargas, Sakurai Kouichi
2019 arXiv   pre-print
Recently, deep learning based natural language processing techniques are being extensively used to deal with spam mail, censorship evaluation in social networks, among others.  ...  Thus, universal rules for fooling networks are here shown to exist.  ...  Recently, Bin L. et al. proposed the first attack for deep text classification [2] .  ... 
arXiv:1901.07132v2 fatcat:nsdxdivblvcftasb5rd2lsx65a

Generating Black-Box Adversarial Examples for Text Classifiers Using a Deep Reinforced Model [article]

Prashanth Vijayaraghavan, Deb Roy
2019 arXiv   pre-print
In natural language domain, small perturbations in the form of misspellings or paraphrases can drastically change the semantics of the text.  ...  We demonstrate that our method is able to fool well-trained models for (a) IMDB sentiment classification task and (b) AG's news corpus news categorization task with significantly high success rates.  ...  Our attacks can be a test of how robust the text classification models are to word and character-level perturbations.  ... 
arXiv:1909.07873v1 fatcat:lug35v7xd5d27k5tjy6js6cq5a

One word at a time: adversarial attacks on retrieval models [article]

Nisarg Raval, Manisha Verma
2020 arXiv   pre-print
Our findings suggest that with very few token changes (1-3), the attacker can yield semantically similar perturbed documents that can fool different rankers into changing a document's score, lowering its  ...  Adversarial examples, generated by applying small perturbations to input features, are widely used to fool classifiers and measure their robustness to noisy inputs.  ...  In computer vision literature [15] and text based classification tasks [12] , researchers have shown that deep learning models can change their predictions with slight variations in inputs.  ... 
arXiv:2008.02197v1 fatcat:m66hcp3avbcchgaur2a57lh2mu

Advbox: a toolbox to generate adversarial examples that fool neural networks [article]

Dou Goodman, Hao Xin, Wang Yang, Wu Yuesheng, Xiong Junfeng, Zhang Huan
2020 arXiv   pre-print
Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle, PyTorch, Caffe2, MxNet, Keras, TensorFlow and it can benchmark the robustness of machine learning models.  ...  Small and often imperceptible perturbations to the input images are sufficient to fool the most powerful neural networks.  ...  first generated small perturbations on the images for the image classification problem and fooled state-of-the-art deep neural networks with high probability (Szegedy et al., 2013) .  ... 
arXiv:2001.05574v5 fatcat:dqq3jde7lngdpnw5hrjfv35skm

SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text [article]

Hoang-Quoc Nguyen-Son, Seira Hidano, Kazuhide Fukushima, Shinsaku Kiyomoto
2021 arXiv   pre-print
Both types are misunderstood by the victim, but they can still be recognized by other classifiers. This induces large gaps in predicted probabilities between the victim and the other classifiers.  ...  In terms of misclassified texts, a classifier handles the texts with both incorrect predictions and adversarial texts, which are generated to fool the classifier, which is called a victim.  ...  An adversary can fool a victim classifier's predictions by generating misclassified text, but it does not fool other classifiers.  ... 
arXiv:2110.05748v2 fatcat:oxnrvmhcp5a7nciaivhlt2csbe

A Fast Two-Stage Black-Box Deep Learning Network Attacking Method Based on Cross-Correlation

Deyin Li, Mingzhi Cheng, Yu Yang, Min Lei, Linfeng Shen
2020 Computers Materials & Continua  
Deep learning networks are widely used in various systems that require classification. However, deep learning networks are vulnerable to adversarial attacks.  ...  When attacking LeNet5 and AlexNet respectively, the fooling rates are 100% and 89.56%. When attacking them at the same time, the fooling rate is 69.78%.  ...  FBACC method Equations and mathematical expressions must be inserted into the main text. Two different types of styles can be used for equations and mathematical expressions.  ... 
doi:10.32604/cmc.2020.09800 fatcat:6inzt3d37ja63ekgrsbe5cmsqa

Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers

Ji Gao, Jack Lanchantin, Mary Lou Soffa, Yanjun Qi
2018 2018 IEEE Security and Privacy Workshops (SPW)  
Our experimental results indicate that DeepWordBug can reduce the classification accuracy from 99% to 40% on Enron and from 87% to 26% on IMDB.  ...  Our results strongly demonstrate that the generated adversarial sequences from a deep-learning model can similarly evade other deep models.  ...  [3] , a few recent studies pointed out that adding small modifications to text inputs can fool deep classifiers to incorrect classification [4] , [5] .  ... 
doi:10.1109/spw.2018.00016 dblp:conf/sp/GaoLSQ18 fatcat:cehdytzqhjhsneufwlja66ec5m

Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers [article]

Ji Gao, Jack Lanchantin, Mary Lou Soffa, Yanjun Qi
2018 arXiv   pre-print
We evaluated DeepWordBug on eight real-world text datasets, including text classification, sentiment analysis, and spam detection.  ...  In this paper, we present a novel algorithm, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input.  ...  setting. • Effective: Using several novel scoring functions on eight real-world text classification tasks, our WordBug can fool two different deep RNN models more successfully than the state-of-the-art  ... 
arXiv:1801.04354v5 fatcat:y3mdfslcjrd4re34jo7r5vgfxe

An Attention Score Based Attacker for Black-box NLP Classifier [article]

Yueyang Liu, Hunmin Lee, Zhipeng Cai
2022 arXiv   pre-print
Case in point, in Natural language processing tasks, the neural network may be fooled by an attentively modified text, which has a high similarity to the original one.  ...  Also, our model is transferable, which can be used in the image domain with several modifications.  ...  NLP solutions for example, it is easily being fooled by a carefully modified text, and since this altered text has a high similarity to the original text, attackers can pass through the spam detection  ... 
arXiv:2112.11660v2 fatcat:tvnuzu5c2vczthmycsvhsnof64

Text Adversarial Examples Generation and Defense Based on Reinforcement Learning

2021 Tehnički Vjesnik  
Crafted adversarial examples can make a trouble for the neural network, which leads to the mis-classification. Text classification is one of the basic tasks of the natural language processing.  ...  This paper is concerned about the generation and defense of text adversarial examples.  ...  Even the Alex Net [11] which performs very well in the image classification can also be cheated. Text classification also has this problem.  ... 
doi:10.17559/tv-20200801053744 fatcat:ehsov6mlqjhrbekjt6xmzb7c7a

Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey [article]

Wei Emma Zhang, Quan Z. Sheng, Ahoud Alhazmi, Chenliang Li
2019 arXiv   pre-print
These samples are generated with some imperceptible perturbations but can fool the DNNs to give false predictions.  ...  However, existing perturbation methods for images cannotbe directly applied to texts as text data is discrete.  ...  Text Classification. Majority of the surveyed works attack the deep neural networks for text classification, since these tasks can be framed as a classification problem.  ... 
arXiv:1901.06796v3 fatcat:gfh4gzkvn5djpdkn7k63xlqahm

WordChange: Adversarial Examples Generation Approach for Chinese Text Classification

Nuo Cheng, Guoqin Chang, Haichang Gao, Ge Pei, Yang Zhang
2020 IEEE Access  
Finally, the adversarial texts based on the long short-term memory network (LSTM) can be successfully transferred to other text classifiers and real-world applications.  ...  Deep neural network (DNN) produces opposite predictions by adding small perturbations to the text data.  ...  fool a text detector or classifier.  ... 
doi:10.1109/access.2020.2988786 fatcat:qo4s3agubjhqdnbv63rlhza62i

A Robust Approach for Securing Audio Classification Against Adversarial Attacks [article]

Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich
2019 arXiv   pre-print
This poses a security concern about the safety of machine learning models since the adversarial attacks can fool such models toward the wrong predictions.  ...  Adversarial audio attacks can be considered as a small perturbation unperceptive to human ears that is intentionally added to the audio signal and causes a machine learning model to make mistakes.  ...  , what means that they can be easily fooled by adversarial examples.  ... 
arXiv:1904.10990v2 fatcat:6sjrddcyynentpgdfpu3as74o4
« Previous Showing results 1 — 15 out of 11,389 results