A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Detecting Textual Adversarial Examples Based on Distributional Characteristics of Data Representations
[article]
2022
arXiv
pre-print
Although deep neural networks have achieved state-of-the-art performance in various machine learning tasks, adversarial examples, constructed by adding small non-random perturbations to correctly classified inputs, successfully fool highly expressive deep classifiers into incorrect predictions. Approaches to adversarial attacks in natural language tasks have boomed in the last five years using character-level, word-level, phrase-level, or sentence-level textual perturbations. While there is
arXiv:2204.13853v1
fatcat:ibie2udqgrbcdo67ye5eq7xjw4