Filters








140,107 Hits in 4.0 sec

Understanding Robustness of Transformers for Image Classification [article]

Srinadh Bhojanapalli, Ayan Chakrabarti, Daniel Glasner, Daliang Li, Thomas Unterthiner, Andreas Veit
2021 arXiv   pre-print
Recently, Transformer-based architectures like Vision Transformer (ViT) have matched or even surpassed ResNets for image classification.  ...  However, details of the Transformer architecture -- such as the use of non-overlapping patches -- lead one to wonder whether these networks are as robust.  ...  We thank the authors of [9] and of [28] for kindly sharing checkpoints of their pre-trained ViT and ResNet models, respectively.  ... 
arXiv:2103.14586v2 fatcat:wbouwz4rbfh3vj7qvjgxpe24be

Exploring Corruption Robustness: Inductive Biases in Vision Transformers and MLP-Mixers [article]

Katelyn Morrison, Benjamin Gilby, Colton Lipchak, Adam Mattioli, Adriana Kovashka
2021 arXiv   pre-print
Due to the novelty of transformers being used in this domain along with the self-attention mechanism, it remains unclear to what degree these architectures are robust to corruptions.  ...  Despite some works proposing that data augmentation remains essential for a model to be robust against corruptions, we propose to explore the impact that the architecture has on corruption robustness.  ...  A variation of the ViT vision transformer, called the Swin Transformer, calculates self-attention of a window of image patches to compute predictions for tasks such as image classification (Liu et al.  ... 
arXiv:2106.13122v2 fatcat:iiwfsu2hdndd5lpdtfytzatjre

Can't Fool Me: Adversarially Robust Transformer for Video Understanding [article]

Divya Choudhary, Palash Goyal, Saurabh Sahu
2021 arXiv   pre-print
To address this, several techniques have been proposed to increase robustness of a model for image classification tasks.  ...  We first show that simple extensions of image based adversarially robust models slightly improve the worst-case performance.  ...  Adversarial training has been proposed in several key tasks such as image classification to make the model robust However, for video understanding, the research of adversarially robust model needs further  ... 
arXiv:2110.13950v1 fatcat:sjmk6zapmjbxdalhjgw5qoepke

Robustness of convolutional neural networks to physiological ECG noise [article]

J. Venton, P. M. Harris, A. Sundar, N. A. S. Smith, P. J. Aston
2021 arXiv   pre-print
supervised networks for ECG classification.  ...  In this study we generate clean and noisy versions of an ECG dataset before applying Symmetric Projection Attractor Reconstruction (SPAR) and scalogram image transformations.  ...  Thanks to the University of Lund for permission to use the noise model and for providing the code. Thanks to Spencer Thomas from the National Physical Laboratory for advice on transfer learning.  ... 
arXiv:2108.01995v1 fatcat:evd4qlttbva6dkhpaur2ssmtqa

Facial Expression Classification using Fusion of Deep Neural Network in Video for the 3rd ABAW3 Competition [article]

Kim Ngan Phan and Hong-Hai Nguyen and Van-Thong Huynh and Soo-Hyung Kim
2022 arXiv   pre-print
Fusion of the robust representations plays an important role in the expression classification task.  ...  In this paper, we employ a transformer mechanism to encode the robust representation from the backbone.  ...  The transformer helps encode the robust representations for the backbone of the model. We also employ the pre-trained model RegNet [14] as the backbone for the proposed network.  ... 
arXiv:2203.12899v3 fatcat:nbvch5bxzjarvete7d32g5u3cm

Are Vision Transformers Robust to Spurious Correlations? [article]

Soumya Suvra Ghosal, Yifei Ming, Yixuan Li
2022 arXiv   pre-print
We hope that our work will inspire future research on further understanding the robustness of ViT models.  ...  Further, we perform extensive ablations and experiments to understand the role of the self-attention mechanism in providing robustness under spuriously correlated environments.  ...  In the domain of computer vision, Dosovitskiy et al. [7] first introduced the concept of Vision Transformers (ViT) by adapting the transformer architecture in [33] for image classification tasks.  ... 
arXiv:2203.09125v1 fatcat:krwiormjtrdctae3qfdzhqsv5a

On the Impact of Illumination-Invariant Image Pre-transformation for Contemporary Automotive Semantic Scene Understanding

Naif Alshammari, Samet Akcay, Toby P. Breckon
2018 2018 IEEE Intelligent Vehicles Symposium (IV)  
In this paper, we present an evaluation of illuminationinvariant image transforms applied to this application domain.  ...  We compare four recent transforms for illumination invariant image representation, individually and with colour hybrid images, to show that despite assumptions to contrary such invariant pre-processing  ...  On the other hand, the use of an illumination-invariant image representation, combined with the chromatic components of a perceptual colour-space HSV has improved robustness for scene understanding and  ... 
doi:10.1109/ivs.2018.8500664 dblp:conf/ivs/AlshammariAB18 fatcat:fr5iu2jcbre5necqt2pxgxqb64

3D-Aided Data Augmentation for Robust Face Understanding [article]

Yifan Xing, Yuanjun Xiong, Wei Xia
2020 arXiv   pre-print
Experiments demonstrate that the proposed 3D data augmentation method significantly improves the performance and robustness of various face understanding tasks while achieving state-of-arts on multiple  ...  However, human annotation for the various face understanding tasks including face landmark localization, face attributes classification and face recognition under these challenging scenarios are highly  ...  method for robust face understanding.  ... 
arXiv:2010.01246v2 fatcat:jroocjidkvdorl5yc272b7wwy4

Data Augmentation with Manifold Exploring Geometric Transformations for Increased Performance and Robustness [article]

Magdalini Paschali, Walter Simson, Abhijit Guha Roy, Muhammad Ferjad Naeem, Rüdiger Göbl, Christian Wachinger, Nassir Navab
2019 arXiv   pre-print
Our method was thoroughly evaluated on the challenging tasks of fine-grained skin lesion classification from limited data, and breast tumor classification of mammograms.  ...  In this paper we propose a novel augmentation technique that improves not only the performance of deep neural networks on clean test data, but also significantly increases their robustness to random transformations  ...  A robust model can maintain higher classification accuracy for images that have larger geodesic distance from the originals.  ... 
arXiv:1901.04420v1 fatcat:lwdh4mnhzvbtfhr4nf5phg5huq

Understanding Generalization in Neural Networks for Robustness against Adversarial Vulnerabilities

Subhajit Chaudhury
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
My thesis summary frames the problem of adversarial robustness as an equivalent problem of learning suitable features that leads to good generalization in neural networks.  ...  This is motivated from learning in humans which is not trivially fooled by such perturbations due to robust feature learning which shows good out-of-sample generalization.  ...  locations for the classification task.  ... 
doi:10.1609/aaai.v34i10.7129 fatcat:xswfqj5w2bcg3pnm2shbyqzv44

Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs [article]

Philipp Benz, Soomin Ham, Chaoning Zhang, Adil Karjauv, In So Kweon
2021 arXiv   pre-print
Using a toy example, we also provide empirical evidence that the lower adversarial robustness of CNNs can be partially attributed to their shift-invariant property.  ...  The Vision Transformer (ViT) relies solely on attention modules, while the MLP-Mixer architecture substitutes the self-attention modules with Multi-Layer Perceptrons (MLPs).  ...  To facilitate the understanding of why CNN is more vulnerable, we design a toy task of binary classification where each class is only represented by a single image.  ... 
arXiv:2110.02797v2 fatcat:77hsm5gj35cqxhlw6v4ivu52ue

On the uncertainty principle of neural networks [article]

Jun-Jie Zhang, Dong-Xiao Zhang, Jian-Nan Chen, Long-Gang Pang
2022 arXiv   pre-print
We find that for a neural network to be both accurate and robust, it needs to resolve the features of the two conjugated parts x (the inputs) and Δ (the derivatives of the normalized loss function J with  ...  Despite the successes in many fields, it is found that neural networks are vulnerability and difficult to be both accurate and robust (robust means that the prediction of the trained network stays unchanged  ...  David Donoho from Stanford University for providing valuable suggestions on the accuracy robustness of neural networks. Many thanks are given to Prof. Tai-Jiao Du, Prof. Hai-Yan xie, Prof.  ... 
arXiv:2205.01493v1 fatcat:bz4me5bafbd2tlkk3vokhgp3uq

Visualization, Discriminability and Applications of Interpretable Saak Features [article]

Abinaya Manimaran, Thiyagarajan Ramanathan, Suya You, C-C Jay Kuo
2019 arXiv   pre-print
applications in image classification.  ...  Being inspired by the operations of convolutional layers of convolutional neural networks, multi-stage Saak transform was proposed.  ...  Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.  ... 
arXiv:1902.09107v3 fatcat:ykzqyq2i5rclteoxokbryzhnta

Automation of Nuclei Identification and Counting In Colon Histology Images [article]

Dorsa Ziaei, Hyun Jung, Tianyi Miao
2022 Zenodo  
The Preact-ResNet50 network showed its superior performance and robustness for feature extraction in segmentation and classification tasks and robustness.  ...  In our work, we present a framework for simultaneous segmentation, classification and quantification of nuclear instances in histology images. The framework is based on hovernet model [1] .  ... 
doi:10.5281/zenodo.6327614 fatcat:hixsiu6hx5hdvd777zai6qseni

Adversarial Token Attacks on Vision Transformers [article]

Ameya Joshi, Gauri Jagatap, Chinmay Hegde
2021 arXiv   pre-print
We infer that transformer models are more sensitive to token attacks than convolutional models, with ResNets outperforming Transformer models by up to ∼30% in robust accuracy for single token attacks.  ...  We probe and analyze transformer as well as convolutional models with token attacks of varying patch sizes.  ...  It is therefore important to understand the sensitivity of the architecture to token level changes rather than to the full image.  ... 
arXiv:2110.04337v1 fatcat:yractlmse5elrhlcrstf5kye6q
« Previous Showing results 1 — 15 out of 140,107 results