1,453 Hits in 4.8 sec

Learning Deep Representations Using Convolutional Auto-encoders with Symmetric Skip Connections [article]

Jianfeng Dong, Xiao-Jiao Mao, Chunhua Shen, Yu-Bin Yang
2017 arXiv   pre-print
We empirically show that symmetric shortcut connections are very important for learning abstract representations via image reconstruction.  ...  Unsupervised pre-training was a critical technique for training deep neural networks years ago.  ...  Although for masking noise, prior works [35, 23] also use a masked loss to emphasize the dropped pixels, we find that it is unnecessary when we use input-output shortcut connection to make the learned  ... 
arXiv:1611.09119v2 fatcat:uijwlfho3bc2zm2c7nje23mlx4

RUBi: Reducing Unimodal Biases for Visual Question Answering

Rémi Cadène, Corentin Dancette, Hedi Ben-younes, Matthieu Cord, Devi Parikh
2019 Neural Information Processing Systems  
It prevents the base VQA model from learning them by influencing its predictions. This leads to dynamically adjusting the loss in order to compensate for biases.  ...  This dataset is specifically designed to assess the robustness of VQA models when exposed to different question biases at test time than what was seen during training.  ...  Acknowledgments We would like to thank the reviewers for valuable and constructive comments and suggestions. We additionally would like to thank Abhishek Das and Aishwarya Agrawal for their help.  ... 
dblp:conf/nips/CadeneDBCP19 fatcat:umiftvxbqngnlj3vtuqhluig7a

A Typology to Explore and Guide Explanatory Interactive Machine Learning [article]

Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian Kersting
2022 arXiv   pre-print
Apart from introducing these novel benchmarking tasks, for improved quantitative evaluations, we further introduce a novel Wrong Reason (\wrnospace) metric which measures the average wrong reason activation  ...  In addition to benchmarking these methods on their overall ability to revise a model, we perform additional benchmarks regarding wrong reason revision, interaction efficiency, robustness to feedback quality  ...  Below there are the correct feedback masks for penalizing the wrong reason and the correct feedback mask/hint for rewarding the right reason.  ... 
arXiv:2203.03668v1 fatcat:xfkjq26lwffklpg5kgszgsefwi

Motion-aware Contrastive Video Representation Learning via Foreground-background Merging [article]

Shuangrui Ding, Maomao Li, Tianyu Yang, Rui Qian, Haohang Xu, Qingyi Chen, Jue Wang, Hongkai Xiong
2022 arXiv   pre-print
By leveraging the semantic consistency between the original clips and the fused ones, the model focuses more on the motion patterns and is debiased from the background shortcut.  ...  When naively pulling two augmented views of a video closer, the model however tends to learn the common static background as a shortcut but fails to capture the motion information, a phenomenon dubbed  ...  , and in part by the Program of Shanghai Science and Technology Innovation Project under Grant 20511100100.  ... 
arXiv:2109.15130v3 fatcat:qy2voj5fxfbl3dla7svjpp54l4

Unified DeepLabV3+ for Semi-Dark Image Semantic Segmentation

Mehak Maqbool Memon, Manzoor Ahmed Hashmani, Aisha Zahid Junejo, Syed Sajjad Rizvi, Kamran Raza
2022 Sensors  
The problems arise due to (1) biased centric exploitations of filter masks, (2) lower representational power of residual networks due to identity shortcuts, and (3) a loss of spatial relationship by using  ...  Semantic segmentation for accurate visual perception is a critical task in computer vision.  ...  For this reason, to benefit from the deep network and lightweight feature of MobileNet, it was employed to increase the detailed information of the visual scene.  ... 
doi:10.3390/s22145312 pmid:35890992 pmcid:PMC9324997 fatcat:jqmwgcqs4vfspilnqn2hvy7r2m

RUBi: Reducing Unimodal Biases in Visual Question Answering [article]

Remi Cadene and Corentin Dancette and Hedi Ben-younes and Matthieu Cord and Devi Parikh
2020 arXiv   pre-print
It prevents the base VQA model from learning them by influencing its predictions. This leads to dynamically adjusting the loss in order to compensate for biases.  ...  This dataset is specifically designed to assess the robustness of VQA models when exposed to different question biases at test time than what was seen during training.  ...  Acknowledgments We would like to thank the reviewers for valuable and constructive comments and suggestions. We additionally would like to thank Abhishek Das and Aishwarya Agrawal for their help.  ... 
arXiv:1906.10169v2 fatcat:vfz5jaffgzbxxolxhpa75mk6ie


Emmanouil Giannisakis, Gilles Bailly, Sylvain Malacria, Fanny Chevalier
2017 Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI '17  
ACKNOWLEDGEMENTS We thank Tamy Boubeker and Stéphane Calderon and our anonymous reviewers for their valuable input on this work.  ...  Too many operations to access visual aids and/or hide them slow down the interaction and break workflow.  ...  Our colleagues mostly reasoned with squares; in Figure 1 , we used quarter-circle instead of squares in order to minimize the visual space occupied.  ... 
doi:10.1145/3025453.3025595 dblp:conf/chi/GiannisakisBMC17 fatcat:swa3w4hwgzfgflvqvkim3r77qu

Joint Answering and Explanation for Visual Commonsense Reasoning [article]

Zhenyang Li, Yangyang Guo, Kejie Wang, Yinwei Wei, Liqiang Nie, Mohan Kankanhalli
2022 arXiv   pre-print
Visual Commonsense Reasoning (VCR), deemed as one challenging extension of the Visual Question Answering (VQA), endeavors to pursue a more high-level visual comprehension.  ...  As a result, the pivotal connection between question answering and rationale inference is interrupted, rendering existing efforts less faithful on visual reasoning.  ...  Index Terms-Visual Commonsense Reasoning, Language Shortcut, Knowledge Distillation. I.  ... 
arXiv:2202.12626v1 fatcat:zglxtlf4kndlxl63lijvmve7oy

A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes [article]

Mazda Moayeri, Phillip Pope, Yogesh Balaji, Soheil Feizi
2022 arXiv   pre-print
To this end, for a subset of ImageNet samples, we collect segmentation masks for the entire object and 18 informative attributes.  ...  Finally, we quantitatively study the attribution problem for neural features by comparing feature saliency with ground-truth localization of semantic attributes.  ...  (Right) Visualization of general noise robustness and relative foreground sensitivity for all points in the unit square.  ... 
arXiv:2201.10766v1 fatcat:hnfnj5h2ive33mbwwc4gkkwq6y

Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning [article]

Damien Teney, Maxime Peyrard, Ehsan Abbasnejad
2022 arXiv   pre-print
Machine learning (ML) models are typically optimized for their accuracy on a given dataset.  ...  They discover predictive features that are otherwise ignored by standard empirical risk minimization (ERM), which we then distill into a global model with superior OOD performance.  ...  We choose visual question answering (VQA) because it is notorious for dataset biases [74] that cause shortcut learning [13] .  ... 
arXiv:2207.02598v1 fatcat:ic6chtidk5dftj2aoriruimh7y

Partial success in closing the gap between human and machine vision [article]

Robert Geirhos, Kantharaju Narayanappa, Benjamin Mitzkus, Tizian Thieringer, Matthias Bethge, Felix A. Wichmann, Wieland Brendel
2021 arXiv   pre-print
Our results give reason for cautious optimism: While there is still much room for improvement, the behavioural difference between human and machine vision is narrowing.  ...  human visual perception.  ...  Mutschler, David-Elias Künstle for feedback on the manuscript; Santiago Cadena for sharing a PyTorch implementation of SimCLR; Katherine Hermann and her collaborators for providing supervised SimCLR baselines  ... 
arXiv:2106.07411v2 fatcat:kd4es6yzirggnht65mvpqwz4yu

MAST: A Memory-Augmented Self-supervised Tracker [article]

Zihang Lai, Erika Lu, Weidi Xie
2020 arXiv   pre-print
In this paper, we first reassess the traditional choices used for self-supervised training and reconstruction loss by conducting thorough experiments that finally elucidate the optimal choices.  ...  Second, we further improve on existing methods by augmenting our architecture with a crucial memory component.  ...  Financial support for this project is provided by EPSRC Seebibyte Grant EP/M013774/1. Erika Lu is funded by the Oxford-Google DeepMind Graduate Scholarship.  ... 
arXiv:2002.07793v2 fatcat:hn6fof2ganfuldzbxkuvouckoq

RANDOM MASK: Towards Robust Convolutional Neural Networks [article]

Tiange Luo, Tianle Cai, Mengxiao Zhang, Siyu Chen, Liwei Wang
2020 arXiv   pre-print
In this paper, we design a new CNN architecture that by itself has good robustness. We introduce a simple but powerful technique, Random Mask, to modify existing CNN structures.  ...  Robustness of neural networks has recently been highlighted by the adversarial examples, i.e., inputs added with well-designed perturbations which are imperceptible to humans but can cause the network  ...  ROBUSTNESS VIA RANDOM MASK Random Mask is not specially designed for adversarial defense, but as Random Mask introduces information that is essential for classifying correctly, it also brings robustness  ... 
arXiv:2007.14249v1 fatcat:ez5ucsoykbcjvounexe25r5pn4

MultiResUNet : Rethinking the U-Net Architecture for Multimodal Biomedical Image Segmentation [article]

Nabil Ibtehaz, M. Sohel Rahman
2019 arXiv   pre-print
Albeit slight improvements in the cases of ideal images, a remarkable gain in performance has been attained for challenging images.  ...  Acknowledgements The Titan Xp GPU used for this research was the generous donation of NVIDIA Corporation.  ...  Supplementary information Supplementary Material 1: contains the links to the weights and parameters of the best performing models in each fold for all the datasets.  ... 
arXiv:1902.04049v1 fatcat:6hfm3shnwrholp5unu6nxweske

COAT: Measuring Object Compositionality in Emergent Representations

Sirui Xie, Ari S. Morcos, Song-Chun Zhu, Ramakrishna Vedantam
2022 International Conference on Machine Learning  
Built upon object masks in the pixel space, existing metrics for objectness can only evaluate generative models with an object-specific "slot" structure.  ...  Learning representations that can decompose a multi-object scene into its constituent objects and recompose them flexibly is desirable for objectoriented reasoning and planning.  ...  and Marco Baroni at ICREA for useful discussion.  ... 
dblp:conf/icml/XieMZV22 fatcat:oxzjr2n57rfplm4bqrejpbikna
« Previous Showing results 1 — 15 out of 1,453 results