23,044 Hits in 3.4 sec

MRI-based Multi-task Decoupling Learning for Alzheimer's Disease Detection and MMSE Score Prediction: A Multi-site Validation [article]

Xu Tian, Jin Liu, Hulin Kuang, Yu Sheng, Jianxin Wang, The Alzheimer's Disease Neuroimaging Initiative
2022 arXiv   pre-print
First, a multi-task learning network is proposed to implement AD detection and MMSE score prediction, which exploits feature correlation by adding three multi-task interaction layers between the backbones  ...  We evaluate our proposed method on multi-site datasets.  ...  Therefore, the attention module is used to obtain the weights of the linear transformation, which is the same as the attention module used for feature decoupling.  ... 
arXiv:2204.01708v2 fatcat:4d4n2ctdjrbplnr6laqapt5ksa

Domain Composition and Attention for Unseen-Domain Generalizable Medical Image Segmentation [article]

Ran Gu, Jingyang Zhang, Rui Huang, Wenhui Lei, Guotai Wang, Shaoting Zhang
2021 arXiv   pre-print
Then, a domain attention module is proposed to learn the linear combination coefficients of the basis representations.  ...  Domain generalizable model is attracting increasing attention in medical image analysis since data is commonly acquired from different institutes with various imaging protocols and scanners.  ...  To deal with this problem, we propose a Domain Composition and Attention-based Network (DCA-Net) for generalizable multi-site medical image segmentation.  ... 
arXiv:2109.08852v1 fatcat:nm5xckyvhncjlc5fawzf2gd5hm

Sparse Fusion Mixture-of-Experts are Domain Generalizable Learners [article]

Bo Li, Jingkang Yang, Jiawei Ren, Yezhen Wang, Ziwei Liu
2022 arXiv   pre-print
Extensive experiments demonstrate that SF-MoE is a domain-generalizable learner on large-scale benchmarks.  ...  Domain generalization (DG) aims at learning generalizable models under distribution shifts to avoid redundantly overfitting massive training data.  ...  Preliminary: ViT and Multi-Head Attention SF-MoE is built on Vision Transformer, which contains two essential components, the multi-head attention (MHA) layer and the feed-forward-network (FFN) layer.  ... 
arXiv:2206.04046v3 fatcat:lwydxkhurbcglparhdiro5ryeq

Refactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals [article]

Tongzhou Mu, Jiayuan Gu, Zhiwei Jia, Hao Tang, Hao Su
2020 arXiv   pre-print
We study how to learn a policy with compositional generalizability.  ...  We propose a two-stage framework, which refactorizes a high-reward teacher policy into a generalizable student policy with strong inductive bias.  ...  bias) 512 ReLU Linear (no bias) 512 ReLU Linear 1 Table 4 : 4 The architecture of the plain CNN used in the experiments on Multi-MNIST.  ... 
arXiv:2011.00971v1 fatcat:yzjitqgs2fdhhn6ve7qdlke2p4

Armour: Generalizable Compact Self-Attention for Vision Transformers [article]

Lingchuan Meng
2021 arXiv   pre-print
This paper introduces a compact self-attention mechanism that is fundamental and highly generalizable.  ...  Attention-based transformer networks have demonstrated promising potential as their applications extend from natural language processing to vision.  ...  To exploit that, we proposed Armour, a compact self-attention that reduces the number of linear transformations. Armour is easy to implement and highly generalizable.  ... 
arXiv:2108.01778v1 fatcat:lnpxbet5qvbnlnsnzbkdlurb6m

The self-supervised spectral-spatial attention-based transformer network for automated, accurate prediction of crop nitrogen status from UAV imagery [article]

Xin Zhang, Liangxiu Han, Tam Sobeih, Lewis Lappin, Mark Lee, Andew Howard, Aron Kisdi
2022 arXiv   pre-print
The proposed approach achieved high accuracy (0.96) with good generalizability and reproducibility for wheat N status estimation.  ...  In this work, we propose a novel deep learning framework: a self-supervised spectral-spatial attention-based vision transformer (SSVT).  ...  Self Attention Layer in the original vision transformer network).  ... 
arXiv:2111.06839v2 fatcat:mnoeuqfkzvbmpouusbkrfzn3qy

Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering [article]

Youngjoong Kwon and Dahun Kim and Duygu Ceylan and Henry Fuchs
2021 arXiv   pre-print
Moreover, a multi-view transformer is proposed to perform cross-attention between the temporally-fused features and the pixel-aligned features at each time step to integrate observations on the fly from  ...  In this paper, we aim at synthesizing a free-viewpoint video of an arbitrary human performance using sparse multi-view cameras.  ...  Our goal is to learn generalizable 3D representations of human performers from multi-time (M ) and multi-view (C) observations.  ... 
arXiv:2109.07448v1 fatcat:lr7xzscnjbennf3fji2raapskq

ViTBIS: Vision Transformer for Biomedical Image Segmentation [article]

Abhinav Sagar
2022 arXiv   pre-print
We test the performance of our network using Synapse multi-organ segmentation dataset, Automated cardiac diagnosis challenge dataset, Brain tumour MRI segmentation dataset and Spleen CT segmentation dataset  ...  Concat operator is used to merge the features before being fed to three consecutive transformer blocks with attention mechanism embedded inside it.  ...  A multi-scale attention network (Fan et al., 2020) was proposed in the context of biomedical image segmentation.  ... 
arXiv:2201.05920v1 fatcat:ixurvgrtune43may3icq75mjqa

The Self-Supervised Spectral–Spatial Vision Transformer Network for Accurate Prediction of Wheat Nitrogen Status from UAV Imagery

Xin Zhang, Liangxiu Han, Tam Sobeih, Lewis Lappin, Mark A. Lee, Andew Howard, Aron Kisdi
2022 Remote Sensing  
The proposed approach achieved high accuracy (0.96) with good generalizability and reproducibility for wheat N status estimation.  ...  In this work, we propose a novel deep learning framework: a self-supervised spectral–spatial attention-based vision transformer (SSVT).  ...  There are four main parts in the transformer encoder Multi-Head Self Attention Layer (MSP), Multi-Layer Perceptrons (MLP), Layer Norm, and Residual connections introduced in CNN evolution.  ... 
doi:10.3390/rs14061400 fatcat:spcgmlwobvaf5arjlam6oitwi4

Multi-Label Activity Recognition using Activity-specific Features and Activity Correlations [article]

Yanyi Zhang, Xinyu Li, Ivan Marsic
2021 arXiv   pre-print
These networks extract shared features for all the activities, which are not designed for multi-label activities.  ...  Most recent activity recognition networks focus on single-activities, that assume only one activity in each video.  ...  We implemented the dot-product attention method (Vaswani et al. 2017 ) for generating attnO as: where attnO k ∈ R T W H denotes the attention for the k th observation, g β k , g γ k are the linear functions  ... 
arXiv:2009.07420v2 fatcat:pxqiebhtr5dexflaonv5qhe4hm

Generalizable multi-task, multi-domain deep segmentation of sparse pediatric imaging datasets via multi-scale contrastive regularization and multi-joint anatomical priors [article]

Arnaud Boutillon, Pierre-Henri Conze, Christelle Pons, Valérie Burdin, Bhushan Borotikar
2022 arXiv   pre-print
In this study, we propose to design a novel multi-task, multi-domain learning framework in which a single segmentation network is optimized over the union of multiple datasets arising from distinct parts  ...  clusters in the shared representations, and multi-joint anatomical priors to enforce anatomically consistent predictions.  ...  non-linear activation.  ... 
arXiv:2207.13502v1 fatcat:qe4stepkqnd2rjgtii4cwxsk34

Learning Scalable Policies over Graphs for Multi-Robot Task Allocation using Capsule Attention Networks [article]

Steve Paul, Payam Ghassemi, Souma Chowdhury
2022 arXiv   pre-print
The proposed neural architecture, called Capsule Attention-based Mechanism or CapAM acts as the policy network, and includes three main components: 1) an encoder: a Capsule Network based node embedding  ...  solve CO problems, namely the purely attention mechanism.  ...  The proposed network architecture is named Capsule Attention Mechanism or CapAM.  ... 
arXiv:2205.03321v1 fatcat:dmow6k257jb5vixzdg4zcskvz4

CrowdFormer: Weakly-supervised Crowd counting with Improved Generalizability [article]

Siddharth Singh Savner, Vivek Kanhangad
2022 arXiv   pre-print
More importantly, it shows remarkable generalizability.  ...  On the other hand, transformer, an attention-based architecture can model the global context easily.  ...  Each encoder consists of a self-attention mechanism and a feed-forward neural network. The position encoding is done in the feed-forward neural network.  ... 
arXiv:2203.03768v1 fatcat:gbuwcsvggjee5izm7vq7lopeie

Is Attention All NeRF Needs? [article]

Mukund Varma T, Peihao Wang, Xuxi Chen, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang
2022 arXiv   pre-print
The first stage of GNT, called view transformer, leverages multi-view geometry as an inductive bias for attention-based scene representation, and predicts coordinate-aligned features by aggregating information  ...  We present Generalizable NeRF Transformer (GNT), a pure, unified transformer-based architecture that efficiently reconstructs Neural Radiance Fields (NeRFs) on the fly from source views.  ...  Multi-Head Self-Attention (MHA) sets a group of self-attention blocks, and adopts a linear layer to project them onto the output space: MHA(X) = [Attn 1 (X) Attn 2 (X) • • • Attn H (X)] W O (2) Following  ... 
arXiv:2207.13298v1 fatcat:f4ikucmkdvaq5boxq4x7u4ouoy

Multi-Gate Attention Network for Image Captioning

Weitao Jiang, Xiying Li, Haifeng Hu, Qiang Lu, Bohong Liu
2021 IEEE Access  
By integrating MGA block with pre-layernorm transformer architecture into the image encoder and AWG module into the language decoder, we present a novel Multi-Gate Attention Network (MGAN).  ...  INDEX TERMS Image captioning, self-attention, transformer, multi-gate attention.  ...  architecture to the image encoder and AWG module to the language decoder, a Multi-Gate Attention Network (MGAN) is proposed.  ... 
doi:10.1109/access.2021.3067607 fatcat:ogqwtb4lqrcslpk6kmtfcemjui
« Previous Showing results 1 — 15 out of 23,044 results