Filters








24,304 Hits in 7.6 sec

Cross-media Multi-level Alignment with Relation Attention Network

Jinwei Qi, Yuxin Peng, Yuxin Yuan
2018 Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence  
To address the above issue, we propose Cross-media Relation Attention Network (CRAN) with multi-level alignment.  ...  First, we propose visual-language relation attention model to explore both fine-grained patches and their relations of different media types.  ...  For addressing above problem, we propose Cross-media Relation Attention Network (CRAN) with multi-level alignment, which has following contributions. • Visual-language relation attention model.  ... 
doi:10.24963/ijcai.2018/124 dblp:conf/ijcai/QiPY18 fatcat:anifftwhbrec7oackglkbn2qra

Cross-media Multi-level Alignment with Relation Attention Network [article]

Jinwei Qi, Yuxin Peng, Yuxin Yuan
2018 arXiv   pre-print
To address the above issue, we propose Cross-media Relation Attention Network (CRAN) with multi-level alignment.  ...  First, we propose visual-language relation attention model to explore both fine-grained patches and their relations of different media types.  ...  For addressing the above problem, we propose Cross-media Relation Attention Network (CRAN) with multi-level alignment, which has the following contributions. • Visual-language relation attention model.  ... 
arXiv:1804.09539v1 fatcat:7bpfoixw2rbfji3b4h7jytyvly

Style Mixer: Semantic‐aware Multi‐Style Transfer Network

Zixuan Huang, Jinghuai Zhang, Jing Liao
2019 Computer graphics forum (Print)  
We first improve the existing SST backbone network by introducing a novel multi-level feature fusion module and a patch attention module to achieve better semantic correspondences and preserve richer style  ...  to simultaneously transferring multiple styles to the same image.  ...  Acknowledgement We thank the anonymous reviewers for helping us to improve this paper. And we acknowledge to the authors of our image and style examples.  ... 
doi:10.1111/cgf.13853 fatcat:ss7kr2l2gfbhxabonw4utoyj24

TransVPR: Transformer-based place recognition with multi-level attention aggregation [article]

Ruotong Wang, Yanqing Shen, Weiliang Zuo, Sanping Zhou, Nanning Zhen
2022 arXiv   pre-print
In addition, the output tokens from Transformer layers filtered by the fused attention mask are considered as key-patch descriptors, which are used to perform spatial matching to re-rank the candidates  ...  Attentions from multiple levels of the Transformer, which focus on different regions of interest, are further combined to generate a global image representation.  ...  Related work We review previous works on image description techniques, especially related to place recognition. Patch-level descriptors.  ... 
arXiv:2201.02001v1 fatcat:kjbcf7qic5gxvdtzq23njxw5ti

MRI-based Alzheimer's disease prediction via distilling the knowledge in multi-modal data [article]

Hao Guan
2021 arXiv   pre-print
In this work, we propose a multi-modal multi-instance distillation scheme, which aims to distill the knowledge learned from multi-modal data to an MRI-based network for MCI conversion prediction.  ...  network to better explore the input MRI.  ...  We would like to thank the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the Australian Imaging, Biomarker & Lifestyle Flagship Study of Ageing (AIBL) for data collection and sharing.  ... 
arXiv:2104.03618v1 fatcat:654s32pwpna37cvezvb7uomn2e

Pyramid Attention Networks for Image Restoration [article]

Yiqun Mei, Yuchen Fan, Yulun Zhang, Jiahui Yu, Yuqian Zhou, Ding Liu, Yun Fu, Thomas S. Huang, Humphrey Shi
2020 arXiv   pre-print
To solve this problem, we present a novel Pyramid Attention module for image restoration, which captures long-range feature correspondences from a multi-scale feature pyramid.  ...  However, recent advanced deep convolutional neural network based methods for image restoration do not take full advantage of self-similarities by relying on self-attention neural modules that only process  ...  Related Works Self-similarity Prior for Image Restoration.  ... 
arXiv:2004.13824v4 fatcat:4tq7ea4ntfdvhazjeqxa3zn7ye

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector [article]

Qi Fan, Wei Zhuo, Chi-Keung Tang, Yu-Wing Tai
2020 arXiv   pre-print
Central to our method are our Attention-RPN, Multi-Relation Detector and Contrastive Training strategy, which exploit the similarity between the few shot support set and query set to detect novel objects  ...  To the best of our knowledge, this is one of the first datasets specifically designed for few-shot object detection.  ...  For the N -way training, we extend the network by adding N − 1 support branches where each branch has its own attention RPN and multi-relation detector with the query image.  ... 
arXiv:1908.01998v4 fatcat:33zf5gvncbc6riuc3vy3gpjcra

Multi-modal Sentence Summarization with Modality Attention and Image Filtering

Haoran Li, Junnan Zhu, Tianshang Liu, Jiajun Zhang, Chengqing Zong
2018 Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence  
To this end, we propose a modality-based attention mechanism to pay different attention to image patches and text units, and we design image filters to selectively use visual information to enhance the  ...  In this paper, we introduce a multi-modal sentence summarization task that produces a short summary from a pair of sentence and image. This task is more challenging than sentence summarization.  ...  To do so, the image patch and the sentence are matched to determine informative patches.  ... 
doi:10.24963/ijcai.2018/577 dblp:conf/ijcai/LiZLZZ18 fatcat:ip6222omozdrtpwtmzdx2kejzu

End-to-End Deep Kronecker-Product Matching for Person Re-identification

Yantao Shen, Tong Xiao, Hongsheng Li, Shuai Yi, Xiaogang Wang
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
The multi-scale features based on hourglasslike networks and self residual attention are also exploited to further boost the re-identification performance.  ...  In this paper, we propose a novel Kronecker Product Matching module to match feature maps of different persons in an end-to-end trainable deep neural network.  ...  Hourglass network for multi-scale KPM Based on the KPM and residual self-attention modules, for fully exploiting the multi-scale information, we adopt a hourglass-like structure [23] to generate multi-scale  ... 
doi:10.1109/cvpr.2018.00720 dblp:conf/cvpr/ShenXLYW18 fatcat:6dwbxargwzcipbhaizx4vwrejy

End-to-End Deep Kronecker-Product Matching for Person Re-identification [article]

Yantao Shen, Tong Xiao, Hongsheng Li, Shuai Yi, Xiaogang Wang
2018 arXiv   pre-print
The multi-scale features based on hourglass-like networks and self-residual attention are also exploited to further boost the re-identification performance.  ...  In this paper, we propose a novel Kronecker Product Matching module to match feature maps of different persons in an end-to-end trainable deep neural network.  ...  Hourglass network for multi-scale KPM Based on the KPM and residual self-attention modules, for fully exploiting the multi-scale information, we adopt a hourglass-like structure [23] to generate multi-scale  ... 
arXiv:1807.11182v1 fatcat:fj6de6vusbesjl7bndbvvttium

Attention-Aware Multi-Stroke Style Transfer

Yuan Yao, Jianqiang Ren, Xuansong Xie, Weidong Liu, Yong-Jin Liu, Jun Wang
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
In this paper, we tackle these limitations by developing an attention-aware multi-stroke style transfer model.  ...  We first propose to assemble self-attention mechanism into a style-agnostic reconstruction autoencoder framework, from which the attention map of a content image can be derived.  ...  However, the increase of the scale for patch size is strictly limited by the network structure and easily saturated when the patch size is larger than the fixed receptive field of network.  ... 
doi:10.1109/cvpr.2019.00156 dblp:conf/cvpr/YaoRX0LW19 fatcat:xdtqe4gdtzgipft2sringb7s5u

Overcoming Obstructions via Bandwidth-Limited Multi-Agent Spatial Handshaking [article]

Nathaniel Glaser, Yen-Cheng Liu, Junjiao Tian, Zsolt Kira
2021 arXiv   pre-print
As such, we propose an end-to-end learn-able Multi-Agent Spatial Handshaking network (MASH) to process, compress, and propagate visual information across a robotic swarm.  ...  We demonstrate superior performance of our model compared against several baselines in a photo-realistic multi-robot AirSim environment, especially in the presence of image occlusions.  ...  ; and (6) the corresponding match scores are used as the spatial attention weights during multi-agent fusion.  ... 
arXiv:2107.00771v1 fatcat:qz5yocruhzadniuk3wwx55kroq

MSCNN-AM: A Multi-Scale Convolutional Neural Network with Attention Mechanisms for Retinal Vessel Segmentation

Qilong Fu, Shuqiu Li, Xin Wang
2020 IEEE Access  
INDEX TERMS Retinal vessel segmentation, convolutional neural network, multi-scale information, attention mechanism VOLUME 4, 2016  ...  In this paper, aiming at upgrading the accuracy and sensitivity of existing vessel segmentation methods, we propose a Multi-Scale Convolutional Neural Network with Attention Mechanisms (MSCNN-AM).  ...  CONCLUSION In this paper, a Multi-Scale Convolutional Neural Network with Attention Mechanisms (MSCNN-AM) is proposed for accurate retinal vessel segmentation.  ... 
doi:10.1109/access.2020.3022177 fatcat:y4mxctqqofgcjfksre3cp3lfdm

Learning Dense Wide Baseline Stereo Matching for People [article]

Akin Caliskan, Armin Mustafa, Evren Imre, Adrian Hilton
2019 arXiv   pre-print
The network learns from the human specific stereo patches from the proposed dataset for wide-baseline stereo estimation.  ...  A synthetic people stereo patch dataset (S2P2) is introduced to learn wide baseline dense stereo matching for people.  ...  Patches are processed through the network, and matching cost is computed for each patch.  ... 
arXiv:1910.01241v1 fatcat:zsxgmbczzvhbxenkpyugz55r3a

Attention-Guided Progressive Neural Texture Fusion for High Dynamic Range Image Restoration [article]

Jie Chen, Zaifeng Yang, Tsz Nam Chan, Hui Li, Junhui Hou, Lap-Pui Chau
2021 arXiv   pre-print
High Dynamic Range (HDR) imaging via multi-exposure fusion is an important task for most modern imaging platforms.  ...  In addition, we introduce several novel attention mechanisms, i.e., the motion attention module detects and suppresses the content discrepancies among the reference images; the saturation attention module  ...  matched location j 2 within Ψ 2 (Ĥ s ) can be found for the target patch k 2 from Ψ 2 (H m ) via: j 2 = argmax j S 2 j (k 2 , ω 2 ), (7) For the next finer scale l = 1, features will be matched within  ... 
arXiv:2107.06211v1 fatcat:3hdra6gc5ngmxp4e7wktqk7y6i
« Previous Showing results 1 — 15 out of 24,304 results