Filters








34,579 Hits in 5.2 sec

Alignment Attention by Matching Key and Query Distributions [article]

Shujian Zhang, Xinjie Fan, Huangjie Zheng, Korawat Tanwisuth, Mingyuan Zhou
2021 arXiv   pre-print
This paper introduces alignment attention that explicitly encourages self-attention to match the distributions of the key and query within each head.  ...  We further demonstrate the general applicability of our approach on graph attention and visual question answering, showing the great potential of incorporating our alignment method into various attention-related  ...  National Science Foundation, the APX 2019 project sponsored by the Office of the Vice President for Research at The University of Texas at Austin, the support of a gift fund from ByteDance Inc., and the  ... 
arXiv:2110.12567v1 fatcat:cspgs4yetjf4toy5rn4ijdw6ji

StreamAligner: a streaming based sequence aligner on Apache Spark

Sanjay Rathee, Arti Kashyap
2018 Journal of Big Data  
Sequence alignment is like the heart of bioinformatics field and has attracted huge attention by researchers.  ...  A lot of MapReduce-based sequence alignment tools like CloudBurst, CloudAligner, Halvade, and SparkBWA are proposed by various researchers in recent few years.  ...  Therefore, a highly distributed computing based platform Apache Spark which outperformed Hadoop by a huge margin for various machine learning problems has got a lot of attention these days.  ... 
doi:10.1186/s40537-018-0114-y fatcat:ooj6fe62dza4ncpsxt5kgbcnu4

Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing [article]

Xihui Liu, Zihao Wang, Jing Shao, Xiaogang Wang, Hongsheng Li
2019 arXiv   pre-print
Referring expression grounding aims at locating certain objects or persons in an image with a referring expression, where the key challenge is to comprehend and align various types of information from  ...  Although the attention mechanism has been successfully applied for cross-modal alignments, previous attention models focus on only the most dominant features of both modalities, and neglect the fact that  ...  Acknowledgements This work is supported in part by SenseTime Group  ... 
arXiv:1903.00839v2 fatcat:c6xlz2ly6vgqzdrn6xk5tz2hpe

G2DA: Geometry-Guided Dual-Alignment Learning for RGB-Infrared Person Re-Identification [article]

Lin Wan, Zongyuan Sun, Qianyan Jing, Yehansen Chen, Lijing Lu, Zhihang Li
2021 arXiv   pre-print
In this paper, we propose a graph-enabled distribution matching solution, dubbed Geometry-Guided Dual-Alignment (G2DA) learning, for RGB-IR ReID.  ...  It can jointly encourage the cross-modal consistency between part semantics and structural relations for fine-grained modality alignment by solving a graph matching task within a multi-scale skeleton graph  ...  Thanks to the semantic-aligned nature of skeleton graph, we can jointly encourage semantic and structural modality consistency by matching the probabilistic distribution of global and part embeddings,  ... 
arXiv:2106.07853v2 fatcat:3nspm3ymgng67esqksxrfqugme

Weakly supervised cross-domain alignment with optimal transport [article]

Siyang Yuan, Ke Bai, Liqun Chen, Yizhe Zhang, Chenyang Tao, Chunyuan Li, Guoyin Wang, Ricardo Henao, Lawrence Carin
2020 arXiv   pre-print
Cross-domain alignment between image objects and text sequences is key to many visual-language tasks, and it poses a fundamental challenge to both computer vision and natural language processing.  ...  Our method builds upon recent advances in optimal transport (OT) to resolve the cross-domain matching problem in a principled manner.  ...  The research at Duke University was supported in part by DARPA, DOE, NIH, NSF and ONR.  ... 
arXiv:2008.06597v1 fatcat:swzix2esmncstffyuyxjr2dbwq

An Introductory Survey on Attention Mechanisms in NLP Problems [article]

Dichao Hu
2018 arXiv   pre-print
First derived from human intuition, later adapted to machine translation for automatic token alignment, attention mechanism, a simple method that can be used for encoding sequence data based on the importance  ...  In this paper, we survey through recent works and conduct an introductory summary of the attention mechanism in different NLP problems, aiming to provide our readers with basic knowledge on this widely  ...  measured by comparing the attention distribution with the gold alignment data, and quantified using alignment error rate (AER).  ... 
arXiv:1811.05544v1 fatcat:ayvyqvklxbgrvc7snqrh53buom

MGIMN: Multi-Grained Interactive Matching Network for Few-shot Text Classification [article]

Jianhai Zhang, Mieradilijiang Maimaiti, Xing Gao, Yuanhang Zheng, Ji Zhang
2022 arXiv   pre-print
The key of instance-wise comparison is the interactive matching within the class-specific context and episode-specific context.  ...  They also ignore the importance to capture the inter-dependency between query and the support set for few-shot text classification.  ...  compute the probability distribution of the label y of the query q using attention: P (y|S, q) = K k=1 exp(sim(q, S k y )) N n=1 K k=1 exp(sim(q, S k n )) . (3) Finally, for any query instance q, we regard  ... 
arXiv:2204.04952v3 fatcat:g6zgletaxjgqdbt4g5gutxth24

Overcoming Obstructions via Bandwidth-Limited Multi-Agent Spatial Handshaking [article]

Nathaniel Glaser, Yen-Cheng Liu, Junjiao Tian, Zsolt Kira
2021 arXiv   pre-print
This setting presents several key challenges, including processing and exchanging unregistered robotic swarm imagery.  ...  Our distributed communication module operates directly (and exclusively) on raw image data, without additional input requirements such as pose, depth, or warping data.  ...  Furthermore, by having separate query and key encoders, the agents have a mechanism for describing the input image in the form of a question (query) and an answer (key), similar to the mechanism explored  ... 
arXiv:2107.00771v1 fatcat:qz5yocruhzadniuk3wwx55kroq

RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers [article]

Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, Matthew Richardson
2021 arXiv   pre-print
On the challenging Spider dataset this framework boosts the exact match accuracy to 57.2%, surpassing its best counterparts by 8.7% absolute improvement.  ...  query.  ...  Acknowledgments We thank Jianfeng Gao, Vladlen Koltun, Chris Meek, and Vignesh Shiv for the discussions that helped shape this work. We thank Bo Pang, Tao Yu for their help with the evaluation.  ... 
arXiv:1911.04942v5 fatcat:3wyyhx4tsjbyxj5f3d6hhypk2m

On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries [article]

Tianze Shi, Chen Zhao, Jordan Boyd-Graber, Hal Daumé III, Lillian Lee
2020 arXiv   pre-print
We propose and test two methods: (1) supervised attention; (2) adopting an auxiliary objective of disambiguating references in the input queries to table columns.  ...  between SQL and question fragments.  ...  CZ and JBG are supported by the Defense Advanced Research Projects Agency (DARPA) and Air Force Research Laboratory (AFRL), and awarded to Raytheon BBN Technologies under contract number FA865018-C-7885  ... 
arXiv:2010.11246v1 fatcat:wpmjjsbhffhxpf4ugtfkjjfspq

Learning a Key-Value Memory Co-Attention Matching Network for Person Re-Identification

Yaqing Zhang, Xi Li, Zhongfei Zhang
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Motivated by this observation, we propose a Key-Value Memory Matching Network (KVM-MN) model that consists of key-value memory representation and key-value co-attention matching.  ...  Furthermore, the proposed KVM-MN model makes use of multi-head co-attention to automatically learn a number of cross-person-matching patterns, resulting in more robust and interpretable matching results  ...  In this paper, we propose a Key-Value Memory Matching Network (KVM-MN) that comprises the modules of key-value memory representation and key-value co-attention matching.  ... 
doi:10.1609/aaai.v33i01.33019235 fatcat:m63cwgofdnghhf3v7gnkkw56wu

Pose-Guided Multi-Granularity Attention Network for Text-Based Person Search [article]

Ya Jing, Chenyang Si, Junbo Wang, Wei Wang, Liang Wang, Tieniu Tan
2019 arXiv   pre-print
Extracting visual contents corresponding to the human description is the key to this cross-modal matching problem.  ...  Firstly, we propose a coarse alignment network (CA) to select the related image regions to the global description by a similarity-based attention.  ...  Acknowledgments This work is jointly supported by National Key Research and Development Program of China (2016YFB1001000), National Natural Science Foundation of China (61525306, 61633021, 61721004, 61420106015  ... 
arXiv:1809.08440v3 fatcat:rb33zfv645at3nfh7qu7vcjqvi

Pose-Guided Multi-Granularity Attention Network for Text-Based Person Search

Ya Jing, Chenyang Si, Junbo Wang, Wei Wang, Liang Wang, Tieniu Tan
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Extracting visual contents corresponding to the human description is the key to this cross-modal matching problem.  ...  Firstly, we propose a coarse alignment network (CA) to select the related image regions to the global description by a similarity-based attention.  ...  Acknowledgments This work is jointly supported by National Key Research and Development Program of China (2016YFB1001000), National Natural Science Foundation of China (61525306, 61633021, 61721004, 61420106015  ... 
doi:10.1609/aaai.v34i07.6777 fatcat:e7o4adxewrb5ned33qnmjdisuq

DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval [article]

Aichun Zhu, Zijie Wang, Yifeng Li, Xili Wan, Jing Jin, Tian Wang, Fangqiang Hu, Gang Hua
2021 arXiv   pre-print
A surroundings-person separation and fusion mechanism plays the key role to realize an accurate and effective surroundings-person separation under a mutually exclusion constraint.  ...  To this end, we propose a novel Deep Surroundings-person Separation Learning (DSSL) model in this paper to effectively extract and match person information, and hence achieve a superior retrieval accuracy  ...  This proposed alignment can be regarded as describing the person in the gallery image with a text and then matching the text with the given query sentence in the textual modality space.  ... 
arXiv:2109.05534v1 fatcat:ffagkml6gfge5fzrzqxydwc3ua

Cross-media Multi-level Alignment with Relation Attention Network [article]

Jinwei Qi, Yuxin Peng, Yuxin Yuan
2018 arXiv   pre-print
Naturally, when correlating an image with textual description, people focus on not only the alignment between discriminative image regions and key words, but also the relations lying in the visual and  ...  To address the above issue, we propose Cross-media Relation Attention Network (CRAN) with multi-level alignment.  ...  Acknowledgments This work was supported by National Natural Science Foundation of China under Grant 61771025 and Grant 61532005.  ... 
arXiv:1804.09539v1 fatcat:7bpfoixw2rbfji3b4h7jytyvly
« Previous Showing results 1 — 15 out of 34,579 results