Filters








4,029 Hits in 8.2 sec

Probabilistic Analogical Mapping with Semantic Relation Networks [article]

Hongjing Lu, Nicholas Ichien, Keith J. Holyoak
2021 arXiv   pre-print
The human ability to flexibly reason using analogies with domain-general content depends on mechanisms for identifying relations between concepts, and for mapping concepts and their relations across analogs  ...  We also show the potential for extending the model to deal with analog retrieval.  ...  We followed up with six planned pairwise comparisons between word positions within each triplet type.  ... 
arXiv:2103.16704v3 fatcat:aausgrtq4jh5xiubgwstevlinq

Relationship-Embedded Representation Learning for Grounding Referring Expressions [article]

Sibei Yang, Guanbin Li, Yizhou Yu
2020 arXiv   pre-print
It involves a joint understanding of natural language and image content, and is essential for a range of visual tasks related to human-computer interaction.  ...  attention mechanism, and represent the extracted information as a language-guided visual relation graph.  ...  The triplet loss with online hard negative mining is adopted during training and the proposal with the highest matching score is chosen.  ... 
arXiv:1906.04464v3 fatcat:n2d3abjxgbegnaoyyfyv6e7cwq

CIDER: Commonsense Inference for Dialogue Explanation and Reasoning [article]

Deepanway Ghosal and Pengfei Hong and Siqi Shen and Navonil Majumder and Rada Mihalcea and Soujanya Poria
2021 arXiv   pre-print
Commonsense inference to understand and explain human language is a fundamental research problem in natural language processing.  ...  Baseline results obtained with transformer-based models reveal that the tasks are difficult, paving the way for promising future research.  ...  ConceptNet is a semantic network with nodes composed of common words or phrases in their natural language form.  ... 
arXiv:2106.00510v2 fatcat:zjjt7g22gvdcdeyjd7b53wmrna

Learning to Generate Scene Graph from Natural Language Supervision [article]

Yiwu Zhong, Jing Shi, Jianwei Yang, Chenliang Xu, Yin Li
2021 arXiv   pre-print
Learning from only image-sentence pairs, our model achieves 30% relative gain over a latest method trained with human-annotated unlocalized scene graphs.  ...  Learning from image-text data has demonstrated recent success for many recognition tasks, yet is currently limited to visual features or individual visual concepts such as objects.  ...  Acknowledgement: YZ and YL acknowledge the support provided by the UW-Madison OVCRGE with funding from WARF. JS and CX were supported by the National Science Foundation (NSF) under Grant RI:1813709.  ... 
arXiv:2109.02227v1 fatcat:s5nn6gezmjdethihx5traqhy6y

Deep Fragment Embeddings for Bidirectional Image Sentence Mapping [article]

Andrej Karpathy, Armand Joulin, Li Fei-Fei
2014 arXiv   pre-print
We introduce a model for bidirectional retrieval of images and sentences through a multi-modal embedding of visual and natural language data.  ...  Additionally, our model provides interpretable predictions since the inferred inter-modal fragment alignment is explicit.  ...  Introduction There is significant value in the ability to associate natural language descriptions with images.  ... 
arXiv:1406.5679v1 fatcat:6geopbzasrb2rbhycbr767cwvq

Structured Knowledge Discovery from Massive Text Corpus [article]

Chenwei Zhang
2019 arXiv   pre-print
In particular, four problems are studied in this dissertation: Structured Intent Detection for Natural Language Understanding, Structure-aware Natural Language Modeling, Generative Structured Knowledge  ...  Nowadays, with the booming development of the Internet, people benefit from its convenience due to its open and sharing nature.  ...  The lexical-syntax joint representation consists of words along with POS tags are shown to be effective in modeling both lexical (words) and syntax (POS tags) from the natural language text corpus in various  ... 
arXiv:1908.01837v1 fatcat:j46srlxblfd35cd4z6jkl43iiu

A Multi-task Learning Framework for Opinion Triplet Extraction [article]

Chen Zhang, Qiuchi Li, Dawei Song, Benyou Wang
2020 arXiv   pre-print
At inference phase, the extraction of triplets is facilitated by a triplet decoding method based on the above outputs. We evaluate the proposed framework on four SemEval benchmarks for ASBA.  ...  parses sentiment dependencies between them with a biaffine scorer.  ...  Triplet Extraction-based Task Other than ABSA, a majority of triplet extractionbased tasks lies in the area of natural language processing.  ... 
arXiv:2010.01512v2 fatcat:opdz6ysgrvfjvptlk2bfz5f5nu

Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition [article]

Peipei Zhu, Xiao Wang, Yong Luo, Zhenglong Sun, Wei-Shi Zheng, Yaowei Wang, Changwen Chen
2022 arXiv   pre-print
Specifically, we adopt image-level labels for the optimization of the UIC model in a weakly-supervised manner.  ...  Furthermore, we design an unrecognized object (UnO) loss combined with a visual concept reward to improve the alignment of the inferred object and relationship information with the images.  ...  Then, the Recurrent Neural Network (RNN) is adopted to decode the features into multiple words one by one. This way, the natural description for the input image can be obtained.  ... 
arXiv:2203.03195v1 fatcat:ata2rtnxhnau5e4yk72r332ija

Seeking Common but Distinguishing Difference, A Joint Aspect-based Sentiment Analysis Model [article]

Hongjiang Jing, Zuchao Li, Hai Zhao, Shu Jiang
2021 arXiv   pre-print
In detail, we introduce a dual-encoder design, in which a pair encoder especially focuses on candidate aspect-opinion pair classification, and the original encoder keeps attention on sequence labeling.  ...  Therefore, we propose a joint ABSA model, which not only enjoys the benefits of encoder sharing but also focuses on the difference to improve the effectiveness of the model.  ...  Zhang et al. (2020a) proposed a multi-task learning approach with the aid of dependency parsing on tail word pair of corresponding aspect-opinion pair.  ... 
arXiv:2111.09634v1 fatcat:atyfflqpqrfgzayl7wzyiyitta

Paradigm Shift in Natural Language Processing [article]

Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang
2021 arXiv   pre-print
For example, we usually adopt the sequence labeling paradigm to solve a bundle of tasks such as POS-tagging, NER, Chunking, and adopt the classification paradigm to solve tasks like sentiment analysis.  ...  With the rapid progress of pre-trained language models, recent years have observed a rising trend of Paradigm Shift, which is solving one NLP task by reformulating it as another one.  ...  Enhanced LSTM for natural language inference.  ... 
arXiv:2109.12575v1 fatcat:vckeva3u3va3vjr6okhuztox4y

Object Relation Detection Based on One-shot Learning [article]

Li Zhou, Jian Zhao, Jianshu Li, Li Yuan, Jiashi Feng
2018 arXiv   pre-print
Detecting the relations among objects, such as "cat on sofa" and "person ride horse", is a crucial task in image understanding, and beneficial to bridging the semantic gap between images and natural language  ...  Despite the remarkable progress of deep learning in detection and recognition of individual objects, it is still a challenging task to localize and recognize the relations between objects due to the complex  ...  Acknowledgement The work of Jiashi Feng was partially supported by NUS startup R-263-000-C08-133, MOE Tier-I R-263-000-C21-112, NUS IDS R-263-000-C67-646 and ECRA R-263-000-C87-133.  ... 
arXiv:1807.05857v1 fatcat:btklk3yvafb33pkfzmwlzge42e

Detecting Visual Relationships with Deep Relational Networks [article]

Bo Dai, Yuqi Zhang, Dahua Lin
2017 arXiv   pre-print
At the heart of this framework is the Deep Relational Network, a novel formulation designed specifically for exploiting the statistical dependencies between objects and their relationships.  ...  Such approaches are faced with significant difficulties caused by the high diversity of visual appearance for each kind of relationships or the large number of distinct visual phrases.  ...  This compressed pair feature, together with the appearance features of individual objects will be fed to the DR-Net for joint inference.  ... 
arXiv:1704.03114v2 fatcat:k6nazub225b43ks3i6nzfkgpni

Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training [article]

Yehao Li and Jiahao Fan and Yingwei Pan and Ting Yao and Weiyao Lin and Tao Mei
2022 arXiv   pre-print
In this way, Uni-EDEN is endowed with the power of both multi-modal representation extraction and language modeling.  ...  Considering that the linguistic representations of each image can span different granularities in this hierarchy including, from simple to comprehensive, individual label, a phrase, and a natural sentence  ...  One of the early successes for natural language pre-training is GPT [33] that pre-trains a Transformer based language model to extract general language representations depending on unidirectional word  ... 
arXiv:2201.04026v1 fatcat:ntihxbz6yrcivj2aib4rrbazhu

Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning [article]

Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, In So Kweon
2019 arXiv   pre-print
Part-of speech (POS, i.e. subject-object-predicate categories) tags can be assigned to every English word. We leverage the POS as a prior to guide the correct sequence of words in a caption.  ...  To this end, we propose a multi-task triple-stream network (MTTSNet) which consists of three recurrent units for the respective POS and jointly performs POS prediction and captioning.  ...  We tokenize the relational expressions to form natural language expressions, and for each word, we assign the POS class from the triplet association.  ... 
arXiv:1903.05942v4 fatcat:xhmhzxjwkngndpxwl5xuppim2i

Jointly Embedding Relations and Mentions for Knowledge Population [article]

Miao Fan, Kai Cao, Yifan He, Ralph Grishman
2015 arXiv   pre-print
This paper contributes a joint embedding model for predicting relations between a pair of entities in the scenario of relation inference.  ...  The proposed model simultaneously learns low-dimensional vector representations for both triplets in knowledge repositories and the mentions of relations in free texts, so that we can leverage the evidence  ...  Acknowledgments The first author conducted this research while he was a joint-supervision Ph.D. student in New York University. This paper is dedicated to all the members of the Proteus Project.  ... 
arXiv:1504.01683v4 fatcat:jn5q52rj6fdxrkbmn6npt5jf7q
« Previous Showing results 1 — 15 out of 4,029 results