Filters








116,318 Hits in 8.0 sec

Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos [article]

Xiaoye Qu, Pengwei Tang, Zhikang Zhou, Yu Cheng, Jianfeng Dong, Pan Zhou
2020 arXiv   pre-print
Finally, both video and query information is utilized to provide robust cross-modal representation for further moment localization.  ...  Temporal language localization in videos aims to ground one video segment in an untrimmed video based on a given sentence query.  ...  Natural Science Foundation (No.  ... 
arXiv:2008.02448v1 fatcat:lohkhfoiufd2ngc2ak737ha75y

MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment [article]

Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang, Larry S. Davis
2019 arXiv   pre-print
This research strives for natural language moment retrieval in long, untrimmed video streams.  ...  MAN naturally assigns candidate moment representations aligned with language semantics over different temporal locations and scales.  ...  Our contributions are as follows: • We propose a novel single-shot model for the natural language moment retrieval task, where language description is naturally integrated as dynamic filters into an end-to-end  ... 
arXiv:1812.00087v2 fatcat:cbxtybz4cnf3xbqiudlyj3rlm4

LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval [article]

Reuben Tan, Huijuan Xu, Kate Saenko, Bryan A. Plummer
2020 arXiv   pre-print
The goal of weakly-supervised video moment retrieval is to localize the video segment most relevant to the given natural language query without access to temporal annotations during training.  ...  Prior strongly- and weakly-supervised approaches often leverage co-attention mechanisms to learn visual-semantic representations for localization.  ...  Experiments We evaluate the capability of LoGAN to accurately localize video moments based on natural language queries without temporal annotations on two datasets -DiDeMo and Charades-STA.  ... 
arXiv:1909.13784v2 fatcat:btgosisk6bb4pklnwgpkojk53m

Language Guided Networks for Cross-modal Moment Retrieval [article]

Kun Liu, Huadong Ma, Chuang Gan
2020 arXiv   pre-print
We address the challenging task of cross-modal moment retrieval, which aims to localize a temporal segment from an untrimmed video described by a natural language query.  ...  In this paper, we present Language Guided Networks (LGN), a new framework that leverages the sentence embedding to guide the whole process of moment retrieval.  ...  Conclusion In this paper, we study the problem of moment localization with natural language and propose Language Guided Networks (LGN) to leverage the sentence embedding to guide the whole process of moment  ... 
arXiv:2006.10457v2 fatcat:eqnjjsvpvfc2darsrkayd3qvxm

Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language [article]

Songyang Zhang, Houwen Peng, Jianlong Fu, Jiebo Luo
2020 arXiv   pre-print
Based on the 2D map, we propose a Temporal Adjacent Network (2D-TAN), a single-shot framework for moment localization.  ...  It is capable of encoding the adjacent temporal relation, while learning discriminative features for matching video moments with referring expressions.  ...  a 2D Temporal Adjacent Network, i.e., 2D-TAN, for moment localization with natural language.  ... 
arXiv:1912.03590v3 fatcat:owznntu63bfzbatngsewcwxuuu

Control of Smart Home Operations Using Natural Language Processing, Voice Recognition and IoT Technologies in a Multi-Tier Architecture

George Alexakis, Spyros Panagiotakis, Alexander Fragkakis, Evangelos Markakis, Kostas Vassilakis
2019 Designs  
The IoT Agent integrates a chat bot that can understand text or voice commands using natural language processing (NLP).  ...  Our solution exploits several available Application Programming Interfaces (APIs), namely: the Dialogflow API for the efficient integration of NLP to our IoT system, the Web Speech API for enriching user  ...  It also integrated natural language processing with the help of a custom service called Engagement, which "interacted in the language of humans, and answered with context" [15] .  ... 
doi:10.3390/designs3030032 fatcat:qbjdp4ygsnfgfinqkvvykwowny

Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language

Songyang Zhang, Houwen Peng, Jianlong Fu, Jiebo Luo
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Based on the 2D map, we propose a Temporal Adjacent Network (2D-TAN), a single-shot framework for moment localization.  ...  It is capable of encoding the adjacent temporal relation, while learning discriminative features for matching video moments with referring expressions.  ...  a 2D Temporal Adjacent Network, i.e., 2D-TAN, for moment localization with natural language.  ... 
doi:10.1609/aaai.v34i07.6984 fatcat:6imibis2ifb55gshcmsj5saq44

A Survey on Temporal Sentence Grounding in Videos [article]

Xiaohan Lan, Yitian Yuan, Xin Wang, Zhi Wang, Wenwu Zhu
2021 arXiv   pre-print
Different from the task of temporal action localization, TSGV is more flexible since it can locate complicated activities via natural languages, without restrictions from predefined action categories.  ...  Temporal sentence grounding in videos(TSGV), which aims to localize one target segment from an untrimmed video with respect to a given sentence query, has drawn increasing attentions in the research community  ...  adjacent network for moment localization, figure from [83] .  ... 
arXiv:2109.08039v2 fatcat:6ja4csssjzflhj426eggaf77tu

Multimodal Language Analysis with Recurrent Multistage Fusion

Paul Pu Liang, Ziyin Liu, AmirAli Bagher Zadeh, Louis-Philippe Morency
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
Temporal and intra-modal interactions are modeled by integrating our proposed fusion approach with a system of recurrent neural networks.  ...  Computational modeling of human multimodal language is an emerging research area in natural language processing spanning the language, visual and acoustic modalities.  ...  The authors thank Yao Chong Lim, Venkata Ramana Murthy Oruganti, Zhun Liu, Ying Shen, Volkan Cirik, and the anonymous reviewers for their constructive comments on this paper.  ... 
doi:10.18653/v1/d18-1014 dblp:conf/emnlp/LiangLZM18 fatcat:itt5akqzpjg4bluwwm42egd4h4

Ontology Based Strategies for Supporting Communication within Social Networks [chapter]

Ivan Kopeček, Radek Ošlejšek, Jaromír Plhák
2014 Lecture Notes in Computer Science  
dialogue communication in natural language.  ...  Exploiting a formal ontology approach facilitates the process of deriving information from relevant texts that can be found in the social network and it simultaneously forms a suitable framework for supporting  ...  Although the current functionality is still very limited, there is no social network integration nor any generation of ontologies from texts at the moment.  ... 
doi:10.1007/978-3-319-10816-2_69 fatcat:zm2rpgejmze4vlavggjhmmq72y

A Survey on Natural Language Video Localization [article]

Xinfang Liu, Xiushan Nie, Zhifang Tan, Jie Guo, Yilong Yin
2021 arXiv   pre-print
Natural language video localization (NLVL), which aims to locate a target moment from a video that semantically corresponds to a text query, is a novel and challenging task.  ...  Therefore, they proposed an Interaction-Integrated Network, where the network is able to capture long-range video structure information by overlaying Interaction-Integrated Cells, which is a module that  ...  INTRODUCTION Given a video and a query sentence described in natural language form, natural language video localization (NLVL) aims at finding the segment from the video that is relevant to the query description  ... 
arXiv:2104.00234v1 fatcat:zuqg6fn6mjafbf3zwqyslmauhy

Relation-aware Video Reading Comprehension for Temporal Language Grounding [article]

Jialin Gao, Xin Sun, Mengmeng Xu, Xi Zhou, Bernard Ghanem
2021 arXiv   pre-print
Temporal language grounding in videos aims to localize the temporal span relevant to the given query sentence.  ...  This paper will formulate temporal language grounding into video reading comprehension and propose a Relation-aware Network (RaNet) to address it.  ...  For each query-video pair, we have one natural language sentence and an associated groundtruth video moment with the start g s and end g e boundary.  ... 
arXiv:2110.05717v3 fatcat:td2rh26u4jhlhj5n4jrim4zl6q

BUILDING SOCIAL CAPITAL: A GRASSROOTS LANGUAGE PROGRAM FOR REFUGEES AND POLITICS OF INTEGRATION

Lika Rodin, Andre Rodin
2016 International Journal of Social Sciences  
This study looks at the format and effects of an informal education programme for refugees titled 'Capture the Moment and Learn Something New', organised by a person with a foreign background in a small-sized  ...  for hosting societies.  ...  Being limited in language fluency, the woman had to invent alternative means for a desirable interaction.  ... 
doi:10.20472/ss2016.5.3.003 fatcat:shilvmbk4zaxzjqljib2bh5udi

Language, Media and Community in the Information Age

Gábor Szécsi
2013 Santalka: Filosofija, Komunikacija  
On the other hand, in this essay I consider the assumption that the medium of the mediatization and new conceptualization of community is a specific, pictorial language of electronically mediated communication  ...  , for example, the following definition: community is a network of interactions between individuals who uniformly accept and apply some rules for the communicative acts aiming at the effective exchange  ...  By using the electronic communication technologies, a networked individual becomes a part of a network of interactions between humans who uniformly accept and apply some rules for the communicative acts  ... 
doi:10.3846/cpc.2013.12 fatcat:muwmo5hcurdm5gvoutjkc56m7m

IDEAIS: Smart Voice Assistants to Improve Interaction with SDIs [article]

Miguel Ángel Bernabé, Jacinto Estima, María Ester González, Carlos Granell, Carlos López-Vázquez, Miguel R. Luaces, Bruno Martins, Daniela Moctezuma, Diego Seco
2019 arXiv   pre-print
A critical goal, is that organizations and citizens can easily access the geographic information required for good governance.  ...  In this position paper, we present IDEAIS, a research network composed of multiple Ibero-American partners to address this usability issue through the use of Intelligent Systems, in particular Smart Voice  ...  In particular, the use of natural language processing methods to user communication or assistant, in natural language, to obtain information and interact with the SDI is a novel, current, and relevant  ... 
arXiv:1910.04696v1 fatcat:2a7jrflfv5hr5aycyfoww7dh44
« Previous Showing results 1 — 15 out of 116,318 results