A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2014; you can also visit the original URL.
The file type is application/pdf
.
Filters
Wide-area motion imagery (WAMI) exploitation tools for enhanced situation awareness
2012
2012 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)
Multi-source data fusion using exploitation context from the video needs to be linked to semantically extracted elements for situation awareness to aid an operator in rapid image understanding. ...
To aid in the process of WAMI exploitation, we explore computer vision and pattern recognition methods to cue the user to salient information. ...
The explosive multimedia growth has closed the gap between richness of linguistic communication, once limited to text and print, to video. ...
doi:10.1109/aipr.2012.6528198
dblp:conf/aipr/BlaschSPLC12
fatcat:iv3ek6pdpfdkxkgob5ogynr56e
Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the Baselines
[article]
2019
arXiv
pre-print
This framework illustrates the different facets of analysis to be considered while performing multimodal sentiment analysis and, hence, serves as a new benchmark for future research in this emerging field ...
Such videos often contain comparisons of products from competing brands, pros and cons of product specifications, and other information that can aid prospective buyers to make informed decisions. ...
[15] fused information from audio, visual and text modalities to extract emotion and sentiment. Metallinou et al. [9] fused audio and text modalities for emotion recognition. ...
arXiv:1803.07427v2
fatcat:jytchjl3gnbpjkyvp4kb3ih5tu
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
2016
arXiv
pre-print
This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos. ...
Specifically, we integrate both a neural language model and distributional semantics trained on large text corpora into a recent LSTM-based architecture for video description. ...
of our late and deep fusion approaches to integrate an independently trained LM to aid video captioning. ...
arXiv:1604.01729v2
fatcat:gr6mmqbvkbfz7omrmthaqfbjn4
Gated Mechanism for Attention Based Multimodal Sentiment Analysis
[article]
2020
arXiv
pre-print
Multimodal sentiment analysis has recently gained popularity because of its relevance to social media posts, customer service calls and video blogs. ...
Fusion of unimodal and cross modal cues. Out of these three, we find that learning cross modal interactions is beneficial for this problem. ...
Since we have sequences of up to 100 utterances in a video, self attention allows us to capture the long context. ...
arXiv:2003.01043v1
fatcat:t73swfhrx5e6ngtkysf65ebnri
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
2016
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos. ...
Specifically, we integrate both a neural language model and distributional semantics trained on large text corpora into a recent LSTM-based architecture for video description. ...
The deep fusion model learns jointly from the hidden representations of the LM and S2VT video-to-text model (Vid-LSTM), whereas the late fusion re-scores the softmax output of the video-to-text model. ...
doi:10.18653/v1/d16-1204
dblp:conf/emnlp/VenugopalanHMS16
fatcat:rkxhj2pojvdcpdovdwtfi2qqzm
Overview of contextual tracking approaches in information fusion
2013
Geospatial InfoFusion III
We describe five contextual information categories that support target tracking: (1) domain knowledge from a user to aid the information fusion process through selection, cueing, and analysis, (2) environment-to-hardware ...
Many information fusion solutions work well in the intended scenarios; but the applications, supporting data, and capabilities change over varying contexts. ...
Fig. 2 . 2 Data Fusion Information Group Model (L = Level) Fig. 3.Context Modeling to support target tracking.
Fig. 4 . 4 Context in Support of Information Fusion [8] . ...
doi:10.1117/12.2016312
fatcat:l24tj2zkqzgc7cvdsssuupipw4
Context-Aware Personal Navigation Using Embedded Sensor Fusion in Smartphones
2014
Sensors
The proposed approach in this paper is using low-cost sensors in a multi-level fusion scheme to improve the accuracy and robustness of context-aware navigation system. ...
The basic idea is that mobile navigation services can provide different services based on different contexts-where contexts are related to the user's activity and the device placement. ...
Acknowledgments This research was supported in part by research funds to Naser El-Sheimy from TECTERRA Commercialization and Research Centre, the Canada Research Chairs Program, and the Natural Science ...
doi:10.3390/s140405742
pmid:24670715
pmcid:PMC4029676
fatcat:ez4ktd6cjvanvp5qblmbs6erwa
Multilogue-Net: A Context Aware RNN for Multi-modal Emotion Detection and Sentiment Analysis in Conversation
[article]
2020
arXiv
pre-print
Current systems dealing with Multi-modal functionality fail to leverage and capture - the context of the conversation through all modalities, the dependency between the listener(s) and speaker emotional ...
In this paper, we propose an end to end RNN architecture that attempts to take into account all the mentioned drawbacks. ...
The usage of only the text representation as input to the context GRUs has been observed to be key to the results, as the context of the conversation would be better captured by textual information then ...
arXiv:2002.08267v3
fatcat:bnvmzp2kuzcxfhtuhxaocbcqd4
A reranking approach for context-based concept fusion in video indexing and retrieval
2007
Proceedings of the 6th ACM international conference on Image and video retrieval - CIVR '07
We propose to incorporate hundreds of pre-trained concept detectors to provide contextual information for improving the performance of multimodal video search. ...
The approach takes initial search results from established video search methods (which typically are conservative in usage of concept detectors) and mines these results to discover and leverage co-occurrence ...
An extension of PRFB from the text domain to video search is to simply apply the method to text searches over text modalities from video sources (such as the speech recognition transcripts) [7] . ...
doi:10.1145/1282280.1282331
dblp:conf/civr/KennedyC07
fatcat:7gwjpfya7reetji5ob5jfofffu
Front Matter: Volume 8050
2011
Signal Processing, Sensor Fusion, and Target Recognition XX
The publisher is not responsible for the validity of the information or for any outcomes resulting from reliance thereon. ...
Utilization of CIDs allows articles to be fully citable as soon they are published online, and connects the same identifier to all online, print, and electronic versions of the publication. ...
in text, with possible reference to map
Fusion
Video Sensor 2
Video Data
Video
Data
Human Observer Human
Reports
Tracks with
Kinematic State
and Activity
Estimates
lxviii
Downloaded ...
doi:10.1117/12.899094
fatcat:mn5fo3c7ijfadnlmjcjcclpal4
Dynamic Data-driven Application System (DDDAS) for Video Surveillance User Support
2015
Procedia Computer Science
Information access includes multimedia fusion of query-based text, images, and exploited tracks which can be utilized for context assessment, content-based information retrieval (CBIR), and situation awareness ...
Inspired by Level 5 Information Fusion 'user refinement', a live-video computing (LVC) structure is presented for user-based query access of a data-base management of information. ...
To access the information, information fusion modeling is needed to provide context (Nguyen, et al, 2013) Information fusion has been applied to many applications. ...
doi:10.1016/j.procs.2015.05.359
fatcat:cc65hagfw5fb3bwjrq3yelv2x4
Semantic Video Search
2007
14th International Conference of Image Analysis and Processing - Workshops (ICIAPW 2007)
The MediaMill Challenge divides the generic video indexing problem into a visual-only, textualonly, early fusion, late fusion, and combined analysis experiment. ...
In this paper we describe the current performance of our MediaMill system as presented in the TRECVID 2006 benchmark for video search engines. ...
Acknowledgments This research is sponsored by the BSIK MultimediaN project, the NWO MuNCH project, and the EU 6th Framework project VIDI-Video. ...
doi:10.1109/iciapw.2007.39
fatcat:lsb6nuqx4nefjb2higadnf47ze
Online Reranking via Ordinal Informative Concepts for Context Fusion in Concept Detection and Video Search
2009
IEEE transactions on circuits and systems for video technology (Print)
Being largely unsupervised, the reranking approach to context fusion can be applied equally well to concept detection and video search. ...
To exploit the co-occurrence patterns of semantic concepts while keeping the simplicity of context fusion, a novel reranking approach is proposed in this paper. ...
Online Reranking via Ordinal Informative Concepts for Context Fusion in Concept Detection and Video Search videos, photo collections, broadcast news videos, and media sharing in the emerging social networks ...
doi:10.1109/tcsvt.2009.2026978
fatcat:vnkonvp4mvairpjeaoknam5s24
VISION-AIDED CONTEXT-AWARE FRAMEWORK FOR PERSONAL NAVIGATION SERVICES
2012
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
CONTEXT INFORMATION IN PNS In order to achieve a context-aware "vision-aided pedestrian navigation" system, two important questions must be answered: what type of context is important for such a system ...
For example, when the context information shows that device is in "texting" or "talking" mode, the observation from camera can be integrated with GPS sensor to improve and validate the pedestrian deadreckoning ...
doi:10.5194/isprsarchives-xxxix-b4-231-2012
fatcat:lwe5plsb7rf27b3ap3xog36nza
When did you become so smart, oh wise one?! Sarcasm Explanation in Multi-modal Multi-party Dialogues
[article]
2022
arXiv
pre-print
We propose MAF (Modality Aware Fusion), a multimodal context-aware attention and global information fusion module to capture multimodality and use it to benchmark WITS. ...
The proposed attention module surpasses the traditional multimodal fusion baselines and reports the best performance on almost all metrics. ...
Acknowledgement The authors would like to acknowledge the support of the Ramanujan Fellowship (SERB, India), Infosys Centre for AI (CAI) at IIIT-Delhi, and ihub-Anubhuti-iiitd Foundation set up under the ...
arXiv:2203.06419v1
fatcat:x6ue2y6a65g4fbsoir2a2uifou
« Previous
Showing results 1 — 15 out of 11,505 results