Filters








11,505 Hits in 5.3 sec

Wide-area motion imagery (WAMI) exploitation tools for enhanced situation awareness

Erik Blasch, Guna Seetharaman, Kannappan Palaniappan, Haibin Ling, Genshe Chen
2012 2012 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)  
Multi-source data fusion using exploitation context from the video needs to be linked to semantically extracted elements for situation awareness to aid an operator in rapid image understanding.  ...  To aid in the process of WAMI exploitation, we explore computer vision and pattern recognition methods to cue the user to salient information.  ...  The explosive multimedia growth has closed the gap between richness of linguistic communication, once limited to text and print, to video.  ... 
doi:10.1109/aipr.2012.6528198 dblp:conf/aipr/BlaschSPLC12 fatcat:iv3ek6pdpfdkxkgob5ogynr56e

Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the Baselines [article]

Soujanya Poria, Navonil Majumder, Devamanyu Hazarika, Erik Cambria, Alexander Gelbukh, Amir Hussain
2019 arXiv   pre-print
This framework illustrates the different facets of analysis to be considered while performing multimodal sentiment analysis and, hence, serves as a new benchmark for future research in this emerging field  ...  Such videos often contain comparisons of products from competing brands, pros and cons of product specifications, and other information that can aid prospective buyers to make informed decisions.  ...  [15] fused information from audio, visual and text modalities to extract emotion and sentiment. Metallinou et al. [9] fused audio and text modalities for emotion recognition.  ... 
arXiv:1803.07427v2 fatcat:jytchjl3gnbpjkyvp4kb3ih5tu

Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text

Subhashini Venugopalan, Lisa Anne Hendricks, Raymond Mooney, Kate Saenko
2016 arXiv   pre-print
This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos.  ...  Specifically, we integrate both a neural language model and distributional semantics trained on large text corpora into a recent LSTM-based architecture for video description.  ...  of our late and deep fusion approaches to integrate an independently trained LM to aid video captioning.  ... 
arXiv:1604.01729v2 fatcat:gr6mmqbvkbfz7omrmthaqfbjn4

Gated Mechanism for Attention Based Multimodal Sentiment Analysis [article]

Ayush Kumar, Jithendra Vepa
2020 arXiv   pre-print
Multimodal sentiment analysis has recently gained popularity because of its relevance to social media posts, customer service calls and video blogs.  ...  Fusion of unimodal and cross modal cues. Out of these three, we find that learning cross modal interactions is beneficial for this problem.  ...  Since we have sequences of up to 100 utterances in a video, self attention allows us to capture the long context.  ... 
arXiv:2003.01043v1 fatcat:t73swfhrx5e6ngtkysf65ebnri

Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text

Subhashini Venugopalan, Lisa Anne Hendricks, Raymond Mooney, Kate Saenko
2016 Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing  
This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos.  ...  Specifically, we integrate both a neural language model and distributional semantics trained on large text corpora into a recent LSTM-based architecture for video description.  ...  The deep fusion model learns jointly from the hidden representations of the LM and S2VT video-to-text model (Vid-LSTM), whereas the late fusion re-scores the softmax output of the video-to-text model.  ... 
doi:10.18653/v1/d16-1204 dblp:conf/emnlp/VenugopalanHMS16 fatcat:rkxhj2pojvdcpdovdwtfi2qqzm

Overview of contextual tracking approaches in information fusion

Erik Blasch, Jesus Garcia Herrero, Lauro Snidaro, James Llinas, Guna Seetharaman, Kannappan Palaniappan, Matthew F. Pellechia, Richard J. Sorensen, Kannappan Palaniappan
2013 Geospatial InfoFusion III  
We describe five contextual information categories that support target tracking: (1) domain knowledge from a user to aid the information fusion process through selection, cueing, and analysis, (2) environment-to-hardware  ...  Many information fusion solutions work well in the intended scenarios; but the applications, supporting data, and capabilities change over varying contexts.  ...  Fig. 2 . 2 Data Fusion Information Group Model (L = Level) Fig. 3.Context Modeling to support target tracking. Fig. 4 . 4 Context in Support of Information Fusion [8] .  ... 
doi:10.1117/12.2016312 fatcat:l24tj2zkqzgc7cvdsssuupipw4

Context-Aware Personal Navigation Using Embedded Sensor Fusion in Smartphones

Sara Saeedi, Adel Moussa, Naser El-Sheimy
2014 Sensors  
The proposed approach in this paper is using low-cost sensors in a multi-level fusion scheme to improve the accuracy and robustness of context-aware navigation system.  ...  The basic idea is that mobile navigation services can provide different services based on different contexts-where contexts are related to the user's activity and the device placement.  ...  Acknowledgments This research was supported in part by research funds to Naser El-Sheimy from TECTERRA Commercialization and Research Centre, the Canada Research Chairs Program, and the Natural Science  ... 
doi:10.3390/s140405742 pmid:24670715 pmcid:PMC4029676 fatcat:ez4ktd6cjvanvp5qblmbs6erwa

Multilogue-Net: A Context Aware RNN for Multi-modal Emotion Detection and Sentiment Analysis in Conversation [article]

Aman Shenoy, Ashish Sardana
2020 arXiv   pre-print
Current systems dealing with Multi-modal functionality fail to leverage and capture - the context of the conversation through all modalities, the dependency between the listener(s) and speaker emotional  ...  In this paper, we propose an end to end RNN architecture that attempts to take into account all the mentioned drawbacks.  ...  The usage of only the text representation as input to the context GRUs has been observed to be key to the results, as the context of the conversation would be better captured by textual information then  ... 
arXiv:2002.08267v3 fatcat:bnvmzp2kuzcxfhtuhxaocbcqd4

A reranking approach for context-based concept fusion in video indexing and retrieval

Lyndon S. Kennedy, Shih-Fu Chang
2007 Proceedings of the 6th ACM international conference on Image and video retrieval - CIVR '07  
We propose to incorporate hundreds of pre-trained concept detectors to provide contextual information for improving the performance of multimodal video search.  ...  The approach takes initial search results from established video search methods (which typically are conservative in usage of concept detectors) and mines these results to discover and leverage co-occurrence  ...  An extension of PRFB from the text domain to video search is to simply apply the method to text searches over text modalities from video sources (such as the speech recognition transcripts) [7] .  ... 
doi:10.1145/1282280.1282331 dblp:conf/civr/KennedyC07 fatcat:7gwjpfya7reetji5ob5jfofffu

Front Matter: Volume 8050

Proceedings of SPIE, Ivan Kadar
2011 Signal Processing, Sensor Fusion, and Target Recognition XX  
The publisher is not responsible for the validity of the information or for any outcomes resulting from reliance thereon.  ...  Utilization of CIDs allows articles to be fully citable as soon they are published online, and connects the same identifier to all online, print, and electronic versions of the publication.  ...  in text, with possible reference to map Fusion Video Sensor 2 Video Data Video Data Human Observer Human Reports Tracks with Kinematic State and Activity Estimates lxviii Downloaded  ... 
doi:10.1117/12.899094 fatcat:mn5fo3c7ijfadnlmjcjcclpal4

Dynamic Data-driven Application System (DDDAS) for Video Surveillance User Support

Erik P. Blasch, Alex J. Aved
2015 Procedia Computer Science  
Information access includes multimedia fusion of query-based text, images, and exploited tracks which can be utilized for context assessment, content-based information retrieval (CBIR), and situation awareness  ...  Inspired by Level 5 Information Fusion 'user refinement', a live-video computing (LVC) structure is presented for user-based query access of a data-base management of information.  ...  To access the information, information fusion modeling is needed to provide context (Nguyen, et al, 2013) Information fusion has been applied to many applications.  ... 
doi:10.1016/j.procs.2015.05.359 fatcat:cc65hagfw5fb3bwjrq3yelv2x4

Semantic Video Search

A.W.M. Smeulders, J.C. van Gemert, B. Huurnink, D.C. Koelma, O. de Rooij, K.E.A. van de Sande, C.G.M. Snoek, C.J. Veenman, M. Worring
2007 14th International Conference of Image Analysis and Processing - Workshops (ICIAPW 2007)  
The MediaMill Challenge divides the generic video indexing problem into a visual-only, textualonly, early fusion, late fusion, and combined analysis experiment.  ...  In this paper we describe the current performance of our MediaMill system as presented in the TRECVID 2006 benchmark for video search engines.  ...  Acknowledgments This research is sponsored by the BSIK MultimediaN project, the NWO MuNCH project, and the EU 6th Framework project VIDI-Video.  ... 
doi:10.1109/iciapw.2007.39 fatcat:lsb6nuqx4nefjb2higadnf47ze

Online Reranking via Ordinal Informative Concepts for Context Fusion in Concept Detection and Video Search

Y.-H. Yang, W.H. Hsu, H.H. Chen
2009 IEEE transactions on circuits and systems for video technology (Print)  
Being largely unsupervised, the reranking approach to context fusion can be applied equally well to concept detection and video search.  ...  To exploit the co-occurrence patterns of semantic concepts while keeping the simplicity of context fusion, a novel reranking approach is proposed in this paper.  ...  Online Reranking via Ordinal Informative Concepts for Context Fusion in Concept Detection and Video Search videos, photo collections, broadcast news videos, and media sharing in the emerging social networks  ... 
doi:10.1109/tcsvt.2009.2026978 fatcat:vnkonvp4mvairpjeaoknam5s24

VISION-AIDED CONTEXT-AWARE FRAMEWORK FOR PERSONAL NAVIGATION SERVICES

S. Saeedi, A. Moussa, N. El-Sheimy
2012 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences  
CONTEXT INFORMATION IN PNS In order to achieve a context-aware "vision-aided pedestrian navigation" system, two important questions must be answered: what type of context is important for such a system  ...  For example, when the context information shows that device is in "texting" or "talking" mode, the observation from camera can be integrated with GPS sensor to improve and validate the pedestrian deadreckoning  ... 
doi:10.5194/isprsarchives-xxxix-b4-231-2012 fatcat:lwe5plsb7rf27b3ap3xog36nza

When did you become so smart, oh wise one?! Sarcasm Explanation in Multi-modal Multi-party Dialogues [article]

Shivani Kumar, Atharva Kulkarni, Md Shad Akhtar, Tanmoy Chakraborty
2022 arXiv   pre-print
We propose MAF (Modality Aware Fusion), a multimodal context-aware attention and global information fusion module to capture multimodality and use it to benchmark WITS.  ...  The proposed attention module surpasses the traditional multimodal fusion baselines and reports the best performance on almost all metrics.  ...  Acknowledgement The authors would like to acknowledge the support of the Ramanujan Fellowship (SERB, India), Infosys Centre for AI (CAI) at IIIT-Delhi, and ihub-Anubhuti-iiitd Foundation set up under the  ... 
arXiv:2203.06419v1 fatcat:x6ue2y6a65g4fbsoir2a2uifou
« Previous Showing results 1 — 15 out of 11,505 results