Filters








8,760 Hits in 4.2 sec

A multi-modal system for the retrieval of semantic video events

Arnon Amir, Sankar Basu, Giridharan Iyengar, Ching-Yung Lin, Milind Naphade, John R. Smith, Savitha Srinivasan, Belle Tseng
2004 Computer Vision and Image Understanding  
A framework for event detection is proposed where events, objects, and other semantic concepts are detected from video using trained classifiers.  ...  These classifiers are used to automatically annotate video with semantic labels, which in turn are used to search for new, untrained types of events and semantic concepts.  ...  Acknowledgments We are very grateful to Paul Over and Ramazan Taban, NIST, for organizing the video track.  ... 
doi:10.1016/j.cviu.2004.02.006 fatcat:enuxpfgaxbggphy6g2fwsgdmti

Content-based video indexing for sports applications using integrated multi-modal approach

Dian Tjondronegoro, Yi-Ping Phoebe Chen, Binh Pham
2005 Proceedings of the 13th annual ACM international conference on Multimedia - MULTIMEDIA '05  
This doctoral consists of a research work based on an integrated multi-modal approach for sports video indexing and retrieval.  ...  To sustain an ongoing rapid growth of video information, there is an emerging demand for a sophisticated content-based video indexing system.  ...  This doctoral consists of a research work based on an integrated multi-modal approach for sports video indexing and retrieval.  ... 
doi:10.1145/1101149.1101362 dblp:conf/mm/TjondronegoroCP05 fatcat:r7flpxbh4fbfrbtnepauzo26du

Exploiting Multimedia Content: A Machine Learning Based Aproach

Ehtesham Hassan
2014 ELCVIA Electronic Letters on Computer Vision and Image Analysis  
An experimental evaluation of the framework is shown for semantic event detection in sport videos, and semantic labelling of components of multi-modal document images.  ...  The combination is also explored for semantic concept recognition using multi-modal components of the same document, and different documents over a collection.  ...  An experimental evaluation of the framework is shown for semantic event detection in sport videos, and semantic labelling of components of multi-modal document images.  ... 
doi:10.5565/rev/elcvia.598 fatcat:mmeqgte4pzedzbtuukeepqxlgm

Semantic-Based Video Retrieval Survey

Shaimaa Toriah Mohamed Toriah, Atef Zaki Ghalwash, Aliaa A. A. Youssif
2018 Journal of Computer and Communications  
Moreover, the different methods that bridge the semantic gap in video retrieval are discussed in more details.  ...  Video represents a rich source of information. Thus, there is an urgent need to retrieve, organize, and automate videos.  ...  Conflicts of Interest The authors declare no conflicts of interest regarding the publication of this paper.  ... 
doi:10.4236/jcc.2018.68003 fatcat:qfep2py7ufhwxea7vazpujltja

Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues

W. H. Adams, Giridharan Iyengar, Ching-Yung Lin, Milind Ramesh Naphade, Chalapathy Neti, Harriet J. Nock, John R. Smith
2003 EURASIP Journal on Advances in Signal Processing  
We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of concepts in the lexicon.  ...  In this paper we present a learning-based approach to semantic indexing of multimedia content using cues derived from audio, visual and text features.  ...  Figure 1 : 1 Diagram of Semantic Concept Analysis System Figure 2 : 2 A Multi-modal Annotation Interface.  ... 
doi:10.1155/s1110865703211173 fatcat:rwkygctgzjfx3fey7e722djxgq

Guest editorial: web multimedia semantic inference using multi-cues

Yahong Han, Yi Yang, Xiaofang Zhou
2015 World wide web (Bussum)  
A multi-modal semantic graph is constructed to find the embedded manifold cross-media correlations. The proposed method shows good performance in cross-media retrieval for image-audio dataset.  ...  Multi-modality and cross-media analysis are typical ways of multi-cue analysis in Web multimedia semantic inference.  ... 
doi:10.1007/s11280-015-0360-2 fatcat:vc4plge5qvg7hfmza3dffmawki

A Survey of Data Representation for Multi-Modality Event Detection and Evolution

Kejing Xiao, Zhaopeng Qian, Biao Qin
2022 Applied Sciences  
The goal of multi-modality event detection is to discover events from a huge amount of online data with different data structures, such as texts, images and videos.  ...  Next, we discuss the techniques of data representation for event detection, including textual, visual, and multi-modality content. Finally, we review event evolution under multi-modality data.  ...  [110] proposed a system for event extraction and retrieval called EventSearch.  ... 
doi:10.3390/app12042204 fatcat:5gpezz6yhjejlmdzr5fhpgka6m

Informedia @ TRECVID 2018: Ad-hoc Video Search, Video to Text Description, Activities in Extended video

Jia Chen, Shizhe Chen, Qin Jin, Alexander G. Hauptmann, Po-Yao Huang, Junwei Liang, Vaibhav, Xiaojun Chang, Jiang Liu, Ting-Yao Hu, Wenhe Liu, Wei Ke (+7 others)
2018 TREC Video Retrieval Evaluation  
In this section of the notebook, we present our system in the TRECVID Video to Text description generation task.  ...  The optimization target is critical to train the encoder-decoder based video captioning models.  ...  Parallel to visual semantics, learning continuous textual-visual representations have been proven useful to encode and retrieve multi-modal instances [13; 7; 3] .  ... 
dblp:conf/trecvid/ChenCJH00VCLHLK18 fatcat:4hie3xjj65gwdeoe7odbddrmrq

Using High-Level Semantic Features in Video Retrieval [chapter]

Wujie Zheng, Jianmin Li, Zhangzhang Si, Fuzong Lin, Bo Zhang
2006 Lecture Notes in Computer Science  
The method can also be extended for the fusion of multi-modalities. Experiment results based on TRECVID2005 corpus demonstrate the effectiveness of the method.  ...  Extraction and utilization of high-level semantic features are critical for more effective video retrieval.  ...  Fig. 1 . 1 Overview of the video retrieval system The magnitude of I(y, x i ) indicates the power of influence of event {X = x i } to event {Y = y}, while the sign of I(y, x i ) indicates the direction  ... 
doi:10.1007/11788034_38 fatcat:jdx7qjho6vgyze326a2upy5r6e

Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning

Shizhe Chen, Yida Zhao, Qin Jin, Qi Wu
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Cross-modal retrieval between videos and texts has attracted growing attentions due to the rapid emergence of videos on the web.  ...  Different levels of texts can guide the learning of diverse and hierarchical video representations for cross-modal matching to capture both global and local details.  ...  Conclusion Most successful cross-modal video-text retrieval systems are based on joint embedding approaches.  ... 
doi:10.1109/cvpr42600.2020.01065 dblp:conf/cvpr/ChenZJW20 fatcat:brlrtsp7lre7bne7cm37esr5wu

Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning [article]

Shizhe Chen, Yida Zhao, Qin Jin, Qi Wu
2020 arXiv   pre-print
Cross-modal retrieval between videos and texts has attracted growing attentions due to the rapid emergence of videos on the web.  ...  The current dominant approach for this problem is to learn a joint embedding space to measure cross-modal similarities.  ...  Conclusion Most successful cross-modal video-text retrieval systems are based on joint embedding approaches.  ... 
arXiv:2003.00392v1 fatcat:4yrqi2cluvhthbd5ipb4bw5zna

Semantic Based Video Retrieval System: Survey

2018 Iraqi Journal of Science  
The video retrieval system is used for finding the users' desired video among a huge number of available videos on the Internet or database.  ...  This paper gives a general discussion on the overall process of the semantic video retrieval phases.  ...  to obtain multi-modality and multi-concept learning leads to exploitation their respective strengths and upgrades the performance of retrieval system.  ... 
doi:10.24996/ijs.2018.59.2a.12 fatcat:6fvq6pygqzglbptl4czxpzbjbm

Semantics for Large-Scale Multimedia: New Challenges for NLP

Florian Metze, Koichi Shinoda
2014 Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Tutorials  
Semantic Indexing • State-of-the art frameworks • Extension of Bag-of-Word model • Multi-modality 3.  ...  We liken "Semantic Indexing" (SIN) task, in which a system must identify occurrences of concepts such as "desk", or "dancing" in a video to the word spotting approach.  ... 
doi:10.3115/v1/p14-6003 dblp:conf/acl/MetzeS14 fatcat:uddxcz47sbdgbjlld5eblhvfmq

Extracting Semantics from Multimedia Content: Challenges and Solutions [chapter]

Lexing Xie, Rong Yan
2008 Signals and Communication Technology  
amounts of training data, and finally leveraging media semantics in retrieval systems.  ...  The lack of effective indexes to describe the content of multimedia data is a main hurdle to multimedia search, and extracting semantics from multimedia content is the bottleneck for multimedia indexing  ...  [42] described a joint text/image modeling approach for video retrieval that allows the full interaction between multi-modalities to result in a considerable performance improvement in TRECVID datasets  ... 
doi:10.1007/978-0-387-76569-3_2 fatcat:jul6fw7esfaurct6erjnvpcq6q

Unsupervised scene detection in Olympic video using multi-modal chains

Gert-Jan Poulisse, Marie-Francine Moens
2011 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)  
This paper presents a novel unsupervised method for identifying the semantic structure in long semistructured video streams.  ...  Each chain serves as an indicator that the temporal interval it demarcates is part of the same semantic event.  ...  The idea is to extract features from several modalities and to unify them in a universal system of multi-modal chains, each representing a particular type of similar feature.  ... 
doi:10.1109/cbmi.2011.5972529 dblp:conf/cbmi/PoulisseM11 fatcat:2rbj7ejojre6hadp3ullr727gu
« Previous Showing results 1 — 15 out of 8,760 results