Filters








12,961 Hits in 6.3 sec

Semantic context detection based on hierarchical audio models

Wen-Huang Cheng, Wei-Ta Chu, Ja-Ling Wu
2003 Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval - MIR '03  
Then semantic context detection is achieved based on Gaussian mixture models, which model the correlations among several audio events temporally.  ...  In this paper, we propose a novel hierarchical approach that models the statistical characteristics of several audio events, over a time series, to accomplish semantic context detection.  ...  In this paper, a hierarchical framework is proposed to detect semantic contexts in action movies.  ... 
doi:10.1145/973264.973282 dblp:conf/mir/ChengCW03 fatcat:ped7mufkxjhvze7iz5ic36hg5i

Toward semantic indexing and retrieval using hierarchical audio models

Wei-Ta Chu, Wen-Huang Cheng, Jane Yung-Jen Hsu, Ja-Ling Wu
2005 Multimedia Systems  
We propose a hierarchical approach that models the statistical characteristics of audio events over a time series to accomplish semantic context detection.  ...  They provide cues for detecting gunplay and car-chasing scenes, two semantic contexts we focus on in this work.  ...  Performance of semantic context detection In semantic context detection, the models based on GMM and HMM are evaluated.  ... 
doi:10.1007/s00530-005-0183-6 fatcat:qpvnlpegirfnra63g74ufd6n3q

Semantic Context Detection Using Audio Event Fusion

Wei-Ta Chu, Wen-Huang Cheng, Ja-Ling Wu
2006 EURASIP Journal on Advances in Signal Processing  
We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection.  ...  Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts.  ...  Evaluation of semantic context detection In semantic context detection, the models based on HMM and SVM are evaluated, respectively.  ... 
doi:10.1155/asp/2006/27390 fatcat:kh6iuqudrfhpfg4m2zidihra7y

Context analysis in intelligent meeting scene

Xiang Zhang, Lin-Mi Tao, Guang-You Xu, Xiong Luo
2007 2007 International Conference on Wavelet Analysis and Pattern Recognition  
Hierarchical dynamic Bayesian networks, which model group events and situation context, are constructed to bridge the gap between physical audio features and semantic concepts.  ...  Context analysis is a crucial issue in dynamic meeting scenarios for online information service and offline semantic information retrieval.  ...  Studies on semantic analysis and indexing can be separated into two levels: isolated audio/video event detection and semantics identification.  ... 
doi:10.1109/icwapr.2007.4421571 fatcat:pxee723d6zdzdn7lbpvuj4ylpy

Unsupervised Structure Discovery for Semantic Analysis of Audio

Sourish Chaudhuri, Bhiksha Raj
2012 Neural Information Processing Systems  
Approaches to audio classification and retrieval tasks largely rely on detectionbased discriminative models.  ...  We present a generative model that maps acoustics in a hierarchical manner to increasingly higher-level semantics.  ...  Then, we present results using the 2-level hierarchical model on the event kit of the 2011 TRECVID Multimedia Event Detection (MED) task [22] .  ... 
dblp:conf/nips/ChaudhuriR12 fatcat:yax3i6nzpfbqzluthhnsnyj6ke

A flexible framework for key audio effects detection and auditory context inference

R. Cai, Lie Lu, A. Hanjalic, Hong-Jiang Zhang, Lian-Hong Cai
2006 IEEE Transactions on Audio, Speech, and Language Processing  
Evaluations on 12 h of audio data indicate that the proposed framework can achieve satisfying results, both on key audio effect detection and auditory context inference.  ...  In this paper, a flexible framework is proposed for key audio effect detection in a continuous audio stream, as well as for the semantic inference of an auditory context.  ...  AUDITORY CONTEXT INFERENCE Based on the obtained key effect sequence, we further extend the framework to detect high-level semantics in an audio stream.  ... 
doi:10.1109/tsa.2005.857575 fatcat:c7cnzzxfwrfs5bhm3wrcb73ebe

A fusion scheme of visual and auditory modalities for event detection in sports video

Min Xu, Ling-Yu Duan, Chang-Sheng Xu, Qi Tian
2003 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)  
Since we have developed a unified framework for semantic shot classification in sports videos and a set of audio mid-level representation with supervised learning methods, the proposed fusion scheme can  ...  Among major shot classes we perform classification of the different auditory signal segments (i.e. silence, hitting ball, applause, commentator speech) with the goal of detecting events with strong semantic  ...  By using MFCC, we obtained satisfactory A Hierarchical SVM Classifier Based on the above analysis, we propose a two-layer hierarchical SVM classifier as shown in Figure 5 .  ... 
doi:10.1109/icme.2003.1220922 dblp:conf/icmcs/XuDXT03 fatcat:6gxfz64ayzb7nfg65445zjtwbm

Information Mining from Multimedia Databases

Ling Guan, Horace HS Ip, Paul H Lewis, Hau San Wong, Paisarn Muneesawang
2006 EURASIP Journal on Advances in Signal Processing  
The next two papers describe new information mining techniques based on the extraction and characterization of audio features.  ...  The technique is based on the extraction and analysis of multimodal features which include visual, motion, and audio information.  ...  In the next paper, Chu et al. introduce a hierarchical approach for modeling the statistical characteristics of audio events over a time series to achieve semantic context detection.  ... 
doi:10.1155/asp/2006/49073 fatcat:louxnv5c5bggrdeyvjw6qavi4m

Hierarchical RNN with Static Sentence-Level Attention for Text-Based Speaker Change Detection [article]

Zhao Meng, Lili Mou, Zhi Jin
2018 arXiv   pre-print
Speaker change detection (SCD) is an important task in dialog modeling.  ...  Our paper addresses the problem of text-based SCD, which differs from existing audio-based studies and is useful in various scenarios, for example, processing dialog transcripts where speaker identities  ...  Because our task is based on text (and is not online speaker change detection), we actually have access to the "future context" after the decision point.  ... 
arXiv:1703.07713v2 fatcat:c4pzvgqo55evzjjcajmafkkejm

A probabilistic layered framework for integrating multimedia content and context information

R. S. Jasinschi, N. Dimitrova, T. McGee, L. Agnihotri, J. Zimmerman, D. Li, J. Louie
2002 IEEE International Conference on Acoustics Speech and Signal Processing  
We discuss experimental results on segment classification on six and a half hours of broadcast video. In our experiments we used audio context information.  ...  We introduce a probabilistic framework that combines (a) Bayesian networks that describe both content and context and (b) hierarchical priors that describe the integration of content and context.  ...  Related work on multimodal integration includes the one by Vasconcelos and Lippman [5] that proposes a statistical model for video shot detection and semantic characterization.  ... 
doi:10.1109/icassp.2002.5745038 dblp:conf/icassp/JasinschiDMAZLL02 fatcat:g6p6auy2mrbdveqxuhb4gn7dqi

A probabilistic layered framework for integrating multimedia content and context information

Jasinschi, Dimitrova, McGee, Agnihotri, Zimmerman, Li, Louie
2002 IEEE International Conference on Acoustics Speech and Signal Processing  
We discuss experimental results on segment classification on six and a half hours of broadcast video. In our experiments we used audio context information.  ...  We introduce a probabilistic framework that combines (a) Bayesian networks that describe both content and context and (b) hierarchical priors that describe the integration of content and context.  ...  Related work on multimodal integration includes the one by Vasconcelos and Lippman [5] that proposes a statistical model for video shot detection and semantic characterization.  ... 
doi:10.1109/icassp.2002.1006178 fatcat:fqsw2vlesnbixe6bs5oqpxnen4

Audio Concept Classification with Hierarchical Deep Neural Networks [article]

Mirco Ravanelli, Benjamin Elizalde, Karl Ni, Gerald Friedland
2017 arXiv   pre-print
Audio-based multimedia retrieval tasks may identify semantic information in audio streams, i.e., audio concepts (such as music, laughter, or a revving engine).  ...  Conventional Gaussian-Mixture-Models have had some success in classifying a reduced set of audio concepts.  ...  audio-based video event detection.  ... 
arXiv:1710.04288v1 fatcat:bdsq3cdpnndfbb2ppadiw4zvbm

Modeling sports highlights using a time-series clustering framework and model interpretation

Regunathan Radhakrishnan, Isao Otsuka, Ziyou Xiong, Ajay Divakaran, Rainer W. Lienhart, Noboru Babaguchi, Edward Y. Chang
2005 Storage and Retrieval Methods and Applications for Multimedia 2005  
The audio classes in the framework were chosen based on intuition.  ...  In our past work on sports highlights extraction, we have shown the utility of detecting audience reaction using an audio classification framework.  ...  In our past work on sports highlights extraction, we have shown the utility of detecting audience reaction using an audio classification framework. 6 The audio classes in the framework were chosen based  ... 
doi:10.1117/12.588059 dblp:conf/spieSR/RadhakrishnanOXD05 fatcat:rz6zfskp5nekfpjni5py2dgm4a

A study on video data mining

V. Vijayakumar, R. Nedunchezhian
2012 International Journal of Multimedia Information Retrieval  
Data mining is a process of extracting previously unknown knowledge and detecting the interesting patterns from a massive set of data.  ...  Video is an example of multimedia data as it contains several kinds of data such as text, image, meta-data, visual and audio.  ...  VCube algorithm uses the video bases of genre classes to classify a video clip and the audio bases to classify the clips based on their audio information.  ... 
doi:10.1007/s13735-012-0016-2 fatcat:xuuf3w3b2rfcxlyevzndz6v62e

Audio contributions to semantic video search

I. Trancoso, T. Pellegrini, J. Portelo, H. Meinedo, M. Bugalho, A. Abad, J. Neto
2009 2009 IEEE International Conference on Multimedia and Expo  
The paper thus covers what is generally known as audio segmentation, as well as audio event detection. Using machine learning approaches, we have built detectors for over 50 semantic audio concepts.  ...  This paper summarizes the contributions to semantic video search that can be derived from the audio signal. Because of space restrictions, the emphasis will be on non-linguistic cues.  ...  Optionally, there may be an intermediate stage of key audio effect detection, typically based on Hidden Markov Models, that explores the time structure of the events and/or models interconnections between  ... 
doi:10.1109/icme.2009.5202575 dblp:conf/icmcs/TrancosoPPMBAN09 fatcat:gtl7evg7xzbbtlmybabegywvmq
« Previous Showing results 1 — 15 out of 12,961 results