IBM Research and Columbia University TRECVID-2011 Multimedia Event Detection (MED) System

Liangliang Cao, Shih-Fu Chang, Noel Codella, Courtenay V. Cotton, Dan Ellis, Leiguang Gong, Matthew L. Hill, Gang Hua, John R. Kender, Michele Merler, Yadong Mu, Apostol Natsev (+1 others)
2011 TREC Video Retrieval Evaluation  
The IBM Research/Columbia team investigated a novel range of low-level and high-level features and their combination for the TRECVID Multimedia Event Detection (MED) task. We submitted four runs exploring various methods of extraction, modeling and fusing of low-level features and hundreds of high-level semantic concepts. Our Run 1 developed event detection models utilizing Support Vector Machines (SVMs) trained from a large number of low-level features and was interesting in establishing the
more » ... seline performance for visual features from static video frames. Run 2 trained SVMs from classification scores generated by 780 visual, 113 action and 56 audio high-level semantic classifiers and explored various temporal aggregation techniques. Run 2 was interesting in assessing performance based on different kinds of high-level semantic information. Run 3 fused the lowand high-level feature information and was interesting in providing insight into the complementarity of this information for detecting events. Run 4 fused all of these methods and explored a novel Scene Alignment Model (SAM) algorithm that utilized temporal information discretized by scene changes in the video.
dblp:conf/trecvid/CaoCCCEGH0KMMNS11 fatcat:gg5hdmdh65bwlcksf4q467lt7m