1,955 Hits in 5.0 sec

Local Expert Forest of Score Fusion for Video Event Classification [chapter]

Jingchen Liu, Scott McCloskey, Yanxi Liu
2012 Lecture Notes in Computer Science  
The core contribution of this paper is a local expert forest model for meta-level score fusion for event detection under heavily imbalanced class distributions.  ...  Multiple pairs of experts based on different partitions ('trees') form a 'forest', balancing local adaptivity and over-fitting of the model.  ...  Acknowledgements We thank our teammates for providing base classifier scores: Byungki Byun, Ilseo Kim, Ben Miller, Greg Mori, Sangmin Oh, Amitha Perera, and Arash Vahdat.  ... 
doi:10.1007/978-3-642-33715-4_29 fatcat:3ihg7ijwnzcetieayhjmmveiba

Metadata-Weighted Score Fusion for Multimedia Event Detection

Scott McCloskey, Jingchen Liu
2014 2014 Canadian Conference on Computer and Robot Vision  
We employ score fusion, also known as late fusion, and propose a method that learns local weightings of the various base classifier scores which respect the performance differences arising from the video  ...  We address the problem of multimedia event detection from videos captured 'in the wild,' in particular the fusion of cues from multiple aspects of the video's content: detected objects, observed motion  ...  ACKNOWLEDGEMENTS The work described in this paper was done as part of a collaboration with Kitware Inc., Simon Fraser University, and Georgia Tech, who provided base classifier scores.  ... 
doi:10.1109/crv.2014.47 dblp:conf/crv/McCloskeyL14 fatcat:qriolg7lzfhudpp5ds37qkbqpm

Multimedia event detection with multimodal feature fusion and temporal concept localization

Sangmin Oh, Scott McCloskey, Ilseo Kim, Arash Vahdat, Kevin J. Cannons, Hossein Hajimirsadeghi, Greg Mori, A. G. Amitha Perera, Megha Pandey, Jason J. Corso
2013 Machine Vision and Applications  
We present a system for multimedia event detection.  ...  The developed system characterizes complex multimedia events based on a large array of multimodal features, and classifies unseen videos by effectively fusing diverse responses.  ...  , of IARPA, DOI/NBC, or the U.S.  ... 
doi:10.1007/s00138-013-0525-x fatcat:m5grko5ls5denhtst2btnwdmmy

Informedia@TRECVID 2011: Surveillance Event Detection

Lei Bao, Longfei Zhang, Shoou-I Yu, Zhen-zhong Lan, Lu Jiang, Arnold Overwijk, Qin Jin, Shohei Takahashi, Brian Langner, Yuanpeng Li, Michael Garbus, Susanne Burger (+2 others)
2011 TREC Video Retrieval Evaluation  
For Multimedia Event Detection and Semantic Indexing of concepts, generally, both of these tasks consist of three main steps: extracting features, training detectors and fusion.  ...  This approach is based on local spatio-temporal descriptors, called MoSIFT, and generated from pair-wise video frames.  ...  , of IARPA, DoI/NBC, or the U.S.  ... 
dblp:conf/trecvid/BaoZYL0OJTLLGBM11 fatcat:a3eteiiit5cy7el5epiqvphqsi

The MediaMill TRECVID 2012 Semantic Video Search Engine

Cees G. M. Snoek, Koen E. A. van de Sande, AmirHossein Habibian, Svetlana Kordumova, Zhenyang Li, Masoud Mazloom, Silvia L. Pintea, Ran Tao, Dennis C. Koelma, Arnold W. M. Smeulders
2012 TREC Video Retrieval Evaluation  
Our event detection and recounting experiments focus on representations using concept detectors. For instance search we study the influence of spatial verification and color invariance.  ...  In this paper we describe our TRECVID 2012 video retrieval experiments.  ...  , of IARPA, DoI/NBC, or the U.S.  ... 
dblp:conf/trecvid/SnoekSHKLMP0KS12 fatcat:tkgsy56yiremddcgodbajxppde

High-level event recognition in unconstrained videos

Yu-Gang Jiang, Subhabrata Bhattacharya, Shih-Fu Chang, Mubarak Shah
2012 International Journal of Multimedia Information Retrieval  
across different modalities, classification strategies, fusion techniques, etc.  ...  The goal of high-level event recognition is to automatically detect complex high-level events in a given video sequence.  ...  Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.  ... 
doi:10.1007/s13735-012-0024-2 fatcat:mfzttic3svb4tho2xb6aczgp4y

Sample-Specific Late Fusion for Visual Category Recognition

Dong Liu, Kuan-Ting Lai, Guangnan Ye, Ming-Syan Chen, Shih-Fu Chang
2013 2013 IEEE Conference on Computer Vision and Pattern Recognition  
However, the existing methods generally use a fixed fusion weight for all the scores of a classifier, and thus fail to optimally determine the fusion weight for the individual samples.  ...  Late fusion addresses the problem of combining the prediction scores of multiple classifiers, in which each score is predicted by a classifier trained with a specific feature.  ...  [14] proposed a local expert forest model for late fusion, which partitioned the score space into local regions and learned the local fusion weights in each region.  ... 
doi:10.1109/cvpr.2013.109 dblp:conf/cvpr/LiuLYCC13 fatcat:qpe3i62pavc5tfbxx25dc6kah4

Learning Sample Specific Weights for Late Fusion

Kuan-Ting Lai, Dong Liu, Shih-Fu Chang, Ming-Syan Chen
2015 IEEE Transactions on Image Processing  
To the best of our knowledge, this is the first method that supports learning of sample specific fusion weights for late fusion.  ...  fusion scores than negative samples.  ...  [9] recently proposed a local expert forest model for late fusion, which partitions the score space into local regions and learns the local fusion weights in each region.  ... 
doi:10.1109/tip.2015.2423560 pmid:25879948 fatcat:pmbu7epunvbtvh24iqtexlcbxq

Efficient disease detection in gastrointestinal videos – global features versus neural networks

Konstantin Pogorelov, Michael Riegler, Sigrun Losada Eskeland, Thomas de Lange, Dag Johansen, Carsten Griwodz, Peter Thelin Schmidt, Pål Halvorsen
2017 Multimedia tools and applications  
Analysis of medical videos from the human gastrointestinal (GI) tract for detection and localization of abnormalities like lesions and diseases requires both high precision and recall.  ...  The system combines deep learning neural networks, information retrieval, and analysis of global and local image features in order to implement multi-class classification, detection and localization.  ...  Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution  ... 
doi:10.1007/s11042-017-4989-y fatcat:w5adpg2k6jc3hoc6rai5mcxwj4

AVEC 2016

Michel Valstar, Maja Pantic, Jonathan Gratch, Björn Schuller, Fabien Ringeval, Dennis Lalanne, Mercedes Torres Torres, Stefan Scherer, Giota Stratou, Roddy Cowie
2016 Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge - AVEC '16  
Fusion of audio and video modalities was performed by averaging the regression outputs of the unimodal random forest regressors.  ...  Classification and training was performed on a frame-wise basis (i.e., at 100Hz for audio and 30Hz for video); temporal fusion was conducted through simple majority voting of all the frames within an  ... 
doi:10.1145/2988257.2988258 dblp:conf/mm/ValstarGSRLTSSC16 fatcat:5slb4a7xvbf5bgqnkek244jhme

Semantic Mapping in Video Retrieval

Maaike H.T. de Boer
2018 SIGIR Forum  
This conclusion is also drawn by Oh et al. (2014) with their Local Expert Forest learning algorithm experimented on the TRECVID MED 2011 Test set.  ...  M is the set of methods, W em is the weight for expansion method em and S e,v,em is the score for video v and event e in expansion method em The fusion score of each combination of two, three and four  ... 
doi:10.1145/3190580.3190606 fatcat:a7agjytxhng4na47sfv7xsoy2a

AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge [article]

Michel Valstar, Jonathan Gratch, Bjorn Schuller, Fabien Ringeval, Denis Lalanne, Mercedes Torres Torres, Stefan Scherer, Guiota Stratou, Roddy Cowie, Maja Pantic
2016 arXiv   pre-print
establish to what extent fusion of the approaches is possible and beneficial.  ...  The Audio/Visual Emotion Challenge and Workshop (AVEC 2016) "Depression, Mood and Emotion" will be the sixth competition event aimed at comparison of multimedia processing and machine learning methods  ...  Fusion of audio and video modalities was performed by averaging the regression outputs of the unimodal random forest regressors.  ... 
arXiv:1605.01600v4 fatcat:j5bbsbjijzbgxh5zpclfksr4vu

Real World Anomalous Scene Detection and Classification using Multilayer Deep Neural Networks

Atif Jan, Gul Muhammad Khan
2021 International Journal of Interactive Multimedia and Artificial Intelligence  
Surveillance videos record malicious events in a locality utilizing various machine learning algorithms for detection.  ...  Lastly, a comparison between the state-of-the-art networks have been presented for malicious event recognition in videos.  ...  But thanks to my team at National Center of Artificial Intelligence who helped, supported, and motivated me on every step.  ... 
doi:10.9781/ijimai.2021.10.010 fatcat:hfircfu6g5f2lpvv25z3esnqc4

Data Fusion in Earth Observation and the Role of Citizen as a Sensor: A Scoping Review of Applications, Methods and Future Trends

Aikaterini Karagiannopoulou, Athanasia Tsertou, Georgios Tsimiklis, Angelos Amditis
2022 Remote Sensing  
An exception is revealed in the smaller-scale studies, which showed a preference for deep learning models.  ...  Subsequent reference is given on EO-data, their corresponding conversions, the citizens' participation digital tools, and Data Fusion (DF) models that are predominately exploited.  ...  Under these conditions, cities are exposed to high concentrations of GHGs, and local events, such as urban flash floods, intense droughts on land, long-standing forest fires [5] , and extreme heatwaves  ... 
doi:10.3390/rs14051263 fatcat:ffg4ulntnrdrxaknuydepyvelm

Recognition and localization of relevant human behavior in videos

Henri Bouma, Gertjan Burghouts, Leo de Penning, Patrick Hanckmann, Johan-Martijn ten Hove, Sanne Korzec, Maarten Kruithof, Sander Landsmeer, Coen van Leeuwen, Sebastiaan van den Broek, Arvid Halma, Richard den Hollander (+2 others)
2013 Sensors, and Command, Control, Communications, and Intelligence (C3I) Technologies for Homeland Security and Homeland Defense XII  
The system is trained on thousands of videos and evaluated on realistic persistent surveillance data in the DARPA Mind's Eye program, with hours of videos of challenging scenes.  ...  The results show that our system is able to track the people, detect and localize events, and discriminate between different behaviors, and it performs 3.4 times better than our previous system.  ...  Maximum likelihood (ML), the standard approach to using HMMs for classification, selects the event model that maximizes the likelihood of an observed event.  ... 
doi:10.1117/12.2015877 fatcat:qolalgnjgjf2zadbkyseufbnsy
« Previous Showing results 1 — 15 out of 1,955 results