Recognizing human-vehicle interactions from aerial video without training

Jong Taek Lee, Chia-Chih Chen, J. K. Aggarwal
2011 CVPR 2011 WORKSHOPS  
We propose a novel framework to recognize humanvehicle interactions from aerial video. In this scenario, the object resolution is low, the visual cues are vague, and the detection and tracking of objects are less reliable as a consequence. Any methods that require the accurate tracking of objects or the exact matching of event definition are better avoided. To address these issues, we present a temporal logic based approach which does not require training from event examples. At the low-level,
more » ... e employ dynamic programming to perform fast model fitting between the tracked vehicle and the rendered 3-D vehicle models. At the semantic-level, given the localized event region of interest (ROI), we verify the time series of human-vehicle relationships with the pre-specified event definitions in a piecewise fashion. With special interest in recognizing a person getting into and out of a vehicle, we have tested our method on a subset of the VIRAT Aerial Video dataset [11] and achieved superior results. Our framework can be easily extended to recognize other types of human-vehicle interactions.
doi:10.1109/cvprw.2011.5981794 dblp:conf/cvpr/LeeCA11 fatcat:2sebwvikvbaglmgkrw53nke2oq