Dominant spatio-temporal modulations and energy tracking in videos: Application to interest point detection for action recognition

Christos Georgakis, Petros Maragos, Georgios Evangelopoulos, Dimitrios Dimitriadis
2012 2012 19th IEEE International Conference on Image Processing  
The presence of multiband amplitude and frequency modulations (AM-FM) in wideband signals, such as textured images or speech, has led to the development of efficient multicomponent modulation models for low-level image and sound analysis. Moreover, compact yet descriptive representations have emerged by tracking, through non-linear energy operators, the dominant model components across time, space or frequency. In this paper, we propose a generalization of such approaches in the 3D
more » ... al domain and explore the potential of incorporating the Dominant Component Analysis scheme for interest point detection and human action recognition in videos. Within this framework, actions are implicitly considered as manifestations of spatio-temporal oscillations in the dynamic visual stream. Multiband filtering and energy operators are applied to track the source energy in both spatial and temporal frequency bands. A new measure for extracting keypoint locations is formulated as the temporal dominant energy computed over the spatial dominant components, in terms of their modulation energy, of input video frames. Theoretical formulation is supported by evaluation and comparisons in human action classification, which demonstrate the potential of the proposed spatio-temporal detector. Index Terms-Human action recognition in videos, spatiotemporal interest point detectors, multiband filtering, multicomponent AM-FM models, dominant component analysis
