Discriminative Video Pattern Search for Efficient Action Detection

Junsong Yuan, Zicheng Liu, Ying Wu
2011 IEEE Transactions on Pattern Analysis and Machine Intelligence  
Actions are spatio-temporal patterns. Similar to the sliding window-based object detection, action detection finds the reoccurrences of such spatio-temporal patterns through pattern matching, by handling cluttered and dynamic backgrounds and other types of action variations. We address two critical issues in pattern matching-based action detection: (1) the intra-pattern variations in actions, and (2) the computational efficiency in performing action pattern search in cluttered scenes. First, we
more » ... propose a discriminative pattern matching criterion for action classification, called naive-Bayes mutual information maximization (NBMIM). Each action is characterized by a collection of spatio-temporal invariant features and we match it with an action class by measuring the mutual information between them. Based on this matching criterion, action detection is to localize a subvolume in the volumetric video space that has the maximum mutual information toward a specific action class. A novel spatio-temporal branch-and-bound (STBB) search algorithm is designed to efficiently find the optimal solution. Our proposed action detection method does not rely on the results of human detection, tracking or background subtraction. It can well handle action variations such as performing speed and style variations, as well as scale changes. It is also insensitive to dynamic and cluttered backgrounds and even to partial occlusions. The crossdataset experiments on action detection, including KTH, CMU action datasets, and another new MSR action dataset, demonstrate the effectiveness and efficiency of the proposed multi-class multipleinstance action detection method. Index Terms-video pattern search, action detection, spatiotemporal branch-and-bound search
doi:10.1109/tpami.2011.38 pmid:21339530 fatcat:fl2mmgdg4vawlawcvx5zwk4rnm