Automatic collection of Web video shots corresponding to specific actions using Web images

Do Hang Nga, Keiji Yanai
2012 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops  
In this paper, we apply Web images to the problem of automatically extracting video shots corresponding to specific actions from Web videos. Our framework modifies the unsupervised method on automatic collecting of Web video shots corresponding to the given actions which we proposed last year [9] . For each action, following that work, we first exploit tag relevance to gather 200 most relevant videos of the given action and segment each video into several video shots. Shots are then converted
more » ... to bags of spatio-temporal features and ranked by the VisualRank method. We refine the approach by introducing the use of Web action images into shot ranking step. We select images by applying Poselets [2] to detect human in the case of human actions. We test our framework on 28 human action categories whose precision values were 20% or below and 8 non-human action categories whose precision values were less than 15% in [9] . The results show that our model can improve the precision approximately 6% over 28 human action categories and 16% over 8 non-human action categories.
doi:10.1109/cvprw.2012.6239255 dblp:conf/cvpr/NgaY12 fatcat:dwfrrrc5cneafjomqtfvv3yay4