SEMANTIC MOTION CONCEPT RETRIEVAL IN NON-STATIC BACKGROUND UTILIZING SPATIAL-TEMPORAL VISUAL INFORMATION
International Journal of Semantic Computing (IJSC)
Motion concepts mean those concepts containing motion information such as racing car and dancing. In order to achieve high retrieval accuracy comparing with those static concepts such as car or person in semantic retrieval tasks, the temporal information has to be considered. Additionally, if a video sequence is captured by an amateur using a hand-held camera containing signi¯cant camera motion, the complexities of the uncontrolled backgrounds would aggravate the di±culty of motion concept
... eval. Therefore, the retrieval of semantic concepts containing motion in non-static background is regarded as one of the most challenging tasks in multimedia semantic analysis and video retrieval. To address such a challenge, this paper proposes a motion concept retrieval framework including a motion region detection model and a concept retrieval model that integrates the spatial and temporal information in video sequences. The motion region detection model uses a new integral density method (adopted from the idea of integral images) to quickly identify the motion regions in an unsupervised way. Specially, key information locations on video frames are¯rst obtained as maxima and minima of the result of Di®erence of Gaussian (DoG) function. Then a motion map of adjacent frames is generated from the diversity of the outcomes from the Simultaneous Partition and Class Parameter Estimation (SPCPE) framework. The usage of the motion map is to¯lter key information locations into key motion locations (KMLs) that imply the regions containing motion. The motion map can also indicate the motion direction which guides the proposed \integral density" approach to locate the motion regions quickly and accurately. Based on the motion region detection model, moving object-level information is extracted for semantic retrieval. In the proposed conceptual retrieval model, temporally semantic consistency among the consecutive shots is analyzed and presented into a conditional probability model, which is then used to re-rank the similarity scores to improve the¯nal retrieval results. The results of our proposed novel motion concept retrieval framework are not only illustrated visually demonstrating its robustness in non-static background, but also veri¯ed by the promising experimental results demonstrating that the concept retrieval performance can be improved by integrating the spatial and temporal visual information.