14 Hits in 0.38 sec

Guiding Visual Surveillance by Tracking Human Attention

Ben Benfold, Ian Reid
2009 Procedings of the British Machine Vision Conference 2009  
We describe a novel method for directing the attention of an automated surveillance system. Our starting premise is that the attention of people in a scene can be used as an indicator of interesting areas and events. To determine people's attention from passive visual observations we have developed a system which automatically locates and tracks pedestrians in surveillance-style video before measuring their head pose as an estimate of their gaze direction. We then demonstrate how the resulting
more » ... aze estimations can be used to identify the subject of interest in three different surveillance scenarios. The first step of processing requires the pedestrians in a scene to be tracked, with the purpose of providing stable head images for the following pose estimation step. In contrast to similar systems, we have developed a robust multi-person tracking system that does not rely on background subtraction, making it capable of tracking the heads of multiple pedestrians through complex environments where occlusions are frequent. We track only the heads of pedestrians rather than their entire bodies for two reasons. The first is that security cameras are generally positioned sufficiently high to allow pedestrian's faces to be seen, so their heads are rarely obscured. The second is that the offset between the centre of a pedestrian's body and their head changes as they walk, so tracking the head directly provides more accurately positioned head images. The head tracking algorithm combines absolute location estimates from a head detector with velocity estimates from feature-based tracking to provide stable head images for the subsequent pose estimation step. A head detector was trained using the Histogram of Oriented Gradients based method of Dalal and Triggs [2] to provide absolute position estimates. The velocity measurements were made by tracking a number of corner features [1, 3] and learning which were representative of the head velocity using a dynamic Bayesian network. The individual feature velocity estimates were then probabilistically combined to give robust velocity estimates for the head. The two types of measurement were combined using a Kalman filter with the process model, which usually predicts the next state based on physics, replaced with the velocity estimations from feature tracking. Using a Kalman filter allows the two types of measurement to be combined probabilistically and additionally the covariance can be used to limit the region in which the detector needs to be applied. The next stage of processing uses the stable head regions provided by the tracking to estimate the direction in which the person is facing. Randomised ferns, a type of randomised tree classifier, were trained using labelled head images and used to estimate the probability that a given head image belonged to each of eight direction classes. The decisions in the ferns were based on two types of comparison, both of which were designed to be robust against contrast and brightness variations. The first decision type was based on the same HOG features that Dalal and Triggs used to train human detectors and the second was based on a comparison of colours sampled at different locations within the head region. The tracking and head pose estimation were combined to make a fully automatic system (figure 1) which could be used to measure the amount of attention received by different areas of a scene. When applied to video sequences, the direction estimates from the randomised ferns were smoothed using a hidden Markov model to enforce temporal constraints. Using a GPU implementation of the HOG head detector, the complete system runs at 15fps on 640×480 video. For three different video sequences, the locations and gaze directions of the pedestrians were projected onto a 2D ground plane and used to build up an attention map representing the amount of attention received by each square metre of the ground. In the first two experiments, static regions receiving attention were identified by accumulating gaze estimates over a long period of time. The third experiment involved locating a transient subject of attention by combining gaze estimates from multiple people, the results of which are shown in figure 2 . The results demonstrate that the system is capable of both automatically tracking a number of pedestrians in the presence of occlusions and Figure 1: A frame showing the gaze direction estimates and the paths along which pedestrians were tracked. Figure 2: Sequence showing how the attention map can be used to highlight transient areas of interest. The left column shows video frames with annotated gaze directions, the middle column shows the corresponding attention maps and the third column shows the video frame modulated with the projected attention map estimating the amount of attention that the pedestrians give to different areas of the scene.
doi:10.5244/c.23.14 dblp:conf/bmvc/BenfoldR09 fatcat:o5ainfv475b4tns642hrretjam

Stable multi-target tracking in real-time surveillance video

Ben Benfold, Ian Reid
2011 CVPR 2011  
The majority of existing pedestrian trackers concentrate on maintaining the identities of targets, however systems for remote biometric analysis or activity recognition in surveillance video often require stable bounding-boxes around pedestrians rather than approximate locations. We present a multi-target tracking system that is designed specifically for the provision of stable and accurate head location estimates. By performing data association over a sliding window of frames, we are able to
more » ... rrect many data association errors and fill in gaps where observations are missed. The approach is multi-threaded and combines asynchronous HOG detections with simultaneous KLT tracking and Markov-Chain Monte-Carlo Data Association (MCM-CDA) to provide guaranteed real-time tracking in high definition video. Where previous approaches have used ad-hoc models for data association, we use a more principled approach based on MDL which accurately models the affinity between observations. We demonstrate by qualitative and quantitative evaluation that the system is capable of providing precise location estimates for large crowds of pedestrians in real-time. To facilitate future performance comparisons, we will make a new dataset with hand annotated ground truth head locations publicly available.
doi:10.1109/cvpr.2011.5995667 dblp:conf/cvpr/BenfoldR11 fatcat:2ird3mbrrbhzpnkig5ga7cflma

Unsupervised learning of a scene-specific coarse gaze estimator

Ben Benfold, Ian Reid
2011 2011 International Conference on Computer Vision  
We present a method to estimate the coarse gaze directions of people from surveillance data. Unlike previous work we aim to do this without recourse to a large handlabelled corpus of training data. In contrast we propose a method for learning a classifier without any hand labelled data using only the output from an automatic tracking system. A Conditional Random Field is used to model the interactions between the head motion, walking direction, and appearance to recover the gaze directions and
more » ... imultaneously train randomised decision tree classifiers. Experiments demonstrate performance exceeding that of conventionally trained classifiers on two large surveillance datasets.
doi:10.1109/iccv.2011.6126516 dblp:conf/iccv/BenfoldR11 fatcat:yhbi26kgdngyraclvaplsrnklu

Gaze directed camera control for face image acquisition

Eric Sommerlade, Ben Benfold, Ian Reid
2011 2011 IEEE International Conference on Robotics and Automation  
STATIC CAMERA TRACKING AND COARSE GAZE ESTIMATION The static camera tracker uses the approach of Benfold and Reid [4] who tracked the heads of pedestrians using a combination of sparse optical flow measurements  ... 
doi:10.1109/icra.2011.5979585 dblp:conf/icra/SommerladeBR11 fatcat:swkjasmbqbbbfjkxwmdbehwnye

Cognitive visual tracking and camera control

Nicola Bellotto, Ben Benfold, Hanno Harland, Hans-Hellmut Nagel, Nicola Pirlo, Ian Reid, Eric Sommerlade, Chuan Zhao
2012 Computer Vision and Image Understanding  
Cognitive visual tracking is the process of observing and understanding the behaviour of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of
more » ... information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision.
doi:10.1016/j.cviu.2011.09.011 fatcat:yehx5wf555gdzcvjdk3nn5qxtm

A distributed camera system for multi-resolution surveillance

Nicola Bellotto, Eric Sommerlade, Ben Benfold, Charles Bibby, Ian Reid, Daniel Roth, Carles Fernandez, Luc Van Gool, Jordi Gonzalez
2009 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC)  
We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor. Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database. Visual
more » ... cking data from static views are stored dynamically into tables in the database via client calls to the SQL server. A supervisor process running on the SQL server determines if active zoom cameras should be dispatched to observe a particular target, and this message is effected via writing demands into another database table. We show results from a real implementation of the system comprising one static camera overviewing the environment under consideration and a PTZ camera operating under closed-loop velocity control, which uses a fast and robust level-set-based region tracker. Experiments demonstrate the effectiveness of our approach and its feasibility to multi-camera systems for intelligent surveillance.
doi:10.1109/icdsc.2009.5289413 dblp:conf/icdsc/BellottoSBBRRTGS09 fatcat:zvhpco554fd2tc3ld43c3myghu

Page 30 of Theatrical Journal Vol. 11, Issue 528 [page]

1850 Theatrical Journal  
Benfold concludes with the present week, but fresh attractions will compensate the visitors. In addition to Mr. Benfold, Signor Correlli and his infant sons have been performing every evening.  ...  Egeiton Wilks’s drama of “ Ben the Boatswain,’’ has drawn very good houses. Lerps.—Princess’s.—“ Virginius,” the “ Mys- teries of?  ... 

A method for performance diagnosis and evaluation of video trackers

Tahir Nawaz, Anna Ellis, James Ferryman
2017 Signal, Image and Video Processing  
Benfold-Tracker.  ...  (PirsiavashTracker) [17], Yang and Nevatia (YangTracker) [21], Benfold and Reid (Benfold-Tracker) [3], and Poiesi et al. (PoiesiTracker) [18].  ... 
doi:10.1007/s11760-017-1086-7 fatcat:y2x7powt6ngxjj223d4mgpbd2u

Target Tracking In Real Time Surveillance Cameras and Videos [article]

Nayyab Naseem, Mehreen Sirshar
2015 arXiv   pre-print
Research Paper Precision Accuracy Ben Benfold 73.6% 59.9% Breitenstein 67.0% 78.1% Anton Milan 87.2% 66.4% Yi Yang - 78.5% Nayyab Naseem 72.5% 78%  ... 
arXiv:1506.06659v1 fatcat:cdrpzcoxxbfwrjxtije4jc3pou

Joint multi-person detection and tracking from overlapping cameras

Martijn C. Liem, Dariu M. Gavrila
2014 Computer Vision and Image Understanding  
This method is extended with an appearance model by Ben Shitrit et al. [23] .  ...  Benfold and Reid [3] use a HOG based head detector to detect heads from a bird's-eye-view camera perspective and extrapolate full body detections using a fixed ground plane.  ... 
doi:10.1016/j.cviu.2014.06.003 fatcat:vk4ceu54urec7bcrqjpercd6qe

WPSS: watching people security services

Henri Bouma, Jan Baan, Sander Borsboom, Kasper van Zon, Xinghan Luo, Ben Loke, Bram Stoeller, Hans van Kuilenburg, Judith Dijk, Roberto Zamboni, Francois Kajzar, Attila A. Szep (+2 others)
2013 Optics and Photonics for Counterterrorism, Crime Fighting and Defence IX; and Optical Materials and Biomaterials in Security and Defence Systems Technology X  
Inspired by the work Benfold and Reid [6] , our system tracks human attention by making a rough estimate of the gaze direction for each person in the scene.  ... 
doi:10.1117/12.2031639 fatcat:fch35biedbab5fcsskvx4lafia


David Jones
2016 International Journal of Business Management & Research (IJBMR)   unpublished 36 Shen, Lu, and Ben Westcott. 2016.  ...  countries may be bluffing, at least in part, on the South China Sea is evidenced by the resumption of joint naval exercises with China's Northern Fleet at Qingdao by the guided missile destroyer USS Benfold  ... 

Multimodale Bestimmung des visuellen Aufmerksamkeitsfokus von Personen am Beispiel aufmerksamer Umgebungen

Michael Voit
Auf [RR06] setzt hingegen das System von Benfold und Reid aus dem Jahr 2008 auf [BR08].  ...  Ein Verfahren, dass direkt von der Kopfdrehung auf Aufmerksamkeitsziele schließt, beschrei- ben zum Beispiel Murphy-Chutorian und Trivedi in [MCT08a].  ... 
doi:10.5445/ir/1000024787 fatcat:65bzdd5hurhx5ldv7vhvsh7oce

The First Workshop on Holistic Human Factors for Adaptive Cooperative Human-Machine Systems COGNITIVE 2015 Committee COGNITIVE Advisory Chairs

Hermann Kaindl, Narayanan Kulathuramaiyer, Malaysia, Jose Alfredo, F Costa, Hakim Lounis, Canada, Om Rishi, Hermann Kaindl, Narayanan Kulathuramaiyer, Malaysia, Jose Alfredo (+91 others)
In this context, Benfold and Reid [15] built upon evidence from the estimated head poses of large crowds to guide a visual surveillance system towards interesting points. C.  ...  Method Proposed by Olfa Ben Ahmed She used the Region of Interest (ROI) to extract the hippocampus and cingulate cortex. For the classification step, it uses the Bag of Visual World (BOVW) method.  ...