Outdoor Human Motion Capture by Simultaneous Optimization of Pose and Camera Parameters

A. Elhayek, C. Stoll, K. I. Kim, C. Theobalt
2014 Computer graphics forum (Print)  
Figure 1 : Examples of multi-person tracking with moving cameras. (Left two images) two actors, and two moving and 3 static cameras (Soccer1). (Right two images) One actor, and three moving and two static cameras (Walk2). Abstract We present a method for capturing the skeletal motions of humans using a sparse set of potentially moving cameras in an uncontrolled environment. Our approach is able to track multiple people even in front of cluttered and non-static backgrounds, and unsynchronized
more » ... eras with varying image quality and frame rate. We completely rely on optical information and do not make use of additional sensor information (e.g. depth images or inertial sensors). Our algorithm simultaneously reconstructs the skeletal pose parameters of multiple performers and the motion of each camera. This is facilitated by a new energy functional that captures the alignment of the model and the camera positions with the input videos in an analytic way. The approach can be adopted in many practical applications to replace the complex and expensive motion capture studios with few consumer-grade cameras even in uncontrolled outdoor scenes. We demonstrate this based on challenging multi-view video sequences that are captured with unsynchronized and moving (e.g. mobile-phone or GoPro) cameras. Categories and Subject Descriptors (according to ACM CCS): Recent years have seen a significant improvement of marker-less skeletal human motion capture algorithms [MHK06, Pop07, SBB10]. Many state-of-the-art methods come close to the performance of marker-based al-gorithms, but only when recording in highly controlled studio setups, where 1) there are sufficiently many exactly synchronized high-quality cameras; 2) each camera is static and scene motion is due to foreground objects only; 3) the background is not cluttered; 4) lighting is controlled; 5) the main foreground actor is seldom occluded. While relative to marker-based systems, this yields an easier apparatus with a reduced setup time, the hurdles towards practical application are still large and the costs are still nosubmitted to COMPUTER GRAPHICS Forum (12/2014).
doi:10.1111/cgf.12519 fatcat:edsdqmdvf5ezdbxvdmoltvythu