Event-based Vision for High-Speed Robotics
Cameras are appealing sensors for mobile robots because they are small, passive, inexpensive and provide rich information about the environment. While cameras have been used successfully on a plenitude of robots, such as autonomous cars or drones, serious challenges remain: power consumption, latency, dynamic range, and frame rate, among others. The sequences of images acquired by a camera are highly redundant (both in space and time), and both acquiring and processing such an amount of data
... sumes significant power. This limits the operation time of mobile robots and, moreover, defines a fundamental power-latency tradeoff. Specialized cameras designed for high-speed or high-dynamic-range scenarios are expensive, heavy, and require additional power, which prevents their use in agile mobile robots. In this thesis, we investigate event cameras as a biologically-inspired alternative to overcome the limitations of standard cameras. These neuromorphic vision sensors work in a completely different way: instead of providing a sequence of images (i.e., frames) at a constant rate, event cameras transmit only information from those pixels that undergo a significant brightness change. These pixel-level brightness changes, called events, are timestamped with micro-second resolution and transmitted asynchronously at the time they occur. Hence, event cameras are power efficient because they convey only non-redundant information, and are able to capture very high-speed motions, thus they directly address the powerlatency tradeoff. Additionally, event cameras achieve a dynamic range of more than 140dB, compared to about 60dB of standard cameras, because each pixel is autonomous and operates at its own set-point. However, since the output of an event camera is fundamentally different from that of standard cameras for which computer-vision algorithms have been developed during the past fifty years, new algorithms that can deal with the asynchronous nature of the sensor and exploit its high temporal resolution are required to unlock its potential. This thesis presents algorithms for using event cameras in the context of robotics. Since event cameras are novel sensors that are being intensively prototyped and have been commercially available only recently (ca. 2008), the literature on event-based algorithms is scarce. This poses some operational challenges as well as uncountable opportunities to explore in research. This thesis focuses on exploring the possibilities that event cameras bring to some fundamental problems in robotics and computer vision, such as localization and actuation. Amongst others, this thesis provides contributions to solving the localization problem, i.e., for a robot equipped with an event camera to be able to infer its location with respect to a given map of the environment. Classical approaches for robot localization build upon lower-level vision algorithms, and so, this thesis also presents contributions in the topics of detection, extraction, and tracking of salient visual features with an event camera, whose applicability expands far beyond the localization problem. This thesis also presents contributions in the use of event cameras for actuation and closed-loop control, i.e., in endowing the robot with the capabilities to interact with the environment to fulfill a given task. Additionally, this thesis also presents the infrastructure developed to work with event cameras in a de-facto standard robotics platform. The following is a list of contributions: * Software infrastructure, consisting of publicly available drivers, calibration tools, sensor delay characterization, and the first event camera dataset and simulator tailored for 6-DOF (degrees of freedom) camera pose estimation and SLAM (Simultaneous localization and mapping). * We introduce the concept of event "lifetime" and provide an algorithm to compute it. The lifetime endows the events with a finite temporal extent for a proper continuous representation of events in time. * The first method to extract FAST-like visual features (i.e., interest points or corners) from the output of an event camera. The detector operates an order of magnitude faster than previous corner detectors. * The first method to extract and track features from the output of a DAVIS camera (an event camera that also outputs standard frames from the same pixel array). Using these feature tracks, we developed the first sparse, feature-based visual-odometry pipeline. * The first two methods to track the 6-DOF pose of an event camera in a known map.While the first method minimizes the reprojection error of the events and only works on black-and-white scenes consisting of line segments, the second method uses a probabilistic filtering framework that allows tracking at high speeds on natural scenes. * The first application of a continuous-time framework to estimate the trajectory of an event camera, possibly incorporating inertial measurements, showing superior performance over pose-tracking-only methods. * An application of event cameras to collision avoidance of a quadrotor, showing how event cameras can be used to control a robot with very low latency. * An application of the use of an event camera for human-vs-machine slot-car racing, showing that event-driven algorithms are power efficient and can outperform human control.