Understanding the 3D layout of a cluttered room from multiple images

Sid Yingze Bao, Axel Furlan, Li Fei-Fei, Silvio Savarese
2014 IEEE Winter Conference on Applications of Computer Vision  
We present a novel framework for robustly understanding the geometrical and semantic structure of a cluttered room from a small number of images captured from different viewpoints. The tasks we seek to address include: i) estimating the 3D layout of the room -that is, the 3D configuration of floor, walls and ceiling; ii) identifying and localizing all the foreground objects in the room. We jointly use multiview geometry constraints and image appearance to identify the best room layout
more » ... om layout configuration. Extensive experimental evaluation demonstrates that our estimation results are more complete and accurate in estimating 3D room structure and recognizing objects than alternative state-of-the-art algorithms. In addition, we show an augmented reality mobile application to highlight the high accuracy of our method, which may be beneficial to many computer vision applications. Recently, [1] proposed an approach to jointly estimating the geometric and semantic properties of a scene. Using a small set of images, [1] shows better 3D geometry estimation and object recognition results than the geometry estimation methods or the semantic reasoning methods that work in isolation. Unfortunately, one of its shortcomings is that it can only produce a very sparse reconstruction of a scene (Fig. 1c) , which is not desirable for the aforementioned applications. Another noticeable series of works concentrate on parsing the room layout from a single image [9, 10, 11, 16, 15, 18,
doi:10.1109/wacv.2014.6836035 dblp:conf/wacv/BaoFLS14 fatcat:35fmuivbuzgrblvbgs7hgyuukq