Proceeding of the 16th ACM international conference on Multimedia - MM '08
The popularity of camera phones enables many exciting multimedia applications. In this paper, we present a novel technology and several applications that allow users to interact with paper documents, books, and magazines. This interaction is in the form of reading and writing electronic information, such as images, web urls, video, and audio, to the paper medium by pointing a camera phone at a patch of text on a document. Our application does not require any special markings, barcodes, or
... arks on the paper document. Instead, we propose a document recognition algorithm that automatically determines the location of a patch of text in a large collection of document images given a small document image. This is very challenging because the majority of phone cameras lack autofocus and macro capabilities and they produce low quality images and video. We developed a novel algorithm, Brick Wall Coding (BWC), that performs image-based document recognition using the mobile phone video frames. Given a document patch image, BWC utilizes the layout, i.e. relative locations, of word boxes in order to determine the original file, page, and the location on the page. BWC runs real-time (4 frames per second) on a Treo 700w smartphone with a 312 MHz processor and 64MB RAM. Using our method we can recognize blurry document patch frames that contain as little as 4-5 lines of text and a video resolution as low as 176×144. We performed experiments by indexing 4397 document pages and querying this database with 533 document patches. Besides describing the basic algorithm, this paper also describes several applications that are enabled by mobile phonepaper interaction, such as inserting electronic annotations to paper, using paper as a tangible interface to collect and communicate multimedia data, and collaborative homework.