Real-Time Document Image Retrieval with LLAH
Last update
What's this?
This page explains a new method of real-time document image retrieval which
takes as input images captured by a web camera and retrieves their corresponding
pages from a large-scale document image database (DB). The core of the
method is LLAH \ the algorithm called "Locally Likely Arrangement
Hashing" invented in our research group.
A short video of introducing the system is available from here.
What is the task?
The method views document images as a collection of feature points. Thus
the task of retrieval is to find the page that has similar arrangement
of feature points. Take a look at the images below. The query image is
converted to a set of feature points and then matched to feature points
from pages in the DB.
Although the task is not so easy for human, machine can easily achieve
it with the help of LLAH. Can you find the correct answer?
|
|||||||
query | pages in the DB |
The method with LLAH tries to find a point of a page that corresponds to each point extracted from the query image. The number of points in a query image is about 400, and the number in a page of the DB is about 600. So the number of times of point matching is 2,400,000,000 (= 400 X 600 X 10,000) for a DB including 10,000 pages \ too many to be real-time for a brute-force matching. Note also that
These make matching much harder.
What have been achieved?
The method is characterized by:
Examples of query images correctly recognized by the method are listed
below. Original images employed for the retrieval can be obtained by clicking
these images.
See videos as well.
perspective distortion | partial capture | occlusion | non-linear deformation |
The system allows us not only to find a corresponding page from the DB, but also, for example, to display information on the retrieved page as shown in the augmented reality video. The following figure illustrates this functionality. With the system, pages can be regarded as media to display various information, which may be diagrams, text, still images as well as movies (like a news paper in the movie of Harry Potter). You can also establish a link from a real page to the Internet.
(A larger image is obtained by clicking the above.)
Can I try ?
Yes! If you have a web camera (1.3 M pixels camera is preferable) and a windows computer (either a Dual CPU machine or two computers are preferable), you can use the system available from the following.
Functions provided by this software are:
In the current distribution, we do not support the function of augmented
reality.
The resolution of query and DB images is limited to be 1.3 M pixels or
less. Images with higher resolution are reduced automatically in the software.
If you are interested in the software without the limitation, please send
us an email to the following address.
Please note that this software is provided ONLY for research purposes. You CANNOT install it to commercial products.
(patent pending; PCT/JP2006/302669, WO2006/092957)
Source code by Tomohiro Nakai(added on May 23,2022)
Source code by Kazutaka Takeda(added on May 23,2022)
For further info.
Who invented this technology?
Tomohiro Nakai Ph.D. Candidate |
Intelligent Media Processing Lab. |
|
Prof. Koichi Kise | ||
Dr. Masakazu Iwamura |
Copyright (c) 2006,2007, Intelligent Media Processing Lab., Osaka Prefecture University. All rights reserved.