Share Email Print
cover

Proceedings Paper

Similarity pyramid: browsing a document database with respect to visual similarity
Author(s): Ildus Ahmadullin; Jan Allebach
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Managing large document databases has become an important task. Sorting documents with respect to their visual similarity and layout features, and visualization of the whole document database is a desirable application. A user may wish to search for documents in a database that are similar to a query in temrs of their stylistic features, or he/she may want to browse the whole database. In these tasks, clustering similar documents and organizing the document database with respect to the clusters is preferable to presenting documents in a random order. In this paper, we propose organization of single-page documents in a 3-D hierarchical structure called a similarity pyramid. The pyramid is constructed from a stack of document database embeddings on a 2-D surface with the help of a nonlinear dimensionality reduction algorithm called Isomap. The mapping algorithm preserves similarity distances between documents by mapping documents that are close to each other in a feature space to points on low-dimensional surface that are close to each other. Higher levels of the pyramid consist of document image icons that represent a large group of roughly similar documents, whereas lower levels contain document image icons representing small groups of very similar documents. A user can browse the database by moving along a certain level of a pyramid by moving between dierent levels

Paper Details

Date Published: 21 February 2012
PDF: 8 pages
Proc. SPIE 8302, Imaging and Printing in a Web 2.0 World III, 83020M (21 February 2012); doi: 10.1117/12.915679
Show Author Affiliations
Ildus Ahmadullin, Purdue Univ. (United States)
Jan Allebach, Purdue Univ. (United States)


Published in SPIE Proceedings Vol. 8302:
Imaging and Printing in a Web 2.0 World III
Qian Lin; Jan P. Allebach; Zhigang Fan, Editor(s)

© SPIE. Terms of Use
Back to Top