Share Email Print

Proceedings Paper

Document image matching using a maximal grid approach
Author(s): Angelina Tzacheva; Yasser El-Sonbaty; Essam A. El-Kwae
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

A new approach for form document representation using the maximal grid of its frameset is presented. Using image processing techniques, a scanned form is transformed into a frameset composed of a number of cells. The maximal grid is the grid that encompasses all the horizontal and vertical lines in the form and can be easily generated from the cell coordinates. The number of cells from the original frameset, included in each of the cells created by the maximal grid, is then calculated. Those numbers are added for each row and column generating an array representation for the frameset. A novel algorithm for similarity matching of document framesets based on their maximal grid representations is introduced. The algorithm is robust to image noise and to line breaks, which makes it applicable to poor quality scanned documents. The matching algorithm renders the similarity between two forms as a value between 0 and 1. Thus, it may be used to rank the forms in a database according to their similarity to a query form. Several experiments were performed in order to demonstrate the accuracy and the efficiency of the proposed approach.

Paper Details

Date Published: 18 December 2001
PDF: 8 pages
Proc. SPIE 4670, Document Recognition and Retrieval IX, (18 December 2001); doi: 10.1117/12.450721
Show Author Affiliations
Angelina Tzacheva, Univ. of North Carolina/Charlotte (United States)
Yasser El-Sonbaty, Arab Academy for Science and Technology (Egypt)
Essam A. El-Kwae, Univ. of North Carolina/Charlotte (United States)

Published in SPIE Proceedings Vol. 4670:
Document Recognition and Retrieval IX
Paul B. Kantor; Tapas Kanungo; Jiangying Zhou, Editor(s)

© SPIE. Terms of Use
Back to Top