Share Email Print
cover

Proceedings Paper

Document image representation using XML technologies
Author(s): Essam A. El-Kwae; Kusuma Harnath Atmakuri
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Electronic documents have gained wide acceptance due to the ease of editing and sharing of information. However, paper documents are still widely used in many environments. Moving into a paperless and distributed office has become a major goal for document image research. A new approach for form document representation is presented. This approach allows for electronic document sharing over the World Wide Web (WWW) using Extensible Markup Language (XML) technologies. Each document is mapped into three different views, an XML view to represent the preprinted and filled-in data, an XSL (Extensible style Sheets) view to represent the structure of the document, and a DTD (Document Type Definition) view to represent the document grammar and field constraints. The XML and XSL views are generated from a document template, either automatically using image processing techniques, or semi-automatically with minimal user interaction. The DTD representation may be fixed for general documents or may be generated semi-automatically by mining a number of filled-in document examples. Document templates need to be entered once to create the proposed representation. Afterwards, documents may be displayed, updated, or shared over the web. The merits of this approach are demonstrated using a number of examples of widely used forms.

Paper Details

Date Published: 18 December 2001
PDF: 12 pages
Proc. SPIE 4670, Document Recognition and Retrieval IX, (18 December 2001); doi: 10.1117/12.450720
Show Author Affiliations
Essam A. El-Kwae, Univ. of North Carolina/Charlotte (United States)
Kusuma Harnath Atmakuri, Univ. of North Carolina/Charlotte (United States)


Published in SPIE Proceedings Vol. 4670:
Document Recognition and Retrieval IX
Paul B. Kantor; Tapas Kanungo; Jiangying Zhou, Editor(s)

© SPIE. Terms of Use
Back to Top