Share Email Print
cover

Proceedings Paper

Automatic document processing system with learning capability
Author(s): Xuhong Li; Peter A. Ng
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

This automatic document processing system proceeds from scanning a given paper-document into the system, automatic recognizing the document layout structure, classifying it as a particular document type, which is characterized in terms of attributes to form a frame template, and extracting the pertinent information from the document to form its corresponding frame instance, which is an effective digital form of the original document. The key attribute of the system is that it is a general-purpose system, which can be adapted easily to any application domains. A segmentation method based on the 'logical closeness' is proposed. A novel and natural representation of document layout structure -- Labeled Directed Weighted Graph (LDWG) and a methodology of transforming document segmentation into LDWG representation are described. To classify a given document, we compare its layout structure with the sample layout structures of various document types prestored in the knowledge base and then use logical structure to verify the initial matching from the first step. There is a weight associated with each component of the layout structure. During the learning stage, the system can adjust the weights automatically based on the human being's correction. Modified Perceptron Learning Algorithm (PLA) is applied.

Paper Details

Date Published: 17 August 2000
PDF: 9 pages
Proc. SPIE 4050, Automatic Target Recognition X, (17 August 2000); doi: 10.1117/12.395571
Show Author Affiliations
Xuhong Li, CUNY/City College (United States)
Peter A. Ng, Univ. of Nebraska/Omaha (United States)


Published in SPIE Proceedings Vol. 4050:
Automatic Target Recognition X
Firooz A. Sadjadi, Editor(s)

© SPIE. Terms of Use
Back to Top