Share Email Print

Proceedings Paper

Pattern matcher for OCR-corrupted documents and its evaluation
Author(s): Stefan Agne; Hans-Guenther Hein
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Document classification is one of the fundamental technologies prior to document routing, document understanding, and information extraction algorithms. Pattern matchers with rule-based components are in use in news agencies with electronic text as input. However, classification of OCR documents must deal with the ambiguities of the underlying OCR engine. The ambiguities of character segmentation and classification lead towards a directed graph of characters as the results of the OCR process - the so-called character hypothesis lattice. This paper deals with techniques to enhance the pattern matcher in order to cope with CHLs.

Paper Details

Date Published: 1 April 1998
PDF: 9 pages
Proc. SPIE 3305, Document Recognition V, (1 April 1998); doi: 10.1117/12.304629
Show Author Affiliations
Stefan Agne, German Research Ctr. for Artificial Intelligence GmbH (Germany)
Hans-Guenther Hein, German Research Ctr. for Artificial Intelligence GmbH (Germany)

Published in SPIE Proceedings Vol. 3305:
Document Recognition V
Daniel P. Lopresti; Jiangying Zhou, Editor(s)

© SPIE. Terms of Use
Back to Top