Share Email Print

Proceedings Paper

Synchronous tracking of outputs from multiple OCR systems
Author(s): Vicente P. Concepcion; Donald P. D'Amato
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

The accuracies of OCR systems have increased in recent years due to improvements in pre- processing methods and recognition algorithms. It has been suggested that even higher accuracy can be attained by integrating the results of two or more OCR systems with uncorrelated errors. There are several methods for integrating outputs, some of which had already been published. However, prior to integration, the individual characters or symbols from the various OCR outputs have to be synchronously tracked in order to compare them. Whereas it is simple to determine character correspondence among strings containing only substitution errors, the matching of strings of unequal length, which result when an OCR system generates insertion and deletion errors, is more complicated. Detecting loss of synchronization is made more difficult when consecutive errors occur. The length of the error burst must be determined or upper bounded before the error can be classified or synchronicity restored. This paper focuses on the tracking problem, and uses a dynamic programming search in n dimensions, where n is the number of OCR systems. The algorithm models the error generation process at each of the OCR systems and looks for the most probable combination of synchronized OCR outputs from the beginning to the end of all strings. The final output of the process (which can be the input to an integrator) is a series of n-tuples, each one containing exactly one output character (including nulls or deletions) from an OCR system.

Paper Details

Date Published: 14 April 1993
PDF: 11 pages
Proc. SPIE 1906, Character Recognition Technologies, (14 April 1993); doi: 10.1117/12.143623
Show Author Affiliations
Vicente P. Concepcion, MITRE Corp. (United States)
Donald P. D'Amato, MITRE Corp. (United States)

Published in SPIE Proceedings Vol. 1906:
Character Recognition Technologies
Donald P. D'Amato, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?