Share Email Print

Optical Engineering

Automated analysis of mixed documents consisting of printed Korean/alphanumeric texts and graphic images
Author(s): Young Kug Ham; Hong Kyu Chung; In Kwon Kim; Rae-Hong Park
Format Member Price Non-Member Price
PDF $20.00 $25.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

An efficient algorithm is proposed that recognizes a mixed document consisting of printed Korean/alphanumeric text and graphic images. In the preprocessing step, an input document is skew-normalized, if necessary, by rotating it by an angle detected with the Hough transform. Then we separate the graphic image parts from the text parts by considering chain codes of connected components. We further separate each character using vertical and horizontal projections. In the recognition step, a mixed text consisting of two different sets of characters, e.g. , Korean and alphanumeric characters is recognized. Korean and alphanumeric characters are classified and each is recognized hierarchically using several effective features. The output is obtained by combining the recognized characters and separated graphic parts. An efficient automated analysis algorithm for mixed documents consisting of graphic images and two different sets of characters is proposed and its performance is demonstrated via computer simulation.

Paper Details

Date Published: 1 June 1994
PDF: 9 pages
Opt. Eng. 33(6) doi: 10.1117/12.171323
Published in: Optical Engineering Volume 33, Issue 6
Show Author Affiliations
Young Kug Ham, Sogang Univ. (South Korea)
Hong Kyu Chung, Sogang Univ. (South Korea)
In Kwon Kim, Sogang Univ. (South Korea)
Rae-Hong Park, Sogang Univ. (South Korea)

© SPIE. Terms of Use
Back to Top