Share Email Print

Proceedings Paper

Segmentation Of Binary Images Into Text Strings And Graphics
Author(s): Lloyd Alan Fletcher; Rangachar Kasturi
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

An automated system for document analysis is extremely desirable. A digitized image consisting of a mixture of text and graphics should be segmented in order to more efficiently represent both the areas of text and graphics. This paper describes the development and implementation of a new algorithm for automated text string separation which is relatively independent of changes in text font style and size, and of string orientation. The algorithm does not explicitly recognize individual characters. The principal components of the algorithm are the generation of connected components and the application of the Hough transform in order to logically group together components into character strings which may then be separated from the graphics. The algorithm outputs two images, one containing text strings, and the other graphics. These images may then be processed by suitable character recognition and graphics recognition systems. The performance of the algorithm, both in terms of its effectiveness and computational efficiency, was evaluated using several test images. The results of the evaluations are described. The superior performance of this algorithm compared to other techniques is clear from the evaluations.

Paper Details

Date Published: 11 May 1987
PDF: 8 pages
Proc. SPIE 0786, Applications of Artificial Intelligence V, (11 May 1987); doi: 10.1117/12.940666
Show Author Affiliations
Lloyd Alan Fletcher, Bell Communications Research (United States)
Rangachar Kasturi, The Pennsylvania State University (United States)

Published in SPIE Proceedings Vol. 0786:
Applications of Artificial Intelligence V
John F. Gilmore, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?