Share Email Print

Proceedings Paper

Text/graphics separation for technical papers
Author(s): Oleg G. Okun; Sergey V. Ablameyko
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

One of the important operations in automatic analysis of technical papers is a text separation from graphics. In practice, a document skew often occurs both for initial document and for its image after scanning. Also text and graphic blocks can exist which have no rectangular shape. In these cases, the standard text/graphics separation methods such as projection profiles or run length smoothing are not always suitable. In this paper, we propose the text/graphics separation algorithm based on two simple and standard properties of technical paper pages. We call them as area and text compactness properties. The area property takes into account the geometrical relationships between text and graphics. The text compactness property reflects the spatial relationships between text components within block and between text and graphics. An application of both properties allows us to accurately perform the separation in the cases above. No skew correction is required before separation and text and graphic blocks can have arbitrary shape.

Paper Details

Date Published: 30 March 1995
PDF: 8 pages
Proc. SPIE 2422, Document Recognition II, (30 March 1995); doi: 10.1117/12.205819
Show Author Affiliations
Oleg G. Okun, Institute of Engineering Cybernetics (Belarus)
Sergey V. Ablameyko, Institute of Engineering Cybernetics (Belarus)

Published in SPIE Proceedings Vol. 2422:
Document Recognition II
Luc M. Vincent; Henry S. Baird, Editor(s)

© SPIE. Terms of Use
Back to Top