Share Email Print
cover

Proceedings Paper

Measuring document image skew and orientation
Author(s): Dan S. Bloomberg; Gary E. Kopec; Lakshmi Dasari
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Several approaches have previously been taken for identifying document image skew. At issue are efficiency, accuracy, and robustness. We work directly with the image, maximizing a function of the number of ON pixels in a scanline. Image rotation is simulated by either vertical shear or accumulation of pixel counts along sloped lines. Pixel sum differences on adjacent scanlines reduce isotropic background noise from non-text regions. To find the skew angle, a succession of values of this function are found. Angles are chosen hierarchically, typically with both a coarse sweep and a fine angular bifurcation. To increase efficiency, measurements are made on subsampled images that have been pre-filtered to maximize sensitivity to image skew. Results are given for a large set of images, including multiple and unaligned text columns, graphics, and large area halftones. The measured intrinsic angular error is inversely proportional to the number of sampling points on a scanline. This method does not indicate when text is upside-down, and it also requires sampling the function at 90 degrees of rotation to measure text skew in landscape mode. However, such text orientation can be determined (as one of four directions) by noting that Roman characters in all languages have many more ascenders than descenders, and using morphological operations to identify such pixels. Only a small amount of text is required for accurate statistical determination of orientation, and images without text are identified as such.

Paper Details

Date Published: 30 March 1995
PDF: 15 pages
Proc. SPIE 2422, Document Recognition II, (30 March 1995); doi: 10.1117/12.205832
Show Author Affiliations
Dan S. Bloomberg, Xerox Palo Alto Research Ctr. (United States)
Gary E. Kopec, Xerox Palo Alto Research Ctr. (United States)
Lakshmi Dasari, Xerox Palo Alto Research Ctr. (United States)


Published in SPIE Proceedings Vol. 2422:
Document Recognition II
Luc M. Vincent; Henry S. Baird, Editor(s)

© SPIE. Terms of Use
Back to Top