Share Email Print

Proceedings Paper

Fast and accurate skew detection algorithm for a text document or a document with straight lines
Author(s): Goroh Bessho; Koichi Ejiri; John F. Cullen
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Bit-mapped images are becoming more popular in offices. Skew is a major problem for many otherwise promising applications. To remove the skew, we propose a new algorithm that makes use of both printed characters and straight line(s). Lines on a document are decomposed into small segments of black runs. By checking their connectivities, we can easily tell whether those runs are from the same line or not. To remove any bad effect from variation in line width, we sample a number of different x-y coordinates along the black runs, adjacent to white pixels. Those coordinates determine a correlation function which is used to find the correlation value. If the value is close to 1.0, we compute the higher-probability regression coefficient using the same parameters. The algorithm is effective both for horizontal and vertical lines. The coefficients can also be used to align character lines. The rectangles formed by connected black pixel are extracted using two or three different compression ratios. We can tell whether those characters are from the same character line or not, by checking the coordinates of rectangles in multiple compression images.

Paper Details

Date Published: 23 March 1994
PDF: 8 pages
Proc. SPIE 2181, Document Recognition, (23 March 1994); doi: 10.1117/12.171101
Show Author Affiliations
Goroh Bessho, Ricoh Co., Ltd. (Japan)
Koichi Ejiri, Ricoh Co., Ltd. (Japan)
John F. Cullen, Ricoh California Research Ctr. (United States)

Published in SPIE Proceedings Vol. 2181:
Document Recognition
Luc M. Vincent; Theo Pavlidis, Editor(s)

© SPIE. Terms of Use
Back to Top