Share Email Print

Proceedings Paper

Historical document image segmentation using background light intensity normalization
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

This paper presents a new document binarization algorithm for camera images of historical documents, which are especially found in The Library of Congress of the United States. The algorithm uses a background light intensity normalization algorithm to enhance an image before a local adaptive binarization algorithm is applied. The image normalization algorithm uses an adaptive linear or non-linear function to approximate the uneven background of the image due to the uneven surface of the document paper, aged color or uneven light source of the cameras for image lifting. Our algorithm adaptively captures the background of a document image with a "best fit" approximation. The document image is then normalized with respect to the approximation before a thresholding algorithm is applied. The technique works for both gray scale and color historical handwritten document images with significant improvement in readability for both human and OCR.

Paper Details

Date Published: 17 January 2005
PDF: 8 pages
Proc. SPIE 5676, Document Recognition and Retrieval XII, (17 January 2005); doi: 10.1117/12.585545
Show Author Affiliations
Zhixin Shi, Univ. at Buffalo (United States)
Venu Govindaraju, Univ. at Buffalo (United States)

Published in SPIE Proceedings Vol. 5676:
Document Recognition and Retrieval XII
Elisa H. Barney Smith; Kazem Taghva, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?