Share Email Print

Proceedings Paper

Robust binarization of degraded document images using heuristics
Author(s): Jon Parker; Ophir Frieder; Gideon Frieder
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Historically significant documents are often discovered with defects that make them difficult to read and analyze. This fact is particularly troublesome if the defects prevent software from performing an automated analysis. Image enhancement methods are used to remove or minimize document defects, improve software performance, and generally make images more legible. We describe an automated, image enhancement method that is input page independent and requires no training data. The approach applies to color or greyscale images with hand written script, typewritten text, images, and mixtures thereof. We evaluated the image enhancement method against the test images provided by the 2011 Document Image Binarization Contest (DIBCO). Our method outperforms all 2011 DIBCO entrants in terms of average F1 measure – doing so with a significantly lower variance than top contest entrants. The capability of the proposed method is also illustrated using select images from a collection of historic documents stored at Yad Vashem Holocaust Memorial in Israel.

Paper Details

Date Published: 24 March 2014
PDF: 12 pages
Proc. SPIE 9021, Document Recognition and Retrieval XXI, 90210U (24 March 2014); doi: 10.1117/12.2042581
Show Author Affiliations
Jon Parker, Georgetown Univ. (United States)
Johns Hopkins Univ. (United States)
Ophir Frieder, Georgetown Univ. (United States)
Gideon Frieder, Georgetown Univ. (United States)

Published in SPIE Proceedings Vol. 9021:
Document Recognition and Retrieval XXI
Bertrand Coüasnon; Eric K. Ringger, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?