Share Email Print

Proceedings Paper

Document image binarization using ”multi-scale” predefined filters
Author(s): Raid M. Saabni
Format Member Price Non-Member Price
PDF $17.00 $21.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Reading text or searching for key words within a historical document is a very challenging task. one of the first steps of the complete task is binarization, where we separate foreground such as text, figures and drawings from the background. Successful results of this important step in many cases can determine next steps to success or failure, therefore it is very vital to the success of the complete task of reading and analyzing the content of a document image. Generally, historical documents images are of poor quality due to their storage condition and degradation over time, which mostly cause to varying contrasts, stains, dirt and seeping ink from reverse side. In this paper, we use banks of anisotropic predefined filters in different scales and orientations to develop a binarization method for degraded documents and manuscripts. Using the fact, that handwritten strokes may follow different scales and orientations, we use predefined sets of filter banks having various scales, weights, and orientations to seek a compact set of filters and weights in order to generate different layers of foregrounds and background. Results of convolving these filters on the gray level image locally, weighted and accumulated to enhance the original image. Based on the different layers, seeds of components in the gray level image and a learning process, we present an improved binarization algorithm to separate the background from layers of foreground. Different layers of foreground which may be caused by seeping ink, degradation or other factors are also separated from the real foreground in a second phase. Promising experimental results were obtained on the DIBCO2011 , DIBCO2013 and H-DIBCO2016 data sets and a collection of images taken from real historical documents.

Paper Details

Date Published: 10 April 2018
PDF: 10 pages
Proc. SPIE 10615, Ninth International Conference on Graphic and Image Processing (ICGIP 2017), 106151L (10 April 2018); doi: 10.1117/12.2303604
Show Author Affiliations
Raid M. Saabni, Tel-Aviv Yaffo Academic College (Israel)
Triangle Research and Development Ctr. (Israel)

Published in SPIE Proceedings Vol. 10615:
Ninth International Conference on Graphic and Image Processing (ICGIP 2017)
Hui Yu; Junyu Dong, Editor(s)

© SPIE. Terms of Use
Back to Top