Share Email Print

Proceedings Paper

Adaptive color document image binarization for text retrieval
Author(s): Yi Li; Zhiyan Wang; Haizan Zeng
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

This paper presents a decision tree based adaptive binarization method for text retrieval in color document images. This method extends Ni-Black windowed thresholding technique and hue (H), saturation (S) and value (V) are employed. First, an observation window is retrieved, and based on standard deviation of H, S and V, a pre-defined decision tree is used for selecting proper variables that should be employed. Secondly, Karhunen-Loeve Transform (KLT) is used for eliminating correlation and reducing dimension. Finally, center point of the window is classified based on 2-D standard normal distribution. The result shows that our binarization method generates better result than Ni-Black and other global thresholding binarization method such as Otsu’s in color document images. A comparison using a commercial OCR system shows that our method can be used in various situations for high quality text retrieval.

Paper Details

Date Published: 15 December 2003
PDF: 10 pages
Proc. SPIE 5296, Document Recognition and Retrieval XI, (15 December 2003); doi: 10.1117/12.522033
Show Author Affiliations
Yi Li, South China Univ. of Technology (China)
Zhiyan Wang, South China Univ. of Technology (China)
Haizan Zeng, South China Univ. of Technology (China)

Published in SPIE Proceedings Vol. 5296:
Document Recognition and Retrieval XI
Elisa H. Barney Smith; Jianying Hu; James Allan, Editor(s)

© SPIE. Terms of Use
Back to Top