Share Email Print
cover

Proceedings Paper

Page layout analysis and classification for complex scanned documents
Author(s): M. Sezer Erkilinc; Mustafa Jaber; Eli Saber; Peter Bauer; Dejan Depalov
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

A framework for region/zone classification in color and gray-scale scanned documents is proposed in this paper. The algorithm includes modules for extracting text, photo, and strong edge/line regions. Firstly, a text detection module which is based on wavelet analysis and Run Length Encoding (RLE) technique is employed. Local and global energy maps in high frequency bands of the wavelet domain are generated and used as initial text maps. Further analysis using RLE yields a final text map. The second module is developed to detect image/photo and pictorial regions in the input document. A block-based classifier using basis vector projections is employed to identify photo candidate regions. Then, a final photo map is obtained by applying probabilistic model based on Markov random field (MRF) based maximum a posteriori (MAP) optimization with iterated conditional mode (ICM). The final module detects lines and strong edges using Hough transform and edge-linkages analysis, respectively. The text, photo, and strong edge/line maps are combined to generate a page layout classification of the scanned target document. Experimental results and objective evaluation show that the proposed technique has a very effective performance on variety of simple and complex scanned document types obtained from MediaTeam Oulu document database. The proposed page layout classifier can be used in systems for efficient document storage, content based document retrieval, optical character recognition, mobile phone imagery, and augmented reality.

Paper Details

Date Published: 22 September 2011
PDF: 12 pages
Proc. SPIE 8135, Applications of Digital Image Processing XXXIV, 813507 (22 September 2011); doi: 10.1117/12.893909
Show Author Affiliations
M. Sezer Erkilinc, Rochester Institute of Technology (United States)
Mustafa Jaber, Rochester Institute of Technology (United States)
Eli Saber, Rochester Institute of Technology (United States)
Peter Bauer, Hewlett-Packard Co. (United States)
Dejan Depalov, Hewlett-Packard Co. (United States)


Published in SPIE Proceedings Vol. 8135:
Applications of Digital Image Processing XXXIV
Andrew G. Tescher, Editor(s)

© SPIE. Terms of Use
Back to Top