Share Email Print

Optical Engineering

Page segmentation for document image analysis using a neural network
Author(s): Devesh Patel
Format Member Price Non-Member Price
PDF $20.00 $25.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In this paper we present a method for segmenting document page images into text and nontext regions. The underlying assumption made by this approach is that the two regions can be viewed as different textures. We do not use any a priori knowledge of the document format. A convolution-based method is used to generate the texture feature images. The coefficients of the convolution masks are obtained using a single-layer artificial neural network that generates eigenvectors of the correlation matrix of the input data. The coefficients of these masks have been ‘‘learned’’ from examples of the document images and have a potential of being considerably more powerful than masks with preset coefficients. A thresholding scheme based on a measure of entropy is used to segment the feature images into the homogeneous regions.

Paper Details

Date Published: 1 July 1996
PDF: 8 pages
Opt. Eng. 35(7) doi: 10.1117/1.600618
Published in: Optical Engineering Volume 35, Issue 7
Show Author Affiliations
Devesh Patel, Univ. of London (United Kingdom)

© SPIE. Terms of Use
Back to Top