Share Email Print
cover

Proceedings Paper

Page segmentation and text extraction from gray-scale images in microfilm format
Author(s): Qing Yuan; Chew Lim Tan
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

The paper deals with a suitably designed system that is being used to separate textual regions from graphics regions and locate textual data from textured background. We presented a method based on edge detection to automatically locate text in some noise infected grayscale newspaper images with microfilm format. The algorithm first finds the appropriate edges of textual region using Canny edge detector, and then by edge merging it makes use of edge features to do block segmentation and classification, afterwards feature aided connected component analysis was used to group homogeneous textual regions together within the scope of its bounding box. We can obtain an efficient block segmentation with reduced memory size by introducing the TLC. The proposed method has been used to locate text in a group of newspaper images with multiple page layout. Initial results are encouraging, we would expand the experiment data to over 300 microfilm images with different layout structures, promising result is anticipated with corresponding modification on the prototype of former algorithm to make it more robust and suitable to different cases.

Paper Details

Date Published: 21 December 2000
PDF: 10 pages
Proc. SPIE 4307, Document Recognition and Retrieval VIII, (21 December 2000); doi: 10.1117/12.410852
Show Author Affiliations
Qing Yuan, National Univ. of Singapore (Singapore)
Chew Lim Tan, National Univ. of Singapore (Singapore)


Published in SPIE Proceedings Vol. 4307:
Document Recognition and Retrieval VIII
Paul B. Kantor; Daniel P. Lopresti; Jiangying Zhou, Editor(s)

© SPIE. Terms of Use
Back to Top