Share Email Print
cover

Proceedings Paper

Word extraction using irregular pyramid
Author(s): PohKok Loo; Chew Lim Tan
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

This paper proposed a new algorithm to perform text extraction from imaged documents. The paper focused in the extraction of word group. Irregular pyramid structure is used as the basis of the algorithm. The uniqueness of this algorithm is its inclusion of strategic background information in the analysis where most techniques have discarded. Both foreground (i.e. text area) and portion of background (i.e. white area) regions are examined. The fundamental of the algorithm is based on the concept of 'closeness' where text information within a group is closed to each other, in terms of spatial distance, as compared to other text area. The result produced by the algorithm is encouraging with the ability to correctly group words of different size, font, arrangement and orientation.

Paper Details

Date Published: 21 December 2000
PDF: 9 pages
Proc. SPIE 4307, Document Recognition and Retrieval VIII, (21 December 2000); doi: 10.1117/12.410857
Show Author Affiliations
PohKok Loo, Singapore Polytechnic (Singapore)
Chew Lim Tan, National Univ. of Singapore (Singapore)


Published in SPIE Proceedings Vol. 4307:
Document Recognition and Retrieval VIII
Paul B. Kantor; Daniel P. Lopresti; Jiangying Zhou, Editor(s)

© SPIE. Terms of Use
Back to Top