Share Email Print

Proceedings Paper

Text line extraction in free style document
Author(s): Xiaolu Shen; Changsong Liu; Xiaoqing Ding; Yanming Zou
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

This paper addresses to text line extraction in free style document, such as business card, envelope, poster, etc. In free style document, global property such as character size, line direction can hardly be concluded, which reveals a grave limitation in traditional layout analysis. 'Line' is the most prominent and the highest structure in our bottom-up method. First, we apply a novel intensity function found on gradient information to locate text areas where gradient within a window have large magnitude and various directions, and split such areas into text pieces. We build a probability model of lines consist of text pieces via statistics on training data. For an input image, we group text pieces to lines using a simulated annealing algorithm with cost function based on the probability model.

Paper Details

Date Published: 19 January 2009
PDF: 12 pages
Proc. SPIE 7247, Document Recognition and Retrieval XVI, 72470L (19 January 2009); doi: 10.1117/12.805695
Show Author Affiliations
Xiaolu Shen, Tsinghua Univ. (China)
Changsong Liu, Tsinghua Univ. (China)
Xiaoqing Ding, Tsinghua Univ. (China)
Yanming Zou, Nokia Research Ctr. (China)

Published in SPIE Proceedings Vol. 7247:
Document Recognition and Retrieval XVI
Kathrin Berkner; Laurence Likforman-Sulem, Editor(s)

© SPIE. Terms of Use
Back to Top