Share Email Print

Proceedings Paper

Document structure analysis algorithms: a literature survey
Author(s): Song Mao; Azriel Rosenfeld; Tapas Kanungo
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Document structure analysis can be regarded as a syntactic analysis problem. The order and containment relations among the physical or logical components of a document page can be described by an ordered tree structure and can be modeled by a tree grammar which describes the page at the component level in terms of regions or blocks. This paper provides a detailed survey of past work on document structure analysis algorithms and summarize the limitations of past approaches. In particular, we survey past work on document physical layout representations and algorithms, document logical structure representations and algorithms, and performance evaluation of document structure analysis algorithms. In the last section, we summarize this work and point out its limitations.

Paper Details

Date Published: 13 January 2003
PDF: 11 pages
Proc. SPIE 5010, Document Recognition and Retrieval X, (13 January 2003); doi: 10.1117/12.476326
Show Author Affiliations
Song Mao, Univ. of Maryland/College Park (United States)
Azriel Rosenfeld, Univ. of Maryland/College Park (United States)
Tapas Kanungo, IBM Almaden Research Ctr. (United States)

Published in SPIE Proceedings Vol. 5010:
Document Recognition and Retrieval X
Tapas Kanungo; Elisa H. Barney Smith; Jianying Hu; Paul B. Kantor, Editor(s)

© SPIE. Terms of Use
Back to Top