Share Email Print

Proceedings Paper

Hierarchical logical structure extraction of book documents by analyzing tables of contents
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Logical structure extraction of book documents is significant in electronic document database automatic construction. The tables of contents in a book play an important role in representing the overall logical structure and reference information of the book documents. In this paper, a new method is proposed to extract the hierarchical logical structure of book documents, in addition to the reference information, by combining spatial and semantic information of the tables of contents in a book. Experimental results obtained from testing on various book documents demonstrate the effectiveness and robustness of the proposed approach.

Paper Details

Date Published: 15 December 2003
PDF: 8 pages
Proc. SPIE 5296, Document Recognition and Retrieval XI, (15 December 2003); doi: 10.1117/12.528808
Show Author Affiliations
Feng He, Tsinghua Univ. (China)
Xiaoqing Ding, Tsinghua Univ. (China)
Liangrui Peng, Tsinghua Univ. (China)

Published in SPIE Proceedings Vol. 5296:
Document Recognition and Retrieval XI
Elisa H. Barney Smith; Jianying Hu; James Allan, Editor(s)

© SPIE. Terms of Use
Back to Top