Share Email Print

Proceedings Paper

Document image compression using document analysis and block-class-specific data compression methods
Author(s): Rene Sennhauser; Krystyna W. Ohnesorge
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

The huge amount of storage needed for document images is a major hindrance to widespread use of document image processing (DIP) systems. Although current DIP systems store document images in a compressed form, there is much room for improvement. In this paper, a nearly-lossless document image compression method is investigated which preserves the relevant information of a document. The proposed approach is based on the segmentation of a document image into different blocks that are classified into one of several block classes and compressed by a block-class-specific data compression method. Whereas image and graphics blocks are compressed using standard image compression methods, text blocks are fed into a text and font recognition module and converted into their textual representation. Finally, text blocks are compressed by encoding their textual representation and enough formatting information to be able to render them as faithfully as possible to the original document. Preliminary results show that (1) the achievable compression ratios compare favorably with standard document image compression methods for all document images tested and (2) the quality of the decompressed image depends on the recognition accuracy of the text recognition module.

Paper Details

Date Published: 1 May 1994
PDF: 10 pages
Proc. SPIE 2186, Image and Video Compression, (1 May 1994); doi: 10.1117/12.173914
Show Author Affiliations
Rene Sennhauser, Univ. of Zurich (Switzerland)
Krystyna W. Ohnesorge, Univ. of Zurich (Switzerland)

Published in SPIE Proceedings Vol. 2186:
Image and Video Compression
Majid Rabbani; Robert J. Safranek, Editor(s)

© SPIE. Terms of Use
Back to Top