Share Email Print

Journal of Electronic Imaging

Binary document image compression using a three-symbol grouped code dictionary
Author(s): Hermilo Sanchez-Cruz; Mario A. Rodríguez-Díaz
Format Member Price Non-Member Price
PDF $20.00 $25.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

A novel method of lossy compression for images of text documents is proposed. The method is based on classifying the objects, characters, and pictures that appear in the images. We used the Tanimoto distance to group the objects into different classes to create an object dictionary; then, we codified the instances of each class by means of a code of three symbols called the three orthogonal symbol chain code (3OT). We applied an entropy coder to the resulting chain, which groups the symbols of 3OT; finally, we compressed the chain obtained by using the Paq8l archiver, which is based on a context-mixing algorithm divided into a predictor and an arithmetic coder. We obtained a high performance in memory storage, with an average of 2.17 times better compression levels with respect to the international standard Joint Bi-level Image Experts Group 2 on its lossy information version.

Paper Details

Date Published: 21 May 2012
PDF: 13 pages
J. Electron. Imag. 21(2) 023013 doi: 10.1117/1.JEI.21.2.023013
Published in: Journal of Electronic Imaging Volume 21, Issue 2
Show Author Affiliations
Hermilo Sanchez-Cruz, Univ. Autonoma de Aguascalientes (Mexico)
Mario A. Rodríguez-Díaz, Univ. Autonoma de Aguascalientes (Mexico)

© SPIE. Terms of Use
Back to Top