Share Email Print

Proceedings Paper

Parsed and fixed block representations of visual information for image retrieval
Author(s): Soo Hyun Bae; Biing-Hwang Juang
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

The theory of linguistics teaches us the existence of a hierarchical structure in linguistic expressions, from letter to word root, and on to word and sentences. By applying syntax and semantics beyond words, one can further recognize the grammatical relationship between among words and the meaning of a sequence of words. This layered view of a spoken language is useful for effective analysis and automated processing. Thus, it is interesting to ask if a similar hierarchy of representation of visual information does exist. A class of techniques that have a similar nature to the linguistic parsing is found in the Lempel-Ziv incremental parsing scheme. Based on a new class of multidimensional incremental parsing algorithms extended from the Lempel-Ziv incremental parsing, a new framework for image retrieval, which takes advantage of the source characterization property of the incremental parsing algorithm, was proposed recently. With the incremental parsing technique, a given image is decomposed into a number of patches, called a parsed representation. This representation can be thought of as a morphological interface between elementary pixel and a higher level representation. In this work, we examine the properties of two-dimensional parsed representation in the context of imagery information retrieval and in contrast to vector quantization; i.e. fixed square-block representations and minimum average distortion criteria. We implemented four image retrieval systems for the comparative study; three, called IPSILON image retrieval systems, use parsed representation with different perceptual distortion thresholds and one uses the convectional vector quantization for visual pattern analysis. We observe that different perceptual distortion in visual pattern matching does not have serious effects on the retrieval precision although allowing looser perceptual thresholds in image compression result poor reconstruction fidelity. We compare the effectiveness of the use of the parsed representations, as constructed under the latent semantic analysis (LSA) paradigm so as to investigate their varying capabilities in capturing semantic concepts. The result clearly demonstrates the superiority of the parsed representation.

Paper Details

Date Published: 10 February 2009
PDF: 12 pages
Proc. SPIE 7240, Human Vision and Electronic Imaging XIV, 724017 (10 February 2009); doi: 10.1117/12.811572
Show Author Affiliations
Soo Hyun Bae, Georgia Institute of Technology (United States)
Biing-Hwang Juang, Georgia Institute of Technology (United States)

Published in SPIE Proceedings Vol. 7240:
Human Vision and Electronic Imaging XIV
Bernice E. Rogowitz; Thrasyvoulos N. Pappas, Editor(s)

© SPIE. Terms of Use
Back to Top