Share Email Print

Proceedings Paper

Use of multimedia input in automated image annotation and content-based retrieval
Author(s): Rohini K. Srihari
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

This research explores the interaction of linguistic and photographic information in an integrated text/image database. By utilizing linguistic descriptions of a picture (speech and text input) coordinated with pointing references to the picture, we extract information useful in two aspects: image interpretation and image retrieval. In the image interpretation phase, objects and regions mentioned in the text are identified; the annotated image is stored in a database for future use. We incorporate techniques from our previous research on photo understanding using accompanying text: a system, PICTION, which identifies human faces in a newspaper photograph based on the caption. In the image retrieval phase, images matching natural language queries are presented to a user in a ranked order. This phase combines the output of (1) the image interpretation/annotation phase, (2) statistical text retrieval methods, and (3) image retrieval methods (e.g., color indexing). The system allows both point and click querying on a given image as well as intelligent querying across the entire text/image database.

Paper Details

Date Published: 23 March 1995
PDF: 12 pages
Proc. SPIE 2420, Storage and Retrieval for Image and Video Databases III, (23 March 1995); doi: 10.1117/12.205290
Show Author Affiliations
Rohini K. Srihari, SUNY/Buffalo (United States)

Published in SPIE Proceedings Vol. 2420:
Storage and Retrieval for Image and Video Databases III
Wayne Niblack; Ramesh C. Jain, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?