Share Email Print

Proceedings Paper

Use of multimedia input in automated image annotation and content-based retrieval
Author(s): Rohini K. Srihari
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

This research explores the interaction of linguistic and photographic information in an integrated text/image database. By utilizing linguistic descriptions of a picture (speech and text input) coordinated with pointing references to the picture, we extract information useful in two aspects: image interpretation and image retrieval. In the image interpretation phase, objects and regions mentioned in the text are identified; the annotated image is stored in a database for future use. We incorporate techniques from our previous research on photo understanding using accompanying text: a system, PICTION, which identifies human faces in a newspaper photograph based on the caption. In the image retrieval phase, images matching natural language queries are presented to a user in a ranked order. This phase combines the output of (1) the image interpretation/annotation phase, (2) statistical text retrieval methods, and (3) image retrieval methods (e.g., color indexing). The system allows both point and click querying on a given image as well as intelligent querying across the entire text/image database.

Paper Details

Date Published: 23 March 1995
PDF: 12 pages
Proc. SPIE 2420, Storage and Retrieval for Image and Video Databases III, (23 March 1995); doi: 10.1117/12.205290
Show Author Affiliations
Rohini K. Srihari, SUNY/Buffalo (United States)

Published in SPIE Proceedings Vol. 2420:
Storage and Retrieval for Image and Video Databases III
Wayne Niblack; Ramesh C. Jain, Editor(s)

© SPIE. Terms of Use
Back to Top