Show all abstracts
View Session
- Content-Based Imagery Database Retrieval
- Content/Context-Based Imagery Database Retrieval
- Extraction of Information from Images
- Image Database Storage and Transmission
- Special Session: Image Information Processing On High-Performance Computers
Content-Based Imagery Database Retrieval
Image retrieval using image context vectors
Steve Gallant,
David M. Fram
Show abstract
Searching image databases using image queries is a challenging problem. For the analogous problem with text, those document retrieval methods that use `superficial' information, such as word count statistics, generally outperform natural language understanding approaches. This motivates an exploration of `superficial' feature-based methods for image retrieval. The main strategy is to avoid full image understanding, or even segmentation. The key question for any image retrieval approach is how to represent the images. We are exploring a new image context vector representation. A context vector is a high (approximately 300) dimensional vector that can represent images, sub-images, or image queries. The image is first represented as a collection of pairs of features with relative orientations defined by the feature pairs. Each feature pair is transformed into a context vector, and then all the vectors for pairs are added together to form the 300-dimensional image context vector for the entire image. This paper examines the image context vector approach and its expected strengths and weaknesses.
Image retrieval by content: a machine learning approach
Usama M. Fayyad,
Padhraic Smyth
Show abstract
In areas as diverse as Earth remote sensing, astronomy, and medical imaging, there has been an explosive growth in the amount of image data available for creating digital image libraries. However, the lack of automated analysis and useful retrieval methods stands in the way of creating true digital image libraries. In order to perform query-by-content type searches, the query formulation problem needs to be addressed: it is often not possible for users to formulate the targets of their searches in terms of queries. We present a natural and powerful approach to this problem to assist scientists in exploring large digital image libraries. We target a system that the user trains to find certain patterns by providing it with examples. The learning algorithms use the training data to produce classifiers to detect and identify other targets in the large image collection. This forms the basis for query by content capabilities and for library indexing purposes. We ground the discussion by presenting two such applications at JPL: the SKICAT system used for the reduction and analysis of a 3 terabyte astronomical data set, and the JARtool system to be used in automatically analyzing the Magellan data set consisting of over 30,000 images of the surface of Venus. General issues which impact the application of learning algorithms to image analysis applications are discussed.
Indexing multispectral images for content-based retrieval
Julio E. Barros,
James C. French,
Worthy N. Martin,
et al.
Show abstract
This paper discusses our view of image databases, content-based retrieval, and our experiences with an experimental system. We present a methodology in which efficient representation and indexing are the basis for retrieval of images by content as well as associated external information. In the example system images are indexed and accessed based on properties of the individual regions in the image. Regions in each image are indexed by their spectral characteristics, as well as by their shape descriptors and position information. The goal of the system is to reduce the number of images that need to be inspected by a user by quickly excluding substantial parts of the database. The system avoids exhaustive searching through the image database when a query is submitted.
Photobook: tools for content-based manipulation of image databases
Show abstract
We describe the Photobook system, which is a set of interactive tools for browsing and searching images and image sequences. These tools differ from those used in standard image databases in that they make direct use of the image content rather than relying on annotations. Direct search on image content is made possible by use of semantics-preserving image compression, which reduces images to a small set of perceptually significant coefficients. We describe three Photobook tools in particular: one that allows search based on gray-level appearance, one that uses 2-D shape, and a third that allows search based on textural properties.
S-MODALS neural network query of medical and forensic imagery databases
Timothy G. Rainey,
Dean W. Brettle,
Andrew Lavin,
et al.
Show abstract
A dual-use neural network technology, called the statistical-multiple object detection and location system (S-MODALS), has been developed by Booz(DOT)Allen & Hamilton, Inc. over a five year period, funded by various U.S. Air Force organizations for automatic target recognition (ATR). S-MODALS performs multi-sensor fusion (Visible(EO), IR, ASARS) and multi-look evidence accrual for tactical and strategic reconnaissance. This paper presents the promising findings of applying S-MODALS to the medical field of lung cancer and the S- MODALS investigation into the intelligent database query of the FBI's ballistic forensic imagery. Since S-MODALS is a learning system, it is readily adaptable to object recognition problems other than ATR as evidenced by this joint government-academia-industry investigation into the S-MODALS automated lung nodule detection and characterization of CT imagery. This paper also presents the full results of a FBI test of the S-MODALS neural network's capabilities to perform an intelligent query of the FBI's ballistic forensic imagery.
Content/Context-Based Imagery Database Retrieval
Experience with CANDID: comparison algorithm for navigating digital image databases
Patrick M. Kelly,
T. Michael Cannon
Show abstract
This paper presents results from our experience with CANDID (comparison algorithm for navigating digital image databases), which was designed to facilitate image retrieval by content using a query-by-example methodology. A global signature describing the texture, shape, or color content is first computed for every image stored in a database, and a normalized similarity measure between probability density functions of feature vectors is used to match signatures. This method can be used to retrieve images from a database that are similar to a user-provided example image. Results for three test applications are included.
Hybrid knowledge bases for integrating symbolic, numeric, and image data
V. S. Subrahmanian
Show abstract
A hybrid knowledge base (HKB), due to Nerode and Subrahmanian, is a formalism that provides a uniform theoretical framework within which heterogeneous data representation paradigms may be integrated. The HKB framework is broad enough to support the integration of a wide array of databases including, but not restricted to: relational data (with multiple schemas), spatial data structures (including different kinds of quadtrees), pictorial data (including GIF files), numeric data and computations (e.g., linear and integer programming), and terrain data. In this paper, we focus on how the HKB paradigm can be used as a unifying framework to reason about terrain data in the context of background data that may be contained in relational and spatial data structures. We show how the current implementation of the HKB compiler can support such an integration scheme.
Predictor of requested imagery and migration engine (PRIME)
Keith Shaffer,
Tony Baraghimian
Show abstract
Although emerging mass storage devices, including robotic tape libraries, optical jukeboxes, and redundant array of inexpensive disks (RAID) systems, enable large softcopy image archives, timely dissemination of any image from the archive is not possible with popular hierarchical storage techniques. Significant delays occur for images located on slower storage devices such as robotic tape libraries. We developed a prototype, predictor of requested imagery and migration engine (PRIME), to provide more timely dissemination. PRIME reduces analysts' wait time by predicting the image requests and migrating the most likely of these images from slower to faster archive devices before analysts make these requests. PRIME uses a fuzzy logic expert system both to make the prediction and to perform the migration. We describe the PRIME environment, including the prediction and migration issues, and a description of the prototype.
Assessment of scientific image quality using wavelet packet browse images
Kathleen G. Perez-Lopez,
Arun K. Sood
Show abstract
An approach to assessing the quality of large scientific images prior to full retrieval is presented. In one mode, the assessment depends upon a visual perusal of wavelet packet subbands, especially those of low frequency content. In the other mode, appropriate for very large numbers of images, the approach involves an automatic algebraic screening of images for a site of interest. The effectiveness of the screening depends upon the extraction of identifying features from subregions of images of the site. An indexing method is presented which shows promise in providing these identifying features.
Extraction of Information from Images
Fast correlation matching in large (edge) image databases
Dariu M. Gavrila,
Larry S. Davis
Show abstract
Correlation-based matching methods are known to be very expensive when used on large image databases. In this paper, we examine ways of speeding up correlation matching by phase-coded filtering. Phase coded filtering is a technique to combine multiple patterns in one filter by assigning complex weights of unit magnitude to the individual patterns and summing them up in a composite filter. Several of the proposed composite filters are based on this idea, such as the circular harmonic component (CHC) filters and the linear phase coefficient composite (LPCC) filters. We consider the LPCC(1) filter in isolation and examine ways to improve its performance by assigning the complex weights to the individual patterns in a non- random manner so as to maximize the SNR of the filter w.r.t. the individual patterns. In experiments on a database of 100 to 1000 edge images from the aerial domain we examine the trade-off between the speed-up (the number of patterns combined in a filter) and unreliability (the number of resulting false matches) of the composite filter. Results indicate that for binary patterns with point densities of about 0.05 we can safely combine more than 20 patterns in the optimized LPCC(1) filter, which represents a speed-up of an order of a magnitude over the brute force approach of matching the individual patterns.
Bayes nets for selective perception and data fusion
Show abstract
Selective perception sequentially collects evidence to support a specified hypothesis about a scene, as long as the additional evidence is worth the effort of obtaining it. Efficiency comes from selecting the best scene locations, resolution, and vision operators, where `best' is defined as some function of benefit and cost (typically, their ratio or difference). Selective vision implies knowledge about the scene domain and the imaging operators. We use Bayes nets for representation and benefit-cost analysis in a selective vision system with both visual and non-visual actions in real and simulated static and dynamic environments. We describe sensor fusion, dynamic scene, and multi-task applications.
Detecting periodic patterns with the Modulo matcher
Adrienne Othon
Show abstract
The Modulo matcher was developed to detect the periodic presence of features in imagery. It was used for detecting and counting trains within a rail corridor. The Modulo matcher is a variant of the Hough transform and uses modulo, phase and along-corridor position to parameterize the transform space. The Modulo matcher was optimized to prevent search through empty transform space while also reducing memory requirements. Construction of the transform space is incremental and data driven.
Locally excitatory, globally inhibitory oscillator networks: theory and application to scene segmentation
DeLiang Wang,
David Terman
Show abstract
A novel class of locally excitatory, globally inhibitory oscillator networks (LEGION) is proposed and investigated analytically and by computer simulation. The model of each oscillator corresponds to a standard relaxation oscillator with two time scales. The network exhibits a mechanism of selective gating, whereby an oscillator jumping up to its active phase rapidly recruits the oscillators stimulated by the same pattern, while preventing other oscillators from jumping up. We show analytically that with the selective gating mechanism the network rapidly achieves both synchronization within blocks of oscillators that are stimulated by connected regions and desynchronization between different blocks. Computer simulations demonstrate LEGION's promising ability for segmenting multiple input patterns in real time. This model lays a physical foundation for the oscillatory correlation theory of feature binding, and may provide an effective computational framework for scene segmentation and figure/ground segregation.
Multimodality radiological image processing system (MRIPS)/MEDx: a system for medical image processing
Show abstract
A new image processing system, MRIPS/MEDx, is being deployed at NIH to facilitate the visualization and analysis of multidimensional images and spectra obtained from different radiological imaging modalities.
Heuristics for test recognition using contextual information
Tony Baraghimian
Show abstract
Competitive electronic imaging systems are emerging due to rapidly declining processing power and storage costs. Imaging converts information on paper to electronic pictures. For applications involving large quantities of paper documents, the resulting pictures are further processed by automated character recognition systems, resulting in a text representation of the original document. Current character recognition accuracy varies from one implementation to the next, and greatly depends on each particular application. We define a set of information fusion rules for combining character recognition system output. The combined result has a higher character recognition accuracy and lower error rate than either of the individual recognizer outputs taken separately. This new set of fusion heuristics takes advantage of the following information from multiple text string recognition systems simultaneously: (1) multiple hypotheses and associated confidences for each character in a text string; (2) multiple text string segmentation hypotheses; (3) separate or combined hypotheses for both uppercase and lowercase alphabetic characters; and (4) overall text string hypotheses and associated confidences. Traditionally, only the last of these four information groups is used for fusion of multiple classifications within character recognition systems. We report on a nationally sponsored character recognition benchmark, with results indicating increased accuracy using the heuristic rules described.
Image Database Storage and Transmission
High-quality lossy compression: current and future trends
Steven W. McLaughlin
Show abstract
This paper is concerned with current and future trends in the lossy compression of real sources such as imagery, video, speech and music. We put all lossy compression schemes into common framework where each can be characterized in terms of three well-defined advantages: cell shape, region shape and memory advantages. We concentrate on image compression and discuss how new entropy constrained trellis-based compressors achieve cell- shape, region-shape and memory gain resulting in high fidelity and high compression.
Transmission of compressed tactical imagery by means of an rf link
Gary H. Conners,
C. S. Joe Liou,
Joe Muczynski
Show abstract
The joint University of Rochester/Rochester Institute of Technology `Center for Electronic Imaging Systems' (CEIS) is designed to focus on research problems of interest to industrial sponsors. A particular feature of the research is that it is organized in the `triplet' mode: each project includes a faculty researcher, an industrial partner, and a doctoral or postdoctoral fellow. Compression of tactical images for transmission over an rf link is an example of this type of research project which is being worked on in collaboration with one of the CEIS sponsors, Harris Corporation/Rf communications. The Harris Digital Video Imagery Transmission System (DVITS) is designed to fulfill the need to transmit secure imagery between unwired locations at real-time rates. DVITS specializes in transmission systems for users who rely on hf equipment operating at the low end of the frequency spectrum. However, the inherently low bandwidth of hf combined with transmission characteristics such as fading and dropout severely restrict the effective throughput. The problem is posed as one of maximizing the probability of reception of the most significant information in an m x n pixel image in the shortest possible time. Various design strategies combining image segmentation, compression, and error correction are evaluated using a realistic model for the communication channel. A recommended strategy is developed and a test method using a variety of test images is described. The methodology established here can be employed for other image transmission designs.
Demonstration of dissemination, storage, and retrieval of Defense Mapping Agency digital products over a distributed enterprise network
James W. Mehring
Show abstract
As the Defense Mapping Agency moves from a producer of hardcopy products to a data warehouse of geospatial products providing the user with the most current information accessible on-line, the architecture will migrate to a distributed set of massive databases connected via high speed local area and wide area networks and accessible by remote users to efficiently query, locate, and move the data of interest to them. A demonstration of a prototype system that incorporates some of the technologies that will be key to the development of the DMA future architecture was run in July of 1994. A remote client with a one meter very small aperture antenna (VSAT) was used to remotely access, via commercial satellite link, the data warehouse consisting of a nationwide set of distributed servers connected via asynchronous transfer mode (ATM) commercial communications links. The demonstration scenario simulated a `take and update' situation where a user has been deployed with geospatial data on CD-ROM and is able to access and download updates to the region of interest via satellite link. The user is also able to provide update information via upload to the central location and is able to collaborate with operators at the central location as to the details of the input from the remote site.
Challenges in providing general access to digitized x rays over the Internet
Show abstract
As part of a collaborative project with other government agencies, the National Library of Medicine (NLM) is engaged in the development of an electronic archive of digitized cervical and lumbar spine xrays taken in the course of nationwide health and nutrition examination surveys. One goal of the project is to provide access to the images via a client/server system specifically designed to enable radiologists located anywhere on the Internet to read them and enter their readings into a database at the server located at NLM. Another key goal is to provide general (public) access to these images, the radiologists' readings, and other collateral data taken during the survey. The system developed for such general access is based on a public domain server, the World Wide Web (WWW), and NCSA Mosaic, a distributed hypermedia client system designed for information retrieval over the Internet. This paper describes the design of the client/server software, the storage environment for the x-ray archive, the user interface, the communications software, and the public access archive. Design issues include file format, image resolution (both spatial and contrast), compression alternatives, linking collateral data with images, and the role of staging and prefetching.
Testing scanners for the quality of output images
Vicente P. Concepcion,
Lawrence D. Nadel,
Donald P. D'Amato
Show abstract
Document scanning is the means through which documents are converted to their digital image representation for electronic storage or distribution. Among the types of documents being scanned by government agencies are tax forms, patent documents, office correspondence, mail pieces, engineering drawings, microfilm, archived historical papers, and fingerprint cards. Increasingly, the resulting digital images are used as the input for further automated processing including: conversion to a full-text-searchable representation via machine printed or handwritten (optical) character recognition (OCR), postal zone identification, raster-to-vector conversion, and fingerprint matching. These diverse document images may be bi-tonal, gray scale, or color. Spatial sampling frequencies range from about 200 pixels per inch to over 1,000. The quality of the digital images can have a major effect on the accuracy and speed of any subsequent automated processing, as well as on any human-based processing which may be required. During imaging system design, there is, therefore, a need to specify the criteria by which image quality will be judged and, prior to system acceptance, to measure the quality of images produced. Unfortunately, there are few, if any, agreed-upon techniques for measuring document image quality objectively. In the output images, it is difficult to distinguish image degradation caused by the poor quality of the input paper or microfilm from that caused by the scanning system. We propose several document image quality criteria and have developed techniques for their measurement. These criteria include spatial resolution, geometric image accuracy, (distortion), gray scale resolution and linearity, and temporal and spatial uniformity. The measurement of these criteria requires scanning one or more test targets along with computer-based analyses of the test target images.
Special Session: Image Information Processing On High-Performance Computers
Are higher performance image processors like GAPP really needed?
Show abstract
We are building automatic target recognizer (ATR) systems. These systems are being applied to many different target detection scenarios. Our work has been in the military application field, but the problems are the same for most commercial applications as well. The measures of performance are the same. How well can a human perform the same target detection task? What is the probability of detecting (Pd) the target versus the false alarm rate (FAR)? The community has evolved comparative performance techniques that present the merits of alternative system approaches. In this paper, we present the results of a comparative study of alternative algorithms for detecting and classifying buried and surface land mines from an airborne platform in infrared imagery. The results show that for low signal-to-clutter ratios, more complex algorithms produce higher Pd for a given FAR. More complex algorithms signify the need for a high performance, high throughput processor to meet typical time lines. An update on the geometric arithmetic parallel processor (GAPPTM) high performance/throughput machine is therefore provided.
MaxVideo 200: a pipeline image processing architecture for performance-demanding applications
Glen L. Ahearn
Show abstract
A variety of hardware architectures have been used to address image processing needs including general purpose processors (CPUs, array processors, and DSPs), parallel processors, and pipeline processors. In performance-demanding imaging applications a pipeline processing architecture, such as MaxVideo 200, has some distinct advantages. MaxVideo 200 is a pipeline processing architecture designed by Datacube for a variety of image processing applications. It consists of many different `algorithmically specific' processing sections that may be used to process the image in stages. These processing sections all operate at a 20 MHz, synchronous rate. Each processing section performs its operation (convolution, arithmetic operation, look-up table, etc...) in a fixed, integer number of clock cycles. The time it takes for each processing section to perform its function is known as pipeline delay. Since the pipeline delay of each section is fixed, the cumulative delay of passing the image data through multiple processing sections can be calculated. This cumulative delay can then be compensated for within the memory architecture of the MaxVideo 200. The memory architecture of the MaxVideo 200 also operates on the same synchronous 20 MHz clock as the processing section. By controlling when the memory acquires the image data after it has been processed, the image may be `realigned' in memory for display or further processing.
ATCURE: heterogeneous computer architecture for real-time image information analysis
Jeremy A. Salinger,
R. Michael Hord
Show abstract
ATCURE is a real-time, open, high-performance computer architecture optimized for automatically analyzing imagery for such applications as medical diagnosis, character recognition, and target cueing. ATCURE's tightly coupled heterogeneous architecture includes specialized subsystems for input/output, image processing, numeric processing, and symbolic processing. Different specialization is provided for each subsystem to exploit distinctive demands for data storage, data representation, mixes of operations, and program control structures. This paper discusses ATCURE in the context of the evolution of computer architectures, and shows that heterogeneous high-performance architecture (HHPA) computers, an emerging category of parallel processors characterized by superior cost performance, and of which ATCURE is an example, are well suited for a wide range of image information processing applications.
Image understanding architecture: a status report
Charles C. Weems
Show abstract
The image understanding architecture (IUA) effort is now entering a new phase. The second generation IUA prototypes are nearing completion and our experience with the hardware, extensive software simulations, and additional research are guiding the development of a new generation of the IUA. Furthermore, the primary contractors have been selected for a technology reinvestment project (TRP) award to develop a commercial, off-the-shelf implementation of the new IUA for dual-use embedded applications. Thus, the IUA effort is in the process of making the transition from a research and development project to being a commercially available vision accelerator. IUA development is currently taking place at three sites (Hughes Research Laboratories in Malibu, Calif., Amerinex Artificial Intelligence Inc., and the University of Massachusetts at Amherst). This TRP consortium plans to form a new company to take over all aspects of IUA development and production. This article summarizes the previous efforts, describes the current status of the effort, expands briefly upon some of the basic research that is supporting the next generation IUA, and concludes with a section describing the efforts that will be undertaken in developing the next generation.
Goals of and open problems in high-performance heterogeneous computing
Howard Jay Siegel,
John K. Antonio,
Richard C. Metzger,
et al.
Show abstract
Ideally, a heterogeneous computing (HC) environment, is a well-orchestrated and coordinates suite of high-performance machines that provides support for computationally intensive applications with diverse computing requirements. Such an HC system includes a heterogeneous suite of machines, high-speed interconnections, interfaces, operating systems, communication protocols, and programming environments. HC is the effective use of these diverse hardware and software components to meet the distinct and varied computational requirements of a given application. Implicit in this concept of HC is the idea that subtasks with different machine architectural requirements are embedded in the applications executed by the HC system. Two types of HC systems, mixed-mode machines and mixed-machine systems, are discussed. The goals of and open problems in HC are overviewed.