Proceedings Volume 1827

Model-Based Vision

Hatem N. Nasr, Rodney M. Larson
cover
Proceedings Volume 1827

Model-Based Vision

Hatem N. Nasr, Rodney M. Larson
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 20 April 1993
Contents: 3 Sessions, 22 Papers, 0 Presentations
Conference: Applications in Optical Science and Engineering 1992
Volume Number: 1827

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Modeling and Representation
  • Matching and Recognition
  • Architectures
  • Matching and Recognition
  • Architectures
  • Matching and Recognition
  • Modeling and Representation
Modeling and Representation
icon_mobile_dropdown
Infrared thermal prediction modeling
Walter John Winterberger, Alfred M. Baird, William E. Taylor
An enhanced thermal prediction model has been developed for weapon system performance analyses. Model development and validation was conducted as part of an effort to quantify the adverse weather capability of autonomous precision guided munitions. The emphasis was on modeling the thermal signatures of fixed, high value targets such as bridges and power plants. The techniques developed are directly applicable to modeling target and background features in IR wavebands that are used by model-based target recognition algorithms.
Visual transformation of a 3D object in attribute hypergraph representation
Andrew K. C. Wong, C. W. Li
This paper proposes a recognition method which enables a model driven vision system to recognize a 3-D object with a single view using visual transformation. The term visual transformation here means geometrical transformation (rotation, translation, and magnification) and projection. When a 3-D object is represented by an attributed hypergraph (AHG) G, any single view of the object can also be expressed by an AHG say g. The recognition process involves establishing a matching between G and g. If the attributes are not considered, g is a subgraph of G. The attributes of g can be converted from those of G by a visual transformation. The objective of this research is to study the transformations of these attributes and determine if there is a possible visual transformation between the given attributes. As application, we can determine the monomorphism between the AHG of the model of an object and the AHG of a single view of the object.
Model-based object classification using unification grammars and abstract representations
Kathleen A. Liburdy, Robert J. Schalkoff
The design and implementation of a high level computer vision system which performs object classification is described. General object labelling and functional analysis require models of classes which display a wide range of geometric variations. A large representational gap exists between abstract criteria such as `graspable' and current geometric image descriptions. The vision system developed and described in this work addresses this problem and implements solutions based on a fusion of semantics, unification, and formal language theory. Object models are represented using unification grammars, which provide a framework for the integration of structure and semantics. A methodology for the derivation of symbolic image descriptions capable of interacting with the grammar-based models is described and implemented. A unification-based parser developed for this system achieves object classification by determining if the symbolic image description can be unified with the abstract criteria of an object model. Future research directions are indicated.
Matching and Recognition
icon_mobile_dropdown
Model-based matching using a minimum representation size criterion and a hybrid genetic algorithm
Ravi B. Ravichandran, Arthur C. Sanderson
Matching of observed scene features to stored models of known objects, in 2-D and 3-D, is a fundamental step toward general scene interpretation. For these problems, the match is defined in terms of the correspondence between the model features and the scene features, and the transformation that maps the model onto the scene. The technique presented in this paper uses a hybrid genetic algorithm to search for the correspondence, and each correspondence is evaluated by finding the associated transformation. The genetic algorithm is composed of a position based crossover operator, and two mutation operators: a random mutation operator and an assignment based mutation operator. The measure of the match, as defined by a correspondence and a transformation, is made in accordance with the principles of the minimum representation size criteria. Results for models and scenes in large, occluded, and cluttered environments are described. The results are presented for two distinct cases, in the first case the model and scene are specified in 2-D, and in the second case the model is specified in 3-D and the scene in 2-D. The results show the genetic algorithm based search technique to be very efficient and the overall matching technique to be robust in noisy environments.
Image-data-based matching for affine-transformed pictures
Yoshihiko Nomura, Yujiro Harada, Seizo Fujii
This paper presents a simple and robust pattern matching algorithm working on image-data level and requiring no feature extraction. A model picture is transformed into an estimated picture, and the estimated picture is matched to an actually input picture. Both the geometrical affine transformation and a linear gray-level transformation are examined, and the transformation parameters relating to the rotation, translation, expansion, and brightness are estimated by using a statistical optimization technique, i.e., an iterative non-linear least squares method where the residual sum of squares between the actually input picture and the estimated picture is used as an evaluation function. The characteristic of the proposed method is that the parameters are estimated by linear matrices calculations so that the calculation is markedly simplified and it could be processed in parallel for all the pixels. The matrices are easily calculated from the gray-level and its spatial derivatives in the horizontal and vertical directions in the model picture, and the gray-level in the actually input picture. As a result of some experiments for a simple pattern and a complicated one, it is confirmed that a translation parameter value is accurately estimated with approximately 0.1 pixel. The dynamics of parameter estimation are also examined.
Recognition of partially occluded objects using macro features
Peter Y. Hsu, Anthony P. Reeves
Objects that are partially visible in an image may be recognized by detecting a number of salient local object features that conform to a set of relative location constraints. Such vision systems contain two main computational components; a feature extraction mechanism and a matching strategy that relates sets of features and their locations to object classes. The macro feature approach represents an object by a small number of complex features; the objectives are to provide very robust feature extraction and to simplify the feature matching stage by minimizing the number of detected features. Described here is a macro feature vision system that uses the Generalized Hough Transform on significant regions of the object surface for local feature detection. The results of using this system to detect objects in multi-object images with partial occlusion are presented.
Information theoretic approach to model-based 3D object recognition using orthogonal transforms
Inderpreet S. Khurana, Richard F. Vaz, David Cyganski, et al.
A previously reported technique for object recognition and orientation determination is used to develop an information theoretic viewpoint of the model-based vision problem. The technique employs an analytic object model developed from transform coefficients of object views, and develops pose parameter estimates by minimizing an error measure developed from the model and transform coefficients of an acquired image. Parallels between this technique and vector quantization-based coding are developed, which motivate an analysis of the machine vision system performance in terms of the distortion/rate performance achievable. This paper presents the framework for such a viewpoint, along with preliminary performance results of the recognition technique.
On-line perceptual recognition system
Panos A. Ligomenides
In this paper we are concerned with automating the inferential derivation of meaningful assertions about resemblance (predicates of probability, possibility or belief) from often underconstrained, indeterminate, or even incomplete sensory data. Our approach for solving the problem of modeling perceptual recognition (within the confines of one-dimensional variational profiles), is based on interactive and paradigmatic `training' of `formal description schema-fds' models, which we briefly review here. Thereafter, we present the architecture and operations of an on-line perceptual recognition system, the E*KB system, which serves as a prosthetic extension of a rational human or a robotic decision maker.
Maximum entropy and minimum cross-entropy methods in image processing
Cristian E. Toma, Mihai P. Datcu
The maximum entropy (ME) and minimum cross-entropy (MCE) formalisms provide a coherent tool for incorporating new information (in terms of constraints) into initial models and also an alternative tool for solving inverse problems. Our paper discusses some particularities of the application of ME and MCE formalisms to image processing problems; given the ME-MCE framework, one has to identify the proper constraint system which applies for the concrete problem. The relation between Bayesian maximum aposteriori probability (MAP) methods and ME-MCE methods are also discussed. Examples are given in the field of the restoration of synthetic aperture radar images, whose resolution is affected by the well- known speckle noise, a side effect of the coherency of the image formation system.
Architectures
icon_mobile_dropdown
Fusion of symbolic and feature information for high-level object recognition
Neelima Shrikhande, Jim Getzinger
In this paper, we present an algorithm which uses symbolic as well as physical labels on the edges and surfaces to constrain the scene-model matching process. Symbolic labels are used to distinguish between curved and planar objects, occluding edges, background surface, etc. These are used along with physical labels such as distances and angles to prune the matching graph. This paper describes a real time object recognition environment that integrates the pruning method described above with low level image processing and high level object recognition algorithms. Results are reported for synthetic and real range images. Our results show that inclusion of symbolic labels improves the accuracy and efficiency of matching.
Distributed image event database: support for symbolic image processing on the image understanding architecture
Richard Allen Lerner
This paper describes software to support symbolic processing of image events on the image understanding architecture (IUA). The IUA is a massively parallel computer for real-time image understanding tasks. Although well suited for image understanding processing, it presents a challenge to researchers who need to implement algorithms quickly and effectively. We address this challenge by providing a programming model whose basis is a distributed version of the intermediate symbolic representation database, familiar in programming models for symbolic image understanding processing on uniprocessors. In distributing the database, we allow multiple tasks to efficiently manipulate the database symbols in parallel, while hiding most of the additional complexity caused by the data distribution. We illustrate our database by presenting a non-trivial grouping algorithm.
Interaction between different types of domain-constraint knowledge in image segmentation
Stuart W. Wells, Matthew O. Ward
Image segmentation systems which employs domain specific knowledge produce, in general, results superior to context-free systems. A significant drawback with using domain specific knowledge is its lack of portability and excessive development and computation time. In this paper, we investigate using a general rule based segmentation system and augmenting it with domain constraint knowledge to improve the performance of the system. We also investigate interactions between multiple types of domain constraint knowledge and their effect on parameter selection and rule ordering.
Symbolic representation, modeling, and manipulation of arcs and lines
Kamran Reihani
From a computational perspective, a general vision problem is considered by many to be an ill-posed problem. Currently, the dominant approach to characterizing vision computations is to categorize the required processings into low, intermediate, and high levels. This paper adopts a structural approach to vision that couples symbolic and numerical methods and discusses the semantics, modeling, and manipulation of symbolic descriptions of arc and line image structures for vision applications. Collectively, this approach demonstrates a logical perspective on the processes that occur in low, intermediate, and high levels of vision computations.
Knowledge-based pattern recognition using an associative processor
Arun D. Kulkarni, Vijay B. Nagpurkar
An image understanding system often consists of the preprocessing, feature extraction, and classification stages. In this paper we have considered a descriptive approach for the classification. As an illustration, we have considered the problem of identification of sailing crafts. The properties like the position of the mast(s), height of the mast, and type of sails are used as features. The classification scheme is described by a hierarchical tree structure. We have created the knowledge base for the classifier by encoding the classification rules, using an associative processor. A number of operations can be performed with the associative processor. They include: an upward closure, downward closure, union, and intersection. In order to use the processor as a classifier, the intersection has been used. The intersection is achieved by performing a downward closure followed by thresholding. We have used a two- layer nonlinear feedback network as the associative processor. We have also developed a menu-driven input/output interface for the classifier.
Matching and Recognition
icon_mobile_dropdown
Fuzzy logic approach to model-based image analysis
Koji Miyajima, Anca L. Ralescu
In this paper, we propose a model representation of objects including fuzziness and a matching method between the model and the result of features extraction from image data. In the model representation, the objects are described hierarchically and the attributes of each component such as color, shape, and size are described by fuzzy sets. The ambiguous correlations between attributes and components are described using fuzzy measure. Considering the correlations, the objects are recognized by integrating the results of the matching between the results of image processing and the attributes in the model. Finally, the method is applied to recognition of a real photograph as experimental results and their effect are discussed.
Architectures
icon_mobile_dropdown
Towards a multimedia model-based image retrieval system using fuzzy logic
Hiroshi Iwamoto, Anca L. Ralescu
This paper describes our current research on the subject of image retrieval based on a model of the image data. The work focuses on retrieving the image of a natural object from a collection of like images; the application considered is that of retrieval from a data base of face photographs. The difficulty arises mainly from the fact that precise, geometric models of natural images are not necessarily available. The work described builds on previous work on an image retrieval system based on linguistic modeling of image data using fuzzy sets: such a model is a collection of statements `X is F,' where X refers to a region/component of the object and F refers to a fuzzy set, such as `big,' `small,' `long,' which describes the size of X. In this paper we consider the situation when the model of the image is expressed in a graphic form (sketch). The motivation for considering this subject arises, among others, from: (1) the fact that not all image characteristics (such as a wrinkle for example) can be described linguistically, (2) linguistic models tend to be subjective, and (3) a linguistic model may not be available. Retrieval based on mixed queries, linguistic and graphic, is considered. Fuzzy logic is used to express the linguistic model of the image data, and for reasoning.
Matching and Recognition
icon_mobile_dropdown
Automating measurement from standard radiographs
Adam I. Harris M.D., Dov Dori, Mitchell Sheinkop, et al.
An obligatory portion of orthopaedic examination is a radiographic examination. Techniques, such as computed tomography easily lend themselves to computerized analysis. Both expense and hazards from radiation prohibit their routine use in orthopaedic practice. Standard radiographs provide a significant amount of information for the orthopaedic surgeon. From the radiographs, surgeons will make many measurements and assessments. A major problem is that measurements are performed by hand and most often by the operating surgeon who may not be completely objective. To overcome this as well as to alleviate the burden of manual measurements which must be made by trained professionals, we have initiated a program to automate certain radiographic measurements. The technique involves digitizing standard radiographs from which features are extracted and identified. This poses a challenge. Structures, such as soft tissues (muscle, and bowel) markedly decrease the signal to noise ratio of the image. The work discusses modeling of the soft tissue structures in order to enhance detection and identification of bone landmarks. These are the anchors for standard measurements which in turn have clinical utility.
Modeling and Representation
icon_mobile_dropdown
Model-based 3D object pose estimation from linear image decomposition and direction of arrival analysis
David Cyganski, Richard F. Vaz, Charles R. Wright
This paper presents enhancements and new results related to a method for model-based object recognition which uses a single, comprehensive analytic object model representing the entirety of a suite of gray-scale views of the object. Object orientation and identity are directly established by this method from arbitrary views, even though these views are not related by any geometric image transformation. The approach is also applicable to multi-sensor real and complex data, such as radar and thermal signatures. The object model is comprised of a reduced reciprocal image set generated from a Fourier representation of an object image suite. The projection of an acquired image onto the reciprocal basis yields samples of a complex exponential, the phase of which reveals the pose parameters. Estimation of this phase for several degrees of freedom corresponds to the plane wave direction of arrival (DOA) problem; thus the pose parameters can be found using DOA solution techniques. Results are given which illustrate the performance of an implementation of this method using camera acquired images.