Proceedings Volume 1708

Applications of Artificial Intelligence X: Machine Vision and Robotics

Kevin W. Bowyer
cover
Proceedings Volume 1708

Applications of Artificial Intelligence X: Machine Vision and Robotics

Kevin W. Bowyer
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 1 March 1992
Contents: 19 Sessions, 69 Papers, 0 Presentations
Conference: Aerospace Sensing 1992
Volume Number: 1708

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • How To Design a Robot "Head" I
  • How To Design a Robot "Head" II
  • How To Design a Robot "Head" I
  • How To Design a Robot "Head" II
  • Machine Vision Inspection Techniques I
  • Segmentation of Fused Range and Intensity Imagery
  • Parallel and VLSI Architectures for Machine Vision
  • Comparison of Range Image Segmentation Algorithms
  • State of the Art in Post-Canny Edge Detection I
  • Simulation and Visualization Environments for Autonomous Robots
  • State of the Art in Post-Canny Edge Detection II
  • Planning of Robot Reach/Grasp Operations
  • Fuzzy Morphological Neural Networks
  • Machine Vision Inspection Techniques II
  • Reasoning Techniques for Vision Systems
  • Exploration of "Recognition by Components" I
  • Exploration of "Recognition by Components" II
  • Representation and Matching
  • Reactive Robotic Control Strategies
  • Image Processing Techniques
  • Machine Vision Inspection Techniques III
  • Image Processing Techniques
  • How To Design a Robot "Head" II
How To Design a Robot "Head" I
icon_mobile_dropdown
Harvard binocular head
Nicola J. Ferrier
This paper presents the design and control aspects of the Harvard Stereo Robotic Head, a camera-head system. Active vision research suggests the use of sensor movement. The emphasis of this paper is on the physical aspects of controlling camera movements and lens parameters, not on the selection of visual targets. The system provides a testbed for active stereo-vision and motion-vision experiments.
How To Design a Robot "Head" II
icon_mobile_dropdown
Layered control of a binocular camera head
James L. Crowley, Phillippe Bobet, Mouafak Mesrabi
This paper describes a layered control system for a binocular stereo head. It begins with a discussion of the principles of layered control. It then describes the mechanical configuration for a binocular camera head with six degrees of freedom. A device level controller is presented which permits an active vision system to command the position of a binocular gaze point in the scene. The final section describes the design of perceptual actions which exploit this device level controller.
How To Design a Robot "Head" I
icon_mobile_dropdown
Head, eyes, and head-eye systems
Kourosh Pahlavan, Jan-Olof Eklundh
Active vision systems can be considered as systems that integrate visual sensing and action. Sensing includes detection of actions/events and results also in specific actions/manipulations. This paper mainly addresses the basic issues in the design of a head-eye system for the study of active-purposive vision. The design complexity of such a head is defined by the activeness of the visual system. Although we have not had the ambition to exactly reproduce the biological solutions in a robot, we claim that the designer should carefully consider the solutions offered by evolution. The flexibility of the behavioral pattern of the system is constrained by the mechanical structure and the computational architecture used in the control system of the head. The purpose of the paper is to describe the mechanical structure as well as the computational architecture of the KTH-head from this perspective.
AUC robot camera head
Active vision is an area that has received increased attention over the past few years. LIA/AUC, an active research area, uses active vision for geometric scene modeling and interpretation. In order to pursue this research, a binocular robot camera head has been constructed. In this manuscript, the basic design of the head is outlined and a prototype that has been constructed is described in some detail. Detailed specifications of the components are provided together with a section on lessons learned from the construction of the prototype.
How To Design a Robot "Head" II
icon_mobile_dropdown
TRISH: the Toronto-IRIS Stereo Head
Michael R. M. Jenkin, Evangelos E. Milios, John K. Tsotsos
This paper introduces and motivates the design of a controllable stereo vision head. The Toronto IRIS stereo head (TRISH) is a binocular camera mount consisting of two AGC, fixed focal length color cameras forming a verging stereo pair. TRISH is capable of version (rotation of the eyes about the vertical axis so as to maintain a constant disparity), vergence (rotation of the eyes about the vertical axis so as to change the disparity), pan (rotation of the entire head about the vertical axis), and tilt (rotation of the eyes about the horizontal axis). One novel characteristic of the design is that the two cameras can rotate about their own optical axes (torsion). Torsion movement makes it possible to minimize the vertical component of the two-dimensional search which is associated with stereo processing in verging stereo systems.
Machine Vision Inspection Techniques I
icon_mobile_dropdown
Fast confocal image processing for inspection
A. Ravishankar Rao, Frederick Y. Wu, Jon R. Mandeville, et al.
The measurement of surface topography is an important inspection task as it provides useful information for process and quality control. A candidate technique for such an application is confocal imaging. The advantages of confocal imaging are that it is a noncontact measurement, can be operated at high speed (greater than 10 megapixels/sec) and submicron resolution, and provides height information in multilayered semitransparent materials. In this paper, we present a scheme for the fast processing of confocal images. The scheme consists of measuring the response function of the confocal system and deriving a deconvolution filter based on this response. The input signal is deconvolved in order to improve the depth resolution and then processed to identify significant peaks. These peaks represent the position of different surfaces in the object being inspected. For semitransparent materials, our scheme is capable of detecting up to two surfaces at a given location.
Automated visual quality evaluation of CVD film
W. Philip Kegelmeyer Jr., Wilfred J. Hansen
This study describes the implementation and performance of a new image analysis algorithm, ALOET, as applied to automatically distinguishing between acceptable and unacceptable chemical vapor deposition (CVD) diamond film. ALOET is a texture feature which measures the homogeneity of the edge orientations in a local window, and which is useful in differentially characterizing large and small polycrystalline structures. The analysis in this study includes a performance metric for film quality evaluation, numerical and visual performance results, and an example analytical model for determining the algorithm's control parameters.
Texture defect detection: a review
Machine vision and automatic inspection has been an active field of research during the past few years. In this paper, we review the texture defect detection methods used at present. We classify them in two major categories, global and local, and we discuss briefly the major approaches that have been proposed.
Segmentation of Fused Range and Intensity Imagery
icon_mobile_dropdown
Edge detection and labeling by fusion of intensity and range images
Sateesha G. Nadabar, Anil K. Jain
We have developed an energy minimization approach for detection and labeling of edges by fusing information from registered intensity and range images. The chosen set of labels for the edges (jump, crease, intensity, no edge) is a classification based on the physical properties of the surfaces in the scene. Bayesian formulation is used to combine the different information sources and the a priori knowledge about the labels is modeled by a Markov random field. Results of applying the labeling algorithm on several real image pairs indicate that it is possible to detect as well as label the jump, crease, and intensity edges accurately using this approach.
Detecting wings in quadric surface scenes
Greg C. Lee, George C. Stockman
A method is given for detecting primitive parts of rigid objects whose surfaces are approximated well by quadric patches. From fused range and intensity images, primitives are detected by simultaneously fitting a 2-D object contour and a set of adjacent 3-D surface points. Simulation results show that combined fitting is superior to fitting either range or intensity alone. Experiments with 10 real fused images indicate that a recognition system could be built upon the outlined primitive detection subsystem.
Relaxation method for segmenting quadratic surfaces from depth and intensity images
Recovery of shape and structure of objects present in a scene from its image is a significant problem in vision. The purpose of this paper is to develop and demonstrate an iterative relaxation algorithm that combines (by a fusion process) surface interpolation using a nonuniformly sampled range image and the recovery of shape from shading of a uniformly sampled intensity image. The recovery of shape from shading requires uniform lighting, as well as uniform viewing directions across the scene which is difficult to achieve. The objective is to extract several piecewise planar surfaces whose orientation parameters are extracted iteratively. With a few exceptions, most range sensors provide nonuniformly sampled depth images. It is desirable to extract the intrinsic surfaces and resample the image over a uniformly spaced grid. Then, it is expected that the structure of each image is isomorphic to that of the other image. Once the structure based region/volume correspondence is established, it becomes possible to adapt the consistency constraint for each surface and the smoothness criterion at the boundaries between two surfaces, and to activate resegmentation (incremental) if necessary.
Parallel and VLSI Architectures for Machine Vision
icon_mobile_dropdown
Design of a systolic VLSI chip for computing scale space
Sanjay Nichani, N. Ranganathan
In this paper we describe the design and implementation of a systolic VLSI chip for computing scale space. The hardware can also be used for Gaussian filtering and Laplacian of Gaussian edge detection. The chip is based on an architecture proposed earlier. The algorithm and the architecture exploit a high degree of pipelining and parallelism in order to obtain high speed, efficiency, and throughput. The hardware organization of a processor cell is simple enough that the entire systolic array can be realized as a single chip system. A prototype CMOS VLSI chip implementing a single processor cell was designed, fabricated, and tested. Based on the estimates obtained from the prototype chip, a real life chip is expected to operate at a rate of 40 MHz. The chip can process a 512 X 512 gray-level image in about 0.006 seconds and a 1000 X 1000 gray-level image in 0.012 seconds which is much faster than other systems reported in the literature.
Systolic architecture for discrete wavelet transforms with orthonormal bases
Henry Y.H. Chuang, HyungJun Kim, Ching-Chung Li
The wavelet transform provides a new method for signal/image analysis where high frequency components are studied with finer time resolution and low frequency components with coarser time resolution. It decomposes a scanned signal into localized contributions for multiscale analysis. This paper presents a general systolic architecture for efficient computations of both signal decomposition and signal reconstruction with orthonormal wavelet bases. When the number of data points windowed in the input is W equals 2m, our discrete wavelet transform (DWT) systolic architecture is composed of m layers of 1-dimensional arrays, which compute the high-pass and the low-pass filtered components simultaneously. Input data string can enter and be processed on-the-fly continuously at the rate of one data point per clock period T. For an input signal of length N (multiple of W), the computation time is NT when N is large. Multiple DWT problems can be pipelined through the array. The computation time for a large number of successive DWT problems is also NT per DWT.
Parallel VLSI-based architecture for multimotion estimation
Jean-Didier Legat, J. P. Cornil, Damien Macq, et al.
This paper describes a new parallel architecture dedicated to multimotion estimation. The input image is scanned by a standard video camera with 256 grey levels. Motion computing is based on the optical flow determination. Some constraints are proposed to allow multimotion evaluation. The algorithm is presented and the main features of a 1-D systolic architecture which is based on a custom VLSI chip is given. This architecture allows a real-time implementation of the multimotion estimation algorithm.
Tracking image features using a parallel computational model
Timothy J. Ellis, Majid Mirmehdi, Geoff R. Dowling
This paper describes a parallel implementation of an image feature tracking system. The system is designed to operate as the front-end of a vision system for controlling autonomous guided vehicles (AGV). Image features or tokens (edge-based line segments in the example given here) are extracted from the image and allocated to individual tracking processes. Both the extraction and the tracking stages are performed by concurrent processes. Arbitrary tracking algorithms may be associated with each process. In the current implementation, a Kalman filter is used to track and predict tokens in subsequent image frames.
Solutions for real-time visual tracking
Olli Silven, Tapio Repo
The purpose of this work has been to find solutions for reliable real-time monocular visual tracking. The goal is to estimate the relative motion of a camera with respect to a rigid 3-D scene by tracking features. In the beginning, the 3-D locations of the features are not known accurately, but during the tracking process these uncertainties are reduced through the integration of new observations. Most attention has been given to modeling measurement uncertainties and selecting the features to be extracted from image frames. The experimental system under implementation employs a bank of extended Kalman filtering based trackers each of which calculates estimates for location and motion using measurements of a few feature points at a time. The small number of points makes the trackers sensitive to various measurement errors, simplifying the detection of tracking failures, thereby giving potential for improving reliability. The preliminary experiments have been performed with satisfactory results for sequences of images at the rates of 22 to 35 frames per second.
Comparison of Range Image Segmentation Algorithms
icon_mobile_dropdown
Performance evaluation of a class of M-estimators for surface parameter estimation in noisy range data
Muhammad Javed Mirza, Kim L. Boyer
Depth maps are frequently analyzed as if, to an adequate approximation, the errors are normally, identically, and independently distributed. This noise model does not consider at least two types of anomalies encountered in sampling: A few large deviations in the data, often thought of as outliers; and a uniformly distributed error component arising from rounding and quantization. The theory of robust statistics formally addresses these problems and is efficiently used in a robust sequential estimator (RSE) of our own design. The specific implementation was based on a t-distribution error model, and this work extends this concept to several well known M-estimators. We evaluate the performance of these estimators under different noise conditions and highlight the effects of tuning constants and the necessity of simultaneous scale and parameter estimation.
Integrated approach for surface and volumetric segmentation of range images using biquadrics and superquadrics
Alok Gupta, Ruzena K. Bajcsy
The problem of part definition, description, and decomposition is central to the shape recognition systems. We present an integrated framework for segmenting dense range data of complex 3-D scenes into their constituent parts in terms of surface (bi-quadrics) and volumetric (superquadrics) primitives, without a priori domain knowledge or stored models. Surface segmentation is performed by a novel local-to-global iterative regression approach of searching for the best piecewise description of the data in terms of bi-quadric models. Region adjacency information, surface discontinuities, and global shape properties are extracted and used to guide the volumetric segmentation. Superquadric models are recovered by a global-to- local residual-driven procedure, which recursively segments the scene to derive the part- structure. A set of acceptance criteria provide the objective evaluation of intermediate descriptions, and decide whether to terminate the procedure, or selectively refine the segmentation, or generate negative volume description. Superquadric and bi-quadric models are recovered in parallel to incorporate the best of the coarse-to-fine and fine-to-coarse segmentation strategies. The control module generates hypotheses about superquadric models at clusters of underestimated data and performs controlled extrapolation of part-models by shrinking the global model. We present results on real range images of scenes of varying complexity, including objects with occluding parts, and scenes where surface segmentation is not sufficient to guide the volumetric segmentation. We conclude by discussing the applications of our approach in data reduction, 3-D object recognition, geometric modeling, automatic model generation, object manipulation, qualitative vision, and active vision.
Contribution of edges and regions to range image segmentation
Andre Davignon
Most of the segmentation algorithms of range images are based upon either a region approach or an edge approach. While the region growing methods are poor in delimiting the region boundaries, the edges do not give information about the different surfaces and are not connected. But these two approaches can collaborate because they give complementary information about the scene. In the proposed method we first extract edges from the image. Two kinds of edges are considered: occlusion edges or signal discontinuity and roof edges or orientation discontinuities. The edges are completed by a concurrent step and roof edges closing method in order to form initial closed regions by connected components labeling. Then begins an iterative region correcting process. At each iteration we fit a least squares bivariate polynomial to every region. Then each boundary point is examined to see if it is better approximated by its region or by a neighboring region. But the regions are not allowed to overpass initial edge points considered as confident surface boundaries. This process converges after few iterations and produces a better correspondence between the shapes of the regions and the surfaces of the objects. Results are shown for real range images.
Curvature, scale, and segmentation
Frank P. Ferrie, A. Lejeune, D. Baird
The central idea behind this paper is that many of the difficulties associated with parts decomposition (segmentation) using so-called edge-based and region-based approaches can be alleviated by pursuing both methodologies simultaneously. Our approach is based on generating a scale space of surface coverings using boundary features to determine which cover elements correspond to each part of an object. We present an algorithm that accomplishes this and show that it can successfully produce reliable parts decompositions on real range data.
State of the Art in Post-Canny Edge Detection I
icon_mobile_dropdown
Performance characterization of edge detectors
Visvanathan Ramesh, Robert M. Haralick
Edge detection is the most fundamental step in vision algorithms. A number of edge detectors have been discussed in the computer vision literature. Examples of classic edge detectors include the Marr-Hildreth edge operator, facet edge operator, and the Canny edge operator. Edge detection using morphological techniques are attractive because they can be efficiently implemented in near real time machine vision systems that have special hardware support. However, little performance characterization of edge detectors has been done. In general, performance characterization of edge detectors has been done mainly by plotting empirical curves of performance. Quantitative performance evaluation of edge detectors was first performed by Abdou and Pratt. It is the goal of this paper to perform a theoretical comparison of gradient based edge detectors and morphological edge detectors. By assuming that an ideal edge is corrupted with additive noise we derive theoretical expressions for the probability of misdetection (the probability of labeling of a true edge pixel as a nonedge pixel in the output). Further, we derive theoretical expressions for the probability of false alarm (the probability of labeling of a nonedge pixel as an output edge pixel) by assuming that the input to the operator is a region of flat graytone intensity corrupted with additive Gaussian noise of zero mean and variance (sigma) 2. Even though the blurring step in the morphological operator introduces correlation in the additive noise, we make an approximation that the output samples after blurring are i.i.d. Gaussian random variables with zero mean and variance (sigma) 2/M where M is the window size of the blurring kernel. The false alarm probabilities obtained by using this approximation can be shown to be upperbounds of the false alarm probabilities computed without the approximation. The theory indicates that the blur- min operator is clearly superior when a 3 X 3 window size is used. Since we only have an upperbound for the false alarm probability the theory is inadequate to confirm the superiority of the blur-min operator. Empirical evaluation of the performance indicates that the blur-min operator is superior to the gradient based operator. Evaluation of the edge detectors on real images also indicate superiority of the blur-min operator. Application of hysteresis linking, after edge detection, significantly reduces the misdetection rate, but increases the false alarm rate.
Robust method of edge detection
We present here the theory of developing robust test statistics for edge shape matching in one dimensional signals. We show that an unbiased test can be developed under the assumption of uncorrelated noise and this test can be made optimal and robust to perturbations of the assumed noise distribution under the extra assumption of symmetric noise. This approach to edge detection is believed to overcome the shortcomings of the uncertainty principle in image processing and is appropriate for use when edges of a certain type have to be identified with great accuracy in their location.
Toboggan contrast enhancement
John Fairfield
Toboggan contrast enhancement is a noniterative, no-parameter, linear execution time method for selectively augmenting the contrast of multispectral images of arbitrary dimensionality. It can be adapted to enhance images given any image function embodying one's notion of what constitutes an edge, provided the function monotonically decreases with distance from edges.
Simulation and Visualization Environments for Autonomous Robots
icon_mobile_dropdown
Integrating control, simulation, and planning in MOSIM
Yuval Roth-Tabak, Ramesh C. Jain
Simulations have traditionally been used as off-line tools, for examining process models and experimenting with system models for which it would have been either impossible, too dangerous, expensive, or time-consuming to perform with the physical systems. We propose a novel way of regarding simulations as part of both the development and the working phases of systems. In our approach simulation is used within the processing and control loop of the system to provide sensor and state expectations. This minimizes the inverse sensory data analysis and model maintenance problems. We refer to this mode of operation as the verification mode, in contrast to the traditional discovery mode. This paper describes the integration of control, simulation, and planning within the mobile platform control and simulation interface program (MOSIM). MOSIM is a program which supports the combination of control and simulation of disparate platforms and environments. The main feature of MOSIM is the sensor simulations and the provision for capturing real sensory data and registering the simulated data with it. In order to provide simulations and planning that are intertwined with the control of a physical system, temporal issues have to be considered. By limiting the focus of the system to small portions of complex models which are temporarily relevant to the system's operation, the system is able to maintain its models and respond faster. For this we employ the context-based caching (CbC) mechanism within MOSIM. CbC is a novel knowledge management technique which maintains large knowledge bases by making the necessary information available at the right time.
Computer-graphics-based approach to range sensor simulation
A technique for simulating ultrasonic and laser range transducers is described in this paper. The purpose of this work is to develop a means whereby sensor data can be artificially replaced by simulated data so that navigation and path planning algorithms can be tested off- line. It is important that the simulated range data be generated in real time. Further, the simulation must display actual sensor characteristics such as reflectance. The simulation was implemented on a silicon graphics workstation by utilizing the Z-buffer hardware incorporated in the system. The technique is explained in the paper together with a brief discussion of results.
Virtual- and real-world operation of mobile robotic manipulators: integrated simulation, visualization, and control environment
ChuXin Chen, Mohan M. Trivedi
This research is focused on enhancing the overall productivity of an integrated human-robot system. A simulation, animation, visualization, and interactive control (SAVIC) environment has been developed for the design and operation of an integrated robotic manipulator system. This unique system possesses the abilities for multisensor simulation, kinematics and locomotion animation, dynamic motion and manipulation animation, transformation between real and virtual modes within the same graphics system, ease in exchanging software modules and hardware devices between real and virtual world operations, and interfacing with a real robotic system. This paper describes a working system and illustrates the concepts by presenting the simulation, animation, and control methodologies for a unique mobile robot with articulated tracks, a manipulator, and sensory modules.
State of the Art in Post-Canny Edge Detection II
icon_mobile_dropdown
Boundary detection using quadratic filters: performance criteria and experimental assessment
Pietro Perona, Jitendra Malik
It is well known that the projection of depth or orientation discontinuities in a physical scene results in image intensity edges which are not ideal step edges but are more typically a combination of steps, peak, and roof profiles. However, most edge detection schemes ignore the composite nature of intensity edges, resulting in systematic errors in detection and localization. We have addressed the problem of detecting and localizing these edges, while at the same time solving the problem of false responses in smoothly shaded regions with constant gradient of the image brightness. We have shown that a class of nonlinear filters, known as quadratic filters are appropriate for this task, while linear filters are not. In this paper a series of performance criteria are derived for characterizing the SNR, localization, and multiple responses of these quadratic filters in a manner analogous to Canny's criteria for linear filters. Additionally, we show experiments on a series of images varying systematically the parameters of the edge detector.
Generalized adaptive smoothing for multiscale edge detection
Jer-Sen Chen
Discontinuity preserving smoothing for edge detection is receiving more attention because of the fact that it does not suffer the disadvantage of linear filtering which smooth edges as well as noise. Adaptive smoothing is basically a discontinuity preserving smoothing scheme which preserves edges with gradient magnitude greater the preset threshold value. However, it suffers a leaking effect in its long term iterative behavior. The propagation of smoothing in the low contrast area will sometimes affect the preservation of high contrast edges. This paper proposes a remedy for adaptive smoothing which includes the original image in the iteration process. It provides not only a more stable iterative behavior, but also introduces a new scaling parameter which facilitates multiple scale processing. The study of iterative behavior of the proposed generalized adaptive smoothing as well as automatic selection of the number of iterations is presented.
Assessing the state of the art in edge detection: 1992
Kim L. Boyer, S. Sarkar
Hoping the reader will not find the title overly pompous, we offer a brief and decidedly informal view of the state of the edge detection art, as we see it, in early 1992. We make no claim to clairvoyance, nor even to being especially insightful. But we have looked over the recent literature and made some attempt to evaluate where we are as a community with respect to this most ubiquitous problem and where we should be headed. We also briefly summarize the work of this session and our own recent contributions to compare the spectrum of philosophies represented to the community at large. This paper should be taken in the spirit in which it was written, which is to say not too seriously. Our aim is by no means frivolous, but we did try to have a little fun while dabbling as futurists. The ultimate goal of this paper is to stimulate some interesting interchange not so much on the `how to' of edge detection as on the `what next.'
Planning of Robot Reach/Grasp Operations
icon_mobile_dropdown
Parallel algorithm for computing 3D reachable workspaces
Tarek Khaled Alameldin, Tarek M. Sobh
The problem of computing the 3-D workspace for redundant articulated chains has applications in a variety of fields such as robotics, computer aided design, and computer graphics. The computational complexity of the workspace problem is at least NP-hard. The recent advent of parallel computers has made practical solutions for the workspace problem possible. Parallel algorithms for computing the 3-D workspace for redundant articulated chains with joint limits are presented. The first phase of these algorithms computes workspace points in parallel. The second phase uses workspace points that are computed in the first phase and fits a 3-D surface around the volume that encompasses the workspace points. The second phase also maps the 3- D points into slices, uses region filling to detect the holes and voids in the workspace, extracts the workspace boundary points by testing the neighboring cells, and tiles the consecutive contours with triangles. The proposed algorithms are efficient for computing the 3-D reachable workspace for articulated linkages, not only those with redundant degrees of freedom but also those with joint limits.
Connectionist and neural net implementations of a robotic grasp generator
Sharon A. Stansfield
This paper presents two parallel implementations of a knowledge-based robotic grasp generator. The grasp generator, originally developed as a rule-based system, embodies a knowledge of the associations between the features of an object and the set of valid hand shapes/arm configurations which may be used to grasp it. Objects are assumed to be unknown, with no a priori models available. The first part of this paper presents a `parallelization' of this rule base using the connectionist paradigm. Rules are mapped into a set of nodes and connections which represent knowledge about object features, grasps, and the required conditions for a given grasp to be valid for a given set of features. Having shown that the object and knowledge representations lend themselves to this parallel recasting, the second part of the paper presents a back propagation neural net implementation of the system that allows the robot to learn the associations between object features and appropriate grasps.
Graphics-based operator control station for local/remote telerobotics
Bruce Bon, John Beahan
A prototype operator control station (OCS) for controlling a remote telerobotic system is being developed and tested. The operational scenario is for an OCS which may be on the ground or in space, controlling a telerobot during the performance of assembly and maintenance tasks in space. The principal goal of this effort is to demonstrate effective remote control of telerobotic operations on realistic tasks, with communications between local and remote sites constrained in both latency and thruput. The prototype OCS will be used to control a telerobot located at the Johnson Space Center (JSC) in Houston, Texas, as well as one co-located with the OCS prototype at JPL. This paper provides an overview of the OCS architecture and operator interface and describes implementation status.
Design and simulation of an articulated surgical arm for guiding stereotactic neurosurgery
A. Majeed Kadi, Lucia J. Zamorano, Matthew P. Frazer, et al.
In stereotactic surgery, the need exists for means of relating intraoperatively the position and orientation of the surgical instrument used by the neurosurgeon to a known frame of reference. An articulated arm is proposed which would provide the neurosurgeon with on-line information for position, and orientation of the surgical tools being moved by the neurosurgeon. The articulated arm has six degrees of freedom, with five revolute and one prismatic joints. The design features include no obstruction to the field of view, lightweight, good balance against gravity, an accuracy of 1 mm spherical error probability (SEP), and a solvable kinematic structure making it capable of fitting the operating room environment. The arm can be mounted on either the surgical table or the stereotactic frame. A graphical simulation of the arm was created using the IGRIP simulation package created by Deneb Robotics. The simulation demonstrates the use of the arm, mounted on several positions of the ring reaching various target points within the cranium.
Fuzzy Morphological Neural Networks
icon_mobile_dropdown
Learning and adaptation in fuzzy neural systems
Madan M. Gupta
In recent years, an increasing number of researchers have become involved in the subject of fuzzy neural networks in the hope of combining the reasoning strength of fuzzy logic and the learning and adaptation power of neural networks. This provides a more powerful tool for fuzzy information processing and for exploring the functioning of human brains. In this paper, an attempt has been made to establish some basic models for fuzzy neurons. First, several possible fuzzy neuron models are proposed. Second, synaptic and somatic learning and adaptation mechanisms are proposed. Finally, the possibility of applying nonfuzzy neural networks approaches to fuzzy systems is also described.
Calculus of fuzzy if-then rules and its applications
Lotfi A. Zadeh
In contrast to classical logical systems, fuzzy logic is aimed at a formalization of modes of reasoning which are approximate rather than exact. Basically, a fuzzy logical system may be viewed as a result of fuzzifying a standard logical system. Thus, one may speak of fuzzy predicate logic, fuzzy modal logic, fuzzy default logic, fuzzy multivalued logic, fuzzy epistemic logic, etc. In this perspective, fuzzy logic is essentially a union of fuzzified logical systems, and precise reasoning may be viewed as a special case of approximate reasoning.
Need for fuzzy morphology: erosion as a fuzzy marker
Edward R. Dougherty, Divyendu Sinha
The need for fuzzy mathematical morphology is explained in terms of the need for fuzzy erosion in certain types of applications, especially where erosion is serving as a marker, as with hit-or-miss shape recognition. Since erosion is defined by fitting, there at once arises a need for relating fuzzified set inclusion and mathematical morphology. The result is a very general class of Minkowski algebras based upon an axiomatic description of indicator functions that yield acceptable set-inclusion fuzzifications and a subclass of richer Minkowski algebras resulting from an analytic formulation for indicators that is constrained by the axioms.
Fuzzification of set inclusion
Divyendu Sinha, Edward R. Dougherty
Fuzzification of set inclusion for fuzzy sets is developed in terms of an indicator for set inclusion, the indicator giving the degree to which a fuzzy set is a subset of another fuzzy set. In contrast to most existing indicators, it is proposed in the present paper that the indicator must be two-valued for crisp sets. The approach is to postulate axioms for the indicators, assume a specific mathematical form for such indicators, and then give necessary and sufficient conditions under which the specified formula gives rise to suitable indicator, in effect, providing a characterization of the indicator.
Machine Vision Inspection Techniques II
icon_mobile_dropdown
New techniques for patterned wafer inspection based on a model of human preattentive vision
Virginia H. Brecher
In recent years automated patterned wafer inspection systems have been replacing manual inspection in semiconductor manufacturing facilities. Smaller features and larger devices have made manual inspection inefficient if not impossible, and several commercial systems have been developed to meet the needs of the semiconductor industry. The defect detection techniques used by these systems can be divided into reference comparison and spatial filtering. Reference comparison is the most common. The reference can be another chip on the same wafer or a golden part [4] . Spatial filtering has been successfully applied to the inspection of repetitive patterns [7]. A third inspection technique based on design rule checks is common in printed circuit board inspection but has yet to be successfully applied to integrated circuits, which present a much more complex and variable pattern due to the small feature size and multiple levels. The reference and spatial filter based automatic systems far exceed the performance of human inspectors. Current systems inspect a single chip in minutes, detecting defects as small as one half micron, and in the near future machines will take only seconds to inspect a chip for quarter micron defects. In one respect only, does the human inspector's performance exceed that of the machine: To date, no automatic patterned wafer inspection system is capable of filtering out real defects from nuisance defects such as granularity in metal or polysilicon. Humans, however, are able to detect defects in grainy, textured fields with little difficulty. Furthermore, for humans no knowledge of the pattern is needed to detect obvious anomalies such as breaks or neckdowns in lines. Conversations with people who inspect patterned wafers indicate that they do not serially scan the pattern for defects. Instead they allow their gaze to relax as they move the pattern under the microscope objective. When a defect comes into the field of view, they report that it "pops out" , grabbing their attention. This phenomenon is characteristic of preattentive or effortless vision, a subject that has been extensively studied in psychophysics. Preattentive vision is generally considered to be based on the decomposition of the visual input into a limited set of features which are detected in parallel. Since these operations extend well beyond the foveal or high resolution area of the visual field, one may assume that they are based on lower resolution features. Such parallelism and data reduction imply computationally efficient processing that could be emulated for machine vision pattern recognition purposes. It was with this in mind that we launched a project aimed at developing new pattern inspection techniques based on models of human preattentive vision. This paper will first briefly describe current theories of preattentive vision. It will then outline the model used as a basis on which to develop the pattern inspection techniques. Lastly it will discuss two defect detection techniques, which will be illustrated with examples.
VCM automated 3D measurement system: theory, application, and performance evaluation
Sabry F. El-Hakim, Nicolino J. Pizzi, David B. Westmore
The vision-based coordinate measurement (VCM) automated measurement system has been under development at the National Research Council Canada for several years. The system, which is a multicamera passive system, combines the principles of stereo vision, photogrammetry, knowledge-based techniques, and object-oriented design to provide precise coordinate and dimension measurements of parts for applications such as those found in the aerospace and automobile industries. The system may also be used for tracking or positioning of parts and digitization of targeted objects. Description of the system, the techniques employed for calibration, CAD-based feature extraction and measurement, and performance evaluation are presented.
AGUILA: an automatic tube detection system
Omid Mohtadi, Felix Safar, Jorge L. C. Sanz
This paper discusses a system which uses machine vision algorithms for the detection of tubes in an extremely `noisy' industrial environment. The heart of the algorithm consists of sampling of the image in predefined sparse bands perpendicular to the likely orientation of the tubes followed by the application of a normalized correlation as a detection filter over these bands. After assigning a reliability factor to each local maxima of the correlation function, these points are then mapped to the Hough space to determine the equations of the midlines of the tubes. Given these equations, the number of tubes and any positional anomalies are reported.
Reasoning Techniques for Vision Systems
icon_mobile_dropdown
Rule-based automatic segmentation for 3D coronary arteriography
Alok Sarwal, Paul Truitt, Fusun Ozguner, et al.
Coronary arteriography is a technique used for evaluating the state of coronary arteries and assessing the need for bypass surgery and angioplasty. The present clinical application of this technology is based on the use of a contrast medium for manual radiographic visualization. This method is inaccurate due to varying interpretation of the visual results. Coronary arteriography based quantitations are impractical in a clinical setting without the use of automatic techniques applied to the 3-D reconstruction of the arterial tree. Such a system will provide an easily reproducible method for following the temporal changes in coronary morphology. The labeling of the arteries and establishing of the correspondence between multiple views is necessary for all subsequent processing required for 3-D reconstruction. This work represents a rule based expert system utilized for automatic labeling and segmentation of the arterial branches across multiple views. X-ray data of two and three views of human subjects and a pig arterial cast have been used for this research.
Introduction to project ALIAS: adaptive-learning image analysis system
Peter Bock
As an alternative to preprogrammed rule-based artificial intelligence, collective learning systems theory postulates a hierarchical network of cellular automata which acquire their knowledge through learning based on a series of trial-and-error interactions with an evaluating environment, much as humans do. The input to the hierarchical network is provided by a set of sensors which perceive the external world. Using both this perceived information and past experience (memory), the learning automata synthesize collections of trial responses, periodically modifying their memories based on internal evaluations or external evaluations from the environment. Based on collective learning systems theory, an adaptive transputer- based image-processing engine comprising a three-layer hierarchical network of 32 learning cells and 33 nonlearning cells has been applied to a difficult image processing task: the scale, phase, and translation-invariant detection of anomalous features in otherwise `normal' images. Known as adaptive learning image analysis system (ALIAS), this parallel-processing engine has been constructed and tested at the Research institute for Applied Knowledge Processing (FAW) in Ulm, Germany under the sponsorship of Robert Bosch GmbH. Results demonstrate excellent detection, discrimination, and localization of anomalies in binary images. Recent enhancements include the ability to process gray-scale images and the automatic supervised segmentation and classification of images. Current research is directed toward the processing of time-series data and the hierarchical extension of ALIAS from the sub-symbolic level to the higher levels of symbolic association.
Bayesian methods for interpretation and control in multiagent vision systems
Finn Verner Jensen, Henrik I. Christensen, Jan Nielsen
Interpretation of images is a context dependent activity and, therefore, saturated with uncertainty. It is outlined how causal probabilistic networks (CPNs) together with strict and efficient Bayesian methods can be used for modeling contexts and for interpretation of findings. For illustration purposes a 2-agent system consisting of an interpreter using a CPN and a findings catcher using an image processor is designed. It is argued that the architecture should be a system of agents with instincts, each of them acting to improve their own situation. Going through an interpretation session, it is shown how the Bayesian paradigm very neatly supports the agents-with-instincts control paradigm such that the system through private benefit maximizing in an efficient way reaches its goal.
Automatic building and supervised discrimination learning of appearancemodels of 3D objects
Richard L. Delanoy, Jacques G. Verly, Dan E. Dudgeon
Mechanisms for automatically building and refining appearance models (AMs) of 3-D objects are presented. AMs encode allowed ranges of values of target characteristics called attributes. Allowed values for each attribute of arbitrarily defined parts of a modeled object are determined by statistical analysis of an example set of known targets. Once models are built, the system learns which attributes are discriminating (important to making a correct identification) from mistakes made on a set of training data. In discrimination learning, a weight associated with an attribute is increased or decreased whenever a test for an attribute denies or supports an incorrect object identification, respectively. A consistently decreasing weight eventually results in the essential elimination of the associated attribute from the AM. We illustrate and evaluate this approach in the context of our work in automatic target recognition (ATR).
"Functant" in a functional model: a theoretical consideration of reasoning about shape, structure, and function
Tadahiro Kitahashi, Masaya Ashida, Sei'ichiro Dan, et al.
Object models in image understanding systems are conventionally represented by geometric features of objects based on shape or a configuration of parts, with each part defined by its shape. Shape-based models such as these are useful in matching to the results of image processing for recognition purposes. Shape-based models, however, are so specific to individual objects that a large number of object models would be required to ensure robust performance in a vision system. By contrast, normal instances of an object can often share a single model if they might be represented by their function. This is the great advantage of the functional approach to representation. An essential task of vision systems for a movable robot should be to find free space to move around, which is a kind of a functional expression of a widely defined road in the similar way to that a space with suitable size and configuration for a person to sit down is that of a chair. The disadvantage of using only the function-based representation of objects is that the results of processing an image are usually described by geometric features and it is not necessarily easy to match these features to the corresponding functional representations. What is needed is an inference scheme which can deduce shape from a functional description. In this way the functional representation will provide a generic framework for describing object models. However, there has been essentially no investigation of the deduction of shapes and structures from functions, that is, how functions can be related to the shape, especially structure and size of objects. This is the essence of the research presented here.
Exploration of "Recognition by Components" I
icon_mobile_dropdown
From image edges to geons to viewpoint-invariant object models: a neural net implementation
Irving Biederman, John E. Hummel, Peter C. Gerhardstein, et al.
Three striking and fundamental characteristics of human shape recognition are its invariance with viewpoint in depth (including scale), its tolerance of unfamiliarity, and its robustness with the actual contours present in an image (as long as the same convex parts [geons] can be activated). These characteristics are expressed in an implemented neural network model (Hummel & Biederman, 1992) that takes a line drawing of an object as input and generates a structural description of geons and their relations which is then used for object classification. The model's capacity for structural description derives from its solution to the dynamic binding problem of neural networks: independent units representing an object's parts (in terms of their shape attributes and interrelations) are bound temporarily when those attributes occur in conjunction in the system's input. Temporary conjunctions of attributes are represented by synchronized activity among the units representing those attributes. Specifically, the model induces temporal correlation in the firing of activated units to: (1) parse images into their constituent parts; (2) bind together the attributes of a part; and (3) determine the relations among the parts and bind them to the parts to which they apply. Because it conjoins independent units temporarily, dynamic binding allows tremendous economy of representation, and permits the representation to reflect an object's attribute structure. The model's recognition performance conforms well to recent results from shape priming experiments. Moreover, the manner in which the model's performance degrades due to accidental synchrony produced by an excess of phase sets suggests a basis for a theory of visual attention.
Probabilistic approach to 3D inference of geons from a 2D view
Alain Jacot-Descombes, Thierry Pun
A new, probabilistic approach for inferring 3-D volumetric primitives from a single 2-D view is presented. This recognition relies on the assumption that every object can be decomposed into component parts that belong to a finite set or alphabet of volumetric primitives (geons). For each possible primitive from the permissible set, a conditional probability function is computed. This law specifies the probability of obtaining the primitive given an observable 2- D measure or feature. The distribution functions are determined by simulation, on the basis of a representative number of random projections of the primitives. The measures themselves are chosen in such a way that they can easily be extracted from real images and their discriminative power for the volumetric primitive inference is high. Examples illustrate the proposed approach.
Recognition of geometric primitives using logic-program and probabilistic-network reasoning methods
Roger C. Munck-Fairwood
This paper addresses the issue of recognition of 3-D objects from a potentially very large database of categories of objects, assuming the data are provided in the form of the edges available from a single monocular view, which indicate the discontinuities in depth and surface orientation. The work is partly inspired by the `Recognition by Components' approach suggested fairly recently by Irving Biederman using `geons,' chosen for their qualitatively distinguishable nonmetric viewpoint-invariant properties. The work is also inspired by Richard Gregory's model of human visual recognition which involves probabilistic reasoning, and the regarding of perception as hypothesis. Further, the interpretation of some data can influence the expectation of other data. A novel attempt is made here to apply two automatic reasoning tools to a sub-task of the general recognition process, viz., the recognition of isolated geons in an idealized image. The tools are logic programming and `belief networks' (causal probabilistic networks). Both the tools have the important property of allowing propagation of information in both directions, i.e., data to hypotheses, and vice-versa. The results to date show good patterns of reasoning consistent with one's intuition and point to the possibility of appropriately `tuning' some feature detectors according to other data received. Future goals include the recognition of geons from real gray-level image data, the extension of the belief network to composite objects, and the use of a reverse-driven image analysis logic program to generate graphics and thereby identify appropriate model constraints.
Exploration of "Recognition by Components" II
icon_mobile_dropdown
Obtaining generic parts from range data using a multiview representation
Narayan S. Raja, Anil K. Jain
A method is proposed for obtaining a generic-part based 3-D object representation from range images. A small number of 3-D shape types based on geons are the basic part primitives. Surface segmentation of the range image is performed based on the method of Hoffman and Jain. The surfaces are classified into five types based on their principal curvatures, and adjacent surfaces belonging to different parts are separated by examining the surface types and the angle between them, and referring to the characteristic views stored in a small catalog of views of the part primitives. Finally, each part is identified by drawing inferences from surface types, as well as from the distribution of the angles between the part's principal axis and the normals on the side surfaces of that part.
Unified approach to the recognition of expected and unexpected geon-based objects
Sven J. Dickinson, Alexander P. Pentland
We present an approach to two problems in 3- D object recognition from a single 2-D image: the problem of recognizing an unexpected object from a large database and the problem of searching the image for a particular object (expected object recognition). Most work in 3-D object recognition has focused on the latter problem, with few expected object recognition systems able to scale to larger databases. Avoiding the large indexing ambiguity requires the use of more discriminating image primitives than are typically employed. In previous work, we describe the representation and recovery of high-level indexing structures composed of volumetric primitives. In this paper, we describe a recognition strategy that, integrated with our shape recovery strategy, supports the recognition of both unexpected and expected objects. Unexpected object recognition is formulated as a matching of recovered 3-D interpretations of the image to objects models, while expected object recognition uses knowledge of the target object to constrain both the matching and shape recovery processes.
Representation and Matching
icon_mobile_dropdown
Object segmentation for helicopter guidance
Banavar Sridhar, Gano B. Chatterji
Electro-optical sensors can be used to compute range to objects in the flight path of a helicopter. The range depends on the computation of optical flow/motion at different points in the image. The motion algorithms provide a sparse set of ranges as a function of azimuth and elevation. For guidance and display purposes, these discrete set of ranges need to be grouped into sets which correspond to objects in the real world. The requirements of the guidance problem are different from the range segmentation problem encountered in robotics where the primary concern is to determine the shape of a surface to identify an object. The emphasis in this paper is on using clustering techniques on the range information provided by motion algorithms together with the spatial relation provided by the static image to perform object segmentation. The range values are initially grouped into clusters based on depth. Subsequently, the clusters are modified by using the k-means algorithm in the inertial horizontal plane and the minimum spanning tree algorithms in the image plane.
DIAC object recognition system
Johannes Buurman
This paper describes the object recognition system used in an intelligent robot cell. It is used to recognize and estimate pose and orientation of parts as they enter the cell. The parts are mostly metal and consist of polyhedral and cylindrical shapes. The system uses feature-based stereo vision to acquire a wireframe of the observed part. Features are defined as straight lines and ellipses, which lead to a wireframe of straight lines and circular arcs (the latter using a new algorithm). This wireframe is compared to a number of wire frame models obtained from the CAD database. Experimental results show that image processing hardware and parallelization may add considerably to the speed of the system.
Correspondence using property coherence
This paper presents a generalization of the correspondence approach of Sethi and Jain by extending the path coherence criterion to high dimensional vector space. This allows the same correspondence procedure to be used for a variety of tokens including points, lines, planes and regions. To demonstrate the generalized approach, we apply it to track lines and present experimental results.
Building hierarchical vision model of objects with multiple representations
Bir Bhanu, Chih-Cheng Ho
Building a hierarchical vision model of an object with multiple representations requires two steps: (1) decomposing the object into parts/subparts and obtaining appropriate representations, and (2) constructing relational links between decomposed parts/subparts obtained in step 1. In this paper, we describe volume-based decomposition and surface-based decomposition of 3-D objects into parts, where the objects are designed by a B-spline based geometric modeler called Alpha-1. Multiple-representation descriptions can be derived for each of these subparts using various techniques such as polygonal approximation, concave/convex edge detection, curvature extrema, and surface normals. For example, subparts of a hammer can be described by two generalized cylinders or one generalized cylinder and one polyhedron. Several examples are presented.
Reactive Robotic Control Strategies
icon_mobile_dropdown
Active avoidance: escape and dodging behaviors for reactive control
Ronald C. Arkin, William M. Carter
New methods for producing avoidance behavior among moving obstacles within the context of reactive robotic control are described. These specifically include escape and dodging behaviors. Dodging is concerned with the avoidance of a ballistic projectile while escape is more useful within the context of chase. The motivation and formulation of these new reactive behaviors are presented. Simulation results of a robot in a cluttered and moving world are also provided.
Active vision for target pursuit by a mobile robot
Rajeev Sharma
We discuss and demonstrate the advantages of developing active vision techniques as an integral part of a mobile robot behavior. In particular, different visual motion analysis modules needed for pursuing a moving target are summarized. A detailed solution is the presented for the active detection of independent motion to illustrate the methodology.
Controlling reactive behavior with consistent world modeling and reasoning
Akram A. Bou-Ghannam
Based on the philosophical view of reflexive behaviors and cognitive modules working in a complementary fashion, this paper proposes a hybrid decomposition of the control architecture for an intelligent, fully autonomous mobile robot. This architecture follows a parallel distributed decomposition and supports a hierarchy of control with lower-level reflexive type behaviors working in parallel with higher-level planning and map building modules. The behavior-based component of the system provides the basic instinctive competences for the robot while the cognitive part performs higher machine intelligence functions such as planning. The interface between the two components utilizes motivated behaviors implemented as part of the behavior-based system. A motivated behavior is one whose response is dictated mainly by the internal state (or the motivation state) of the robot. Thus, the cognitive planning activity can execute its plans by merely setting the motivation state of the robot and letting the behavior-based subsystem worry about the details of plan execution. The goal of such a hybrid architecture is to gain the real-time performance of a behavior-based system without losing the effectiveness of a general purpose world model and planner. We view world models as essential to intelligent interaction with the environment, providing a `bigger picture' for the robot when reactive behaviors encounter difficulty. We describe a live experimental run of our robot under hybrid control in an unknown and unstructured lab environment. This experiment demonstrated the validity of the proposed hybrid control architecture and the sensory knowledge integrator (the underlying model for the map-builder module) for the task of mapping the environment. Results of the emergent robot behavior and different map representations of the environment are presented and discussed.
Real-time reactive model for mobile robot architecture
Arcot Sowmya
Conventional robot designs have used a functional decomposition of modules to drive the robot architecture. In recent times, an alternative behavioral decomposition has been advocated by Brooks as a basis for mobile robot design. In this model, the set of desired behaviors of the mobile robot dictate the architecture. We have earlier proposed a formal model of mobile robot architecture, which is totally compatible with the new behavior-based approach, and is transformable to a behavior-based design. This paper describes a scheme for the transformation of a robot specification in our approach into an implementable design.
Image Processing Techniques
icon_mobile_dropdown
Optimal decomposition of arbitrary-shaped structuring elements into neighborhood subsets
Xiaoyi Jiang, Horst Bunke
To efficiently perform morphological operations on neighborhood-processing-based parallel image computers, we need to decompose structuring elements larger than the neighborhood that can be directly handled into neighborhood subsets. In this paper we give an algorithm for the optimal decomposition of arbitrary-shaped structuring elements.
Noise-model-based morphological shape recognition
Edward R. Dougherty, Dongming Zhao
A classical morphological technique for shape recognition is by means of the hit-or-miss transform. In essence, there are two structuring elements for each shape, one to fit inside and one to fit outside. These structuring-element pairs are chosen so that there will be a `hit' and a `miss' if and only if the appropriate shape appears. The problem is to design structuring pairs that yield acceptable recognition rates. This can be especially difficult if some shapes are close and the shapes are random (noisy). The present paper analyzes the problem by adopting a shape-noise model that represents both the structures of the individual shapes and edge indeterminacy. For direct application to a given system, the model parameters must be estimated statistically. Optimal shape recognition is characterized in terms of the model. The advantage of the new approach is that it provides an environment for machine design optimal structuring elements for shape recognition.
Spatiotemporal edge focusing with multiple-pixel interframe motion
A. S. Young, John L. Barron
Previous work has resulted in a spatio-temporal edge focusing algorithm for computing noiseless well-localized edge maps by tracking edges in both scale space and time. As such, the algorithm was an extension of spatial edge focusing. In this paper, we show how the 1 pixel motion constraint used in the temporal tracking component of the original spatio- temporal edge focusing algorithm can be removed to allow multiple pixel motion between adjacent frames. Our new algorithm is based on a simple Hough transform computation. The final result is an edge detection technique that uses three adjacent images from some sequence to produce a well-localized noiseless edge map for the middle image. A noise model is not explicitly required. Rather, we define an authentic edge as an intensity discontinuity existing over a number of scales and being temporally connected in three adjacent images.
Sampling and surface reconstruction with adaptive-size meshes
Wen-Chen Huang, Dmitry B. Goldgof
This paper presents a new approach to sampling and surface reconstruction which uses the physically based models. We introduce adaptive-size meshes which automatically update the size of the meshes as the distance between the nodes changes. We have implemented the adaptive-size algorithm to the following three applications: (1) Sampling of the intensity data. (2) Surface reconstruction of the range data. (3) Surface reconstruction of the 3-D computed tomography left ventricle data. The LV data was acquired by the 3-D computed tomography (CT) scanner. It was provided by Dr. Eric Hoffman at University of Pennsylvania Medical school and consists of 16 volumetric (128 X 128 X 118) images taken through the heart cycle.
Machine Vision Inspection Techniques III
icon_mobile_dropdown
Rotation measurement techniques for alignment in PCB automated inspection
Arturo A. Rodriguez, Frederick Y. Wu, Jon R. Mandeville
Part alignment is an essential element of automated inspection of printed circuit boards. The objective of a part alignment procedure is to measure the global rotation of the part under inspection and the location of its center to properly orient and position the part prior to inspection. This paper describes a set of techniques that effectively accomplishes this task. We describe techniques to rapidly detect fiducials and to accurately measure their location. Measurement of rotation is obtained by mapping a quadrilateral in 2-D measurement space onto a rectangle. This solution compensates for minor global variations in part height and provides an analytical equation to measure the rotation of the part that is independent of knowledge of the physical coordinates of the fiducials.
Comparison of multiresolution morphological and Laplacian techniques for automated inspection
Jeffrey M. Seaton, A. Lynn Abbott
This paper concerns the analysis of images at multiple scales of spatial resolution. We describe and compare two methods of generating hierarchical image representations (called pyramids) which are based on changes in image resolution. The first of these methods utilizes the tools of mathematical morphology to derive image pyramids, while the second method is based on well known linear filtering techniques to create Gaussian and Laplacian pyramids. Both methods involve the successive remodel of high frequency components from an image so that an ordered sequence of low pass filtered, subsampled images results. It is also possible in each case to use differences of adjacent levels in a low-pass pyramid to generate a band-pass pyramid of images. The resulting hierarchy of images is well suited to such applications as automated industrial inspection, since components of interest can be expected to appear at particular levels of the hierarchy. A problem with the latter method is that image features are blurred by the linear smoothing. This drawback is not present in the former technique, which depends on nonlinear, set-theoretic transformations. This paper describes these two methods for generating image pyramids and compares the results of each in the inspection of a printed circuit board.
Transputer-based system for the visual inspection of surface mount assemblies
Paul Netherwood, Peter Barnwell, Peter Forte
This paper describes a project, the aim of which is to develop a low-cost machine vision system for the inspection of surface mount electronic assemblies and solder joints. The system uses a simple optical system connected to sophisticated machine vision software running on a transputer network. This allows high-speed inspection at a lower cost than existing x-ray or laser techniques.
Scratch measurement system using machine vision: part II
Aircraft skins and windows must not have scratches, which are unacceptable for cosmetic and structural reasons. Manual methods are inadequate in giving accurate reading and do not provide a hardcopy report. A prototype scratch measurement system (SMS) using computer vision and image analysis has been developed. This paper discusses the prototype description, novel ideas, improvements, repeatability, reproducibility, accuracy, and the calibration method. Boeing's Calibration Certification Laboratory has given the prototype a qualified certification. The SMS is portable for usage in factory or aircraft hangars anywhere in the world.
Machine vision inspection of component footprints in PCB artwork
A new printed circuit board (PCB) inspection scheme designed to perform automated inspection of component sites is presented. A digitized image of the artwork for a board is stored after digitization via a scanner (300 dots per inch) in a tag image file format (TIFF) file. At inspection time, after reading in the compressed format TIFF file and decoding the compressed image, the PCB is graphically displayed on a PC screen together with a command menu. The user is then taken through a series of menu-driven commands to ultimately check the dimensions of center-to-center distance between holes and pads for electronic components on the PCB. The significance of this research is in the adaptation of the UNION-FIND procedure to develop a robust algorithm to segment an image into component objects and background to facilitate dimensional analysis and inspection of the artwork.
Image Processing Techniques
icon_mobile_dropdown
Pose determination of cylinders, cones, and spheres from perspective projections
Yiu Cheung Shiu, Hanqi Zhuang
This paper addresses the computation of spheres, right-circular cylinders, and right-circular cones from a single view. Geometric information about these objects are known a priori, as this would be the case for model-based vision applications. The 3-D poses of these objects can be computed using elliptical projection from the sphere or elliptical projections from the circular surfaces of a cylinder or cone. For the cylinder and cone, their poses can also be computed form their linear extremal contours. This paper employs simple analytic geometric techniques that address uniqueness of the solutions as well as the geometric interpretations of the non-unique solutions. For the sphere, its pose can be uniquely determined by its elliptical projection. Using a planar circular surface from either the cone or the cylinder, two solutions to the position and orientation of the circle are possible. From the linear extremal contours of a cylinder, its pose can be computed uniquely if the junctions are visible in the image, otherwise the position has one degree of freedom along the cylinder axis. From the linear contours of a cone, its pose can also be solved uniquely unless the junctions are occluded. In that case there are two possible solutions for the orientation but the position has one degree of freedom along the line joining the projection center to the image apex position.
How To Design a Robot "Head" II
icon_mobile_dropdown
Lightweight camera head for robotic-based binocular stereo vision: an integrated engineering approach
John R. G. Pretlove, Graham A. Parker
This paper presents the design and development of a real-time eye-in-hand stereo-vision system to aid robot guidance in a manufacturing environment. The stereo vision head comprises a novel camera arrangement with servo-vergence, focus, and aperture that continuously provides high-quality images to a dedicated image processing system and parallel processing array. The stereo head has four degrees of freedom but it relies on the robot end- effector for all remaining movement. This provides the robot with exploratory sensing abilities allowing it to undertake a wider variety of less constrained tasks. Unlike other stereo vision research heads, the overriding factor in the Surrey head has been a truly integrated engineering approach in an attempt to solve an extremely complex problem. The head is low cost, low weight, employs state-of-the-art motor technology, is highly controllable and occupies a small size envelope. Its intended applications include high-accuracy metrology, 3-D path following, object recognition and tracking, parts manipulation, and component inspection for the manufacturing industry.