Intelligent Robots and Computer Vision XIV: Algorithms, Techniques, Active Vision, and Materials Handling

Volume Details

Date Published: 3 October 1995

Contents: 10 Sessions, 70 Papers, 0 Presentations

Conference: Photonics East '95 1995

Volume Number: 2588

All links to SPIE Proceedings will open in the SPIE Digital Library.

Show all abstracts

View Session

Pattern Recognition in Intelligent Robotics
Active Vision and Control
Color Image Processing
Systems and Applications
Robotic Navigation, Path Planning, and Use of Motion and Image Sequences
3D Modeling and Data Processing for Robotics and Machine Vision
3D Pattern Recognition for Intelligent Robots
Neural Nets for Pattern Recognition and Machine Vision
Feature Extraction (2D and 3D)
Detection and Segmentation in Machine Vision

Pattern Recognition in Intelligent Robotics

Detection algorithm fusion concepts for computer vision

David P. Casasent, Anqi Ye, Ashit Talukder

Show abstract

We consider detection (locating all objects in a scene) independent of object distortions and contrast differences and in the presence of clutter. We employ several different new detection algorithms; to reduce false alarms. We fuse (combine) the outputs from different detection algorithms. We describe a new peak sorting detection scoring algorithm and 3 different fusion algorithms to combine the results from different algorithms: binary, analog, and hierarchical fusion. Quantitative data on a distortion-invariant six object class is presented; the objects have a wide range of object contrasts including obscured objects and the objects are present in severe clutter.

Target segmentation in IR imagery using a wavelet-based technique

Firooz A. Sadjadi

Show abstract

Segmentation of ground based targets embedded in clutter obtained by airborne Infrared (IR) imaging sensors is one of the challenging problems in automatic target recognition. In this paper a new texture based segmentation technique is presented that uses the statistics of 2D wavelet decomposition components of the local sections of the image. A measure of statistical similarity is then used to segment the image and separate the target from the background. This technique is applied on a set of real sequential IR imagery and has shown to produce a high degree of segmentation accuracy across varying ranges.

Feature extraction of articulated objects in an active vision environment

Patrick S. P. Wang

Show abstract

This paper deals with feature extraction of 3D articulated objects in active vision environment, in which the object and/or the observer can move or rotate, while at the same time the articulated object itself can change status. An articulated object is divided into two portions: main rigid portion and articulated portion. It is more complicated than `rigid' object in that the relative positions, shapes or angles between the main portion and the articulated portion have essentially infinite variations, in addition to the infinite variations of each individual rigid portions due to orientations, rotations and topological transformations. A new method generalized from linear combination is employed to investigate such problems. It uses very few learning samples, and can describe, understand, and recognize 3D articulated objects while the objects status is being changed in an active vision environment.

From craftwork toward industrial production in development of real-time machine vision software

Juha Roening, Hannu Kauniskangas, Jarmo Kalaoja, et al.

Show abstract

A systematic method is proposed for designing and developing a machine vision algorithm and transferring it from a non-real time environment to a real time target system. The systematic design of a real time system by the Real Time Structured Analysis method and the inclusion of the use of image processing tools and algorithm design as a separate phase during system design are discussed. The concept is first clarified by explaining a hypothetical real-time system design example, after which a more realistic example of a real-time machine vision system provided by a company is analyzed, a bottle crate returning machine. A RT/SA model was created with the aid of a Prosa/SA case tool. Graphical animation of the RT/SA specification to help understanding of the model and the automatic generation of C code are evaluated.

High-speed scanning: an improved algorithm

A. Nachimuthu, Khoi Hoang

Show abstract

In using machine vision for assessing an object's surface quality, many images are required to be processed in order to separate the good areas from the defective ones. Examples can be found in the leather hide grading process; in the inspection of garments/canvas on the production line; in the nesting of irregular shapes into a given surface... . The most common method of subtracting the total area from the sum of defective areas does not give an acceptable indication of how much of the `good' area can be used, particularly if the findings are to be used for the nesting of irregular shapes. This paper presents an image scanning technique which enables the estimation of useable areas within an inspected surface in terms of the user's definition, not the supplier's claims. That is, how much useable area the user can use, not the total good area as the supplier estimated. An important application of the developed technique is in the leather industry where the tanner (the supplier) and the footwear manufacturer (the user) are constantly locked in argument due to disputed quality standards of finished leather hide, which disrupts production schedules and wasted costs in re-grading, re- sorting... . The developed basic algorithm for area scanning of a digital image will be presented. The implementation of an improved scanning algorithm will be discussed in detail. The improved features include Boolean OR operations and many other innovative functions which aim at optimizing the scanning process in terms of computing time and the accurate estimation of useable areas.

Robotics projects for the undergraduate electrical engineering design lab.

David J. Mehrl, Michael E. Parten, Sunanda Mitra

Show abstract

We describe recent robotics projects carried out by Electrical Engineering undergraduates at Texas Tech University. Design Lab projects employing mobile robots proved to be an excellent choice, as the animation and instant feedback provided by the robots instilled great enthusiasm, and taught important principles in areas such as micro controllers, motor control, sensors and artificial intelligence. Several autonomous robotic vehicles, employing a variety of sensors, were successfully designed and demonstrated. We also discuss ongoing design projects to build RF telemetry links, an RF navigation system, and other mobile platforms such as the `TT-U-Boat'.

Knowledge-based object recognition technique for robotics assembly application

H. Najjari, Simon J. Steiner

Show abstract

This paper presents a flexible object recognition system which has been developed using a knowledge based approach for robotics assembly applications. The developed system overcomes some limitations of conventional object recognition techniques. The system can robustly recognize any number of objects of any shape, where the objects could be with any random position and orientation, and lying on any side or face. The system is flexible such that it can accommodate any number of new objects of any shape without the need to reprogram the system. The recognition method is based on a knowledge based approach, and the record of the boundary pixels is used for the extraction of a number of potential recognition features in order to recognize the objects. Depending on the shape and size of the objects introduced to the system, in its Training Mode, the system automatically selects a minimum number of important recognition features for robustly recognizing all the objects, and it automatically modifies the recognition program and feature extraction program in its Operation Mode based on the selected features.

Determination of major maceral groups in coal by automated image analysis procedures

Jamshid Dehmeshki, Mohammad Farhang Daemi, N. J. Miles, et al.

Show abstract

This paper describes development of an automated and efficient system for classifying of different major maceral groups within polished coal blocks. Coal utilization processes can be significantly affected by the distribution of macerals in the feed coal. In carbonization, for example, maceral group analysis is an important parameter in determining the correct coal blend to produce the required coking properties. In coal liquefaction, liptinites and vitrinites convert more easily to give useful products than inertinites. Microscopic images of coal are inherently difficult to interpret by conventional image processing techniques since certain macerals show similar visual characteristics. It is particularly difficult to distinguish between the liptinite maceral and the supporting setting resin. This requires the use of high level image processing as well as fluorescence microscopy in conjunction with normal white light microscopy. This paper is concerned with two main stages of the work, namely segmentation and interpretation. In the segmentation stage, a cooperative, iterative approach to segmentation and model parameter estimation is defined which is a stochastic variant of the Expectation Maximization algorithm. Because of the high resolution of these images under study, the pixel size is significantly smaller than the size of most of the different regions of interest. Consequently adjacent pixels are likely to have similar labels. In our Stochastic Expectation Maximization method the idea that neighboring pixels are similar to one another is expressed by using Gibbs distribution for the priori distribution of regions (labels). We also present a suitable statistical model for distribution of pixel values within each region. In the interpretation stage, the coal macerals are identified according to the measurement information on the segmented region and domain knowledge. Studies show that the system is able to distinguish coals macerals, especially Fusinite from Pyrite or liptinite from mineral which previous attempts have been unable to resolve.

Texture-region coding based on Boolean random sets

Edward R. Dougherty, John C. Handley, Yidong Chen

Show abstract

Region-based coding is applied to images composed of disjoint texture regions, where in each region the image is generated by a discrete random Boolean model. The image is segmented into regions by applying pixelwise granulometric classification and the region boundaries are chain encoded. Maximum-likelihood estimation based upon induced 1D Boolean models is used to estimate the parameters of the governing processes in each region. The regions are coded by these parameters. Decoding is accomplished by generating in each region a realization of the appropriate random set.

Shape matching using optimization

S. Hossein Cheraghi, Huey S. Lim

Show abstract

In this paper an optimization approach to shape matching and recognition is presented. The approach is applicable to convex polygonal shapes but its modified model can also be applied to non-convex polygonal shapes. The shape matching and recognition problem is formulated as a nonlinear optimization problem with a linear objective function and nonlinear constraints. The approach is invariant to scale, translation and rotation.

Active Vision and Control

Robot control architectures: application requirements, approaches, and technologies

Jorg-Michael Hasemann

Show abstract

This paper identifies attributes of intelligent robotic applications and surveys the different flavor in robot control architectures. Directions in robot control are discussed in more detail and the attributes and properties of different approaches are compared. A section on applied technologies lists the application area covered by existing real-world and test bed implementations and simulations for the various applied control architectures.

Novel approach to visual robot control

Naser Prljaca, Hugh McCabe

Show abstract

In this paper a new control scheme for a robot manipulator based on visual information is proposed. The control system determines the position and orientation of the robot gripper in order to achieve desired grasping relation between the gripper and a 3D object. The proposed control scheme consists of two distinct stages: (1) Learning stage, in this stage the robot system reconstructs a 3D geometrical model of a presented unknown object within a class of objects (polyhedra), by integrating information from an image sequence obtained from a camera mounted on the robot manipulator (eye-in-hand). This model is represented by a set of 3D line segments and denoted as a reference model. The robot is also taught desired grasping relation by manual guidance. (2) Execution stage, in this stage the robot system reconstructs a 3D model of the arbitrarily placed 3D object. This model is denoted as an observed model. Then, the necessary position and orientation of its gripper is determined based on estimated 3D displacement between the reference and observed models. Further, the basic algorithm is extended to handle multiple objects manipulation and recognition. The performance of the proposed algorithms has been tested on the real robot system and the experimental results are presented.

Object-oriented tool for controlling a multipurpose robot cell

Ismail Fidan, Larry Ruff, Stephen Derby

Show abstract

As the piece cost of completed circuit boards increases and production quantities decrease, reworking of defective boards has become an important part of the production process. The lead-pitch of surface mount components has also been decreasing, which has required increased training and skill level of the rework operators. Fully-automated rework provides a solution to these problems. In order to simplify the control of automated rework of surface mount circuit boards from an operator's viewpoint, an Object-Oriented Tool was developed at Rensselaer for an automated rework cell that uses the Apple Macintosh Graphical Interface. The interface reduces operator skill and training requirements for the operation of the automated cell. This paper includes a description of the fully-automated rework cell created at Rensselaer and describes the Object-Oriented Tool development and its final structure.

New active 3D vision system based on rf-modulation interferometry of incoherent light

Rudolf Schwarte, Horst-Guenther Heinol, Zhanping Xu, et al.

Show abstract

Presently there is still a remarkable gap between the requirements and the capabilities of 3D- vision in the field of industrial automation, especially in manufacture integrated 100%-quality control. For these and a lot of other applications like security and traffic control a new extremely fast, precise and flexible 3D-camera concept is presented in this paper. In order to obtain the geometrical 3D information, the whole 3D object or 3D scene is illuminated simultaneously by means of rf-modulated light. This is realized by using optical modulators such as Pockels cells or FTR optical components (FTR: Frustrated Total Reflection). The back scattered light represents the depth information within the local delay of the phase front of the rf-modulated light intensity. If the reflected wave front is mixed again within the whole receiving aperture using the same optical 2D-modulation components and the same rf- frequency, an rf-interference pattern is produced. A CCD camera may be applied to sample these rf-modulation interferograms. In order to reconstruct the 3D-image a minimum of three independent interferograms have to be evaluated. They may be produced either by applying three different rf-phases or three different rf-frequencies. This procedure will be able to deliver up to some tens of high resolution 3D images per second with some hundred thousand voxels (volume elements). Such a remarkable progress can be achieved by means of three key important steps: Firstly by separating the opto-electronic receiver device from real-time requirements by using homodyne mixing of CW-modulated light. Secondly by applying the rf- modulation signal as an optical reference signal to the receiving optical mixer. And thirdly by using a throughout 2D layout of the transmitted illumination, further, of the optical mixer in the receiving aperture, and of the optoelectronic sensing element, e.g., a CCD-chip.

Active vision system for planning and programming of industrial robots in one-of-a-kind manufacturing

Ulrich Berger, Achim Schmidt

Show abstract

The aspects of automation technology in industrial one-of-a-kind manufacturing are discussed. An approach to improve the quality and cost relation is developed and an overview of an 3D- vision supported automation system is given. This system is based on an active vision sensor for 3D-geometry feedback. Its measurement principle, the coded light approach, is explained. The experimental environment for the technical validation of the automation approach is demonstrated, where robot based processes (assembly, arc welding and flame cutting) are graphically simulated and off-line programmed. A typical process sequence for automated one- of-a-kind manufacturing is described. The results of this research development are applied to a project on the automated disassembling of car parts for recycling using industrial robots.

Integration of active vision and intelligent robotics for advanced material handling

John R. G. Pretlove, Nongji Chen

Show abstract

This paper presents an intelligent robotic system consisting of a PUMA 562 industrial robot arm, a novel open architecture robot controller and an active vision system. The tight integration of these three sub-systems has resulted in a high performance robotic based workcell which is capable of tracking objects which are following complex 3D trajectories and intercepting them. This work has been carried out within the MSRR group and has been aimed at developing robot systems which can overcome workcell uncertainties. The active stereo vision systems is designed on mechatronic principles and is modular, light weight and highly controllable. This system is robot mounted and can determine the 3D position of objects in real-time. The active stereo vision system communicates with the open architecture robot controller via a transputer link. The robot controller is based around transputer architecture hosted by an OS-9 system. It provides easy integration of external sensors and offers wide user accessibility to the internal robot control modules. A series of tests have been undertaken and are reported here which demonstrate the functional integration of the active vision system and the robot controller. The vision system is capable of tracking moving objects and this information is used to update the current robot trajectory. These experiments demonstrate the vision system in full control of the robot during a tracking an intercept duty cycle. Our practical experience of using the system is also discussed together with out views of the future.

Mars lander robotics and machine vision capabilities for in-situ planetary science

Paul S. Schenker, D. L. Blaney, D. K. Brown, et al.

Show abstract

We overview our recent progress in lander-based robotics for Mars planetary science. Utilizing a 1:1 scale laboratory replica of the NASA Mar Surveyor '98 mission, JPL engineers and Mars science colleagues have demonstrated approaches to lander science functions such as robotic sample acquisition and deposition, end-effector based microscopic viewing, hand- carried science instrument data collection, and science instrument emplacement by a robot. Some of the significant technical advances underlying this simulated Mars flight capability include JPL's innovation of a new lightweight, mechanically stiff, gas deployed telescopic two meter robot arm, and cooperative engineering work with Michigan Tech colleagues on automated visual positioning control of robotic sampling. University of Arizona and JPL have further developed complementary advances in lander-based imaging spectroscopy and its robotic enablement. We outline this work, summarizing its key technical features, and illustrating experimental progress with photographs and an accompanying conference videotape.

Color Image Processing

Spectral color measurement

Jouni Haanpalo, Timo Jaeaeskelaeinen, Jussi P. S. Parkkinen, et al.

Show abstract

In this paper, we discuss a color measuring system, which is based on the measurement of the wavelength spectrum of an object of interest. The design of measuring head, the correction of errors in the measuring and color recognition and analysis methods for an intelligent colorimeter are studied and discussed. We have described a compact and shock resistant measuring head for color spectrum measurements. It is shown that the cut of a spectrum from the 640 nm can be corrected quite accurately by linear extrapolation function in the sense of error in CIELAB coordinate values. For construction of an adaptive color measurement system, the subspace network or the Multi-Layer-Perceptron network can be used as an intelligent method for spectral analysis and recognition.

Color characterization for landmark selection by a neural network

Ettore Stella, F. Monte, Laura Caponetti, et al.

Show abstract

Many of visual navigation strategies for an autonomous mobile robot are landmark based. A vehicle to determine its position needs to refer to absolute references in the environment, so landmarks are required to be invariant for rotation, translation, scale and perspective. A straightforward alternative is to be able to characterize invariantly the context where landmarks are placed. In this paper, we show as a neural network appropriately trained, is able to recognize context where landmarks are located in the scene. The early results seem to be interesting.

Color line-scan technology in industrial applications

Guy F. Lemstrom

Show abstract

Color machine vision opens new possibilities for industrial on-line quality control applications. With color machine vision it's possible to detect different colors and shades, make color separation, spectroscopic applications and at the same time do measurements in the same way as with gray scale technology. These can be geometrical measurements such as dimensions, shape, texture etc. By combining these technologies in a color line scan camera, it brings the machine vision to new dimensions of realizing new applications and new areas in the machine vision business. Quality and process control requirements in the industry get more demanding every day. Color machine vision can be the solution for many simple tasks that haven't been realized with gray scale technology. The lack of detecting or measuring colors has been one reason why machine vision has not been used in quality control as much as it could have been. Color machine vision has shown a growing enthusiasm in the industrial machine vision applications. Potential areas of the industry include food, wood, mining and minerals, printing, paper, glass, plastic, recycling etc. Tasks are from simple measuring to total process and quality control. The color machine vision is not only for measuring colors. It can also be for contrast enhancement, object detection, background removing, structure detection and measuring. Color or spectral separation can be used in many different ways for working out machine vision application than before. It's only a question of how to use the benefits of having two or more data per measured pixel, instead of having only one as in case with traditional gray scale technology. There are plenty of potential applications already today that can be realized with color vision and it's going to give more performance to many traditional gray scale applications in the near future. But the most important feature is that color machine vision offers a new way of working out applications, where machine vision hasn't been applied before.

Enhanced multiprobing recovering algorithm based on color mixed nonlinear modulation and its application in a 3D vision system

Zhanping Xu, Horst-Guenther Heinol, Rudolf Schwarte, et al.

Show abstract

The paper introduces a fast and enhanced recovering algorithm and its application in an active 3D-color vision system. The algorithm is based on the processing of several non-linearly modulated optical test signals of different colors. The processing approach arises from minimizing errors caused by using non-linear modulators in an active vision system, i.e., recovering 3D properties from higher order terms of a Fourier series expansion of the non- linear modulation. Two aspects are worth mentioning: Firstly, the modulation depth of non- linear optical components such as Pockels cells can be exploited much beyond the linear region if assisted by the appropriate recovering algorithm and thus increasing the effective aperture of the optical system. Secondly, the same algorithm can be adaptive to a synthetic non-linear modulation, i.e., the various incoherent signals used as color probes are synchronously modulated each with different characterizing rf signals by means of corresponding optical modulators. These signals are then further incoherently superposed in the transmission medium. After having been reflected from and interacted with the object of interest, the selectively attenuated signals are demodulated using a single modulator. In this process phase, color and other information are simultaneously demodulated. Therefore a single black-white CCD camera may be utilized to sample the 2D-rf interferograms, which are fast and analytically processed by the proposed algorithm in order to extract 3D ranges, colors, and other properties of the interesting object.

Relationship between brightness, hue, and saturation when the inverted human retina is interpreted as a cellular diffractive 3D chip

Norbert Lauinger

Show abstract

The nonlinear relationship between brightness, hue and saturation in human vision becomes clear if, in addition to the pupil as brightness regulator, the inverted retina is interpreted as a cellular multilayer phase grating optical 3D chip, i.e. as a chromaticity and brightness regulator. Both regulators are optical information preprocessors which determine the signal input into the photoreceptors and thus represent the basis for subsequent electrical information processing in retinal neural networks. Data from the interference-optical 3D phase calculation (von Laue equation) are compared with experimental data on phenomena in human vision which are critical to this question, to show the interdependence of brightness, hue and saturation. This gives new insights into the function of the pupil, the Purkinje shift, the Bezold-Brucke phenomenon, the Stiles-Crawford aperture effects I/II and the saturation effects in human vision, all of which can be derived from a single pupil/retina/photopigment equation.

Systems and Applications

Aspects in automatic nesting of irregular shapes

Khoi Hoang

Show abstract

Nesting of shoe parts onto a leather hide is a very complex puzzle. This is because firstly, both the shoe parts and the leather hide are of irregular shapes and secondly, leather quality is not uniform throughout the hide, certain zones are only suitable for certain types of shoe parts. Moreover, leather hide bears stretch directions which some shoe parts are not permitted to be nested in at certain angles to the hide's stretch directions. Thirdly, the pose of defects in some shoe parts can be acceptable or unacceptable depending on the manufacturer's quality standard. Machine vision researchers have been attempting to capture the hide defect map and use this data file for assisting the computerization nesting of the irregular shapes. There are many challenging tasks in the visual inspection stage which affect the performance of the nesting systems. An essential assignment is the clustering of leather defects. Two vital parameters required in the clustering of defects are the minimum Euclidean distance between defects to be clustered and the re-classification of the newly formed cluster if it contains more than one type of defect or satisfied more serious defect type definition. In order to minimize the wastage in yield, this work recommends that during the inspection stage, only defects of the same type are to be clustered and the process should cease at the point where the newly formed cluster is to become a more serious defect type. The question of minimum Euclidean distance should be considered at both inspection stage as a preliminary clustering operation and at the nesting stage when the shoe size and part shapes are known, as a fine tuned process.

Verification and reconciliation of virtual world model for radioactive waste cleanup

Sharon X. Wang, Dean Haddox, Carl D. Crane III, et al.

Show abstract

The main task of sensing for robotics and automation is to provide 3D geometrical environment information to robot control and visualization systems, which is often referred as facility characterization or environment mapping. Particularly in radioactive waste cleanup operations, such as the Tank Waste Retrieval (TWR) task where the environment is hazardous, automated sensing techniques are necessary for accurate facility characterization that will be used by remote controlled robots for a safe and orderly cleanup process. This research proposes a facility characterization system which combines the strength of computer vision and computer graphics and which maximizes the use of a priori information. Using a novel image registration method, this system is able to detect the difference between the pre-modeled virtual world and sensed real world. Combined with 3D sensing data, the information can be used for verification and reconciliation of the virtual world database. In the proposed system, the environment is pre-modeled as the virtual world. This virtual world database provides the template for the virtual/real world registration. Once the virtual/real images are registered, the comparison can be accomplished by image subtraction. As the result of the comparison, any missing objects or unanticipated objects will be detected. Utilizing the 3D information, the surfaces of these objects can be reconstructed. This information in turn is used for geometric primitives detection and virtual world updating. The initial testing demonstrates that this system has potential to accomplish the TWR task.

Robust 3D object recognition and pose estimation using passively sensed 3D contour images

Rainer Otterbach

Show abstract

The paper presents a system for the recognition and pose estimation of 3D objects with an emphasis on the segmentation and verification modules. The system is tailored to the analysis of 3D contour images, which are obtained from image sequences of a CCD camera by means of Kalman filtering. In order to reduce the search complexity and the noise sensitivity of the recognition process, the method is built on robust contour-based 2D algorithms for the retrieval of model candidates from the database and the generation of pose hypotheses. These techniques apply because of the previous segmentation of the 3D contour image into plane curve segments that make up boundary lines of plane surface patches. Hypotheses for the object's pose are obtained by pairwise matching of model and image boundaries. The subsequent verification computes the best globally consistent assignment of model and image contours by searching for groups of similar pose hypothesis. Both the segmentation and the verification algorithm are formulated in terms of clustering approaches and realized by use of a common technique for the evaluation of transformation space. With regard to industrial applications most importance has been attached to the modular design of a robust software solution and the experimental performance evaluation. Experimental results obtained from real-world images are presented.

Syntactic recognition of defects on wooden boards

Wolfgang Poelzleitner

Show abstract

This paper describes a method to classify the various patterns that make up the appearance of wooden surfaces. Such surfaces are characterized by their textural appearance as well as compact convex objects like knots, holes, resin, cracks, grain lines. Many approaches to describe such surfaces have been published in the past. The list includes, but is not limited to, Hough transform methods, 2D shape recognition, fuzzy set approaches for segmentation, hierarchical pattern recognition, associative memories, and so on. In the present paper we assume that a local textural representation is computed permitting the description of the graylevel image in terms of texture elements or symbols. Using the symbolic image it is shown, how segmentation into objects can be achieved, followed by the extraction the symbolic contour as a list of symbols. Every object is described by a list of symbols to be classified using the syntactic pattern recognition. Each class of objects is described by a formal language, and parsing each string, a classification can be obtained from the grammar that causes the least amount of parsing errors. We describe details of the system, including how symbolic descriptions can be obtained, and the implementation of Earley's parser on a parallel computer architecture.

Robotic Navigation, Path Planning, and Use of Motion and Image Sequences

Egomotion parameter computation with a neural network

Antonella Branca, Ettore Stella, Gabriella Convertino, et al.

Show abstract

In this work we consider the application context of planar passive navigation in which the visual control of locomotion requires only the direction of translation and not the full set of motion parameters. If the temporally changing optic array is represented as a vector field of optical velocities, the vectors form a radial pattern emanating from a center point, called the focus of expansion (FOE), representing the heading direction. The FOE position is independent of the distances of world surfaces and doesn't require assumptions about surface shape and smoothness. We investigate the performance of an artificial neural network for the computation of the image position of the FOE of an optical flow field induced by an observer translation relative to a static environment. The network is characterized by a feed forward architecture and is trained by a standard supervised back-propagation algorithm which receives as input the pattern of points where the lines generated by 2D vectors are projected using the Hough transform. We present results obtained on test set of synthetic noisy optical flows and on optical flows computed from real image sequences.

Hopfield neural network for TTC and heading direction estimation for obstacle avoidance systems in planar passive navigation

Gabriella Convertino, Antonella Branca, Ettore Stella, et al.

Show abstract

In this paper a method for the estimation of the heading direction and of the time-of-collision of a moving vehicle is presented. The assumption that the motion can be described as a prevalence of translation motion is used to reduce the optic flow equations to a linear version. In this case 2D motion field assumes a radial shape with vectors directions intersecting in a point called focus of expansion. In the presented method a sparse linear optic flow map is derived in the most relevant and reliable areas of the image. These estimations are then used to derive information about 3D motion of the vehicle. Results on synthetic and real time-varying sequence are presented.

Approach to analyze a deformable moving target by using the shape deformation model and morphological operators

Weiguo Wu, Take Asai, Takao Akatsuka

Show abstract

The measurement of the characteristic parameters for a moving object with deformation is often an important problem. Here, an approach to analyze the shape change of a ball, when it is kicked in soccer, is proposed by using a simple shape deformation model to evaluate the shape change from the image sequence. Moreover, to determine the parameters of the model which apply to actual ball deformation, the detection of ball is necessary, and the pattern spectrum based on morphological operators is considered. Here, we assume that the deformation surface of the ball is a circular arc, when it is kicked by foot, and the arc is always convex when it is observed from the kicking side. To obtain the parameters of the arc, the preprocessing of the ball image such as local binarization, the region filling and noisy smoothing with morphological operators, is performed from actual image sequence. In order to detect the ball, the pattern spectrum with morphological operators is measured, and then circumscribed circle of the ball is extracted. So, the center and radius of the ball from circumscribed circle and the arc of the deformation surface of the model are obtained. Finally, the characteristic parameters of a moving ball such as the deformation are measured by using the shape deformation model. To demonstrate the effect of this method, we show an application to extract the deformation of the ball in football for the actual sports skill training.

Finite-element method for nonrigid motion estimation

Vincent Devlaminck

Show abstract

We propose a method for estimation of nonrigid motions in image sequences without segmentation technics. This method is based on the use of a Finite Element Method (FEM). We show how a motion process and an adaptative-size mesh procedure can be used to compute a nonrigid motion estimation. The algorithm is based on three steps. In the first step, an initial regular mesh is applied on the first frame. Then a velocity components estimation is computed using an energy function minimization. A FEM analysis based on the present mesh gives the velocity components for all the nodes of the mesh between the first and the second frame. Finally the new positions of the nodes are computed. These positions are the result of the sum of the old position and the estimated displacement or deformation. The algorithm is tested with different types of finite elements.

Accuracy of determination of minor-sized landmarks on the rise of anomalous dimensions

Yuri V. Martishevcky

Show abstract

This causes distortion or no-signal condition and, as a result, deterioration of the accuracy while determining the reference point coordinates. The device performance by isolated CCD defects as well as by no-signal states in the frame under detection has been investigated. The device has been connected in two-processor circuit: the video-signal processor and the tracking processor implemented in the liner Kalman Filter. The numeric results have been obtained by full-scale modeling of the PC device, by the data accession speed 25 Hz, CCD dimensions 256 X 256 pixels and the track including the deterministic and random components.

3D Modeling and Data Processing for Robotics and Machine Vision

Active 3D modeling by recursive viewpoint selection based on symmetry

Kazunori Yoshida, Hiromi T. Tanaka, Jun Ohya, et al.

Show abstract

This paper proposes a new method for creating 3D models of objects efficiently from the silhouettes of objects in images acquired by an active camera whose viewpoints are selected recursively based on symmetry planes of the observed silhouettes. In the proposed method, to obtain the initial view point, we use the assumption that an object takes a stable pose under the influence of gravity by having a symmetry plane to which the direction of gravity is constrained. We choose a point in the direction of the gravity as the initial viewpoint. Then, the new view points are determined based on information about the symmetry plane, where the symmetry plane is obtained from the center of gravity and the axis of inertia of the observed silhouette. This process is repeated until no new view point is selected. Then, the 3D shape of the object is reconstructed by processing voxel data based on the silhouette information acquired at the selected view points. Finally, textures acquired by the observations are mapped to the reconstructed 3D shape. We present some experimental results that show the effectiveness of the proposed method.

3D geometrical shape modeling from an image sequence in an interactive context

Laurent Monchal, Pascal Aubry

Show abstract

We propose in this paper a method to reconstruct geometrical primitives of a scene in the context of an interactive modelling system. For this type of reconstruction we use various pieces of information given by the operator: the name of the primitive and its contours in the first image. We exploit these pieces of information to extract contours of the object in a sequence of image and generate a criterion. The minimization of the criterion leads to determine location and size of the primitive which can be a cylinder, polyhedron, circle... We even propose to improve the quality of the reconstruction of primitives reconstructed separately by adding pieces of information about the whole scene. We generate a global criterion whose minimization leads to better reconstructions of the primitives. We obtain good results on real images.

Petri nets for modeling automated target recognition systems

Ernest L. Hall, Frank S. Cheng

Show abstract

The application of Petri nets for modeling automated target recognition is proposed. This theory will permit the reformulation of automatic target recognition algorithms and paradigms into a modern scientific model that incorporates all the details of a realistic, complex scene. Petri nets have recently been studied and applied to a variety of applications. Their use permits the modeling of both information flow and physical material flow in an overall system model which clearly shows constraints, influences of variables on system performance, and results in a system which can be realized in hardware under software control. This proposed theory promises to revolutionize the model of command, communication and control currently used by integrating the three functions into one overall control system model. The theory has been demonstrated on two systems and the automatic target recognition problem is an ideal test bed for further study.

From vision to action: grasping unmodeled objects from a heap

Martin Rutishauser, Frank Ade

Show abstract

We have investigated the problem of removing objects from a heap without having recourse to object models. This capability is useful for `intelligent singulation', i.e., the decomposition of a heap into isolated objects. As we are exclusively relying on geometric information, the use of range data is a natural choice. To ensure that we see opposite patches of the object surfaces, we use up to three range views from different directions. These views are triangulated using the data points as vertices. After merging the views, the resulting surface description is segmented, i.e., partitioned into large sets of contiguous triangles which correspond to objects or object components in the scene. The system then tries to detect grasping opportunities. Two heuristics guide the selection of a `focus of action' which consists of a suitable component. For this component, good grasping point pairs are found by using an intelligent search technique. If no pairs are found due to impending collisions or bad grasping quality another component is checked. Finally the robot performs the grasping. Force sensing allows the correction of inaccuracies of the vision system and the handling of collisions.

Trinocular stereovision by generalized Hough transform

Jun Shen, Philippe Paillou

Show abstract

In this paper, we present the generalized Hough transform to match the edge segments in trinocular stereovision. We show that the corresponding segment triplet candidates can be detected by a generalized Hough transform in the parameter space ((theta) ,(phi) ) which characterizes the 3D segment orientation. These triplets can then be verified, and the position parameters of the 3D segments can be detected by a generalized Hough transform in the parameter space (Y,Z). So the matching of geometric primitives in trinocular stereovision images can be found by the cascade of two generalized Hough transforms in the spaces of only two dimensions. Experimental results are reported also. Our method shows the following advantages: (1) Trinocular stereovision image matching is transformed into Hough transforms in 2D parameter spaces, which reduces much the computational complexity. (2) Matching can be done completely in parallel. (3) No a priori similarity between images is needed, so very different views can be used, which improves the precision of 3D reconstruction. (4) It is very efficient to solve false targets. (5) Our method gives good results even for partially hidden segments.

Segmentation of dense 3D data using a neural network approach

George K. Knopf

Show abstract

Reverse engineering is the process of generating accurate 3D CAD models of manufactured parts from measured coordinate data. The 3D coordinate data can be acquired from non- contact laser scanning machines or contact coordinate-measuring machines. Prior to creating the CAD model, it is necessary to segment the dense data into regions that are free of any sharp changes in the surface shapes. These segmented regions are then fitted with parametric surface patches for an economized CAD representation. In this paper, a hybrid basis function neural network is proposed for segmenting dense depth data. The first three layers of the network perform the coarse segmentation task by clustering surface features and classifying them as one of eight primitive surface types. The features correspond to the mean curvature (H) and Gaussian curvature (K) of the measured 3D surface. Each surface type image is further partitioned into isolated regions by a series of competitive feedback networks that perform opening and closing morphological operations. Once segmented, each region is parameterized and the associated depth data is approximated by a Bezier surface patch. The corresponding control points are used to reconstruct the parametric surface patch in a typical CAD system.

Best-next-view algorithm for three-dimensional scene reconstruction using range images

J. E. Banta, Yu Zhien, X. Z. Wang, et al.

Show abstract

The primary focus of the research detailed in this paper is to develop an intelligent sensing module capable of automatically determining the optimal next sensor position and orientation during scene reconstruction. To facilitate a solution to this problem, we have assembled a system for reconstructing a 3D model of an object or scene from a sequence of range images. Candidates for the best-next-view position are determined by detecting and measuring occlusions to the range camera's view in an image. Ultimately, the candidate which will reveal the greatest amount of unknown scene information is selected as the best-next-view position. Our algorithm uses ray tracing to determine how much new information a given sensor perspective will reveal. We have tested our algorithm successfully on several synthetic range data streams, and found the system's results to be consistent with an intuitive human search. The models recovered by our system from range data compared well with the ideal models. Essentially, we have proven that range information of physical objects can be employed to automatically reconstruct a satisfactory dynamic 3D computer model at a minimal computational expense. This has obvious implications in the contexts of robot navigation, manufacturing, and hazardous materials handling. The algorithm we developed takes advantage of no a priori information in finding the best-next-view position.

New method of calculating a cross-section area of an object from images

Yin Fan, Dezong Wang

Show abstract

In computer vision, there are many methods to obtain 3D information of objects from images. But the forms of 3D information are different from different methods. For example, in the way of shape from shading, orientation of surface normal can be obtained; in the way of stereo vision, 3D coordinate of points on the object can be obtained. This paper proposes a new method to calculate cross-section area of the object from surface normal by fitting curve. The new method is more precise than old method.

New methods of 3D curved object modeling from image contour sequences

Changsheng Zhao, Roger Mohr

Show abstract

This paper describes three new methods for reconstructing a 3D rigid curve from a sequence of images using a 3D epipolar parameterization. It will be shown that such a 3D curve reconstruction is possible when the camera motion is known, except for the points where the epipolar lines are tangential to the curve and the image contours are singular. All the three new methods are treated as parameter estimation problems. Here the parameter is the position of control points of 3D B-spline curves, and surface patches. Experimental results are presented for real data.

Visualization and modeling of 3D image data in remote robotic applications

John Tourtellott

Show abstract

Robotic missions in remote or unstructured environments rely on 3D vision to accurately map scenes and construct geometric models. In order to effectively utilize 3D vision data, the Interactive Computer-Enhanced Remote Viewing System (ICERVS) provides integration and modeling of 3D data for robotic systems and operators. ICERVS supports mixed-mode modeling of sensor data and geometric objects (3D computer-aided-design data) to enable interactive construction of geometric objects that match features of interest in the work environment. During ICERVS development, different approaches for mixed-mode modeling have been implemented to compare their utility and effectiveness. Results support the conclusion that multiple approaches should be included in 3D visualization and modeling systems.

Automatic sensor placement

Besma R. Abidi

Show abstract

Active sensing is the process of exploring the environment using multiple views of a scene captured by sensors from different points in space under different sensor settings. Applications of active sensing are numerous and can be found in the medical field (limb reconstruction), in archeology (bone mapping), in the movie and advertisement industry (computer simulation and graphics), in manufacturing (quality control), as well as in the environmental industry (mapping of nuclear dump sites). In this work, the focus is on the use of a single vision sensor (camera) to perform the volumetric modeling of an unknown object in an entirely autonomous fashion. The camera moves to acquire the necessary information in two ways: (a) viewing closely each local feature of interest using 2D data; and (b) acquiring global information about the environment via 3D sensor locations and orientations. A single object is presented to the camera and an initial arbitrary image is acquired. A 2D optimization process is developed. It brings the object in the field of view of the camera, normalizes it by centering the data in the image plane, aligns the principal axis with one of the camera's axes (arbitrarily chosen), and finally maximizes its resolution for better feature extraction. The enhanced image at each step is projected along the corresponding viewing direction. The new projection is intersected with previously obtained projections for volume reconstruction. During the global exploration of the scene, the current image as well as previous images are used to maximize the information in terms of shape irregularity as well as contrast variations. The scene on the borders of occlusion (contours) is modeled by an entropy-based objective functional. This functional is optimized to determine the best next view, which is recovered by computing the pose of the camera. A criterion based on the minimization of the difference between consecutive volume updates is set for termination of the exploration procedure. These steps are integrated into the design of an off-line Autonomous Model Construction System AMCS, based on data-driven active sensing. The system operates autonomously with no human intervention and with no prior knowledge about the object. The results of this work are illustrated using computer simulation applied to intensity images rendered by ray-tracing software package.

3D Pattern Recognition for Intelligent Robots

Omnidirectional vision applications for line following

Bradley O. Matthews, David Perdue, Ernest L. Hall

Show abstract

The purpose of this paper is to describe experimental studies on omnidirectional vision for the recognition and control of a mobile vehicle. The omnidirectional vision control technique described offers the advantage of an extremely wide angle field of view. This may be translated in practice to a machine which will not get lost when following a path, to a target locating system which can see both forward and backward, and generally helps the robot survive as a prey rather than as a predator. The wide angle of view permits a mobile robot to follow a curved path even around sharp corners, hairpin turns or other complicated curves. The disadvantage of the omnidirectional view is geometric distortion. This geometric distortion may be easily corrected after calibration to determine important parameters. An object recognition method was used that detects the largest target in a selected region of the field of view, and computes the centroid of this target. When two target points are detected, the algorithm calculates a projected 3D path for the robot. The distance and angle from this ideal path are then used to provide steering control for a mobile robot. The current application for this technique is a generic intelligent control device that is transportable from one mobile vehicle to another with changes only in system parameters rather than control architecture. The significance of this research is in showing how the geometric distortion can be compensated to permit an omnidirectional vision navigation control system for a mobile robot.

Ultrasonic perception of mobile robots: a comparison of competition neural network and feedforward neural network techniques for sensor array signal processing

J. Chen, Jan M. Van Campenhout

Show abstract

For truly intelligent behavior of a mobile robot, the important ask is to make the mobile robot to understand its environment. For this reason a certain measurement device must be installed in the mobile robot. In our system we make use of a tri-aural sensor array to observe the robot environment. The tri-aural sensor array is composed of three ultrasonic sensors placed on a line. The central sensor is used as transmitter as well as receiver, the two peripheral ones are used only as receivers. By using this device, the robot can get some environment information which is represented by the reflected echoes from the objects in the sensor field. From the sampled information, the robot has to determine the number of objects as well as their positions (distance and bearing) in the sensor field. How to process these sensor array signals is our principal problem. We have developed two neural network techniques for solving this problem, one is based on a competition neural network and another based on a multi-layer feedforward neural network. In this paper, we first describe the correspondence problem in realistic circumstances and then we briefly introduce both neural network techniques. Finally we compare both methods by using simulation data.

Toward the recognition of 3D free-form objects

Christian L. Schutz, Heinz Huegli

Show abstract

This paper investigates a new approach for the recognition of 3D objects of arbitrary shape. The proposed solution follows the principle of model-based recognition using geometric 3D models and geometric matching. It is an alternative to the classical segmentation and primitive extraction approach and provides a perspective to escape some of its difficulties to deal with free-form shapes. The heart of this new approach is a recently published iterative closest point matching algorithm, which is applied variously to a number of initial configurations. We examine methods to obtain successful matching. Our investigations refer to a recognition system used for the pose estimation of 3D industrial objects in automatic assembly, with objects obtained from range data. The recognition algorithm works directly on the 3D coordinates of the objects surface as measured by a range finder. This makes our system independent of assumptions on the objects geometry. Test and model objects are sets of 3D points to be compared with the iterative closest point matching algorithm. Substantially, we propose a set of rules to choose promising initial configurations for the iterative closest point matching; an appropriate quality measure which permits reliable decision; a method to represent the object surface in a way that improves computing time and matching quality. Examples demonstrate the feasibility of this approach to free-form recognition.

3D model matching for fine pose determination

Thomas Peurach, Margaret J. Whalen, Douglas Haanpaa

Show abstract

3D model matching is the process of matching image features to 3D model features. The feature correspondences are used to determine the transformation between the model and camera and thus, define the orientation of the object. The solution to this 3D model matching is difficult to calculate due to the high dimensionality of the solution space. Addressing this problem, we have developed a machine-vision system which incorporates an advanced iterative search procedure and Hough transform feature extraction into a system which avoids image segmentation to determine the fine pose of some objects. Inputs to the system include images of the object from various camera views, a coarse pose estimate and a CAD object model. A windowing grid system is applied to the scene image(s) and a single feature hypothesis is extracted from each window. A modified Newton-Raphson iterative search method is used to evaluate mappings between the image features and model features. The search attempts to optimize a cost function which is based on the perpendicular distance between linear features. Results have shown that the described fine pose determination method can achieve sub pixel accuracy when applied to images of various geometric shapes. The system has been applied to a number of applications including space rendezvous and docking, ordnance defusing, and missile tracking.

Geometric features in images of polyhedra

Raashid Malik, Hyeon-June Kim

Show abstract

In model-based object recognition, the features used to describe a model often represent various geometric properties of the model. But there is a difficulty in recognizing a 3D object when the orientation of the object or the view angle of the camera in three space is unknown. This occurs because measurements of the geometric features of an object in a 2D image vary with different view directions. The variations of measured features may be expressed using probability density functions. These densities may be used to completely characterize the observed variations. In this paper we introduce a recognition scheme based on the probabilistic analysis of view variations of geometric features. Our previous work quantified the view variations of a certain pair of features (referred to as quadrature line ratios) for planar surfaces which were scale invariant and image rotation invariant. That work is now extended to a complete 3D convex polyhedral object recognition scheme. We derive the joint density of two pairs of features which are measured from two non-coplanar faces of an object. Likelihood functions based on this density have been developed for each aspect of a polyhedral object and used in the recognition scheme. Experiments have been conducted to verify the efficiency of the proposed scheme.

Qualitative and quantitative characterization of surface and volumetric properties of objects for recognition

Stoyanka D. Zlateva

Show abstract

Recently advanced computational theories of 3D shape representation for recognition have focused on the alternative of viewer-centered vs. object-centered representation. Both approaches rely on establishing a correspondence between image data and the prototypical knowledge of object shape. This paper discusses the mathematical structures needed for organizing prototypical knowledge of object shape in a way that naturally relies to perceptual categories and thus allows for a flexible and efficient recognition process. The representational schema consists of a configuration of boundary based constituent parts which build the reference frame for qualitative and quantitative shape attributes. The decomposition into constituent parts maximizes convexity regions of the bounding surface and relies on extending the local classification into elliptic, hyperbolic, plane and parabolic to globally convex and nonconvex surface regions. The surface type of the parts guides and is preserved in a subsequent part approximation through generalized cones as volumetric primitives. This approach allows for a consistent characterization of surface and volumetric properties of object shape. A secondary segmentation into sub-parts and associated features is defined by the surface type and the type of change in cross section area along the axis. The two segmentation levels allows for a detailed and elaborate shape description. We show examples of shape description and discuss the representation in relation to the viewer-centered and object centered approaches to recognition.

Competitive stereo correspondence framework

El-Sayed H. El-Konyaly, Sabry Fouad Saraya, Y. Atwa

Show abstract

At the heart of the binocular stereo approach lies the task of stereo matching, i.e. solving for correspondences. Solving the correspondence problem accurately, reliably, and efficiently depends on the type of features used and the computational strategy employed. Similarity is the guiding principle for solution, with the premise that corresponding features will remain similar in the two images. Yet, because of factors such as noise, shadows, occlusions, and perspective effects, the appearance of the corresponding features will differ in the two images. Moreover, derivation of a matching primitive that contains adequate power to resolve ambiguities and is truly invariant with respect to the viewing geometries is a difficult task. This paper introduces a developed competitive stereo correspondence (CSC) framework that solves for these ambiguities. It is heuristic, iterative, and feature-based. Extensive experimentation is successfully carried out on real world scenes, of varying complexity, to evaluate the performance of CSC framework.

Statistical framework for stereo

Wolfgang Poelzleitner, Gerhard Jakob, Gerhard Paar

Show abstract

This paper deals with stereo matching, which is reformulated as a statistical pattern recognition problem. In stereo, the computation of correspondences of image points in the right and left image is viewed as a two-class pattern recognition problem. The two matching left-right points are said to constitute class 1 (matching) and the points in the neighborhood of these points form class 2 (non-matching). We have argued before that matching can be drastically improved by using several features rather than just graylevels (usually called area- based matching) or edges (usually called edge-based matching). Based on this formulation of matching as a pattern recognition problem well-known theories to optimize feature extraction and feature selection should be applied to stereo as well. In the paper we show the results of experiments to support the statistical framework for stereo and how the performance of a stereo system can be improved by taking into account the findings of statistical pattern recognition.

Modelization of fetal cranial contour from ultrasound axial slices

Eric Duquenoy, Abdelmalik Taleb-Ahmed, Serge Reboul, et al.

Show abstract

The problem of the choice of slices angles, at the time of diagnosis of brain fetal malformations, is linked to the position of the fetus inside the uterus. The 3D reconstruction of intern parts of the brain and especially the callosus corpus can help to detect some malformations. This kind of reconstruction pass by several steps that depend all on the initial segmentation step. The main difficulties of the segmentation are linked on the one hand to the inherent noise of ultrasound imaging and on the other hand to the matching of views of the 2D sequence to process. The 3D reconstruction stage require the definition of a marker in the sequence of process. In agreement with physicians, we have used the cranial contour as reference on the one hand because it is considered as invariable and fixed and on the other hand because of its more pronounced contrast (due to the fact of its cartilaginous nature) than the other structures. Nevertheless, the classic techniques of segmentations have remained without effect (open contour, too noisy). Therefore, we have developed an algorithm allowing to define automatically the ellipse. This method is based on a parametrically deformable model using elliptic FOURIER decomposition.

Neural Nets for Pattern Recognition and Machine Vision

Feature space trajectory neural net classifier: 8-class distortion-invariant tests

Leonard Neiberg, David P. Casasent, Robert J. Fontana, et al.

Show abstract

A novel neural network for distortion-invariant pattern recognition is described. Image regions of interest are determined using a detection stage, each region is then enhanced (the steps used are detailed), features are extracted (new Gabor wavelet features are used), and these features are used to classify the contents of each input region. A new feature space trajectory neural network (FST NN) classifier is used. A new 8 class database is used, a new multilayer NN to calculate the distance measures necessary is detailed, its low storage and on-line computational load requirements are noted. The ability of the adaptive FST algorithm to reduce network complexity while achieving excellent performance is demonstrated. The clutter rejection ability of this neural network to reject false alarm inputs is demonstrated, and time-history processing to further reduce false alarms is discussed. Hardware and commercial realizations are noted.

Automatic target identification using neural networks

Mahmoud A. Abdallah, Tayib I. Samu, William A. Grissom

Show abstract

Neural network theories are applied to attain human-like performance in areas such as speech recognition, statistical mapping, and target recognition or identification. In target identification, one of the difficult tasks has been the extraction of features to be used to train the neural network which is subsequently used for the target's identification. The purpose of this paper is to describe the development of an automatic target identification system using features extracted from a specific class of targets. The extracted features were the graphical representations of the silhouettes of the targets. Image processing techniques and some Fast Fourier Transform (FFT) properties were implemented to extract the features. The FFT eliminates variations in the extracted features due to rotation or scaling. A Neural Network was trained with the extracted features using the Learning Vector Quantization paradigm. An identification system was set up to test the algorithm. The image processing software was interfaced with MATLAB Neural Network Toolbox via a computer program written in C language to automate the target identification process. The system performed well as at classified the objects used to train it irrespective of rotation, scaling, and translation. This automatic target identification system had a classification success rate of about 95%.

Optimal path planning for robot navigation by the Hopfield net

G. Castellano, Ettore Stella, Giovanni Attolico, et al.

Show abstract

Navigation in dynamic indoor environments requires a mobile vehicle to follow the planned path while avoiding unexpected obstacles eventually met along it. In this paper an attempt of designing a path planar using a computational model suitable for fast implementation on special purpose hardware is presented, in which the automatic modeling of the scene and its continuous updating are accomplished by means of a recursive ultrasonic-based obstacle avoidance system. From this model a graph representing all the possible paths for the robot in the free-space is built using well known methodologies (configuration space, generalized cones). The task of searching for the shortest path in this graph is solved by means of a neural network based on the Hopfield model, that represents an interesting alternative to classical techniques as A*. A major advantage of this neural approach is the parallel nature of the resulting network that allows a rapid convergence to a solution when implemented in hardware. Simulation results are shown to illustrate the performance of the Hopfield path planner.

Model of foveal visual preprocessor

Natalia A. Shevtsova, Alain Faure, Arkadi A. Klepatch, et al.

Show abstract

A neural network model of the foveal visual preprocessor (FVP) based on a non-uniform representation of the vision field from the center to the periphery has been developed. The properties of foveal neural network and the results of computer simulation and analytical study of the model are considered. The dependence of a sensory tuning of FVP elements on the FVP and moving stimulus parameters is studied. A possible development of a foveal visual sensor on the base of the results obtained is discussed.

Predictive neuro-fuzzy controller for multilink robot manipulator

Emre Kaymaz, Sunanda Mitra

Show abstract

A generalized controller based on fuzzy clustering and fuzzy generalized predictive control has been developed for nonlinear systems including multilink robot manipulators. The proposed controller is particularly useful when the dynamics of the nonlinear system to be controlled are difficult to yield exact solutions and the system specification can be obtained in terms of crisp input-output pairs. It inherits the advantages of both fuzzy logic and predictive control. The identification of the nonlinear mapping of the system to be controlled is realized by a three- layer feed-forward neural network model employing the input-output data obtained from the system. The speed of convergence of the neural network is improved by the introduction of a fuzzy logic controlled backpropagation learning algorithm. The neural network model is then used as a simulation tool to generate the input-output data for developing the predictive fuzzy logic controller for the chosen nonlinear system. The use of fuzzy clustering facilitates automatic generation of membership relations of the input-output data. Unlike the linguistic fuzzy logic controller which requires approximate knowledge of the shape and the numbers of the membership functions in the input and output universes of the discourse, this integrated neuro-fuzzy approach allows one to find the fuzzy relations and the membership functions more accurately. Furthermore, it is not necessary to tune the controller. For a two-link robot manipulator, the performance of this predictive fuzzy controller is shown to be superior to that of a conventional controller employing an ARMA model of the system in terms of accuracy and consumption of energy.

Feature Extraction (2D and 3D)

Line sketch extraction using the Hough transform

Raashid Malik, Moon Won Choo

Show abstract

Reliable schemes for line sketch extraction from gray scale images are investigated. It is often possible to recognize polyhedral objects in an image by capturing the linear features in the image and analyzing corner points. A line sketch is extracted from a gray scale image using the Hough transform (HT). The pixel points with their gradient directions and magnitudes are properly used for deciding the value of accumulation in Hough space. This leads to significant computational savings and robust results. The research presented in this report is concerned with utilizing HT and its derivatives to detect linear feature in images and constructing line sketches for higher level processing such as image representation, description and coding.

Line following for a mobile robot

Bradley O. Matthews, Michael A. Ruthemeyer, David Perdue, et al.

Show abstract

A mobile robot has been designed which incorporates line following, obstacle avoidance and speed control. The line following algorithm images two windows and locates the centroid of the brightest target in the window. From these two centroids and the knowledge that the points are in the ground plane, the equation of the line is developed. The angle of the line and minimum distance from the robot centroid are then calculated and used in the steering control. This robot was designed for the 1995 Automated Unmanned Vehicle Society/Society of Automotive Engineers contest and won first place in the design competition.

Issues for data reduction of dense three-dimensional data

Joseph H. Nurre, Jennifer J. Whitestone, Dennis B. Burnsides, et al.

Show abstract

Acquiring a large quantity of 3D data has become common plane with the advent of new technologies. Reducing the number of data points improves processing speed and storage requirements. Astute data reduction requires an understanding of the correlation between data measures and geometric measures. These relationships are dependent upon the data reduction algorithm used. This paper investigates these relationships for a small number of data reduction algorithms. A framework is presented for tracking these changes and for assisting a user in identifying the most appropriate data reduction method for their application.

Three-dimensional thinning by neural networks

Jun Shen, Wei Shen

Show abstract

3D thinning is widely used in 3D object representation in computer vision and in trajectory planning in robotics to find the topological structure of the free space. In the present paper, we propose a 3D image thinning method by neural networks. Each voxel in the 3D image corresponds to a set of neurons, called 3D Thinron, in the network. Taking the 3D Thinron as the elementary unit, the global structure of the network is a 3D array in which each Thinron is connected with the 26 neighbors in the neighborhood 3 X 3 X 3. As to the Thinron itself, the set of neurons are organized in multiple layers. In the first layer, we have neurons for boundary analysis, connectivity analysis and connectivity verification, taking as input the voxels in the 3 X 3 X 3 neighborhood and the intermediate outputs of neighboring Thinrons. In the second layer, we have the neurons for synthetical analysis to give the intermediate output of Thinron. In the third layer, we have the decision neurons whose state determines the final output. All neurons in the Thinron are the adaline neurons of Widrow, except the connectivity analysis and verification neurons which are nonlinear neurons. With the 3D Thinron neural network, the state transition of the network will take place automatically, and the network converges to the final steady state, which gives the result medial surface of 3D objects, preserving the connectivity in the initial image. The method presented is simulated and tested for 3D images, experimental results are reported.

Plane surfaces characterization by stereo vision and manipulation: a method to enhance assembly cell efficiency

Roberto Da Forno, Francesco Angrilli

Show abstract

This paper analyzes a strategy for characterization of plane surfaces using stereo vision and manipulation. Characterization involves three steps: definition of orientation (normal versor to the plane), definition of the figure center of massand definition of a proper frame of reference solidal to such plane. When all these quantities are defined, the plane can be considered as characterized. In this work, plane orientation is obtained using stereo vision and structured light. The second step is solved by power manipulation, orienting the plane orthogonal to the camera focal axes to give the figure center of mass. The frame of reference solidal to the plane can now be placed as an example in the center of mass. In this work, kinematic analysis is fully developed, considering a robot with six degrees of freedom. The proposed method can be applied to enhance the efficiency of robotized assembly cells. The main problem in assembly is actually continuity in the dimension of assembled parts. Parts with working errors beyond a fixed limit can cause plant stoppage. The proposed method can be used to avoid this problem or at least to extend dimensional limits.

Algorithm of invariant pattern recognition using redundant Hough transform

Michael A. Popov, Sergey Ju. Markov

Show abstract

The invariant pattern recognition and distortions parameters estimation using redundant Hough transform (RHT) is considered. Authors offer to use RHT which differs from classical Hough transform (HT) by replacement of procedure of `voting' on procedure of `double voting'. The RHT-array has following property (in difference from HT-array): translate and rotation distortions in image plane can be reduced to cyclic shifts along coordinate axes in HT-plane for all values of distortions. Classical HT provides the completion of this property not for all significances or distortions. This property permits to construct invariant pattern recognition algorithm simply enough. The algorithm for recognition and distortions estimation is presented. The algorithm is based on consecutive completion of RHT and modified Walsh- Hadamard transform. As a result the set of features is formed. The features are invariant to image noise and affine distortions of rotation and translation. On the basis of these features the classification of object is executed and the values of distortions are defined. The structural scheme of automatic recognition system that uses the algorithm is presented. The experimental researches of algorithm on aircraft imagery shows its good performance even with noisy images.

Detection and Segmentation in Machine Vision

Image segmentation using Gaussian curvature

Neelima Shrikhande, Sripriya Ramaswamy

Show abstract

One of the central problems of computer vision is segmentation of images into salient features such as edges and surfaces. Different kinds of similarity criteria can be used to group related pixels together. One such criterion is the curvature of surfaces in an image of a multiobject scene that contains several objects with different shapes. In practice, however, curvature is difficult to calculate because small amount of noise can cause large amounts of errors in calculations of first and second derivatives. In this paper, we use a discrete approximation of Gaussian curvature that is efficient to compute. The approximation is used to segment the image into individual surfaces. Both synthetic and real images have been tested. Results appear quite encouraging.

Fusion and optimized Gabor filter design for object detection

David Weber, David P. Casasent

Show abstract

We consider the problem of detection of objects in images using one filter based on different 2D Gabor functions. By detection, we mean locating multiple classes of targets with distortions present and in a clutter background. It is also desirable to minimize false alarms due to clutter. Gabor functions (GFs) are Gaussian functions modulated by complex sinusoids. The imaginary (real) part of a GF has been shown to be a good edge (blob) detector. In this work, we use a single filter which is a linear combination of the real and imaginary parts of several GFs. We refer to this as a macro Gabor filter. It is correlated with an input image and then thresholded to detect targets. The new aspects are: combining real and imaginary parts of GFs into one filter, separately optimizing the parameters of the GFs by controlling the shape of the correlation outputs for true classes and clutter and separately optimizing the linear combination coefficients using a new square law perceptron to detect hot and cold objects. We show multi-class distortion invariant detection results with better performance than obtained with other methods.

Localization of significant 3D objects in 2D images for generic vision tasks

Marielle Mokhtari, Robert Bergevin

Show abstract

Computer vision experiments are not very often linked to practical applications but rather deal with typical laboratory experiments under controlled conditions. For instance, most object recognition experiments are based on specific models used under limitative constraints. Our work proposes a general framework for rapidly locating significant 3D objects in 2D static images of medium to high complexity, as a prerequisite step to recognition and interpretation when no a priori knowledge of the contents of the scene is assumed. In this paper, a definition of generic objects is proposed, covering the structures that are implied in the image. Under this framework, it must be possible to locate generic objects and assign a significance figure to each one from any image fed to the system. The most significant structure in a given image becomes the focus of interest of the system determining subsequent tasks (like subsequent robot moves, image acquisitions and processing). A survey of existing strategies for locating 3D objects in 2D images is first presented and our approach is defined relative to these strategies. Perceptual grouping paradigms leading to the structural organization of the components of an image are at the core of our approach.

New methods of measuring and calibrating robots

Hartmut Janocha, Bernd Diewald

Show abstract

ISO 9283 and RIA R15.05 define industrial robot parameters which are applied to compare the efficiency of different robots. Hitherto, however, no suitable measurement systems have been available. ICAROS is a system which combines photogrammetrical procedures with an inertial navigation system. For the first time, this combination allows the high-precision static and dynamic measurement of the position as well as of the orientation of the robot endeffector. Thus, not only the measuring data for the determination of all industrial robot parameters can be acquired. By integration of a new over-all-calibration procedure, ICAROS also allows the reduction of the absolute robot pose errors to the range of its repeatability. The integration of both system components as well as measurement and calibration results are presented in this paper, using a six-axes robot as example. A further approach also presented here takes into consideration not only the individual robot errors but also the tolerances of workpieces. This allows the adjustment of off-line programs of robots based on inexact or idealized CAD data in any pose. Thus the robot position which is defined relative to the workpiece to be processed, is achieved as required. This includes the possibility to transfer teached robot programs to other devices without additional expenditure. The adjustment is based on the measurement of the robot position using two miniaturized CCD cameras mounted near the endeffector which are carried along by the robot during the correction phase. In the area viewed by both cameras, the robot position is determined in relation to prominent geometry elements, e.g. lines or holes. The scheduled data to be compared therewith can either be calculated in modern off-line programming systems during robot programming, or they can be determined at the so-called master robot if a transfer of the robot program is desired.

Watershed transformation of time series of medical thermal images

Dietrich W. Paulus, Torsten Greiner, Christian Knuevener

Show abstract

In this paper, we demonstrate how the watershed transform can be applied to series of thermal medical images to compute important features for physiological interpretation. Automatic physiological analysis of neural features can thereby be shown which was not possible otherwise. The transform as described in the literature has some minor algorithmic errors and inconsistencies which usually cause little trouble. These problems occur on flat plateaus where no unique watershed can be detected. After a short formal description of the transform we describe and eliminate these deficiencies and introduce a modified segmentation method which handles these plateaus as expected intuitively. In our particular medical applications, visible differences of the new segmentation with respect to the old one can be noticed. We contrast our results to those obtained by the detection of isothermic regions. Features of the segmented regions are evaluated as a function of time and used for medical and physiological interpretation. An outlook describes current research in sensor fusion of visual and thermal images for medical research.

Morphological slope filters

Ivan R. Terol-Villalobos

Show abstract

Morphological image filters play a fundamental role in image segmentation techniques. In mathematical morphology, the geodesic transformation (used to build reconstruction transformations) and morphological gradients are the bases to apply the watershed-plus- markers approach, which is the most commonly used technique to segment an image. Here, we propose new filters, based on the notion of morphological gradients and the geodesic transformation. These new filters have interesting properties and give essential contrast on the images, but they are not increasing transformations. Using these filters, we can modify the minima (maxima) of images, to obtain a good segmentation when the traditional transformation, the watershed, is applied.

Vector quantization of multiresolution morphological pyramids for efficient coding of images

Zhiyang Zhang, Sunanda Mitra

Show abstract

Morphological pyramid has been proven to be a useful tool in image compression due to low computational complexity, simple implementation and good compression performance based on minimization of entropy. Several morphology based pyramid decomposition techniques already exist. These techniques use morphological filters prior to the down sampling of images. The coding schemes developed commonly omit the first error image of the error pyramid to achieve high compression ratios. However, fine image details may be lost in this process. In order to get high quality lossy image, an estimator involving connectivity preserving filters for the first error image has been used. By using this estimator, the bits per pixel required to code the first error image can be reduced by 30 to 40 percent to obtain `near lossless' compression. In this paper, we apply variable Vector Quantization (VQ) to pyramid coding. We compare the performance of the above estimator to that of VQ scheme. For multi- level pyramid, we discuss accumulative errors and use a modified pyramid generation structure to reduce the accumulative errors. We perform our comparison on two standard images and use Peak Signal to Noise Ratio to judge the compression efficiency and visual quality.

Segmentation of range images using morphological operations: review and examples

Linda Ann Gee, Mongi A. Abidi

Show abstract

Image segmentation involves calculating the position of object boundaries. For scene analysis, the intent is to differentiate objects from clutter by means of preprocessing. The object of this paper is to examine and discuss two morphological techniques for preprocessing and segmenting range images. A Morphological Watershed Algorithm has been studied in detail for segmenting range images. This algorithm uses a unique approach for defining the boundaries of objects from a morphological gradient. Several sets of range images are used as input to the algorithm to demonstrate the flexibility of the watershed technique and the experimental results support this approach as an effective method for segmenting range images. Morphological image operators present another means for segmenting range images. In particular, the results from implementing gray-scale morphological techniques indicate that these operators are useful for segmentation. This is made possible by converting a range image of a scene to a gray-scale image representation. The result represents the umbra of the surface of the objects within the scene. By applying morphological operations to the gray values of the image, the operations are applied to the umbra. Each pixel represents a point of the object's umbra, thereby yielding scene segmentation. The techniques that are discussed are found to be useful for preprocessing and segmenting range images which are direct extensions to object recognition, scene analysis, and image understanding.

Intelligent Robots and Computer Vision XIV: Algorithms, Techniques, Active Vision, and Materials Handling

Volume Details

Table of Contents

Table of Contents