Sensor Fusion VI

Sensor models and a framework for sensor management

Alex P. Gaskell, Penelope J. Probert

Show abstract

We describe the use of Bayesian belief networks and decision theoretic principles for sensor management in multi-sensor systems. This framework provides a way of representing sensory data and choosing actions under uncertainty. The work considers how to distribute functionality between sensors and the controller. Use is made of logical sensors based on complementary physical sensors to provide information at the task level of abstraction represented within the network. We are applying these methods in the area of low level planning in mobile robotics. A key feature of the work is the development of quantified models to represent diverse sensors, in particular the sonar array and infra-red triangulation sensors we use on our AGV. We need to develop a model which can handle these very different sensors but provides a common interface to the sensor management process. We do this by quantifying the uncertainty through probabilistic models of the sensors, taking into account their physical characteristics and interaction with the expected environment. Modelling the sensor characteristics to an appropriate level of detail has the advantage of giving more accurate and robust mapping between the physical and logical sensor, as well as a better understanding of environmental dependency and its limitations. We describe a model of a sonar array, which explicitly takes into account features such as beam-width and ranging errors, and its integration into the sensor management process.

Variable state dimension filter applied to active camera calibration

Philip F. McLauchlan, David W. Murray

Show abstract

We present a new form of Kalman filter that allows the size of the state vector estimated by the filter to vary in an arbitrary way. The state vector is structured as a single global state vector and any number of local state vectors. The local state vectors are allowed to be coupled by the system plant equations to the global state vector, but not to each other. This means that the inverse covariance matrix contains mostly zeroes, and this allows the Kalman filter to be formulated such that the time complexity is a linear function of the number of local states, rather than cubic as would be the case with the normal Kalman filter. Local states may be added to or removed from the state vector at any time. The filter does not strictly allow state dynamics, but approximate methods are available under certain assumptions. We have implemented an active camera calibration algorithm for a high performance head/eye platform, Yorick, using the filter. This uses the trajectories of an arbitrary and changing number of tracked image features to update the calibration parameters over time. The algorithm is fully integrated into a parallel real-time vision system for gaze control.

Design and experimental validation of a multiband-radar data fusion system

Stelios C.A. Thomopoulos, Nickens N. Okello

Show abstract

A robust constant false alarm rate (CFAR) distributed detection system that operates in heavy clutter with unknown distribution is presented. The system is designed to provide CFARness under clutter power fluctuations and robustness under unknown clutter and noise distributions. The system is also designed to operate successfully under different-power sensors and exhibit fault-tolerance in the presence of sensor power fluctuations. The test statistic at each sensor is a robust CFAR t-statistic. In addition to the primary binary decisions, confidence levels are generated with each decision and used in the fusion logic to robustify the fusion performance and eliminate weakness of the Boolean fusion logic. The test statistic and the fusion logic are analyzed theoretically for Weibull and log-normal clutter. The theoretical performance is compared against Monte-Carlo simulations that verify that the system exhibits the desired characteristics of CFARness, robustness, insensitivity to power fluctuations and fault- tolerance. The system is tested with experimental target-in-clear and target-in-clutter data and its experimental performance agrees with the theoretically predicted behavior.

Robust computational vision

Brian G. Schunck

Show abstract

This paper presents a paradigm for formulating reliable machine vision algorithms using methods from robust statistics. Machine vision is the process of estimating features from images by fitting a model to visual data. Vision research has produced an understanding of the physics and mathematics of visual processes. The fact that computer graphics programs can produce realistic renderings of artificial scenes indicates that our understanding of vision processes must be quite good. The premise of this paper is that the problem in applying computer vision in realistic scenes is not the fault of the theory of vision. We have good models for visual phenomena, but can do a better job of applying the models to images. Our understanding of vision must be used in computations that are robust to the kinds of errors that occur in visual signals. This paper argues that vision algorithms should be formulated using methods from robust regression. The nature of errors in visual signals is discussed, and a prescription for formulating robust algorithms is described. To illustrate the concepts, robust methods have been applied to several problems: surface reconstruction, dynamic stereo, image flow estimation, and edge detection.

Automatic extraction of complex surface models from range images using a trimmed-rational Bezier surface

Pierre Boulanger, Iwao Sekita

Show abstract

This paper presents a new method for the extraction of a rational Bezier surface from a set of data points. The algorithm is divided into four parts. First, a least median square fitting algorithm is used to extract a Bezier surface from the data set. Second, from this initial surface model an analysis of the data set is performed to eliminate outliers. Third, the algorithm then improves the fit over the residual points by modifying the weights of a rational Bezier surface using a non-linear optimization method. A further improvement of the fit is achieved using a new intrinsic parameterization technique. Fourth, an approximation of the region boundary is performed using a NURB with knots. Experimental results show that the current algorithm is robust and can precisely approximate complex surfaces.

Extraction of parametrically defined geometric primitives

Gerhard Roth

Show abstract

Extraction is a generalization of fitting, and is sometimes given the name robust fitting. In ordinary fitting the assumption is made that all the points belong to the curve or surface being fit. In extraction, or robust fitting, this assumption does not hold. Thus an extraction routine must return not only the equation of the best primitive (curve or surface), but also which of the data points are described by this primitive. We give a short description of our extraction algorithm which is based on random sampling. Previously we have shown how our extraction algorithm can deal with curves and surfaces defined implicitly. In this paper we extend this algorithm to curves and surfaces defined parametrically. Being able to extract curves and surfaces defined both implicitly and parametrically makes this algorithm unique. We show a number of experimental results that demonstrate the extraction algorithm in operation.

Robust surface perception in range data

Kim L. Boyer, M. J. Mirza

Show abstract

We present an autonomous, statistically robust, sequential function approximation approach to simultaneous parameterization and organization of (possibly partially occluded) surfaces in noisy, outlier-ridden (not Gaussian) range data. Unlike existing surface characterization techniques, our method generates complete surface hypotheses in parameter space. Given a noisy depth map of an unknown 3D scene, the algorithm first selects appropriate seed points representing possible surfaces. For each non-redundant seed it chooses the best approximating model from a given set of competing models using the Modified Akaike Information Criterion (MCAIC). With this best model, each surface is expanded from its seed over the entire image, and this step is repeated for all seeds. Those points which appear to be outliers with respect to the model in growth are not included in the (possibly disconnected) surface. Point regions are removed from each newly grown surface in the prune stage. Noise, outliers, or coincidental surface alignment may cause some points to appear to belong to more than one surface. These ambiguities are resolved by a weighted voting scheme within a 5 X 5 decision window centered around the ambiguous point. The isolated point regions left after the resolve stage are removed and any missing points in the data are filled by the surface having a majority consensus in an 8-neighborhood.

New random sampling operator for vision

Charles V. Stewart

Show abstract

A new robust operator is presented which is based on minimizing the probability that a fit to the data could have occurred at random. Through theoretical and experimental analysis we have shown that it is able to find accurate fits and approximately the correct number of inliers to the fit even when fewer than 50% of the points in an image region belong to any one surface.

Directed-sensing strategies for feature-relative navigation

John J. Leonard, James G. Bellingham

Show abstract

Many important applications of autonomous underwater vehicles (AUVs) require operations in close proximity to man-made objects or natural bottom topography. In these situations, the vehicle must adapt its trajectory on-line in response to current threats and mission objectives. To provide this capability, we are developing a sonar-based navigation technique that emulates the manner in which a person navigates through an unknown room in the dark: by reaching out for and establishing contact with walls, tables, and chairs, managing transitions from one object to the next as one moves across the room. Our intuition here is that, in many ways, sonar is more like touch than vision. It may be possible to build a vehicle that can effectively use its sonar to `grab' an object of interest, say a cylindrical post for docking, and then `reel itself in' by feeding back sonar range measurements from the object to its dynamic controller. We envision an AUV that can establish `virtual tethers' with arbitrary objects in the water column or on the seabed. Fast, local processing can maintain `contact' with the objects or surfaces of interest. Control laws can be established to utilize streams of measurements from these features to achieve local, feature-relative navigation. While our research is driven by the severe challenges of the subsea environment, we anticipate that the approach will also be useful in land robot applications.

Active sensor fusion for mobile robot exploration and navigation

Robert Mandelbaum, Max Mintz

Show abstract

In this paper we present a paradigm for active sensor fusion and feature integration for the purposes of exploration of a static environment, and subsequent navigation therein. We describe the feature grid representation of the environment which is extensible to a wide range of sensor modalities, allows efficient access of information, and supports inter-agent cooperation. In particular, we have developed a testbed with which we investigate the fusion of data from acoustic range sensors with data from a structured-light sensor capable of delineating object extent. The acoustic sensors are employed for primary detection and localization of an object, as well as for extraction of geometric features such as planar surfaces and corners. Once an object has been identified, an active sensing strategy is invoked, causing the mobile robot to follow a trajectory which brings the object into view of the structured-light sensor. In this way, relatively accurate information can be accumulated regarding object position, extent and orientation.

Exception handling for sensor fusion

G. T. Chavez, Robin R. Murphy

Show abstract

This paper presents a control scheme for handling sensing failures (sensor malfunctions, significant degradations in performance due to changes in the environment, and errant expectations) in sensor fusion for autonomous mobile robots. The advantages of the exception handling mechanism are that it emphasizes a fast response to sensing failures, is able to use only a partial causal model of sensing failure, and leads to a graceful degradation of sensing if the sensing failure cannot be compensated for. The exception handling mechanism consists of two modules: error classification and error recovery. The error classification module in the exception handler attempts to classify the type and source(s) of the error using a modified generate-and-test procedure. If the source of the error is isolated, the error recovery module examines its cache of recovery schemes, which either repair or replace the current sensing configuration. If the failure is due to an error in expectation or cannot be identified, the planner is alerted. Experiments using actual sensor data collected by the CSM Mobile Robotics/Machine Perception Laboratory's Denning mobile robot demonstrate the operation of the exception handling mechanism.

Novel graphical environment for virtual and real-world operations of tracked mobile manipulators

ChuXin Chen, Mohan M. Trivedi, Mir Azam, et al.

Show abstract

A simulation, animation, visualization and interactive control (SAVIC) environment has been developed for the design and operation of an integrated mobile manipulator system. This unique system possesses the abilities for (1) multi-sensor simulation, (2) kinematics and locomotion animation, (3) dynamic motion and manipulation animation, (4) transformation between real and virtual modes within the same graphics system, (5) ease in exchanging software modules and hardware devices between real and virtual world operations, and (6) interfacing with a real robotic system. This paper describes a working system and illustrates the concepts by presenting the simulation, animation and control methodologies for a unique mobile robot with articulated tracks, a manipulator, and sensory modules.

Model-driven vision for in-door navigation

Henrik I. Christensen, N. O. S. Kirkeby, Steen Kristensen, et al.

Information acquisition and fusion in the Mobile Perception Laboratory

Bruce A. Draper, Shashi Buluswar, Allen R. Hanson, et al.

Show abstract

The University of Massachusetts Mobile perception Laboratory (MPL) is an autonomous outdoor vehicle, similar to CMU's NAVLAB II, that was built by UMass as an experimental testbed for high-level computer vision. Our goal in developing MPL is the integration of many of the vision algorithms developed over the past decade, at UMass and elsewhere, into a system which is capable of exhibiting useful goal-oriented autonomous navigation in real- world scenarios. To accomplish such high-level tasks, MPL has to acquire many types of information about its environment, not all of it image related. Rather than performing sensor level fusion of the image data, we have focused on the types of information required and the representations needed to express them. The problem, as we see it, is to integrate the information needed for a specific task (using task-specific and general constraints) by combining the appropriate representations at the appropriate time, whether they are derived from different sensors or from different interpretation techniques applied to a single sensor. We refer to this task as information fusion, rather than sensor fusion.

Laser eye: a new 3D sensor for active vision

Piotr Jasiobedzki, Michael R. M. Jenkin, Evangelos E. Milios, et al.

Show abstract

This paper describes a new sensor that combines visual information from a CCD camera with sparse distance measurements from an infra-red laser range-finder. The camera and the range- finder are coupled together in such a way that their optical axes are parallel. A mirror with a different reflectivity for visible and for infra-red light is used to ensure collinearity of effective optical axes of the camera lens and the range-finder. The range is measured for an object in the center of the camera field of view. The Laser Eye is mounted on a robotic head and is used in an active vision system for an autonomous mobile robot (called ARK).

Robotic 3D dual-drive force/velocity control for tactile sensing and object recognition

Kelly A. Korzeniowski, William A. Wolovich

Show abstract

This work describes the development and testing of a 3-D dual-drive surface tracking controller that enables a robot to track along any specified path on the surface of an object. The dual-drive controller computes the normal and tangent vectors relative to movement along the path. The result is controlled movement in 3-D on the surface of an object. It is assumed that the path is generated by an external Recognizer in such a way that the data points collected by tactile sensing along the path will maximize the probability of correctly identifying the object. This tactile data collection method is referred to as `object-dependent' sensing because the location of the sensing paths is driven by comparisons made by the Recognizer to a model data base. The application for such a data collection system is object recognition tasks in environmental exploration and manipulation.

Informed peg-in-hole insertion using optical sensors

Eric Paulos, John F. Canny

Show abstract

Peg-in-hole insertion is not only a longstanding problem in robotics but the most common automated mechanical assembly task. In this paper we present a high precision, self-calibrating peg-in-hole insertion strategy using several very simple, inexpensive, and accurate optical sensors. The self-calibrating feature allows us to achieve successful dead-reckoning insertions with tolerances of 25 microns without any accurate initial position information for the robot, pegs, or holes. The program we implemented works for any cylindrical peg, and the sensing steps do not depend on the peg diameter, which the program does not know. The key to the strategy is the use of a fixed sensor to localize both a mobile sensor and the peg, while the mobile sensor localizes the hole. Our strategy is extremely fast, localizing pegs as they are in route to their insertion location without pausing. The result is that insertion times are dominated by the transport time between pick and place operations.

Global surface reconstruction from many views of the extremal boundary

W. Brent Seales, Olivier D. Faugeras

Show abstract

We present the results from a working system designed to reconstruct a complete 3D surface description from the extremal boundary of an object. Earlier work has shown that complete surface information (second order differential surface properties) can be recovered at edges generated by the extremal boundary of a 3D surface. In this paper we present new results in applying this theoretical framework to many views of real objects in order to show that many frames can be integrated into a common coordinate system to form a complete 3D model of an object. Our experiments place these multiple frames in a common coordinate system using known motion, if available, or by otherwise employing an algorithm for automatically computing object motion based on our classification of edges in the reconstruction process. We present experimental results on both real and synthetic data. Our experimental results on real objects show that with a calibrated trinocular camera system we can accurately reconstruct a complete surface description of 3D objects.

Building global surface models by purposive and qualitative viewpoint adjustment

Kiriakos N. Kutulakos, W. Brent Seales, Charles R. Dyer

Show abstract

We present an approach for recovering a global surface model of an object from the deformation of the occluding contour using an active (i.e., mobile) observer able to control its motion. In particular, we consider two problems: (1) How can the observer's viewpoint be controlled in order to generate a dense sequence of images that allows incremental reconstruction of an unknown surface? And (2) how can we construct a global surface model from the generated image sequence? We achieve the first goal by purposefully and qualitatively controlling the observer's instantaneous direction of motion in order to control the motion of the visible rim over the surface. We achieve the second goal by using a stationary calibrated trinocular camera rig and a mechanism for controlling the relative position and orientation of the viewed surface with respect to the trinocular rig. Unlike previous shape-from-motion approaches which derive quantitative shape information from an arbitrarily generated sequence of images, we develop a collection of simple and efficient viewing strategies that allow the observer to achieve the global reconstruction goal by maintaining specific geometric relationships with the viewed surface. These relationships depend only on tangent computations on the occluding contour. To demonstrate the feasibility and effectiveness of our approach we apply the developed algorithms to synthetic and real scenes.

Method for registering overlapping range images of arbitrarily shaped surfaces for 3D object reconstruction

Eric Bittar, Stephane Lavallee, Richard Szeliski

Show abstract

This paper presents a method to register overlapping 3-D surfaces which we use to reconstruct entire three-dimensional objects from sets of views. We use a range imaging sensor to digitize the object in several positions. Each pair of overlapping images is then registered using the algorithm developed in this paper. Rather than extracting and matching features, we match the complete surface, which we represent using a collection of points. This enables us to reconstruct smooth free-form objects which may lack sufficient features. Our algorithm is an extension of an algorithm we previously developed to register 3-D surfaces. This algorithm first creates an octree-spline from one set of points to quickly compute point to surface distances. It then uses an iterative nonlinear least squares minimization technique to minimize the sum of squared distances from the data point set to the octree point set. In this paper, we replace the squared distance with a function of the distance, which allows the elimination of points that are not in the shared region between the two sets. Once the object has been reconstructed by merging all the views, a continuous surface model is created from the set of points. This method has been successfully used on the limbs of a dummy and on a human head.

Accurate segmentation using multiple sources and probability networks

Simon J. Davies, A. David Marshall, Ralph R. Martin

Show abstract

Data from more than one source can be useful in that we can use information from one source to overcome a deficiency in another source. This can be especially beneficial when segmenting an image. We present a method for fusing data from more than one source of information or more than one segmentation method to achieve an accurate segmentation of an object. By this we mean a surface estimation consisting of surface and boundary properties (e.g., orientation, curvature, perimeter, etc.). This paper builds significantly on previous results where we outlined some simple, preliminary concepts used to obtain accurate estimation of the object's surface properties. Objects to be segmented may now consist of curved surfaces and curved boundaries. Probabilistic networks are used to process the variety of data that is available in order to provide the best segmentation results.

Landmark-based 3D fusion of SPECT and CT images

Lisa Gottesfeld Brown, Gerald Q. Maguire Jr., Marilyn E. Noz

Show abstract

In this paper we present interactive visualization procedures for registration of SPECT and CT images based on landmarks. Because of the poor anatomic detail available in many SPECT images, registration of SPECT images with other modalities often requires the use of external markers. These markers may correspond to anatomic structures identifiable in the other modality image. In this work, we present a method to nonrigidly register SPECT and CT images based on automatic marker localization and interactive anatomic localization using 3D surface renderings of skin. The images are registered in 3D by fitting low order polynomials which are constrained to be near rigid. The method developed here exploits 3D information to attain greater accuracy and reduces the amount of time needed for expert interaction.

Model-based vision for car following

Henry Schneiderman, Marilyn Nashman, Ronald Lumia

Show abstract

This paper describes a vision processing algorithm that supports autonomous car following. The algorithm visually tracks the position of a `lead vehicle' from the vantage of a pursuing `chase vehicle.' The algorithm requires a 2-D model of the back of the lead vehicle. This model is composed of line segments corresponding to features that give rise to strong edges. There are seven sequential stages of computation: (1) Extracting edge points; (2) Associating extracted edge points with the model features; (3) Determining the position of each model feature; (4) Determining the model position; (5) Updating the motion model of the object; (6) Predicting the position of the object in next image; (7) Predicting the location of all object features from prediction of object position. All processing is confined to the 2-D image plane. The 2-D model location computed in this processing is used to determine the position of the lead vehicle with respect to a 3-D coordinate frame affixed to the chase vehicle. This algorithm has been used as part of a complete system to drive an autonomous vehicle, a High Mobility Multipurpose Wheeled Vehicle (HMMWV) such that it follows a lead vehicle at speeds up to 35 km/hr. The algorithm runs at an update rate of 15 Hertz and has a worst case computational delay of 128 ms. The algorithm is implemented under the NASA/NBS Standard Reference Model for Telerobotic Control System Architecture (NASREM) and runs on a dedicated vision processing engine and a VME-based multiprocessor system.

Outlier detection and motion segmentation

Philip H. S. Torr, David W. Murray

Show abstract

We present a new method for solving the problem of motion segmentation, identifying the objects within an image moving independently of the background. We utilize the fact that two views of a static 3D point set are linked by a 3 X 3 Fundamental Matrix (F). The Fundamental Matrix contains all the information on structure and motion from a given set of point correspondences and is derived by a least squares method under the assumption that the majority of the image is undergoing a rigid motion. Least squares is the most commonly used method of parameter estimation in computer vision algorithms. However the estimated parameters from a least squares fit can be corrupted beyond recognition in the presence of gross errors or outliers which plague any data from real imagery. Features with a motion independent of the background are those statistically inconsistent from the calculated value of (F). Well founded methods for detecting these outlying points are described.

Relaxation matching of road networks in aerial images using topological constraints

Richard C. Wilson, Edwin R. Hancock

Show abstract

This paper applies novel probabilistic relaxation techniques to the problem of matching road networks in different aerial images. The problem stems both from the fusion of image information obtained at different altitudes over the same region and from the matching of road map data to image data. Feature matching is hindered by segmentation error, changes of scale or perspective and distortion of the image plane. Initially the image is segmented into the road network using a dictionary based relaxation line finder, the output of which corresponds closely with ground-truth road map data. Since the contour dictionary explicitly encodes junction structure, such features are robustly detected and provide the basis for a graph based representation of road structure. In the matching phase scale and rotation measurements are used to compute an initial match of the junctions from two images. Probabilistic relaxation is then used to update junction match probabilities in light of the topological constraints provided by the road network. The main advantage of this topological representation of constraints is that it renders the matching process robust to scale and viewpoint changes.

Locating the mouth region in images of human faces

H. J. Grech-Cini, Gerard T. McKee

Show abstract

Being able to see the face of a speaker can improve speech recognition performance by as much as a shift from 20% to 80% intelligibility under certain circumstances. Lip movements provide a major source of visual cues in speech recognition. In our research we are concerned with locating, tracking, characterizing, and exploiting the lip movements for this purpose. In this paper we focus on the first of these problems. Using a technique based on n-Tuples we locate the `eye-nose-region' (ENR) of the face in images and infer the location of the mouth via a `face model.' We describe this method in detail and present initial test results.

Fusing multiple reprocessings of signal data

Frank Klassner, Victor Lesser, S. Hamid Nawab

Show abstract

In the analysis of signals from complex environments, often it is not possible to rely on a single set of signal processing algorithms (SPAs) to produce a set of data correlates that permit meaningful interpretation. In such situations, what is needed is the structured fusion of data from multiple applications of SPAs (reprocessings) with different parameter values. We present the integrated processing and understanding of signals (IPUS) architecture as a framework for structuring interaction between the search for SPAs appropriate to the environment and the search for interpretation models to explain the SPAs' output data correlates. In this paper we describe our use of IPUS to control the integration of output from multiple SPA applications in a system for acoustic signal interpretation of household sounds.

Verification of nonlinear dynamic structural test results by combined image processing and acoustic analysis

Yair Tene, Noam Tene, G. Tene

Show abstract

An interactive data fusion methodology of video, audio, and nonlinear structural dynamic analysis for potential application in forensic engineering is presented. The methodology was developed and successfully demonstrated in the analysis of heavy transportable bridge collapse during preparation for testing. Multiple bridge elements failures were identified after the collapse, including fracture, cracks and rupture of high performance structural materials. Videotape recording by hand held camcorder was the only source of information about the collapse sequence. The interactive data fusion methodology resulted in extracting relevant information form the videotape and from dynamic nonlinear structural analysis, leading to full account of the sequence of events during the bridge collapse.

Wavelet-based sensor fusion

Terrance L. Huntsberger, Bjorn D. Jawerth

Show abstract

Sensor fusion can be performed either on the raw sensor output or after a segmentation step has been done. Our previous work has concentrated on neural network models for sensor fusion after segmentation. Although this method has been shown to be fast and reliable, there is still the overhead entailed from using entire images. The wavelet transform is a multiresolution method that is used to decompose images into detail and average channels. These channels maintain all of the image information and sensor fusion logic operations can be done within the wavelet coefficient space. In addition, image compression can be done within this same space for possible remote transmission. This paper examines sensor fusion within the wavelet coefficient space. The results of some experimental studies performed on the 1024 node NCUBE/10 at the University of South Carolina are also included.

Data fusion through nondeterministic approaches: a comparison

Muhamad Abdulghafour, Mongi A. Abidi

Show abstract

Information gathered by different knowledge sources from the same scene are often uncertain, imprecise, fuzzy, or incomplete. Using a multi-sensory system to integrate several types of data should yield more meaningful information otherwise unavailable or difficult to acquire by a single sensory modality. In this paper, we examine a number of non-deterministic methods for solving the fusion problem. Within the framework of fuzzy sets theory, we present a new technique for data fusion. We develop a fusion formula based on the measure of fuzziness. The fusion formula is mathematically tested against several desirable properties of fusion operators. Also, the Super Bayesian Approach and Dempster's rule of combination are presented. These approaches were implemented and tested with real range and intensity images acquired by an Odetics Laser Range Scanner. The goal was to obtain better scene descriptions through a segmentation process of both images. A systematic method for evaluating and comparing segmentation results was presented. Various levels of noise were added to the real data and segmentation results from all three approaches were evaluated.

Multisensor data fusion of points, line segments, and surface segments in 3D space

Joost van Lawick van Pabst, P. F.C. Krekel

Show abstract

Recently, interest for multi sensor data, and consequently data fusion, for world modeling has been growing. Multi sensor data fusion is the process of matching and integrating data from multiple sensor input. The objectives of the geometric fusion process are to enhance accuracy and resolution in the world model and to improve the effectiveness and the robustness of the overall imaging system. A solid description and parametric representation of the extracted 3D features from sensor data, here referred to as geometric primitives, the building blocks of the world model, are fundamental to model sensor uncertainties and to integrate sensor data. Parametric representations of three types of geometric primitives (points, line segments, and surface segments in 3D space) are presented. Furthermore, given these parametric representations, the application of the (Extended) Kalman Filter for data association and fusion is described. Special attention is paid to the fusion of geometric primitives of different modality. The resulting integrated geometric primitives are stored in a 3D world model that can be used for tele-presence and autonomous system control purposes.

Data fusion for detection of early-stage lung cancer cells using evidential reasoning

Lei-Jian Liu, Jingyu Yang, Jian Lu

Show abstract

Data fusion has already been widely used in various applications in which multiple sources of information are presented. One of the most widely used applications of data fusion is in the field of object recognition and classification, since it can efficiently improve the accuracy and the ability of fault tolerance. This presentation describes a cytological color image processing system, using data fusion method, developed for the detection of early stage lung cancers used in the health inspection. As most of the existing microscopic diagnostic systems use morphological, textual and gray or color features respectively, which results in the instability of the diagnosis, this system makes use of all these features by fusing multiple classification results obtained using morphological, textual and chromatic features respectively. Data fusion is achieved using Dempster-Shafer's Evidential Reasoning (DSER). In the current system, all the nuclei are first segmented by thresholding in a special color space which is a non-linear transformation of the (R,G,B) color space. Then, using morphological, textual and chromatic features respectively, the segmented cells are classified as normal or abnormal cells. Using DSER, the classification results obtained above are fused into a final result. Finally, a decision strategy based on the fused data is presented to get the final classification results. And experiment results are given to show the feasibility of the data fusion approach proposed here.

Sensor planning strategy in a CAD model-based vision system

HuiQun Liu, Xueyin Lin

Show abstract

This paper presents the sensor planning strategy in a robot vision system in which 3D information of the object is obtained from different view points by the structured light sensor. Sensor planning strategy has a great influence on the efficiency and reliability of the vision system. Both the 3D data obtained from the object in the scene and the vision model provide important information for the recognition and localization of the object. The sensor planning module is responsible for making good use of this information in finding the next view point so that the most useful information can be acquired as soon as possible. Rules are made and adopted in the sensor planning module to identify the object or to determine its position. Different surface types of the objects have different features that need special measures to be detected. The rules are designed to deal with variations of the object's appearance so that the vision system can go ahead to the right solution. The sensor planning strategy works in two steps: first it analyzes what features are to be detected according to the model guidance and the current hypotheses of the object, next it tries to select the proper rule to estimate where to find the predicted features and the corresponding next view point of the sensor.

Improving robot's indoor navigation capabilities by integrating visual, sonar, and odometric measurements

Tarcisio Coianiz, Marco Aste

Show abstract

In the framework of autonomous navigation, an application of Kalman filtering to the problem of multi-sensor information processing is presented. In particular, estimation of model parameters is considered when a mobile robot equipped with a set of sonars and a standard TV camera moves along corridors, and the environment is affected by sharp discontinuities due to the presence of recesses or protrusions. Experiments performed in real-world situations are also presented and discussed.

Genetic algorithms in hypothesize-and-verify image interpretation

Milan Sonka, Satish K. Tadikonda, Steve M. Collins

Show abstract

Image interpretation plays an important role in computer vision and robotics. We describe here a new approach to image understanding whose novel feature is that it integrates segmentation and interpretation into a single feedback process that incorporates contextual knowledge and uses a genetic algorithm technique to produce an optimal image interpretation. In this paper, we describe the principles of our approach, demonstrate its feasibility, and assess the accuracy of the proposed method using artificially-generated images.

Real-time implementation of structure estimation using image streams

Arun K. Dalmia, Mohan M. Trivedi

Show abstract

The computation of 3-D structure from motion using a monocular sequence of images in the paradigm of active vision is investigated in this paper. Robotic tasks such as navigation, manipulation, and object recognition all require 3-D description of scene. The 3-D description for these tasks varies in resolution, accuracy, robustness, range, and time. In developing such a system, it must have the capability to actively control the imaging parameters so that a 3-D description sufficient enough for that task is generated. The objective of this paper is to investigate a strategy suitable for fast and active 3-D structure estimation. First, the process of 3-D structure estimation in the paradigm of active vision is discussed. Second, implication of image streams in 3-D structure computation is analyzed. The three approaches to structure from motion, feature based, optical flow based, and spatial and temporal gradient based, are reviewed for their efficacy in fast and active estimation of 3-D structure. Finally, a pipeline based implementation of a spatial and temporal gradient based approach is presented along with detailed experimental analysis.

Regularization approach to multisensor reconstruction

Jagath C. Rajapakse, Raj S. Acharya

Show abstract

Due to the presence of noise and uncertainty, a single set of visual data is not always sufficient to successfully perform the intended vision task. In this paper, we consider the visual reconstruction problem with multi-sensor data and propose a computational method for fusion of sensor information. Correlation among different sensor information appears in some hidden representation of data. In our representation of visual data, the fusion of information occurs at the places where visual discontinuities appear. The appearance of discontinuities in one set of data is constrained by the discontinuity contours in other sets of data. This is achieved by introducing energies of external discontinuity cliques of multiple sensor data to the minimizing energy functional. The fusion improves the reconstruction error in individual sensors, and the detection of discontinuities, and reduces the false detection of discontinuities. The computational method presented in this paper can deal with fusion of surfaces with higher order discontinuities and with data sets belonging to different visual cues. Previously proposed Markov Random Field (MRF) techniques can deal with surfaces having only first order discontinuities. An implementation of the proposed fusion method on simulated images, and a comparison of the results with the other methods are presented.

Continuous reconstruction of scene objects

Steen Kristensen, Henrik I. Christensen

Show abstract

In this paper we describe an approach to continuous scene modeling for an autonomous mobile robot navigation system operating in indoor environments. The continuous scene modeling is based on a cooperative sensor system that comprises two parts: binocular region based stereo, i.e., a passive depth extraction technique, and depth from focus, i.e., an active depth extraction technique. The region based stereo technique provides an overview of the scene aided by a 3D a priori world model. Since feature based stereo is vulnerable to occlusions, a depth from focus technique is selectively employed at locations where potential occlusions are detected in order to extract the correct depth. Scene maintenance over time is done by generation of expectation images, based on previously sensed scene objects, that are matched with images, recorded by the on-robot stereo camera head. This match allows for detection of previously undetected scene objects and for updating of already known scene objects. The operation of the system is demonstrated on an in-door image sequence.

Nonanthropomorphic viewing for teleoperation

Gerard T. McKee, Paul S. Schenker

Show abstract

Workspace viewing in teleoperation systems is normally constrained by the fixed position of the cameras relative to the teleoperator. `Virtual viewing' and `free-flying' cameras reduce these constraints but alter the mapping between teleoperator control and perception leading to reduced visual performance during teleoperation. We are investigating the effects of non- anthropomorphic viewing on teleoperation performance and its implications for the design of teleoperation systems. In this paper we present results from an initial set of experiments.

Coping with delays for real-time gaze control: the fall and rise of the Smith `Predictor'

Paul M. Sharkey, David W. Murray

Show abstract

In this paper we describe how to cope with the delays inherent in a real time control system for a steerable stereo head/eye platform. A purposive and reactive system requires the use of fast vision algorithms to provide the controller with the error signals to drive the platform. The time-critical implementation of these algorithms is necessary, not only to enable short latency reaction to real world events, but also to provide sufficiently high frequency results with small enough delays that controller remain stable. However, even with precise knowledge of that delay, nonlinearities in the plant make modelling of that plant impossible, thus precluding the use of a Smith Regulator. Moreover, the major delay in the system is in the feedback (image capture and vision processing) rather than feed forward (controller) loop. Delays ranging between 40 msecs and 80 msecs are common for the simple 2D processes, but might extend to several hundred milliseconds for more sophisticated 3D processes. The strategy presented gives precise control over the gaze direction of the cameras despite the lack of a priori knowledge of the delays involved. The resulting controller is shown to have a similar structure to the Smith Regulator, but with essential modifications.

Multiresolution fusion of FLIR and ladar data for target detection with JDEF

Stelios C.A. Thomopoulos, Byron H. Chen

Show abstract

A joint detection and estimation filter (JDEF) is proposed for fusing Forward Looking Infrared (FLIR) and Laser Radar (Ladar) data for target detection. The JDEF has two main components: minimum mean square error estimator followed by a Bayesian detector. The estimator is a 2-D adaptive Kalman filter based on conditional Markovian field (CMF) modeling. The detector is a recursive maximum a posteriori probability (MAP) detector based on the generalized likelihood ratio. The estimator removes the noise and provides the detector with the estimated mean and variance at each pixel. The detector combines the information from the estimator as well as the information from the neighboring pixels to decide whether the current pixel belongs to targets or clutter. By fusing the data from FLIR and Ladar images with the JDEF, the locations of the possible targets are efficiently determined and the targets are accurately segmented. The filter has successfully been applied to both synthetic data and real data. The results are presented.

Modular agents for robot navigation

Hans S. Dulimarta, Anil K. Jain

Show abstract

Building an expandable control strategy for a mobile robot system that is capable of performing a given task or tasks is an important issue. For instance, new hardware and peripherals keep emerging as better and faster technologies become available. In this paper, we describe a system that is designed for controlling a mobile robot system with multiple sensors or peripherals. We propose a control scheme that employs at least as many server modules as there are hardware or peripheral components. Thus, a server is dedicated to controlling one hardware component and responds to all requests designated to it. Since hardware components are independent of each other, the server modules can run concurrently and new peripherals can be easily added without modifying the existing system. A typical configuration of a system consists of several task specific modules, each carrying a subgoal as part of a common goal that the robot has to achieve. Information sharing among these modules is accomplished via communication with a central data server. All modules can send query or update requests of the information used by the entire system. The control system is designed to enable all the modules to run concurrently on different machines. Preliminary results on an indoor navigation task are encouraging.

Neural networks for processing data from multiple redundant sensors for mine systems management, operation, maintenance, and control

Aaron Gordon, Hong Chang, Robert H. King

Show abstract

We have developed a neural-network approach to classifying signals by fusing information from multiple sensors. During the past three years, we have developed concepts and algorithms for an intelligent decision support system (IDSS) for mine managers. The goal of the IDSS is to detect the activities of machines in an underground coal mine and to produce management reports similar to traditional industrial engineering time studies. The data we operate on is power usage of the various machines taken every 50 milliseconds. Currently we are working with data from three machines which interact with each other: a continuous miner and two shuttle cars. Detection of events was first done using numerical techniques to arrive at locally best guesses and rule-based techniques to fuse the information from the different machines. Our current research involves dynamic recurrent neural networks (a variation of recurrent cascade correlation) which replace the numerical and rule-based techniques. Our current neural networks can accurately label approximately 90% of the machine events in the training set and approximately 70% in new data sets. Neural network techniques are able to adjust to the dynamic mine environment much better than the previous algorithms, consequently, the neural network approach is more acceptable in the applications environment.

Eye-hand relations for sensor placement and object location determination

Xiang Wan, Guang-you Xu

Show abstract

Eye-on-hand configuration is an important way to build the active vision system. Most tasks that the eye-on-hand system can perform are based on the estimation of eye-hand relation. Traditionally eye-hand relation is defined as a 3D to 3D coordinate transformation. This kind of definition only views the eye (camera) as a coordinate and it is useful for sensor placement. When this eye-hand relation is used for objects location determination (after the sensor is actively placed), it causes much larger errors than a more direct way which under this situation defines the eye-hand relation as the 3D to 2D perspective transformation between the last joint coordinate and the camera image plane. In this paper the meanings of the eye-hand relations are extended for different tasks: one for sensor placement and one for objects location determination. We also present a new method for the calculation of eye-hand relations by making the last joint coordinate `touchable.' We call it the direct method because some specially designed motions are performed by the robot arm to estimate the relation between the robot base frame and the world frame. When only the rotation matrix is obtained, moving the camera twice and calibrating the 3D camera pose at three stations, the eye-hand relations can then be computed. When the transformation matrix is obtained, calibrating the camera at one station may yield the solution of the eye-hand relations. Experimental results with real data are included. The advantages of the direct method are its efficiency, accuracy, and reproducibility.

Three-dimensional object recognition and pose determination based on combined-edge and surface-shape data

Yi Tan, Herbert Freeman

Show abstract

This paper describes a new approach to the classical machine vision problem of 3-D object recognition and pose determination. The process begins by extracting contour edges from a given patterned-light image of an unknown object and postulates an initial object line structure. This line structure is used to guide placement of a set of small surface-attribute- determining windows, called `surface-attribute probes' (SAPs), over the image to determine whether the areas being examined show the presence of one or more intersecting planar or curved regions. The resulting information is used to refine the object line structure, i.e., to determine all face intersections and characterize each face. The process is repeated until no further refinement in region segmentation and surface-attribute determination appears feasible.

Sensor Fusion VI

Volume Details

Table of Contents

Table of Contents