Intelligent Robots and Computer Vision XIX: Algorithms, Techniques, and Active Vision

Face recognition with pose variations

David P. Casasent, Ashit Talukder

Show abstract

Abstract not available.

Intelligent robot trends and predictions for the first year of the new millennium

Ernest L. Hall

Show abstract

An intelligent robot is a remarkably useful combination of a manipulator, sensors and controls. The current use of these machines in outer space, medicine, hazardous materials, defense applications and industry is being pursued with vigor. In factory automation, industrial robots can improve productivity, increase product quality and improve competitiveness. The computer and the robot have both been developed during recent times. The intelligent robot combines both technologies and requires a thorough understanding and knowledge of mechatronics. Today's robotic machines are faster, cheaper, more repeatable, more reliable and safer than ever. The knowledge base of inverse kinematic and dynamic solutions and intelligent controls is increasing. More attention is being given by industry to robots, vision and motion controls. New areas of usage are emerging for service robots, remote manipulators and automated guided vehicles. Economically, the robotics industry now has more than a billion-dollar market in the U.S. and is growing. Feasibility studies show decreasing costs for robots and unaudited healthy rates of return for a variety of robotic applications. However, the road from inspiration to successful application can be long and difficult, often taking decades to achieve a new product. A greater emphasis on mechatronics is needed in our universities. Certainly, more cooperation between government, industry and universities is needed to speed the development of intelligent robots that will benefit industry and society. The fearful robot stories may help us prevent future disaster. The inspirational robot ideas may inspire the scientists of tomorrow. However, the intelligent robot ideas, which can be reduced to practice, will change the world.

New hierarchical approach to pattern matching for industrial applications

Markus Brandner, Axel J. Pinz, Wolfgang Poelzleitner

Show abstract

In this paper a new hierarchical structure for fast, robust geometry-based pattern matching is proposed. As opposed to many pattern matching systems reported in the literature we use a structure comprising a number of alternating feature processing and constraint layers. Feature extraction accuracy ranges from coarse at the bottom of the structure to a fine level at the top. A 2D quasi-affine matching system based on the proposed structure has been implemented. Experiments show the reduction in the amount of image data being processed in every layer of the structure as a consequence of applying constraints to the data between two adjacent feature extraction layers. The structure is able to utilize scalable feature extraction algorithms as well as the incorporation of a priori knowledge into the feature extraction.

Diffractive 3D phase gratings will qualify optical sensors and image processing technologies for reaching performances of human vision

Norbert Lauinger

Show abstract

Diffractive 2D and 3D grating optics as enabling technologies are on a good way to technically realize specific features well known in human vision. How does the human eye do its job in visual information processing?

Wearable computer for mobile augmented-reality-based controlling of an intelligent robot

Tuukka Turunen, Juha Roening, Sami Ahola, et al.

Show abstract

An intelligent robot can be utilized to perform tasks that are either hazardous or unpleasant for humans. Such tasks include working in disaster areas or conditions that are, for example, too hot. An intelligent robot can work on its own to some extent, but in some cases the aid of humans will be needed. This requires means for controlling the robot from somewhere else, i.e. teleoperation. Mobile augmented reality can be utilized as a user interface to the environment, as it enhances the user's perception of the situation compared to other interfacing methods and allows the user to perform other tasks while controlling the intelligent robot. Augmented reality is a method that combines virtual objects into the user's perception of the real world. As computer technology evolves, it is possible to build very small devices that have sufficient capabilities for augmented reality applications. We have evaluated the existing wearable computers and mobile augmented reality systems to build a prototype of a future mobile terminal- the CyPhone. A wearable computer with sufficient system resources for applications, wireless communication media with sufficient throughput and enough interfaces for peripherals has been built at the University of Oulu. It is self-sustained in energy, with enough operating time for the applications to be useful, and uses accurate positioning systems.

Knowledge and vision engines: a new generation of image understanding systems combining computational intelligence methods and model-based knowledge representation and reasoning

Igor Kuvychko

Show abstract

Vision is a part of a larger informational system that converts visual information into knowledge structures. These structures drive vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, that is an interpretation of visual information in terms of such knowledge models. The solution to Image Understanding problems is suggested in form of active multilevel hierarchical networks represented dually as discrete and continuous structures. Computational intelligence methods transform images into model-based knowledge representation. Certainty Dimension converts attractors in neural networks into fuzzy sets, preserving input-output relationships. Symbols naturally emerge in such networks. Symbolic Space is a dual structure that combines closed distributed space split by the set of fuzzy regions, and discrete set of symbols equivalent to the cores of regions represented as points in the Certainty dimension. Model Space carries knowledge in form of links and relations between the symbols, and supports graph, diagrammatic and topological operations. Composition of spaces works similar to M. Minsky frames and agents, Gerard Edelman's maps of maps, etc., combining machine learning, classification and analogy together with induction, deduction and other methods of higher level model-based reasoning. Based on such principles, an Image Understanding system can convert images into knowledge models, effectively resolving uncertainty and ambiguity via feedback projections and does not require supercomputers.

Detection and avoidance of simulated potholes in autonomous vehicle navigation in an unstructured environment

Jaiganesh Karuppuswamy, Vishnuvardhanaraj Selvaraj, Meyyappa Murugappa Ganesh, et al.

Show abstract

In the navigation of an autonomous vehicle, tracking and avoidance of the obstacles presents an interesting problem as this involves the integration of the vision and the motion systems. In an unstructured environment, the problem becomes much more severe as the obstacles have to be clearly recognized for any decisive action to be taken. In this paper, we discuss a solution to detection and avoidance of simulated potholes in the path of an autonomous vehicle operating in an unstructured environment. Pothole avoidance may be considered similar to other obstacle avoidance except that the potholes are depressions rather than extrusions form a surface. A non-contact vision approach has been taken since potholes usually are significantly different visually from a background surface. Large potholes more than 2 feet in diameter will be detected. Furthermore, only white potholes will be detected on a background of grass, asphalt, sand or green painted bridges. The signals from the environment are captured by the vehicle's vision systems and pre-processed appropriately. A histogram is used to determine a brightness threshold to determine if a pothole is within the field of view. Then, a binary image is formed. Regions are then detected in the binary image. Regions that have a diameter close to 2 feet and a ratio of circumference to diameter close to pi are considered potholes. The neuro-fuzzy logic controller where navigational strategies are evaluated uses these signals to decide a final course of navigation. The primary significance of the solution is that it is interfaced seamlessly into the existing central logic controller. The solution can also be easily extended to detect and avoid any two dimensional shape.

Simple obstacle detection to prevent miscalculation of line location and orientation in line following using statistically calculated expected values

Terrell Nathan Mundhenk, Michael J. Rivett, Ernest L. Hall

Show abstract

Visual line following in mobile robotics can be made more complex when objects are places on or around the line being followed. An algorithm is presented that suggests a manner in which a good line track can be discriminated from a bad line track using the expected size of the line. The mobile robot in this case can determine the size of the width of the line. It calculates a mean size for the line as it moves and maintains a set size of samples, which enable it to adapt to changing conditions. If a measurement is taken that falls outside of what is to be expected by the robot, then it treats the measurement as undependable and as such can take measures to deal with what it believes to be erroneous data. Techniques for dealing with erroneous data include attempting to look around the obstacle or making an educated guess as to where the line should be. The system discussed has the advantage of not needing to add any extra equipment to discover if an obstacle is corrupting its measurements. Instead, the robot is able to determine if data is good ro bad based upon what it expects to find.

Parallel image processing for line detection in shared-memory and distributed environments

Ville Kyrki, Jouni Ikonen, Jari Porras, et al.

Show abstract

Parallel systems provide a robust approach for high performance computing. Lately the use of parallel computing has become more available as new parallel environments have evolved. Low cost and high performance of off-the-shelf PC processors have made PC-based multiprocessor systems popular. These systems typically contain two or four processors. Standardized POSIX-threads have formed an environment for the effective utilization of several processors. Moreover, distributed computing using networks of workstations has increased. The motivation for this work is to apply these techniques in computer vision. The Hough Transform (HT) is a well-known method for detecting global features in digital images. However, in practice, the sequential HT is a slow method with large images. We study the behavior of line detecting HT with both message passing workstation networks and shared-memory, multiprocessor systems. Parallel approaches suggested in this paper seem to decrease the computation time of HT significantly. Thus, the methods are useful for real-world applications.

Morphological image processing of a bubble in laser-induced shock-wave lithotripsy

Jahja O. Kokaj

Show abstract

In this work, morphological processing is applied to study a bubble, which is crucial factor in laser-induced shock-wave lithotripsy. Erosion, dilation and subtraction are applied for edge detection of a bubble Hence the position and the shape is measured. The image of the bubble is extracted from the actual image of the stone (shadowgram). The image is obtained by fast photography using an N-Dye laser for illumination. A Ho:yag laser is used for fragmentation of the stone. Using a time delay, an oscilloscope and a computer, two laser pulses are synchronized. A microscope and a Kodak-camera are used to photograph the stone and the phenomena around.

Image analysis using select-only morphological operators

Yury V. Visilter

Show abstract

This paper describes a new selective morphology (SM) that provides the morphological operators with selecting-only and filtering-only properties (S-operators and F-operators). In this framework, S-opening and S-closing operators perform the extreme monotonous reconstruction of source image, starting from the results or erosion and dilation correspondingly. F-opening and F-closing are formed as an algebraic combination of S-opening and S-closing and usual morphological opening and closing. It is proved that SM filters have the most mathematical properties of MM filters. The additional property of S-operators to preserve the connectivity and shape (edges) of restored image areas is provided. Some examples of S-morphological object extraction and F-morphological feature extraction are outlined. The significant improvement of extraction quality is demonstrated with comparison to usual MM operators. The example of special SM based on contour filtering is described.

Design of corner detection used for calibration of x-ray device

Jianfeng Lu, Xuelei Hu, Jingyu Yang, et al.

Show abstract

This paper proposed a set of practical method for corner detection used for X-ray device calibration. Based on the square steel pattern, the original image is first segmented into several small squares, then edge detection is operated to each small square, after this, Hough transform is used to line approximation and cross points of lines are solved, these cross points are just the corner needed. The experimental results show the proposed method is feasible and robust.

Sequence of texture images and process control

Ari J. E. Visa, Sami Autio

Show abstract

The demand on higher quality of products together with recent progress in texture recognition have made it possible to consider old processes in a new way. It has been common that the human operators have used visual appearance to control e.g. mixing, floating, or cooking processes. Now a methodology based on time series of textured images is reported and demonstrated. The methodology is aimed to help the human operators either to monitor or to control the processes. The main idea is that a sequence of images is taken. Each image in a sequence is interpreted and characterized as a texture image. The textured image is transformed. It is also possible that suitable features are extracted from the transformed texture image. The transformed images or the features are used together with the corresponding values from preceding and succeeding images to characterize the process. Some results are reported here. The key points how to apply the methodology are also discussed.

Flexible workobject localization for CAD-based robotics

Mikko Sallinen, Tapio A. Heikkila

Show abstract

In this paper a method to locate work objects with splined surfaces and estimate the spatial uncertainties of the estimated parameters is presented. The reference B-spline surface patch is selected from a work object CAD-model and is defined in the form of control vertices. The process includes the hang-eye calibration of the sensor, determination of the work object localization and surface treating, e.g. inspection. The hand-eye calibration and work object localization are carried out using the Bayesian form estimation with sensor fusion. Use of the recursive sensor fusion method makes calibration more flexible and accurate in handling large data sets. The spatial uncertainties in the form of eigenvalues in the direction of the eigenvectors are analyzed from the error covariance matrices of the estimated parameters.

Surface defect detection with histogram-based texture features

Jukka Iivarinen

Show abstract

In this paper the performance of two histogram-based texture analysis techniques for surface defect detection is evaluated. These techniques are the co-occurrence matrix method and the local binary pattern method. Both methods yield a set of texture features that are computed form a small image window. The unsupervised segmentation procedure is used in the experiments. It is based on the statistical self-organizing map algorithm that is trained only with fault-free surface samples. Results of experiments with both feature sets are good and there is no clear difference in their performances. The differences are found in their computational requirements where the features of the local binary pattern method are better in several aspects.

Video switching and sensor fusion for multicamera sensing systems

Mayank Saxena, Vishnuvardhanaraj Selvaraj, Rahul Dhareshwar, et al.

Show abstract

With the low cost of solid-state camera systems, it is now possible to include many cameras on a mobile robot or other machine. However, video processing is still relatively expensive. Therefore it is often desirable to share several cameras with a single processor. The purpose of this paper is to describe the design of a video switching system that permits eight cameras to be multiplexed with a single chip. Multiples of eight could also easily be accomplished. The heart of the system is a Maxim video switch. The user simply selects using a three-bit control signal, which camera signal is selected. The output of the video switch is then the desired camera image. One application of this video switch is a four camera input system to a mobile robot being constructed at the University of Cincinnati. Other applications include surveillance and other mobile systems. The decision as to which camera to observe can be made automatically from a computer providing a great versatility. For example, supplemental motion detectors could be used to activate the camera selection for a surveillance system. Higher-level logic has been used on our mobile robot application. Still higher-level logic could be used to fuse the video information in various ways before processing. The significance of this device is that it provides a wealth of video information to be used at the discretion of either a human viewer or automatic system.

Computer vision system for an autonomous mobile robot

Xiaoqun Liao, Jin Cao, Ming Cao, et al.

Show abstract

The purpose of this paper is to compare three methods for 3- D measurements of line position used for the vision guidance to navigate an autonomous mobile robot. A model is first developed to map 3-D ground points into image points to be developed using homogeneous coordinates. Then using the ground plane constraint, the inverse transformation that maps image points into 3-D ground points is determined. And then the system identification problem is solved using a calibration device. Calibration data is used to determine the model parameters by minimizing the mean square error between model and calibration points. A novel simplification is then presented which provides surprisingly accurate results. This method is called the magic matrix approach and uses only the calibration data. A more standard variation of this approach is also considered. The significance of this work is that it shows that three methods that are based on 3-D measurements may be used for mobile robot navigation and that a simple method can achieve accuracy to a fraction of an inch which is sufficient in some applications.

Identification and distortion rectification of signboards in real scene image for robot navigation

Yongmei Liu, Tsuyoshi Yamamura, Toshimitsu Tanaka, et al.

Show abstract

This paper describes an approach for finding a signboard along with the camera viewing direction in a gray scene image and rectifying a distorted signboard image taken from a nonorthogonal viewing direction. First, we use heuristics of both characters and character lines to discriminate characters from other objects in a gray scene image. Then, we attempt to compute the orientation of the camera relative to a detected signboard from a single view. Finally, we rectify the distortion of signboard images that are viewed at an angle. The only restriction for viewing direction estimation and distortion rectification is that a detected signboard is rectangular. We evaluate our methods on both simulation images and real scene images. Experimental results demonstrate the effectiveness of the proposed methods.

Techniques for fisheye lens calibration using a minimal number of measurements

Terrell Nathan Mundhenk, Michael J. Rivett, Xiaoqun Liao, et al.

Show abstract

A method is discussed describing how different types of Omni-Directional fisheye lenses can be calibrated for use in robotic vision. The technique discussed will allow for full calibration and correction of x,y pixel coordinates while only taking two uncalibrated and one calibrated measurement. These are done by finding the observed x,y coordinates of a calibration target. Any Fisheye lense that has a roughly spherical shape can have its distortion corrected with this technique. Two measurements are taken to discover the edges and centroid of the lens. These can be done automatically by the computer and does not require any knowledge about the lens or the location of the calibration target. A third measurement is then taken to discover the degree of spherical distortion. This is done by comparing the expected measurement to the measurement obtained and then plotting a curve that describes the degree of distortion. Once the degree of distortion is known and a simple curve has been fitted to the distortion shape, the equation of that distortion and the simple dimensions of the lens are plugged into an equation that remains the same for all types of lenses. The technique has the advantage of needing only one calibrated measurement to discover the type of lens being used.

Optimization of camera parameter calibration based on a new criterion with effective visual angle

Yoshihiko Nomura, Takashi Fujimoto, Norihiko Kato, et al.

Show abstract

The 3D information obtained from camera, actually 2D image, is given as visual angles, and, so, we should evaluate the accuracy of calibration based on the visual angles. When we evaluate the accuracy, the visual angle, which direction should we take as the reference? True values of the coordinates of the principal points have been ordinarily used as the reference of the visual angle. But, the kind of evaluation criterion makes us overestimate visual angle errors although it is simple, and conforms to a fail-safe principle. And, it also causes us an ill effect that we should change fiducial-chart setup to suit an optimal condition that is given to each camera according to principal distances. We propose a novel criterion for the visual angle evaluation: the calibrated coordinates of principal points are taken as the references of the visual angle. It yields useful results: it markedly decreases visual angle estimation errors, and it also makes the optimal setup condition not vary with the principal distance. Further more, we present a useful formula enabling us to estimate calibration error for every camera in advance.

Free-form 3D object compression using surface signature

Hesham F. Anan, Sameh M. Yamany

Show abstract

This paper introduces a new transform, called the signature transform, to concisely represent fee-from 3D objects. The signature transform is based on a free-form surface representation called the surface signature. The surface signature captures some information about a 3D surface, as viewed from a special point called the anchor, such as the curvature, distance from anchor point,....etc. The surface signature stores this information in the form of a 2D image called the surface signature image. The signature transform uses different variations of the surface signature as viewed from selected landmark points. The selection of anchor points is crucial to the success of the signature transform an approach for selecting landmark points based on curvature value will be presented. The signature transform can then be used as a form of a progressive compression of objects that will allow the view and manipulation of the 3D object even if all the compression data are not received. Unlike the previously existing progressive compression techniques, the signature transform does not require receiving the data in special order nor does it have key frames in the representation.

Curvature-based range image classification for object recognition

Jan Boehm, Claus Brenner

Show abstract

This work focuses on the extraction of features from dense range images for object recognition. The object recognition process is based on a CAD model of the subject. Curvature information derived from the CAD model is used to support the feature extraction process. We perform a curvature based classification of the range image to achieve a segmentation into meaningful surface patches, which are later to be matched with the surfaces of the CAD model.

Real-time image processing architecture for robot vision

Stelian Persa, Pieter P. Jonker

Show abstract

This paper presents a study of the impact of MMX technology and PIII Streaming SIMD (Single Instruction stream, Multiple Data stream). Extensions in image processing and machine vision application, which, because of their hard real time constrains, is an undoubtedly challenging task. A comparison with traditional scalar code and with other parallel SIMD architecture (IMPA-VISION board) is discussed with emphasis of the particular programming strategies for speed optimization. More precisely we discuss the low level and intermediate level image processing algorithms, which are best suited for parallel SIMD implementation. High-level image processing algorithms are more suitable for parallel implementation on MIMD architectures. While the IMAP-VISION system performs better because of the large number of processing elements, the MMX processor and PIII (with Streaming SIMD Extensions) remains a good candidate for low-level image processing.

Video flow active control by means of adaptive shifted foveal geometries

Cristina Urdiales, Juan Antonio Rodriguez, Antonio J. Bandera, et al.

Show abstract

This paper presents a control mechanism for video transmission that relies on transmitting non-uniform resolution images depending on the delay of the communication channel. These images are built in an active way to keep the areas of interest of the image at the highest resolution available. In order to shift the area of high resolution over the image and to achieve a data structure easy to process by using conventional algorithms, a shifted fovea multi resolution geometry of adaptive size is used. Besides, if delays are nevertheless too high, the different areas of resolution of the image can be transmitted at different rates. A functional system has been developed for corridor surveillance with static cameras. Tests with real video images have proven that the method allows an almost constant rate of images per second as long as the channel is not collapsed.

Dynamically reconfigurable real-time software components in the RTLinux environment

Kristiina Valtanen, Tuomo Nayha

Show abstract

In this work one efficient real-time software technique, port-based object technique, is decouple from its special framework and transferred into the standard PC operating system environment in order to make the use of this technique more attractive, for example, ,in low-cost embedded systems. The port-based object technique is based on the Chimera methodology introduced by the Robotics Institute of the Carnegie Mellon University in the U.S.A. and it has been used in the control of advanced sensor-based applications in a customized real-time operating system environment. The aim of the use of this software technique is to improve the entire real-time software process. Port-based objects are independent and clearly structured software components which can be configured by and end-user to get the desired operation. The software structure is dynamically reconfigurable and thus a modification of the configuration can be made on-the-fly without the need to reboot the system. The simple structure speeds up real-time software development, testing and maintenance and also makes the timing analysis easier. In addition the reuse of the software becomes natural. In the work, implementation and use of the basic software structures needed by the port-based object technique are considered in RTLinux environment which is a Linux operating system to which has been added real-time features using the RTLinux operating system extension. Both Linux and RTLinux are commonly used and freely distributable. The results of the work on the software structures needed for the use of the port-based object model encourage the inauguration of the port-based object technique in a standard operating system. The implementation possibilities provided by RTLinux give a natural support for the port- based object technique.

Examples of design and achievement of vision systems for mobile robotics applications

Patrick J. Bonnin, Laurent Cabaret, Ludovic Raulet, et al.

Show abstract

Our goal is to design and to achieve a multiple purpose vision system for various robotics applications : wheeled robots (like cars for autonomous driving), legged robots (six, four (SONY's AIBO) legged robots, and humanoid), flying robots (to inspect bridges for example) in various conditions : indoor or outdoor. Considering that the constraints depend on the application, we propose an edge segmentation implemented either in software, or in hardware using CPLDs (ASICs or FPGAs could be used too). After discussing the criteria of our choice, we propose a chain of image processing operators constituting an edge segmentation. Although this chain is quite simple and very fast to perform, results appear satisfactory. We proposed a software implementation of it. Its temporal optimization is based on : its implementation under the pixel data flow programming model, the gathering of local processing when it is possible, the simplification of computations, and the use of fast access data structures. Then, we describe a first dedicated hardware implementation of the first part, which requires 9CPLS in this low cost version. It is technically possible, but more expensive, to implement these algorithms using only a signle FPGA.

Survey of robot lawn mowers

Rob Warren Hicks II, Ernest L. Hall

Show abstract

Lawn mowing is considered by many to be one of the most boring and tiring routine household tasks. It is also one of the most promising personal robot applications. Several devices have not been invented and some manufactured products are available for lawn mowing. The purpose of this paper is to survey the state of the art in robotic lawn mowers to highlight the requirements and capabilities of current devices. A brief survey of available robot products, typical patents and some test bed prototypes are presented. Some enabling technologies which could make the devices more capable are also suggested. Some predictions indicate that the robot lawn mower will be the breakthrough device in robotics. The significance of this research lies in the presentation of an overview of a potential major market for personal robots.

Toward a vision system for blinds

Edwige E. Pissaloux, Hichem A. Bouayed, Samer M. Abdallah

Show abstract

This paper addresses design principles and definition of a non-invasive vision system assisting blinds in their displacements in non-cooperating environment. The global environment perception and understanding is a key problem for such systems. The optical vision is one of sense providing the global information on nearest environment; its association with convenient global environment representation transmitted to blinds via tactile interface allows blinds to locate all moving and fixed obstacles. Adding vision sensor to electronic travel aid (ETA) for blinds increases blinds autonomy, helps them to orient and move in safety in 3D environment. The announced principles for vison-based ETA can be easily transposed to any robotics, and especially humanoids, vison system.

Automatic alignment of electron tomography images using markers

Sami Sebastian Brandt, Jukka Heikkonen

Show abstract

Electron tomography means reconstructing the interior of an object from its electron microscope images. In order to successfully perform the 3D reconstruction, the motion between consecutive images must be solve, i.e., the images have to be aligned or registered. Using a set of two- dimensional electron microscope images of a three- dimensional object, we propose a method where the registration is automated using small colloidal gold particles as reference markers between images. The alignment problem is divided into several subproblems: (1) Finding initial matches from successive images, (2) estimating the epipolar geometry between images, (3) finding and localizing the gold particles with subpixel accuracy in each image, (4) predicting the probably matching gold particles using the epipolar geometry with the disparity information, (5) matching and tracking the gold particles through the tilt series, and (6) optimizing the transformation parameters for the whole image set. The results show the reliability of the suggested method as well as high accuracy in registration.

Nonlinear combining of heterogeneous features in content-based image retrieval

HyoungGu K. Lee, Suk In Yoo

Show abstract

In content-based image retrieval (CBIR), retrieval based on different features can be various by the way how to combine the feature values. Most of the existing approaches assume a linear relationship between different features, and the usefulness of such systems was limited due to the difficulty in representing high-level concepts using low-level features. In this paper, we introduce Neural Network-based Image retrieval (NNIR) system, a human-computer interaction approach to CBIR. By using the Radial Basis Function (RBF) network, this approach determines nonlinear relationship between features so that more accurate similarity comparison between images can be supported. The experimental results show that the proposed approach has the superior retrieval performance than the existing linear combining approach, the rank-based method and the Back Propagatoin-based method. Although the proposed retrieval model is for CBIR, it can easily be expanded to handle other media types such as video and audio.

Automatic image generation by genetic algorithms for testing halftoning methods

Timo J. Mantere, Jarmo T. Alander

Show abstract

Automatic test image generation by genetic algorithms is introduced in this work. In general the proposed method has potential in functional software testing. This study was done by joining two different projects: the first one concentrates on software test data generation by genetic algorithms and the second one studied digital halftoning for an ink jet marking machine also by genetic algorithm optimization. The object software halftones images with different image filters. The goal was to reveal, if genetic algorithm is able to generate images that re difficult for the object software to halftone, in other words to find if some prominent characteristics of the original image disappear or ghost images appear due to the halftoning process. The preliminary results showed that genetic algorithm is able to find images that are considerable changed when halftoned, and thus reveal potential problems with the halftoning method, i.e. essentially tests for errors in the halftoning software.

Knowledge representation and knowledge module structure for uncalibrated vision-guided robots

Minh-Chinh Nguyen, Doan-Trong Bui

Show abstract

A new concept for knowledge representation and structure of the knowledge module for vision-guided robots is introduced. It allows the robot to acquire, accumulate and adapt automatically whatever knowledge it may need and to gain experience in the course of its normal operation, i.e., learning by doing, thus, to improve its skills and operating speed over time. The knowledge module is structured into a set of a fairly independent submodules each performing a limited task, and sub-knowledge bases each contains limited knowledge. Such a structure allows to use the acquired knowledge flexibly and efficiently. It makes also easily to extend the knowledge base when the robot's number of degrees of freedom that must be controlled increases. The concept was realized and evaluated in real-world experiments on an uncalibrated vision-guided 5-DOF manipulator to grasp a variety of differently shaped objects.

Face detection by learning from positive examples

Shijin Li, Jianfeng Lu, Jingyu Yang

Show abstract

Many existing methods for face detection using both the positive examples (faces) and negative examples (nonfaces). By learning only from the positive examples, a novel face detection algorithm is invented, which is made up of two parts of research works. The first one is a frontal-view upright face detection algorithm based on the well-known singular value feature (SVF) and Hidden Markov models (HMM). The algorithm couples the virtues of both the SVF and HMM and produces excellent detection results. Firstly, it is tested on the second part of a large face image library NUSTFDB603-II whose first part is used to train the HMM and where there are 954 face images of 96 persons. The detection rate is 98.32% while only one false alarm is reported. Then it is tested on a collect photo album and has detected the 85.1 percent of its 484 people, while 97 false alarms are also reported. The second part of our algorithm is the extension of the first one to rotation-invariant face detection. Several HMMs are employed simultaneously and the angle of the "face" image is obtained. Then the HMM for detecting the upright faces is employed to verify the faceness of the test pattern. This rotation-invariant algorithm is tested on another image set where there are 173 persons whose faces are rotated randomly. The detection rate is 72.2%, and 34 false alarms are reported.

Estimation of optical flow in a three-dimensional voting space for brightness change

Hiroki Imamura, Yukiko Kenmochi, Kazunori Kotani

Show abstract

There is a problem that precision of optical flow declines in situations such that brightness changes. In order to solve the problem, conventional methods have been proposed. Each conventional method has different effective properties for optical flow estimation in the situations of brightness change. We aim to propose the method which has all effective properties in conventional methods for optical flow estimation in the situations of brightness change.

Adaptive visual tracking of moving object with neural-PID controller

Liucun Zhu, Hideo Fujimoto, Akihito Sano, et al.

Show abstract

In this paper, an adaptive approach is presented to nonlinear system for real-time robotic visual tracking of a moving polyhedral object. A light stripe vision system consists of a laser-stripe sensor and a CCD camera fixed to robot end-effector, and projects planar light on the polyhedral object. The geometric conditions can be provided to assure location of features of the polyhedral faces. The objective is to predict the location of features of the object on the image plane based on the light stripe vision system and then to determine an optimal control input that will move the camera so that the image features align with their desired positions. We first give the equations of observation and stat-space by using the motion rules of the camera and the object. Then, the system can be represented as an MIMO ARMAX model and an efficient estimation model. The estimation model can process on-line estimation of the 3D related parameters between the camera and the object. Those parameters are used to calculate the system sensitivity of a neural network. The control scheme adopts a neural-PID controller that can adjust the PID controller parameters. The paper concludes with the simulation results and the compute stimulation shows that the proposed method is effective to visual tracking of combining vision and control

Diffractive 3D-grating optical sensor with trichromatic color constancy adaptation to variable illuminants

Norbert Lauinger, B. Badenhop

Show abstract

Since 0. Lummer [6] and the industrial development and production of artificial illuminants it became more and more evident that between sunlight and human vision a specific — until today unexplained — "resonance" condition apparently exists. Therefore the recommendation holds to approximate as much as possible the spectral energy distribution of artificial illuminants to the one of sunlight. Especially in human color vision spectral shifts of illuminants always lead to hue shifts (cornbined Brightness-, Hue-, Saturation-Shifts) in the perception of colors. These hue shifts in human vision adaptively become compensated with more or less time delay, leading to a relatively good "color constancy" under variable illurninants. An — always far from perfect — explanation model, the von Kries-model, attributes this adaptive compensation of hue shifts to the photopigments in the cones of the human retina. Other — less perfect — models attributing this adaptation to cortical functions also exist [1, 2]. In parallel the need becomes evident to realize future color sensors "capable to measure colors normalized to the spectral sensitivity curves of human vision" [7]. It might be registered with satisfaction that a growing objectivity comes into this psychophysical field of color constancy in human vision by the publication of more and more precise data on relevant parameters in the physical conditions of the experiments.

Color space conversion for linear color grading

Dah-Jye Lee

Show abstract

Color grading is an important process for various industries such as food processing, fruit and vegetable grading, etc. Quality and price are often determined by the color of product. For example, darker red color for apples means higher price. In color machine vision applications, image is acquired with a color CCD camera that outputs color information in three channels, red, gree, and blue. When grading color, these three primary colors must be processed to determine the color level for separation. A very popular color space conversion technique for color image processing is RGB-to-HSI, where HSI represents hue, saturation, and intensity, respectively. However, the conversion result is still 3D information that makes determining color grades very difficult. A new color space conversion technique that can be implemented for high-speed real-time processing for color grading is introduced in this paper. Depending on the application, different color space conversion equations must be used. The result of this technique is a simple one-dimensional array that represents different color levels. This linear array makes linear color grading adjustment possible.

Finding colored objects under different illumination conditions in robotic applications

Dietmar Legenstein, Markus Vincze, Stefan Chroust

Show abstract

Color information is very useful for object recognition, but the measured image color of objects depends on the scene illumination. Human vision exhibits color constancy for an object under a wide range of illumination conditions. A similar ability is required if computer vision systems are used to recognize objects in uncontrolled environments. Until now there is no satisfying approach for finding a colored object under different illuminations and using different cameras. To find the colored object, the following approach is proposed and has been verified with several demonstrations. The first step is to take a picture of the known object and a reference template of eighteen colored strips. Using the color values of the reference template the object color can be related to the reference colors. The key idea is that the robot can take the template with the 18 colored strips with it. In other rooms with other illumination conditions the robot can look at this reference template and calibrate its camera to different illumination conditions. The color calibration of the camera is achieved with a transformation matrix and the standard deviation of the color transformation is used to evaluate the quality of the transformation. Under the new illumination conditions, this transformation is used to search for the color of the object in a new room. The found object color is related to the reference values of the sample using the transformation into the new color space. The colored object itself is found using the multi-spectral classification via Gaussian distribution. First demonstrations indicate that this approach is very robust against different illumination conditions ranging form neon light to direct sun light and can be used rather simple by each robot.

Comparison of PCA and ICA in color recognition

Hannu Tapani Laamanen, Timo Jaeaeskelaeinen, Jussi P. S. Parkkinen

Show abstract

It has been shown that a large dataset of color spectra can be represented as a linear combination of a few principal spectra. These principal spectra, which form the basis of a vector-subspace, are usually generated by Principal Component Analysis (PCA), the method widely applied to the analysis of spectral data. The objective of the present study was the comparison of PCA and its extenion Independent Component Analysis (ICA). ICA is a statistical signal processing technique, which tries to express measured signals as a linear combination of unknown source signals. Both methods were applied to a set of 1269 reflectance spectra of the chips in the Munsell Book of Color-Matte Finish Collection and a set of 922 reflectance spectra of the samples in the Pantone Color Formula Guide. Several bases with different number of principal spectra were generated. Each Munsell and Pantone basis was used to reconstruct both the Munsell and the Pantone color spectra. The accuracy of the reconstructability was measured mainly by means of color differences (delta) E_ab^* (CIELAB), but the spectral reconstruction errors were also determined. The dimension of the subspaces leading to a given reconstruction accuracy is discussed in the paper.

Vision system and three-dimensional modeling techniques for quantification of the morphology of irregular particles

Lyndon N. Smith, Melvyn L. Smith

Show abstract

Particulate materials undergo processing in many industries, and therefore there are significant commercial motivators for attaining improvements in the flow and packing behavior of powders. This can be achieved by modeling the effects of particle size, friction, and most importantly, particle shape or morphology. The method presented here for simulating powders employs a random number generator to construct a model of a random particle by combining a sphere with a number of smaller spheres. The resulting 3D model particle has a nodular type of morphology, which is similar to that exhibited by the atomized powders that are used in the bulk of powder metallurgy (PM) manufacture. The irregularity of the model particles is dependent upon vision system data gathered from microscopic analysis of real powder particles. A methodology is proposed whereby randomly generated model particles of various sized and irregularities can be combined in a random packing simulation. The proposed Monte Carlo technique would allow incorporation of the effects of gravity, wall friction, and inter-particle friction. The improvements in simulation realism that this method is expected to provide would prove useful for controlling powder production, and for predicting die fill behavior during the production of PM parts.

Intelligent Robots and Computer Vision XIX: Algorithms, Techniques, and Active Vision

Volume Details

Table of Contents

Table of Contents