Proceedings Volume 6051

Optomechatronic Machine Vision

cover
Proceedings Volume 6051

Optomechatronic Machine Vision

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 5 December 2005
Contents: 10 Sessions, 46 Papers, 0 Presentations
Conference: Optomechatronic Technologies 2005 2005
Volume Number: 6051

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Pattern Search
  • Range Imaging and Calibration
  • Face and Gesture
  • Image-Based Measurement
  • Imaging Basics
  • Object Detection and Visualization
  • 3D Measurement: Range Image Processing
  • Security and Safety
  • Tracking and Calibration
  • Poster Session
Pattern Search
icon_mobile_dropdown
Fast and robust rotation-invariant search by using orientation code difference histogram (OCDH)
A rotation-invariant template matching scheme using Orientation Code Difference Histogram (OCDH) is proposed. Orientation code features based on local distributions of pixel brightness are substantively robust against furious change in illumination plays a main role in designing the rotation-invariant matching algorithm. Since every difference between any pair of orientation codes is invariant in rotation of an image, we can elaborate a histogram feature by use of the differences, which can aggregate effective clues for searching rotated images through simple procedures. With gray scale images as targets, rotation angles of an image can be accurately estimated by the proposed method. It is fast and robust even in presence of some irregularities as brightness change by shading or highlighting. We propose a two-stage framework for realizing the rotation-invariant template matching based on OCDH. In the first stage, candidate positions are selected through evaluation of OCDH at every position, and then in the second stage, they are tested by use of a verification also based on orientation code features. The effectiveness of the proposed matching method has been shown through many kinds of experiments designed with real world images.
Efficient template matching algorithm based on interval estimations on correlation
We propose an efficient template matching algorithm for binary image search. When we use template matching techniques, the computation cost depends on size of images. If we have large size images, we spend a lot of time for searching similar objects in scene image to template image. We design a scanning-type upper limit estimation that can be useful for neglect correlation calculation. For calculating the scanning-type upper limits, template and scene images are divided into two regions: R-region and P-region. In R-region, an upper limit of correlation coefficients can be derived as an interval estimation based on mathematical analysis of correlations of the object image and a pivot image. In P-region, another upper limit is formalized based on the number of white and black pixels in a template and the object image. By use of these upper limits, the scanning-type upper limit estimation of correlation coefficients can be formalized for the efficient matching algorithm. This upper limits estimation isn't over true values of correlation, so the accuracy of search by conventional search is the same as one by conventional search. The experiments with document images show the effectiveness and efficiency of the proposed matching algorithm. In these experiments, computation time by the proposed algorithm is between 5 and 20% compare of the conventional search.
A fast tag searching method based on orientation code entropy and density
This paper aims to propose a fast image searching method from environmental observation images even in the presence of scale changes. A new scheme has been proposed for extracting feature areas as tags based on a robust image registration algorithm called Orientation code matching. Extracted tags are stored as template images and utilized in tag searching. As the number of tags grows, the searching cost becomes a serious problem. Additionally, change in viewing positions cause scale change of an image and matching failure. In our scheme, richness in features is important for tag generation and the entropy is used to evaluate the diversity of edge directions which are stable to scale change of the image. This characteristic contributes to limitation of searching area and reduction in calculation costs. Scaling factors are estimated by orientation code density which means the percentage of effective codes in fixed size tag areas. An estimated scaling factor is applied to matching a scale of template images to one of observation images. Some experiments are performed in order to compare computation time and verify effectiveness of estimated scaling factor using real scenes.
Vehicle detection using Gaussian mixture model for infrared orientation code image
Nami Hirata, Haruhisa Okuda, Makito Seki, et al.
This paper describes an approach to the detection of vehicles in infrared images. Stable vehicle detection is important for future intelligent transport systems and is generally done by background subtraction and object modeling. To avoid the daylight-dependent and weather-dependent influences of varying illumination in visible images acquired with conventional ITV cameras, some researchers have been using infrared (IR) images. IR images make it easy to extract foreground vehicle regions from background scenes, but their lack of clarity make object modeling difficult. We therefore propose a method that describes the internal pattern of each vehicle by using Gaussian mixture models (GMM) in the orientation-code image (OCI) space. Each pixel of an OCI has information about the maximum-gradient orientation of the IR image, not intensity information. Gradient orientation information does not depend on contrast and can describe the internal pattern structures of objects even in unclear IR images. We use the GMM to describe the topological structures of the internal patterns of vehicles. This approach can also eliminate the influences due to small differences between patterns. Evaluation tests with actual infrared video sequences have proved that the proposed algorithm provides stable vehicle detection.
Real time texture and global characters based station keeping for underwater vehicle
Xiaomin Liu, Feng Zhu, Yingming Hao, et al.
A kind of station keeping servo method based on texture and global characters analyzing for under water vehicle is described in this paper. Most of other systems with the same function adopt some artificial targets or other obvious characters such as corner, line, or outline for servoing. In some cases, the target, especially nature objects, haven't many obvious characters for identification. Texture and some region characters can be considered as the inherent features. The nature texture elements have their special relationships which will change according to the distance and the position changing between the camera and the texture elements. After some analyzing of texture, this paper gives an automatic texture region recognition and tracking algorism. A satisfying result of simulate controlling the four freedoms underwater vehicle with servoing is shown at the end of the paper.
Object detection using independent local feature extractor
Ryouta Nakano, Kazuhiro Hotta, Haruhisa Takahashi
This paper presents an object detection method using independent local feature extractor. In general, it can be considered that objects are the combination of characteristic parts. Therefore, if local parts specialized for recognition target are obtained automatically from training samples, it is expected that good object detector is developed. For this purpose, we use Independent Component Analysis (ICA) which decomposes a signal into independent elementary signals. The basis vectors obtained by ICA are used as independent local feature extractors specified for detection target. The feature extractors are applied to candidate region, and their outputs are used in classification. However, the extracted features are independent local features. Therefore the relative information between neighboring positions of independent features may be more effective for object detection than simple independent features. To extract the relative information, higher order local autocorrelation features are used. To classify detection target and non-target, we use Support Vector Machine which is known as binary classifier. The proposed method is applied to car detection problem. Superior results are obtained by comparison with Principal Component Analysis.
Range Imaging and Calibration
icon_mobile_dropdown
A new foveated wide angle lens with high resolving power and without brightness loss in the periphery
K. Wakamiya, T. Senga, K. Isagi, et al.
A new foveated wide angle lens with high resolving power and without brightness loss in the periphery is developed. A "foveated lens" is an optical system which has both a large field of view and a high resolution in the center of the view field. While it is ideal to keep uniform brightness in the whole view field, widening the field of view without loosing brightness in the periphery had not been achieved by several previous designs and fabrications. In the new design, telecentric light is kept on the image side and a more negative distortion is added to the sine law. As a result, more than 140 degrees view field is achieved with very little loss of brightness in the periphery. The evaluation methods and results for the fundamental optical performance of the fabricated lenses are described. The performance of the proposed lens for several basic visual tasks, such as the readability of characters, the visibility in the view circumference, and the possibility of stereo vision, are compared with that of other optical systems which have a wide field of view. The prospective applications of the proposed lens is also discussed.
Range measurement by a digital camera using flash
Various methods have been proposed until now for range measurement or three dimensional shape reconstruction. However, most of them need a large-scale equipment or a special environment. This paper proposes a technique which obtains a range image easily under a general environment using only an off-the-shelf digital camera. Distance is calculated by obtaining the irradiance of scene lighted by the flash of a digital camera using the fact that the intensity of reflected light of the flash is inversely proportional to the square of the distance from the object. The irradiance is obtained by subtracting an image without the flash from an image with the flash. The image without the flash is used to obtain the reflectance ratio at each pixel. The intensity of reflected light of the flash is affected by the inclination of the object surface. A method to estimate the inclination at each pixel is proposed which uses the change of the irradiance in adjacent pixels. The inclination is formulated as the function of the rate of change, and thus the inclination can be calculated by the rate which is easily obtained from the image. Additionally, color information is simultaneously obtained because visible light is used. Assumptions in the method are that the object surface has no specular reflection and the flash is set at the same position as the center of the lens. Experiments show that a range image is roughly obtained by the proposed method, and furthermore, that proper distance is obtained for inclined surfaces.
Model-based segmentation and recognition from range data
This paper aims at developing a model-based system for the object recognition of three-dimensional objects with curved surfaces using range images. The model data is represented using a CAD-model, providing a mathematical precise and reliable description of arbitrary shapes. The proposed method is based on model-based range image segmentation, using curvature as invariant features. By integrating model information into the segmentation stage, the segmentation process is guided to provide a partitioning corresponding to that of the CAD-model. The work provides a way to detect objects in arbitrary positions and derive the transformation onto a CAD-model. Thereby it contributes to the development of automated systems in the areas of inspection, manufacturing and robotics.
Automatic analysis for neuron by confocal laser scanning microscope
Kouhei Satou, Yoshimitsu Aoki, Nobuko Mataga, et al.
The aim of this study is to develop a system that recognizes both the macro- and microscopic configurations of nerve cells and automatically performs the necessary 3-D measurements and functional classification of spines. The acquisition of 3-D images of cranial nerves has been enabled by the use of a confocal laser scanning microscope, although the highly accurate 3-D measurements of the microscopic structures of cranial nerves and their classification based on their configurations have not yet been accomplished. In this study, in order to obtain highly accurate measurements of the microscopic structures of cranial nerves, existing positions of spines were predicted by the 2-D image processing of tomographic images. Next, based on the positions that were predicted on the 2-D images, the positions and configurations of the spines were determined more accurately by 3-D image processing of the volume data. We report the successful construction of an automatic analysis system that uses a coarse-to-fine technique to analyze the microscopic structures of cranial nerves with high speed and accuracy by combining 2-D and 3-D image analyses.
Face and Gesture
icon_mobile_dropdown
Face and facial parts tracking for acquiring nonverbal information
Since, at our laboratory, the basic configuration of the facial caricaturing system PICASSO has been constructed, it is strongly expected to get sufficient input image from a person who is naturally performing in front of the PICASSO camera system. From this viewpoint, we developed a face tracking PC system for capturing sufficient facial image especially in size by means of PTZ (Pan-Tilt-Zoom) camera collaborated with a fixed CCD camera. Irises are successfully recognized from the motion images captured from PTZ camera. These irises can be utilized to provide a key feature for realizing an automated facial recognizing system. In this system, a person performing naturally in pose and in facial expression within the scope of the fixed CCD camera can be stably tracked and the sufficient images in resolution of PTZ camera were successfully analyzed for iris recognition and facial parts extractions. This face tracking and face recognition system was characterized by a novel template replacement scheme among the successive image frames. Experimental results were also demonstrated in this paper. This system works well in a practical speed 6-9fps on a usual PC connected to these cameras.
Real-time iris detection on faces with coronal and transversal axis rotation
Claudio A. Perez, Vanel A. Lazcano
Real-time face and iris detection on video sequences is important in diverse applications such as, study of the eye function, drowsiness detection, man-machine interfaces, face recognition, security and multimedia retrieval. In this work we present and extension to our previous method to incorporate face and iris detection in faces with coronal and transversal axis rotations in real time. The method is based on anthropometric templates and consists of three stages: coarse face detection, fine face detection and iris detection. In the coarse face detection, a directional image is computed and the contribution of each directional vector is weighted into an accumulator. The highest score in the accumulator is taken as the coarse face position. Then, a high-resolution directional image is computed. Face templates were constructed off-line for face coronal and transversal rotation, using face features such as elliptical shape, location of the eyebrows, nose and lips. A line integral is computed using these templates over the fine directional image to find the actual face location, size and rotation angle. This information provides a region to search for the eyes and the iris boundary is detected within this region by a ratio among to line integrals using a semicircular template. Results computed on five video sequences which include coronal and transversal rotations with over 1900 frames show correct face detection rate above 92% and iris detection rate above 86%.
Robust face detection using individual face parts classifiers based on AdaBoost
Kiyoto Ichikawa, Takeshi Mita, Osamu Hori
We present a robust frontal face detection method that enables the identification of face positions in images by combining the results of a low-resolution whole face and individual face parts classifiers. Our approach is to use face parts information and change the identification strategy based on the results from individual face parts classifiers. Faces are detected by scanning the classifiers into an input image. The classifiers for whole face and individual face parts detection were implemented based on an AdaBoost algorithm. We propose a novel method based on a decision tree to improve performance of face detectors for occluded faces. The proposed decision tree method distinguishes partially occluded faces based on the results from the individual classifies. Preliminarily experiments on a test sample set containing non-occluded faces and occluded faces indicated that our method achieved better results than conventional methods. Actual experimental results containing real images also showed better results.
Fusion of hand and arm gestures
D. Coquin, E. Benoit, H. Sawada, et al.
In order to improve the link between an operator and its machine, some human oriented communication systems are now using natural languages like speech or gesture. The goal of this paper is to present a gesture recognition system based on the fusion of measurements issued from different kind of sources. It is necessary to have some sensors that are able to capture at least the position and the orientation of the hand such as Dataglove and a video camera. Datagloge gives a measure of the hand posture and a video camera gives a measure of the general arm gesture which represents the physical and spatial properties of the gesture, and based on the 2D skeleton representation of the arm. The measurements used are partially complementary and partially redundant. The application is distributed on intelligent cooperating sensors. The paper presents the measurement of the hand and the arm gestures, the fusion processes, and the implementation solution.
Image-Based Measurement
icon_mobile_dropdown
Defect classification for the inspection of TFT LCD glass
DaeCheol Lim, Dae-Gyu Seo, DaeHwa Jeong
Serious pattern defects and particles co-exist on the glass and only a few defects can cause serious quality problem. Now, if there would be a way to classify the defect by its potential lethality, it would be useful to control the product quality and loss of review time. This paper presents a method to classify the defect by using reviewing images. First, several defect types were investigated to develop an algorithm. In next, efficiency of the algorithm was verified in a plant. The result was good enough to utilize the information of classified defect type. Finally, the algorithm was applied to remove the information of trivial defects. The result was good to increase a throughput of whole process under little risk.
Comparison of linear and non-linear calibration methods for phase-shifting surface-geometry measurement
In fringe-projection surface-geometry measurement, phase unwrapping techniques produce a continuous phase distribution that contains the height information of the 3-D object surface. To convert the phase distribution to the height of the 3-D object surface, a phase-height conversion algorithm is needed, essentially determined in the system calibration which depends on the system geometry. Both linear and non-linear approaches have been used to determine the mapping relationship between the phase distribution and the height of the object; however, often the latter has involved complex derivations. In this paper, the mapping relationship between the phase and the height of the object surface is formulated using linear mapping, and using non-linear equations developed through simplified geometrical derivation. A comparison is made between the two approaches. For both methods the system calibration is carried out using a least-squares approach and the accuracy of the calibration is determined both by simulation and experiment. The accuracy of measurement using linear calibration data was generally higher than using non-linear calibration data in most of the range of measurement depth.
Non-contact measuring system in sinusoidal phase modulating interferometry using a laser diode
Ki-Young Pyo, Geun-Young Lee, Weon-Jae Ryu, et al.
Recently, laser interferometry is widely used as a measuring system in many fields because of its high resolution and its ability to measure a broad area in real-time all at once. In conventional laser interferometry, for example Out-of-plane ESPI (Electronic Speckle Pattern Interferometry), In plane ESPI, Shearography and Holography, it uses PZT or other components as a phase shift instrumentation to extract 3-D deformation data, vibration mode and others. However, in most cases PZT has some disadvantages, which include nonlinear errors and limited time of use. In the present study, a new type of laser interferometry using a laser diode is proposed. Using Laser Diode Sinusoidal Phase Modulating (LD-SPM) interferometry, the phase modulation can be directly modulated by controlling the laser diode injection current thereby eliminating the need for PZT and its components. This makes the interferometry more compact.
Imaging Basics
icon_mobile_dropdown
OK-quantization method and its theoretical and experimental properties for simultaneous digitization both in space and in value
OK-Quantization Theory for the digitization in value ensures the reconstructivity of the probabilistic density function of the image. This paper shows some experimental demonstrations to reduce the number of the gray levels, and shows mainly that there is a necessary analytical relationship between sampling and quantization based on the equivalence relationship between two kinds of the integral, Riemann and Lebesgue integrals for calculating the volume of the image. Experimental demonstrations are also shown in this paper.
Simultaneous observation of phase-stepped images for birefringence measurement
S. Yoneyama, H. Kikuta, K. Moriwaki
An instantaneous phase-stepping and subsequent phase analysis method, using a CCD camera with a form-birefringent micro-retarder array, is proposed for two-dimensional birefringence distribution measurement. A birefringent sample placed behind a polarizer and a quarter-wave plate is analyzed by the proposed method. Light emerging from the sample is recorded using a CCD camera that has micro-retarder array on the CCD plane. This micro-retarder array has four different principal directions. That is, an image obtained by the CCD camera contains four data corresponding to four different optical axes of the retarder. The four images separated from the image recorded by the CCD camera are reconstructed using gray level interpolation. Then, the distributions of the Stokes parameters that represent the state of polarization are calculated from the four images. The birefringence parameters, that is, the principal direction and the phase retardation are then obtained from these Stokes parameters. This method is applicable to real-time inspection of optical elements as well as the study of mechanics of time-dependent phenomena because multiple exposures are unnecessary for sufficient data acquisition in the completion of data analysis.
Object Detection and Visualization
icon_mobile_dropdown
Object detection based on radial reach filter under the change of background
In this paper, we propose a new method of object detection. In the past, there are various methods of object detection. Especially, the method of the background subtraction has the effectiveness. However, the methods based on brightness differences are easily influenced by change in lighting condition. In this paper, we use Radial Reach Filter (RRF). RRF is called as the effective method of the change in lighting conditions. However, RRF is not considered change that caused by moving objects on the background image. Then, we propose the new method of object detection that considered motion of the moving objects on the background image. And, we verify the effectiveness by the experiments using a time series image.
The support system of firefighters by detecting objects in smoke space
In recent years, crisis management's response to terrorist attacks and natural disasters, as well as accelerating rescue operations has become an important issue. We aim to make a support system for firefighters using the application of various engineering techniques such as information technology and radar technology. In rescue operations, one of the biggest problems is that the view of firefighters is obstructed by dense smoke. One of the current measures against this condition is the use of search sticks, like a blind man walking in town. The most important task for firefighters is to understand inside situation of a space with dense smoke. Therefore, our system supports firefighters' activity by visualizing the space with dense smoke. First, we scan target space with dense smoke by using millimeter-wave radar combined with a gyro sensor. Then multiple directional scan data can be obtained, and we construct a 3D map from high-reflection point dataset using 3D image processing technologies (3D grouping and labeling processing). In this paper, we introduce our system and report the results of the experiment in the real smoke space situation and practical achievements.
Mixed reality orthognathic surgical simulation by entity model manipulation and 3D-image display
Tatsunari Shimonagayoshi, Yoshimitsu Aoki, Kenji Fushima, et al.
In orthognathic surgery, the framing of 3D-surgical planning that considers the balance between the front and back positions and the symmetry of the jawbone, as well as the dental occlusion of teeth, is essential. In this study, a support system for orthodontic surgery to visualize the changes in the mandible and the occlusal condition and to determine the optimum position in mandibular osteotomy has been developed. By integrating the operating portion of a tooth model that is to determine the optimum occlusal position by manipulating the entity tooth model and the 3D-CT skeletal images (3D image display portion) that are simultaneously displayed in real-time, the determination of the mandibular position and posture in which the improvement of skeletal morphology and occlusal condition is considered, is possible. The realistic operation of the entity model and the virtual 3D image display enabled the construction of a surgical simulation system that involves augmented reality.
3D Measurement: Range Image Processing
icon_mobile_dropdown
Fast and high-accurate 3D registration algorithm using hierarchical M-ICP
Haruhisa Okuda, Yasuo Kitaaki, Manabu Hashimoto, et al.
This paper presents a novel fast and high-accurate 3-D registration algorithm. The ICP (Iterative Closest Point) algorithm converges all the 3-D data points of two data sets to the best matching points with minimum evaluation values. This algorithm is broadly used, because it has good availability to many applications. But, it needs many computational costs and it is very sensible to error values. Because, it uses whole data points of two data sets and least mean square optimization. We had proposed the M-ICP algorithm, which is an extension of the ICP algorithm based on modified M-estimation for realization of robustness against outlying gross noise. The proposed algorithm named HM-ICP (Hierarchical M-ICP) is an extension of the M-ICP with selecting region for matching and hierarchical searching of selected regions. In this algorithm, we select regions using evaluation of variance for distance values in the target region and homogeneous topological mapping. Some fundamental experiments utilizing real data sets of 3-D measurement show effectiveness of the proposed method. We achieved more than 4-digits number reduction of computational costs and confirmed less than 0.1% error to the measurement distance.
Mask scale adjusting stereo matching method for an object with curved surface
Stereo matching process is needed in order to obtain 3D information from stereo pair images. Many sophisticated methods have been proposed and used for obtaining the correspondence in stereo pair. However, they are all directing to obtaining global matching in a whole scene. In the recent studies on the human visual system, volume perception with binocular viewing is reported: in which object can be perceived as if it extending to sufficiently farther depth than the depth expected from the visible parts of an object; then found that the binocularly unpaired areas on an object playing an indispensable role for the volume perception. In order to simulate the volume perception on computer vision, more precise matching process in the adjacent areas of unpaired regions on an object were required. However, because of some inherent problems, even in the ideal condition without any noise, conventional stereo matching methods can not be applied for obtaining detailed information especially in the areas adjacent to the contour of an object with curved surface. The authors investigated in details on the stereo matching in the adjacent areas of the object contour with curved surface; then found that different shrinking ratios between two stereo images were the essential factor causing the difficulty in stereo matching. In order to solve this problem, the authors devise a mask scale adjusting stereo matching method and improve the matching accuracy especially in the adjacent areas of binocularly unpaired parts. By applying the proposed method more precise 3D information of an object can be expected; then the volume perception could be simulated in computer vision.
Influence of the projected grid pattern distortions on measurement accuracy for phase shift based 3D inspection
Igor Dunin-Barkowski, Jae Seon Kim
Recently 3D inspection tasks become more and more important, especially in electronic manufacturing industry segment in such areas as solder paste inspection, wafer bump inspection, ball grid array (BGA) and leadless packages inspection, pre- and post-reflow surface mount technology (SMT) board inspection and others, number of these tasks is rapidly growing. The main trend in these applications is that sizes of objects that are being inspected are decreasing down to tens of microns thus increasing requirements in measurement accuracy for inspection systems including increasing of both range and lateral resolutions. All these factors form strong demand for 3D measurement methods that could combine high resolution and accuracy with high-speed scan and measurement capabilities. That is why phase shift profilometry methods, based on projection of structured moire-like light pattern on object's surface and measurement of the resultant phase shift, are becoming more and more popular due to their efficiency, precision and robustness. To ensure these method's precision it is necessary to make the projected pattern perfectly sinusoidal and also having predefined grid pitch (period). These parameters are subjects for various distortions due to several factors. Among these factors there are variations of projector's working distance due to changes of objects position, grating distortions, non-sinusoidality of grating's transmission profile, grating pitch variations, etc. Compensation methods for neutralizing the influence of the above mentioned factors are presented in this paper along the experiment results based on 3D measurement head developed by authors for solder paste inspection utilization as well as simulation results.
Security and Safety
icon_mobile_dropdown
Development of a bathroom watching system based on breath detection and silhouette extraction
Tomofumi Nishiura, Masato Nakajima
Sudden death in bathrooms is an important social problem in Japan. This paper proposes a bathroom watching system with the aim of detecting bathing people who are drowning. This system employs a fiber grating vision system and a color camera to detect breathing and the position of a bather, as well as to perform a self-diagnosis of the system operational state. The effectiveness of these functions was verified through experiments.
Real-time violent action detector for elevator
Kentaro Hayashi, Makito Seki, Takahide Hirai, et al.
This paper presents a new critical event detection method simplified for an embedded appliance mounted on an elevator car. We first define that the critical event is unusual action such as violent action, counteraction, etc, and introduce the violent action degree(VAD). We use an optical flow based method to analyze the current state of the motion through an ITV(Industrial TeleVision) camera. After motion analysis, we calculate a normalized statistical value, which is the VAD. The statistical value is the multiple of the optical flow direction variance, the optical flow magnitude variance, and optical flow area. Our method calculates the statistical value variance and normalize it by the variance. At last we can detect critical event by thresholding the VAD. Then we implement this method on an embedded appliance. The appliance has an A/D converter with special designed frame buffer, a 400MIPS high performance micro processor, dynamic memory, and some flash ROM. Since we need to process the method 4Hz or faster to keep the detection performance, we shrink the images into 80 by 60 size, adopt the recursive correlation method, and analyze optical flows. The special designed frame buffer enables us for capturing sequencial two images at any time. After that we achieve about 8Hz processing performance on it. Our method detects 80% of critical events where at most 6% of false acception.
Tracking and Calibration
icon_mobile_dropdown
Robust tracking based on orientation code matching under irregular conditions
Feature extraction and tracking are widely applied in the industrial world of today. It is still an important topic in Machine Vision. In this paper, we present a new feature extraction and tracking method which is robust against illumination change such as shading and highlighting, scaling and rotation of objects. The method is composed mainly of two algorithms: Entropy Filter and Orientation Code Matching (OCM). The Entropy Filter points up areas of images being messy distribution of orientation codes. The orientation code is determined by detecting the orientation of maximum intensity change around neighboring 8 pixels. It is defined as simply integral values. We can extract good features to track from the images by using the Entropy Filter. And then, the OCM, a template matching method using the orientation code, is applied to track the features each frame. We can track the features robustly against the illumination change by using the OCM. Moreover, updating these features (templates) each frame allows complicated motions of tracked objects such as scaling, rotation and so on. In this paper, we report the details of our algorithms and the evaluations of comparison with other well-known feature extraction and tracking methods. As an application example, planer landmarks and face tracking is tried. The results of them are also reported in context.
Robust object tracking under pose variation
This paper presents a robust object tracking method under pose variation. In practical environment, illumination and pose of objects are changed dynamically. Therefore, the robustness to them is required for practical applications. However, it is difficult to be robust to various changes by only one tracking model. Therefore, the robustness to slight variations and the easiness of model update are required. For this purpose, Kernel Principal Component Analysis (KPCA) of local parts is used. KPCA of local parts is proposed for the purpose of pose independent object recognition. Training of this method is performed by using local parts cropped from only one or two object images. This is good property for tracking because only one target image is given in practical applications. In addition, the model (subspace) of this method can be updated easily by solving an eigen value problem. However, simple update rule that only the tracked region is used to update the model for next frame may propagate the error to the following frames. Therefore, the first given image which is a unique supervised sample should be used effectively. To reduce the influence of error propagation, the first given image and tracked region in t-th frame are used for constructing the subspace. Performance of the proposed method is evaluated by using the test face sequence captured under pose, scaling and illumination variations. Effectiveness of the proposed method is shown by the comparison with template matching with update. In addition, adaptive update rule using similarity with current subspace is also proposed. Effectiveness of adaptive update rule is shown by experiment.
Analysis of the 3D trajectory of absolute motion of an object using a motionless monocular camera
Huynh Quang Huy Viet, Makoto Sato, Hiromi T. Tanaka
Based on the triangulation method, the 3D motion of an object can be completely recognized by a stereo camera. However, the question whether or not the 3D motion of an object can be completely recognized by a motionless / fixed monocular camera is the yet-unanswered question. In this paper we propose a method using a motionless monocular camera of which the focus is changed in cycle to recognize the absolute 3D motion of an object. We name the method motion from focus.
Poster Session
icon_mobile_dropdown
Automatic recognition of coded-pattern sequence by using image cross-correlation
Sidong Zhong, Zhi Gao
In order to solve the problem of automatic target recognition in photogrammetry, a method of recognition for coded-pattern sequence by using image cross-correlation is presented. Coded-pattern sequences are a series of patterns that posses unique identification information, which will be extracted to realize recognition. The concrete operation is to do cross-correlation of the real pattern image and the fictitious templet image of every possible pattern. The basis of this method is the theory of signal processing that the operation of cross-correlation can detect the resemblance of signals with different offset. The result of experiment shows that this method is applicable in many situations and also has the characteristics of high accuracy and high speed.
Robust template matching by using variable size block division
Kosuke Mitani, Hitoshi Saji
Template matching is used in many applications, such as object recognition and motion tracking. In this study, we propose a template matching method that is robust against rotation and occlusion. For this purpose, we first divide a template image into several blocks. In the division, each block size is variable on the basis of the brightness distribution in the block region. Next, we search the matching position of each block by using a color histogram matching method whose result is rotational invariant. Then, from the matching coordinates of each block, we compute the Helmert transformation parameters and vote to the coordinates in the space composed of the parameters. Finally, we obtain the matching position of the template by searching the optimum Helmert transformation parameters from the coordinates where the sum of the vote is the maximum. We evaluate the efficacy of our method by means of several experiments. This method enables the robust extraction of an object which is rotated or occluded and is usable in many applications.
Application of successive test feature classifier to dynamic recognition problems
A novel successive learning algorithm is proposed for efficiently handling sequentially provided training data based on Test Feature Classifier (TFC), which is non-parametric and effective even for small data. We have proposed a novel classifier TFC utilizing prime test features (PTF) which is combination feature subsets for getting excellent performance. TFC has characteristics as follows: non-parametric learning, no mis-classification of training data. And then, in some real-world problems, the effectiveness of TFC is confirmed through way applications. However, TFC has a problem that it must be reconstructed even when any sub-set of data is changed. In the successive learning, after recognition of a set of unknown objects, they are fed into the classifier in order to obtain a modified classifier. We propose an efficient algorithm for reconstruction of PTFs, which is formalized in cases of addition and deletion of training data. In the verification experiment, using the successive learning algorithm, we can save about 70% on the total computational cost in comparison with a batch learning. We applied the proposed successive TFC to dynamic recognition problems from which the characteristic of training data changes with progress of time, and examine the characteristic by the fundamental experiments. Support Vector Machine (SVM) which is well established in algorithm and on practical application, was compared with the proposed successive TFC. And successive TFC indicated high performance compared with SVM.
Query-by-sketch image retrieval using relevance feedback
Gosuke Ohashi, Yoshifumi Shimodaira
The present paper describes a query-by-sketch image retrieval system aimed at reducing the semantic gap by adopting relevance feedback. To reduce the semantic gap between low-level visual features and high-level semantics, in this content-based image retrieval system, users' sketches play an important role in relevance feedback. When users mark similar images of output images with "relevant" labels, the "relevant" images are relevant to the sketch image in positive feedback. This method was applied to 5,500 images in Corel Photo Gallery. Experimental results show that the proposed method is effective in retrieving images.
Measuring time sequence of 3D facial shapes using binocular stereo with color slits
Takahiro Arai, Naoki Ikegaya, Hitoshi Saji
Three-dimensional (3D) shape measurement of a human face is useful in various applications, and many researches have been reported until now. In particular, many applications require methods of measuring time series data of 3D shapes by using simple devices. However, in the present, there are a lot of measurement methods with large scale devices under strong restraint conditions, and hence applications of the method are limited. In our research, we use binocular stereo methods with color slits. We measure 3D shape by using a couple of stereo images, and we continue to do the measurement sequentially at high speed. Therefore, our method enables real-time measurement of a human face in motion. Up to now, when featureless objects such as a human face are measured, general methods have found the corresponding points by projecting the slit lights onto the subject and by scanning the subject. Consequently, these methods require a long time to measure the subject, and hence the shape cannot be accurately measured when the object moves. In our method, we extract the color slits projected on the face and make these correspond between the images at a time, and hence, we can achieve high-speed stereo matching even when the subject moves.
Development of caricaturing robot and its prospect through the prototype robot exhibition in EXPO 2005
Takayuki Fujiwara, Takashi Watanabe, Takuma Funahashi, et al.
We developed the facial caricaturing robot "COOPER", that was exhibited at the Prototype Robot Exhibition of EXPO 2005, Aichi Japan during 11 days from Jun.9 to Jun.19. COOPER watches the face of a person seated at the chair, obtains facial images, and analyzes the images to extract 251 feature points to generate his facial line drawings with deformation. It is noted that the caricature was drawn on the specialized "Shrimp rice cracker" in 4 minutes. To do this we customized the original system PICASSO by coping with the illumination circumstances in EXPO pavilion. This paper illustrates the outline of the COOPER and the details of the image processing in it. And we discusses on the prospects of the future subjects based on more than 395 facial caricatures obtained at EXPO2005.
Discrimination of gender using facial image with expression change
Jun Kuniyada, Takahiro Fukuda, Kenji Terada
By carrying out marketing research, the managers of large-sized department stores or small convenience stores obtain the information such as ratio of men and women of visitors and an age group, and improve their management plan. However, these works are carried out in the manual operations, and it becomes a big burden to small stores. In this paper, the authors propose a method of men and women discrimination by extracting difference of the facial expression change from color facial images. Now, there are a lot of methods of the automatic recognition of the individual using a motion facial image or a still facial image in the field of image processing. However, it is very difficult to discriminate gender under the influence of the hairstyle and clothes, etc. Therefore, we propose the method which is not affected by personality such as size and position of facial parts by paying attention to a change of an expression. In this method, it is necessary to obtain two facial images with an expression and an expressionless. First, a region of facial surface and the regions of facial parts such as eyes, nose, and mouth are extracted in the facial image with color information of hue and saturation in HSV color system and emphasized edge information. Next, the features are extracted by calculating the rate of the change of each facial part generated by an expression change. In the last step, the values of those features are compared between the input data and the database, and the gender is discriminated. In this paper, it experimented for the laughing expression and smile expression, and good results were provided for discriminating gender.
Surveillance of the plant growth using the camera image
In this paper, we propose a method of surveillance of the plant growth using the camera image. This method is able to observe the condition of raising the plant in the greenhouse. The plate which is known as HORIBA is prepared for extracting harmful insect. The image of HORIBA is obtained by the camera and used for processing. The resolution of the image is 1280×960. In first process, region of the harmful insect (fly) is extracted from HORIBA by using color information. In next process the template matching is performed to examine the correlation of shape in four different angles. 16 kinds of results are obtained by four different templates. The sum logical of the results is calculated for estimation. In addition, the experimental results are shown in this paper.
Content-based retrieval using MPEG-7 visual descriptor and hippocampal neural network
Young Ho Kim, Lyang-Jae Joung, Dae-Seong Kang
As development of digital technology, many kinds of multimedia data are used variously and requirements for effective use by user are increasing. In order to transfer information fast and precisely what user wants, effective retrieval method is required. As existing multimedia data are impossible to apply the MPEG-1, MPEG-2 and MPEG-4 technologies which are aimed at compression, store and transmission. So MPEG-7 is introduced as a new technology for effective management and retrieval for multimedia data. In this paper, we extract content-based features using color descriptor among the MPEG-7 standardization visual descriptor, and reduce feature data applying PCA(Principal Components Analysis) technique. We remodel the cerebral cortex and hippocampal neural networks as a principle of a human's brain and it can label the features of the image-data which are inputted according to the order of hippocampal neuron structure to reaction-pattern according to the adjustment of a good impression in Dentate gyrus region and remove the noise through the auto-associate- memory step in the CA3 region. In the CA1 region receiving the information of the CA3, it can make long-term or short-term memory learned by neuron. Hippocampal neural network makes neuron of the neural network separate and combine dynamically, expand the neuron attaching additional information using the synapse and add new features according to the situation by user's demand. When user is querying, it compares feature value stored in long-term memory first and it learns feature vector fast and construct optimized feature. So the speed of index and retrieval is fast. Also, it uses MPEG-7 standard visual descriptors as content-based feature value, it improves retrieval efficiency.
Extraction from biological volume data of a region of interest with non-uniform intensity
Hiroyuki Shimai, Hideo Yokota, Sakiko Nakamura, et al.
A method is proposed for extracting the region of interest from biological volume data when non-uniform intensity is involved. Binary coded processing, using thresholding and active contour models, is often used for extracting the region of interest from biological images, but this method suffers from two defects: no allowance is made for the three-dimensional structure of the region of interest and preprocessing of the data is required. On the other hand, the region growing method can extract a complex 3-D structure and requires no preprocessing, although it is difficult to apply to biological volume data that contain a lot of noise and have non- uniform intensity in the region of interest. This paper reports improvements to the conventional region growing method to provide more robustness against non-uniform intensity and noise. This is achieved by only paying attention to the local area, using information inside and outside the region of interest, and by using a median value. This method can be easily applied as the number of parameters is less than that by conventional techniques and no prior knowledge of the original data is needed.
A remote camera operation system using a marker attached cap
In this paper, we propose a convenient system to control a remote camera according to the eye-gazing direction of the operator, which is approximately obtained through calculating the face direction by means of image processing. The operator put a marker attached cap on his head, and the system takes an image of the operator from above with only one video camera. Three markers are set up on the cap, and 'three' is the minimum number to calculate the tilt angle of the head. The more markers are used, the robuster system may be made to occlusion, and the wider moving range of the head is tolerated. It is supposed that the markers must not exist on any three dimensional straight line. To compensate the marker's color change due to illumination conditions, the threshold for the marker extraction is adaptively decided using a k-means clustering method. The system was implemented with MATLAB on a personal computer, and the real-time operation was realized. Through the experimental results, robustness of the system was confirmed and tilt and pan angles of the head could be calculated with enough accuracy to use.
Development of the crone seedlings handling system using 3D-sensor and force control gripper
Hirotaka Hojo, Hiroshi Takarada, Takahisa Hiroyasu, et al.
The crone seedlings have unstable form and it is hard to handle. In order to transplant crone seedlings automatically, the functions of 3D-shape recognition and force control of grippers are indispensable. We have introduced the new handling technology which combines the 3D-mesurement using the relative stereo method and gripping method by gripping stroke control for high elasticity forceps structure. In this gripping method, the gripping force is controlled according to the shoot diameter which is measured by 3d-mesurment of relative stereo method. The experimental crone seedlings transplant system using the new handling technique has been shown.
Improvement of relief algorithm to prevent inpatient's downfall accident with night-vision CCD camera
Noriyuki Matsuda, Takeshi Yamamoto, Masafumi Miwa, et al.
"ROSAI" hospital, Wakayama City in Japan, reported that inpatient's bed-downfall is one of the most serious accidents in hospital at night. Many inpatients have been having serious damages from downfall accidents from a bed. To prevent accidents, the hospital tested several sensors in a sickroom to send warning-signal of inpatient's downfall accidents to a nurse. However, it sent too much inadequate wrong warning about inpatients' sleeping situation. To send a nurse useful information, precise automatic detection for an inpatient's sleeping situation is necessary. In this paper, we focus on a clustering-algorithm which evaluates inpatient's situation from multiple angles by several kinds of sensor including night-vision CCD camera. This paper indicates new relief algorithm to improve the weakness about exceptional cases.
3-D sensor using relative stereo method for bio-seedlings transplanting system
In the plant factory of crone seedlings, most of the production processes are highly automated, but the transplanting process of the small seedlings is hard to be automated because the figures of small seedlings are not stable and to handle the seedlings it is required to observe the shapes of the small seedlings. Here, a 3-D vision system for robot to be used for the transplanting process in a plant factory has been introduced. This system has been employed relative stereo method and slit light measuring method and it can detect the shape of small seedlings and decides the cutting point. In this paper, the structure of the vision system and the image processing method for the system is explained.
An effective data acquisition system using image processing
The authors investigate a data acquisition system utilising the widely available digital multi-meter and the webcam. The system is suited for applications that require sampling rates of less than about 1 Hz, such as for ambient temperature recording or the monitoring of the charging state of rechargeable batteries. The data displayed on the external digital readout is acquired into the computer through the process of template matching. MATLAB is used as the programming language for processing the captured 2-D images in this demonstration. A RC charging experiment with a time characteristic of approximately 33 s is setup to verify the accuracy of the image-to-data conversion. It is found that the acquired data matches the steady-state voltage value displayed by the digital meter after an error detection technique has been devised and implemented into the data acquisition script file. It is possible to acquire a number of different readings simultaneously from various sources with this imaging method by placing a number of digital readouts within the camera's field-of-view.
Quantitative evaluation method of the bubble structure of sponge cake by using morphology image processing
Hironobu Tatebe, Kunihito Kato, Kazuhiko Yamamoto, et al.
Now a day, many evaluation methods for the food industry by using image processing are proposed. These methods are becoming new evaluation method besides the sensory test and the solid-state measurement that are using for the quality evaluation. An advantage of the image processing is to be able to evaluate objectively. The goal of our research is structure evaluation of sponge cake by using image processing. In this paper, we propose a feature extraction method of the bobble structure in the sponge cake. Analysis of the bubble structure is one of the important properties to understand characteristics of the cake from the image. In order to take the cake image, first we cut cakes and measured that's surface by using the CIS scanner. Because the depth of field of this type scanner is very shallow, the bubble region of the surface has low gray scale values, and it has a feature that is blur. We extracted bubble regions from the surface images based on these features. First, input image is binarized, and the feature of bubble is extracted by the morphology analysis. In order to evaluate the result of feature extraction, we compared correlation with "Size of the bubble" of the sensory test result. From a result, the bubble extraction by using morphology analysis gives good correlation. It is shown that our method is as well as the subjectivity evaluation.