2017 International Conference on Robotics and Machine Vision | (2018) | Publications

Volume Details

Date Published: 4 January 2018

Contents: 5 Sessions, 25 Papers, 0 Presentations

Conference: Second International Conference on Robotics and Machine Vision 2017

Volume Number: 10613

All links to SPIE Proceedings will open in the SPIE Digital Library.

Show all abstracts

View Session

Front Matter: Volume 10613
Object Detection and Pattern Recognition
Image Processing and Applications
Modern Information Theory and Signal Processing
Robot Design and Control Engineering

Front Matter: Volume 10613

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 10613, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.

Object Detection and Pattern Recognition

Relative velocity discretization for moving targets detection in FMCW SAR

Pu Cheng, Jianwei Wan, Qin Xin, et al.

Show abstract

Frequency-Modulated Continuous-Wave Synthetic Aperture Radar (FMCW SAR) is a promising compact remote imaging sensor. In this paper, a ground moving targets refocusing method is presented to provide FMCW SAR system with simultaneous moving targets indication application. This method is modified from range migration algorithm. To discriminate the target optimally, the concept of relative motion is utilized. The moving target is refocused like a fixed target. Its migrations both in the range and azimuth directions are completely compensated. Blind hypotheses of the relative velocities are used in the detection phase of moving targets. The step size between the hypotheses involves a trade-off between the computation load and detectability. In this paper, we determine the discretization based on the principle of stationary phase. The discretization reduces the computation burden and secures the detectability.

Binary image filtering for object detection based on Haar feature density map

Chengqi Li, Zhigang Ren, Bo Yang

Show abstract

The most concerned problem is to detect the interesting objects in image sequence captured from the same scene. Image difference is a commonly used method in detecting the interesting object, however, massive noise exists in the binarized difference image, so how to remove the noise is a hot issue. Aiming at the removing the noise in binary difference image, we propose a novel filtering algorithm based on Haar feature density map. Firstly, calculate the Haar feature density distribution map of binary image. Secondly, the density distribution map of Haar feature is binarized to remove noise. Finally, the interesting objects can be easily detected. Experiments show that the Haar feature density map achieves a better filtering effect than the conventional filtering algorithms for binary image (such as median filtering, morphological operation and so on).

Posture recognition associated with lifting of heavy objects using Kinect and Adaboost

Sayli Raut, Navaneethakrishna M., Ramakrishnan S.

Show abstract

Lifting of heavy objects is the common task in the industries. Recent statistics from the Bureau of Labour indicate, back injuries account for one of every five injuries in the workplace. Eighty per cent of these injuries occur to the lower back and are associated with manual materials handling tasks. According to the Industrial ergonomic safety manual, Squatting is the correct posture for lifting a heavy object. In this work, an attempt has been made to monitor posture of the workers during squat and stoop using 3D motion capture and machine learning techniques. For this, Microsoft Kinect V2 is used for capturing the depth data. Further, Dynamic Time Warping and Euclidian distance algorithms are used for extraction of features. Ada-boost algorithm is used for classification of stoop and squat. The results show that the 3D image data is large and complex to analyze. The application of nonlinear and linear metrics captures the variation in the lifting pattern. Additionally, the features extracted from this metric resulted in a classification accuracy of 85% and 81% respectively. This framework may be put-upon to alert the workers in the industrial ergonomic environments.

A comparison between skeleton and bounding box models for falling direction recognition

Lalita Narupiyakul, Nitikorn Srisrisawang

Show abstract

Falling is an injury that can lead to a serious medical condition in every range of the age of people. However, in the case of elderly, the risk of serious injury is much higher. Due to the fact that one way of preventing serious injury is to treat the fallen person as soon as possible, several works attempted to implement different algorithms to recognize the fall. Our work compares the performance of two models based on features extraction: (i) Body joint data (Skeleton Data) which are the joint’s positions in 3 axes and (ii) Bounding box (Box-size Data) covering all body joints. Machine learning algorithms that were chosen are Decision Tree (DT), Naïve Bayes (NB), K-nearest neighbors (KNN), Linear discriminant analysis (LDA), Voting Classification (VC), and Gradient boosting (GB). The results illustrate that the models trained with Skeleton data are performed far better than those trained with Box-size data (with an average accuracy of 94-81% and 80-75%, respectively). KNN shows the best performance in both Body joint model and Bounding box model. In conclusion, KNN with Body joint model performs the best among the others.

Illumination robust face recognition using spatial adaptive shadow compensation based on face intensity prior

Cheng-Ta Hsieh, Kae-Horng Huang, Chang-Hsing Lee, et al.

Show abstract

Robust face recognition under illumination variations is an important and challenging task in a face recognition system, particularly for face recognition in the wild. In this paper, a face image preprocessing approach, called spatial adaptive shadow compensation (SASC), is proposed to eliminate shadows in the face image due to different lighting directions. First, spatial adaptive histogram equalization (SAHE), which uses face intensity prior model, is proposed to enhance the contrast of each local face region without generating visible noises in smooth face areas. Adaptive shadow compensation (ASC), which performs shadow compensation in each local image block, is then used to produce a wellcompensated face image appropriate for face feature extraction and recognition. Finally, null-space linear discriminant analysis (NLDA) is employed to extract discriminant features from SASC compensated images. Experiments performed on the Yale B, Yale B extended, and CMU PIE face databases have shown that the proposed SASC always yields the best face recognition accuracy. That is, SASC is more robust to face recognition under illumination variations than other shadow compensation approaches.

Parking-lines detection based on an improved Hough transform

Shuyu Jiang, Yinan Lu, Yuan Chen, et al.

Show abstract

The parking-lines recognition is a prerequisite for the vehicle automatic parking system. This paper adopts Otsu threshold segmentation method, Sobel operator and improved Hough transform to realize the detection of parking lines. The experimental results show that the algorithm can effectively and accurately identify the parking lines.

Image Processing and Applications

Feature extraction of the wafer probe marks in IC packaging

Cheng-Yu Tsai, Chia-Te Lin, Chen-Ting Kao, et al.

Show abstract

This paper presents an image processing approach to extract six features of the probe mark on semiconductor wafer pads. The electrical characteristics of the chip pad must be tested using a probing needle before wire-bonding to the wafer. However, this test leaves probe marks on the pad. A large probe mark area results in poor adhesion forces at the bond ball of the pad, thus leading to undesirable products. In this paper, we present a method to extract six features of the wafer probe marks in IC packaging for further digital image processing.

Multi-focus image fusion algorithm based on non-subsampled shearlet transform and focus measure

Hongmei Wang, Mir Soban Ahmed

Show abstract

novel multi-focus image fusion algorithm is proposed in the Sheartlet domain. The core idea of this paper is to utilize the focus measure to detect the focused region from the multi-focus images. The proposed algorithm can be divided into three procedures: image decomposition, sub-bands coefficients selection and image reconstruction. At first, the multi-focus images are decomposed by non-subsampled Sheartlet transform (NSST), and the low frequency sub-bands and high frequency sub-bands can be obtained. For the low frequency sub-bands, saliency detection and improved sum-modified-Laplacian are combined to detect the focused regions. A modified edge measure algorithm is utilized to guide the coefficients combination for high frequency sub-bands at different levels. Moreover, in order to avoid the erroneous results introduced by the above procedures, mathematical morphology technique is used to revise the decision maps of the low frequency sub-bands and high frequency sub-bands. The final fused image can be obtained by taken the inverse NSST. The performance of the proposed method is tested on series of multi-focus images extensively. Experimental results indicate that the proposed method outperformed some state-of-the-art fusion methods, in terms of both subjective observation and objective evaluations.

Spatial and spectral analysis of corneal epithelium injury using hyperspectral images

Siti Salwa Md Noor, Kaleena Michael, Stephen Marshall, et al.

Show abstract

Eye assessment is essential in preventing blindness. Currently, the existing methods to assess corneal epithelium injury are complex and require expert knowledge. Hence, we have introduced a non-invasive technique using hyperspectral imaging (HSI) and an image analysis algorithm of corneal epithelium injury. Three groups of images were compared and analyzed, namely healthy eyes, injured eyes, and injured eyes with stain. Dimensionality reduction using principal component analysis (PCA) was applied to reduce massive data and redundancies. The first 10 principal components (PCs) were selected for further processing. The mean vector of 10 PCs with 45 pairs of all combinations was computed and sent to two classifiers. A quadratic Bayes normal classifier (QDC) and a support vector classifier (SVC) were used in this study to discriminate the eleven eyes into three groups. As a result, the combined classifier of QDC and SVC showed optimal performance with 2D PCA features (2DPCA-QDSVC) and was utilized to classify normal and abnormal tissues, using color image segmentation. The result was compared with human segmentation. The outcome showed that the proposed algorithm produced extremely promising results to assist the clinician in quantifying a cornea injury.

Virtual expansion of the technical vision system for smart vehicles based on multi-agent cooperation model

Nina Krapukhina, Roman Senchenko, Nikolay Kamenov

Show abstract

Road safety and driving in dense traffic flows poses some challenges in receiving information about surrounding moving object, some of which can be in the vehicle’s blind spot. This work suggests an approach to virtual monitoring of the objects in a current road scene via a system with a multitude of cooperating smart vehicles exchanging information. It also describes the intellectual agent model, and provides methods and algorithms of identifying and evaluating various characteristics of moving objects in video flow. Authors also suggest ways for integrating the information from the technical vision system into the model with further expansion of virtual monitoring for the system’s objects. Implementation of this approach can help to expand the virtual field of view for a technical vision system.

Acceleration of planes segmentation using normals from previous frame

Pavel Gritsenko, Igor Gritsenko, Askar Seidakhmet, et al.

Show abstract

One of the major problem in integration process of robots is to make them able to function in a human environment. In terms of computer vision, the major feature of human made rooms is the presence of planes [1, 2, 20, 21, 23]. In this article, we will present an algorithm dedicated to increase speed of a plane segmentation. The algorithm uses information about location of a plane and its normal vector to speed up the segmentation process in the next frame. In conjunction with it, we will address such aspects of ICP SLAM as performance and map representation.

Modern Information Theory and Signal Processing

Advertisement recognition using mode voting acoustic fingerprint

Reza Fahmi, Hosein Abedi Firouzjaee, Ali Janalizadeh Choobbasti, et al.

Show abstract

Emergence of media outlets and public relations tools such as TV, radio and the Internet since the 20th century provided the companies with a good platform for advertising their goods and services. Advertisement recognition is an important task that can help companies measure the efficiency of their advertising campaigns in the market and make it possible to compare their performance with competitors in order to get better business insights. Advertisement recognition is usually performed manually with help of human labor or is done through automated methods that are mainly based on heuristics features, these methods usually lack abilities such as scalability, being able to be generalized and be used in different situations. In this paper, we present an automated method for advertisement recognition based on audio processing method that could make this process fairly simple and eliminate the human factor out of the equation. This method has ultimately been used in Miras information technology in order to monitor 56 TV channels to detect all ad video clips broadcast over some networks.

Adaptive EMG noise reduction in ECG signals using noise level approximation

Mohamed Marouf, Lazar Saranovac

Show abstract

In this paper the usage of noise level approximation for adaptive Electromyogram (EMG) noise reduction in the Electrocardiogram (ECG) signals is introduced. To achieve the adequate adaptiveness, a translation-invariant noise level approximation is employed. The approximation is done in the form of a guiding signal extracted as an estimation of the signal quality vs. EMG noise. The noise reduction framework is based on a bank of low pass filters. So, the adaptive noise reduction is achieved by selecting the appropriate filter with respect to the guiding signal aiming to obtain the best trade-off between the signal distortion caused by filtering and the signal readability. For the evaluation purposes; both real EMG and artificial noises are used. The tested ECG signals are from the MIT-BIH Arrhythmia Database Directory, while both real and artificial records of EMG noise are added and used in the evaluation process. Firstly, comparison with state of the art methods is conducted to verify the performance of the proposed approach in terms of noise cancellation while preserving the QRS complex waves. Additionally, the signal to noise ratio improvement after the adaptive noise reduction is computed and presented for the proposed method. Finally, the impact of adaptive noise reduction method on QRS complexes detection was studied. The tested signals are delineated using a state of the art method, and the QRS detection improvement for different SNR is presented.

A novel ECG data compression method based on adaptive Fourier decomposition

Chunyu Tan, Liming Zhang

Show abstract

This paper presents a novel electrocardiogram (ECG) compression method based on adaptive Fourier decomposition (AFD). AFD is a newly developed signal decomposition approach, which can decompose a signal with fast convergence, and hence reconstruct ECG signals with high fidelity. Unlike most of the high performance algorithms, our method does not make use of any preprocessing operation before compression. Huffman coding is employed for further compression. Validated with 48 ECG recordings of MIT-BIH arrhythmia database, the proposed method achieves the compression ratio (CR) of 35.53 and the percentage root mean square difference (PRD) of 1.47% on average with N = 8 decomposition times and a robust PRD-CR relationship. The results demonstrate that the proposed method has a good performance compared with the state-of-the-art ECG compressors.

Structural damage detection based on wavelet transform in strain energy signal processing

Pengbo Wang

Show abstract

Structural damage detection is of great significance for engineering applications. Most damage detection methods are vibration-based methods. In this paper, we propose a method for damage detection of structures under static loads. The wavelet transform technique is introduced in the spatially distributed strain energy signal processing. We can use the singularity of wavelet coefficients to determine the damaged location. Numerical examples under four cases of damages are provided to illustrate the applicability of the proposed method. We use the element stiffness reduction to simulate the damage. The results of the damaged cases have indicated that the different damage locations in a structure can be precisely determined using the proposed method. The damage detection method proposed in this paper can be introduced for engineering applications.

Modeling and prediction of human word search behavior in interactive machine translation

Duo Ji, Bai Yu, Bin Ma, et al.

Show abstract

As a kind of computer aided translation method, Interactive Machine Translation technology reduced manual translation repetitive and mechanical operation through a variety of methods, so as to get the translation efficiency, and played an important role in the practical application of the translation work. In this paper, we regarded the behavior of users' frequently searching for words in the translation process as the research object, and transformed the behavior to the translation selection problem under the current translation. The paper presented a prediction model, which is a comprehensive utilization of alignment model, translation model and language model of the searching words behavior. It achieved a highly accurate prediction of searching words behavior, and reduced the switching of mouse and keyboard operations in the users' translation process.

An 1.4 ppm/°C bandgap voltage reference with automatic curvature-compensation technique

Zekun Zhou, Hongming Yu, Yue Shi, et al.

Show abstract

A high-precision Bandgap voltage reference (BGR) with a novel curvature-compensation scheme is proposed in this paper. The temperature coefficient (TC) can be automatically optimized with a built-in adaptive curvature-compensation technique, which is realized in a digitization control way. Firstly, an exponential curvature compensation method is adopted to reduce the TC in a certain degree, especially in low temperature range. Then, the temperature drift of BGR in higher temperature range can be further minimized by dynamic zero-temperature-coefficient point tracking with temperature changes. With the help of proposed adaptive signal processing, the output voltage of BGR can approximately maintain zero TC in a wider temperature range. Experiment results of the BGR proposed in this paper, which is implemented in 0.35-μm BCD process, illustrate that the TC of 1.4ppm/°C is realized under the power supply voltage of 3.6V and the power supply rejection of the proposed circuit is -67dB.

Differential effects of gender on entropy perception

Kleddao Satcharoen

Show abstract

The purpose of this research is to examine differences in perception of entropy (color intensity) between male and female computer users. The objectives include identifying gender-based differences in entropy intention and exploring the potential effects of these differences (if any) on user interface design. The research is an effort to contribute to an emerging field of interest in gender as it relates to science, engineering and technology (SET), particularly user interface design. Currently, there is limited evidence on the role of gender in user interface design and in use of technology generally, with most efforts at gender-differentiated or customized design based on stereotypes and assumptions about female use of technology or the assumption of a default position based on male preferences. Image entropy was selected as a potential characteristic where gender could be a factor in perception because of known differences in color perception acuity between male and female individuals, even where there is no known color perception abnormality (which is more common with males). Although the literature review suggested that training could offset differences in color perception and identification, tests in untrained subject groups routinely show that females are more able to identify, match, and differentiate colors, and that there is a stronger emotional and psychosocial association of color for females. Since image entropy is associated with information content and image salience, the ability to identify areas of high entropy could make a difference in user perception and technological capabilities.

Robot Design and Control Engineering

Image registration algorithm for high-voltage electric power live line working robot based on binocular vision

Chengqi Li, Zhigang Ren, Bo Yang, et al.

Show abstract

In the process of dismounting and assembling the drop switch for the high-voltage electric power live line working (EPL2W) robot, one of the key problems is the precision of positioning for manipulators, gripper and the bolts used to fix drop switch. To solve it, we study the binocular vision system theory of the robot and the characteristic of dismounting and assembling drop switch. We propose a coarse-to-fine image registration algorithm based on image correlation, which can improve the positioning precision of manipulators and bolt significantly. The algorithm performs the following three steps: firstly, the target points are marked respectively in the right and left visions, and then the system judges whether the target point in right vision can satisfy the lowest registration accuracy by using the similarity of target points’ backgrounds in right and left visions, this is a typical coarse-to-fine strategy; secondly, the system calculates the epipolar line, and then the regional sequence existing matching points is generated according to neighborhood of epipolar line, the optimal matching image is confirmed by calculating the similarity between template image in left vision and the region in regional sequence according to correlation matching; finally, the precise coordinates of target points in right and left visions are calculated according to the optimal matching image. The experiment results indicate that the positioning accuracy of image coordinate is within 2 pixels, the positioning accuracy in the world coordinate system is within 3 mm, the positioning accuracy of binocular vision satisfies the requirement dismounting and assembling the drop switch.

A formation control strategy with coupling weights for the multi-robot system

Xudong Liang, Siming Wang, Weijie Li

Show abstract

The distributed formation problem of the multi-robot system with general linear dynamic characteristics and directed communication topology is discussed. In order to avoid that the multi-robot system can not maintain the desired formation in the complex communication environment, the distributed cooperative algorithm with coupling weights based on zipf distribution is designed. The asymptotic stability condition for the formation of the multi-robot system is given, and the theory of the graph and the Lyapunov theory are used to prove that the formation can converge to the desired geometry formation and the desired motion rules of the virtual leader under this condition. Nontrivial simulations are performed to validate the effectiveness of the distributed cooperative algorithm with coupling weights.

Implementation and performance evaluation open-source controller for precision control of gripper

Seung-Yong Lee, Un-Hyeong Ham, Young-Woo Park, et al.

Show abstract

This paper proposes integrating gripper embedded operating system, which consist of external interface structure for sophisticated gripper control. This system has multiple functions that control the gripping module and measure the pose of the gripper body with respect to contact environment. A controller based on open source only for the gripper is developed and an external communication interface between robot controller and gripper controller is designed. An experimental environment for the fixed-cycle test consists of integrating magic gripper software system and hardware on commercial business. As a result, a deviation is measured approximately 2% and the system were verified for gripper control.

Research on key technology of prognostic and health management for autonomous underwater vehicle

Zhi Zhou

Show abstract

Autonomous Underwater Vehicles (AUVs) are non-cable and autonomous motional underwater robotics. With a wide range of activities, it can reach thousands of kilometers. Because it has the advantages of wide range, good maneuverability, safety and intellectualization, it becomes an important tool for various underwater tasks. How to improve diagnosis accuracy of the AUVs electrical system faults, and how to repair AUVs by the information are the focus of navy in the world. In turn, ensuring safe and reliable operation of the system has very important significance to improve AUVs sailing performance. To solve these problems, in the paper the prognostic and health management(PHM) technology is researched and used to AUV, and the overall framework and key technology are proposed, such as data acquisition, feature extraction, fault diagnosis, failure prediction and so on.

Intelligent navigation and accurate positioning of an assist robot in indoor environments

Bin Hua, Endri Rama, Genci Capi, et al.

Show abstract

Intact robot’s navigation and accurate positioning in indoor environments are still challenging tasks. Especially in robot applications, assisting disabled and/or elderly people in museums/art gallery environments. In this paper, we present a human-like navigation method, where the neural networks control the wheelchair robot to reach the goal location safely, by imitating the supervisor’s motions, and positioning in the intended location. In a museum similar environment, the mobile robot starts navigation from various positions, and uses a low-cost camera to track the target picture, and a laser range finder to make a safe navigation. Results show that the neural controller with the Conjugate Gradient Backpropagation training algorithm gives a robust response to guide the mobile robot accurately to the goal position.

Structural optimization of Beach-Cleaner snatch mechanism

Lian-ge Ouyang, Qin-rui Wei, Shui-ting Zhou, et al.

Show abstract

In the working process of one Beach-Cleaner snatch institution, the second knuckle arm angular speed was too high, which resulted in the pick-up device would crash into the basic arm in the fold process. The rational position of joint to reduce the second knuckle arm angular speed and the force along the axis direction of the most dangerous point can be obtained from the kinematics simulation of snatch institution in the code of Automatic Dynamic Analysis off Mechanical Systems (ADAAMS). The feasible of scheme was validated by analyzing the optimized model in the software of ANSYS. The analysis results revealed: the open angle between the basic arm and the second knuckle arm improved from 125.0° too 135.24°, thee second knuckle arm angular speed decreased from 990.74rad/s to 58.53 rad/s, Not only improved work efficiency of snatch institution, but also prolonged its operation smoothness.