Proceedings Volume 9067

Sixth International Conference on Machine Vision (ICMV 2013)

Branislav Vuksanovic, Jianhong Zhou, Antanas Verikas
cover
Proceedings Volume 9067

Sixth International Conference on Machine Vision (ICMV 2013)

Branislav Vuksanovic, Jianhong Zhou, Antanas Verikas
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 20 December 2013
Contents: 1 Sessions, 89 Papers, 0 Presentations
Conference: Sixth International Conference on Machine Vision (ICMV 13) 2013
Volume Number: 9067

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Sixth International Conference on Machine Vision (ICMV 2013)
Sixth International Conference on Machine Vision (ICMV 2013)
icon_mobile_dropdown
Front Matter: Volume 9067
This PDF file contains the front matter associated with SPIE Proceedings Volume 9067 including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Stereo matching using belief propagation with spatiotemporal consistency
Yingyun Yang, Xie Song, Qin Zhang
In this paper, we propose a stereo matching approach using belief propagation for video disparity estimation by establishing a novel spatiotemporal belief propagation model. The proposed model extends 2D belief propagation algorithm to 3D mode by propagating the belief of preceding frame to the following frame. Additionally, the propagating messages of the preceding frame are translated through referring to motion vector and then used as the initial values of message for the current frame. Meanwhile, the consistency of the motion vector is incorporated to the smoothness constraint for the current frame. The proposed spatiotemporal model of belief propagation has more systematic and comprehensive combination of temporal correlation compared to previous works. The experimental results show that it outperforms the algorithms based on 2D belief propagation especially for non-deformation motion in middle-low speed.
Occluded object imaging via optimal camera selection
Tao Yang, Yanning Zhang, Xiaomin Tong, et al.
High performance occluded object imaging in cluttered scenes is a significant challenging task for many computer vision applications. Recently the camera array synthetic aperture imaging is proved to be an effective way to seeing object through occlusion. However, the imaging quality of occluded object is often significantly decreased by the shadows of the foreground occluder. Although some works have been presented to label the foreground occluder via object segmentation or 3D reconstruction, these methods will fail in the case of complicated occluder and severe occlusion. In this paper, we present a novel optimal camera selection algorithm to solve the above problem. The main characteristics of this algorithm include: (1) Instead of synthetic aperture imaging, we formulate the occluded object imaging problem as an optimal camera selection and mosaicking problem. To the best of our knowledge, our proposed method is the first one for occluded object mosaicing. (2) A greedy optimization framework is presented to propagate the visibility information among various depth focus planes. (3) A multiple label energy minimization formulation is designed in each plane to select the optimal camera. The energy is estimated in the synthetic aperture image volume and integrates the multi-view intensity consistency, previous visibility property and camera view smoothness, which is minimized via Graph cuts. We compare our method with the state-of-the-art synthetic aperture imaging algorithms, and extensive experimental results with qualitative and quantitative analysis demonstrate the effectiveness and superiority of our approach.
Design of directional selection for three-dimensional complex discrete wavelet transform
Takeshi Kato, Zhong Zhang, Hiroshi Toda, et al.
In this paper, we propose the novel design method of directional selection for three-dimensional complex discrete wavelet transform (3D-CDWT) (hereafter, all abbreviations will be written with capital letters). Previously, the complex discrete wavelet transform has been able to extract directional edges from images using the directional selection property. This can be applied to shape and texture analysis. However the angular range of each directional edge is still unclear and the angular ranges are fixed. Thus, it is difficult to be applied to image processing. Therefore, we firstly clarify the angular range of directional edges by using frequency characteristics. Secondly, we propose a design method for the angular range of directional edges with a desirable angular range. As a result, we can clarify the angular ranges of directional edges and it is possible to obtain the directional edges with arbitrary angular range.
Pedestrian cue detection: colour inverse maximum likelihood ratio
Malik Braik, David Pycock
This paper presents an adaptable method for identifying pedestrian cues. Cue detection is investigated for adults in isolation and groups. The aim is to detect a single cue for each pedestrian. Colour Inverse Maximum Likelihood Ratio (IMLR) criteria are employed to distinguish object and background regions using a mask designed to accommodate a wide range of appearances. The adaptability and specificity of the method is demonstrated using images containing trees and street furniture; structures that are often confused with pedestrians by computer vision systems. Test images of low contrast are also included to assess the sensitivity of the cue detection process. Evaluation with over 250 images gives a false positive error rate of 10% and a false negative error rate of 1.5% % under exacting detection criteria with a complexity of where n is the number of image points considered. The speed of execution is 8 mS per frame for images of 640 by 480 pixels on an Intel core i3-2310MTM CPU running at 2.10GHz with 4.00GB RAM.
A method of real-time detection for distant moving obstacles by monocular vision
Bao-zhi Jia, Ming Zhu
In this paper, we propose an approach for detection of distant moving obstacles like cars and bicycles by a monocular camera to cooperate with ultrasonic sensors in low-cost condition. We are aiming at detecting distant obstacles that move toward our autonomous navigation car in order to give alarm and keep away from them. Method of frame differencing is applied to find obstacles after compensation of camera’s ego-motion. Meanwhile, each obstacle is separated from others in an independent area and given a confidence level to indicate whether it is coming closer. The results on an open dataset and our own autonomous navigation car have proved that the method is effective for detection of distant moving obstacles in real-time.
Finger mouse system based on computer vision in complex backgrounds
Jun Xu, Xiong Zhang
This paper presents a human-computer interaction system and realizes a real-time virtual mouse. Our system emulates the dragging and selecting functions of a mouse by recognizing bare hands, hence the control style is simple and intuitive. A single camera is used to capture hand images and a DSP chip is embedded as the image processing platform. To deal with complex backgrounds, particularly where skin-like or moving objects appear, we develop novel hand recognition algorithms. Hand segmentation is achieved by skin color cue and background difference. Each input image is corrected according to the luminance and then skin color is extracted by Gaussian model. We employ a Camshift tracking algorithm which receives feedbacks from the recognition module. In fingertip recognition, a method combining template matching and circle drawing is proposed. Our system has advantages of good real-time performance, easy integration and energy conservation. Experiments show that the system is robust to the scaling and rotation of hands.
A defects detection system for the surfaces of stampings
Baowen Chen, Jun Jiang, Jun Cheng, et al.
Detecting defects on the surfaces of stampings plays a critical role in the manufacturing process. Many methods have been proposed to detect and identify simple defects on stampings. However, these methods suffer from large system size, high cost, and low speed for inspection. This paper proposes a new visual system for detecting defects on the surfaces of stampings. A set of LED bar lights are used to illuminate the stamping surface from the four sides. This can ensure that the irradiation directions are parallel to the surface. Thus, it can enhance the imaging of the defects and punching edges in the vertical orientation of the surface, which facilitates the location of the defects such as scratch and pitting and the measurement of the punching sizes. Thereby, the defects can be classified using simple shape and dimension analysis. The proposed system is a part of the automated sorting system. Practical operations verify the effectiveness of the proposed system.
Face recognition using sparse representation classifier with Volterra kernels
Hengjian Li, Lianhai Wang, Jiashu Zhang, et al.
Sparse representation based classification (SRC) could not well classify the sample belonging to different classes distribute on the same direction. To solve the problem, a Volterra kernel sparse representation based classification (Volterra-SRC) algorithm is proposed in this paper. Firstly, the original face images are divided into non overlapped patches and then mapped into a high dimensional space by utilizing the Volterra kernels. During the training stage, following by the Fisher criteria, the objective function is defined to obtain the optimal Volterra kernels via maximizing inter-class distances and minimizing intra-class distances simultaneously. During the testing stage, a voting procedure is introduced in conjunction with a sparse representation based classification to decide to which class each individual patch belongs. Finally, the aggregate classification results of all patches in a face are used to determine the overall recognition outcome for the given face image. We demonstrate the experiments on ORL and Extended Yale B benchmark face databases and show that our proposed Volterra-SRC algorithm consistently outperforms the original SRC and the proposed has some advantages and robustness in case of small train number samples.
Multi-view urban scene reconstruction in non-uniform volume
Runchao Mao, Qiang Wu, Yu Qiao, et al.
This paper presents a new fully automatic approach for multi-view urban scene reconstruction. Our algorithm is based on the Manhattan-World assumption, which can provide compact models while preserving fidelity of synthetic architectures. Starting from a dense point cloud, we extract its main axes by global optimization, and construct a nonuniform volume based on them. A graph model is created from volume facets rather than voxels. Appropriate edge weights are defined to ensure the validity and quality of the surface reconstruction. Compared with the common pointcloud- to-model methods, the proposed methodology exploits image information to unveil the real structures of holes in the point cloud. Experiments demonstrate the encouraging performance of the algorithm.
Automatic 2D-to-3D video conversion by monocular depth cues fusion and utilizing human face landmarks
In this paper, we propose a hybrid 2D-to-3D video conversion system to recover the 3D structure of the scene. Depending on the scene characteristics, geometric or height depth information is adopted to form the initial depth map. This depth map is fused with color-based depth cues to construct the nal depth map of the scene background. The depths of the foreground objects are estimated after their classi cation into human and non-human regions. Speci cally, the depth of a non-human foreground object is directly calculated from the depth of the region behind it in the background. To acquire more accurate depth for the regions containing a human, the estimation of the distance between face landmarks is also taken into account. Finally, the computed depth information of the foreground regions is superimposed on the background depth map to generate the complete depth map of the scene which is the main goal in the process of converting 2D video to 3D.
Semantic labeling of indoor scenes from RGB-D images with discriminative learning
Bo Liu, Haoqi Fan
Recently emerged RGB-D sensors provide great promise for indoor scene understanding, which is a fundamental and challenging problem in computer vision. We present a discriminative model in this paper to semantically label indoor scenes from RGB-D images. Unlike previous work which only labels pre-determined superpixels, we characterize the scenes with a set of planes and compose them into objects. The optimal way to composition and corresponding labels are inferred simultaneously using a greedy algorithm. Our model considers unary features and pairwise and co-occurrence context, as well as latent variables that account for multi-mode distributions of each object category. We train the model with latent structural SVM learning framework. Our approach achieves state-of-the-art performance on the Cornell RGB-D indoor scene dataset [1].
Variational Bayesian level set for image segmentation
Han-Bing Qu, Lin Xiang, Jia-Qiang Wang, et al.
In this paper, we present a variational Bayesian framework for level set image segmentation, which utilizes Gaussian mixtures model to approximate the posteriors of image intensities inside and outside of the zero level set, respectively. The active curve will evolve according to the approximate log marginal probability of each region and a partition of image is obtained by the sign of the level set function. Our method provides a flexible probabilistic framework to model image data with flexible Gaussian mixtures model. Experimental results demonstrate our approach is comparable to classical level set segmentation method.
Face and eyes localization for pose and light invariant face image
In this paper, a new pose-robust and light-invariant face and eyes localization method using integral projection of adaptive probability is proposed for multiview face recognition. First, an automatic preprocessing method is proposed to balance brightness and color of original image. Then an unsupervised multi-component skin model is established to obtain the adaptive skin probability. Finally the integral projection of adaptive probability is used to realize the postrobust face localization. Experiment results show that the effectiveness of proposed method.
Robust moving ship detection using context-based motion analysis and occlusion handling
This paper proposes an original moving ship detection approach in video surveillance systems, especially con- centrating on occlusion problems among ships and vegetation using context information. Firstly, an over- segmentation is performed to divide and classify by SVM (Support Vector Machine) segments into water or non-water, while exploiting the context that ships move only in water. We assume that the ship motion to be characterized by motion saliency and consistency, such that each ship distinguish itself. Therefore, based on the water context model, non-water segments are merged into regions with motion similarity. Then, moving ships are detected by measuring the motion saliency of those regions. Experiments on real-life surveillance videos prove the accuracy and robustness of the proposed approach. We especially pay attention to testing in the cases of severe occlusions between ships and between ship and vegetation. The proposed algorithm outperforms, in terms of precision and recall, our earlier work and a proposal using SVM-based ship detection.
Mutual information-based facial expression recognition
Mliki Hazar, Mohamed Hammami, Ben-Abdallah Hanêne
This paper introduces a novel low-computation discriminative regions representation for expression analysis task. The proposed approach relies on interesting studies in psychology which show that most of the descriptive and responsible regions for facial expression are located around some face parts. The contributions of this work lie in the proposition of new approach which supports automatic facial expression recognition based on automatic regions selection. The regions selection step aims to select the descriptive regions responsible or facial expression and was performed using Mutual Information (MI) technique. For facial feature extraction, we have applied Local Binary Patterns Pattern (LBP) on Gradient image to encode salient micro-patterns of facial expressions. Experimental studies have shown that using discriminative regions provide better results than using the whole face regions whilst reducing features vector dimension.
Automatically measuring the effect of strategy drawing features on pupils’ handwriting and gender
Narges Tabatabaey-Mashadi, Rubita Sudirman, Richard M. Guest, et al.
Children’s dynamic drawing strategies have been recently recognized as indicators of handwriting ability. However the influence of each feature in predicting handwriting is unknown due to lack of a measuring system. An automated measuring algorithm suitable for psychological assessment and non-subjective scoring is presented here. Using the weight vector and classification rate of a machine learning algorithm, an overall feature’s effect is calculated which is comparable in different groupings. In this study thirteen previously detected drawing strategy features are measured for their influence on handwriting and gender. Features are extracted from drawing a triangle, Beery VMI and Bender Gestalt tangent patterns. Samples are related to 203 pupils (77 below average writers, and 101 female). The results show that the number of strokes in drawing the triangle pattern plays a major role in both groupings; however Left Tendency flag feature is affected by children’s handwriting about 2.5 times greater than their gender. Experiments indicate that different forms of a feature sometimes show different influences.
Robust matching of SIFT keypoints via adaptive distance ratio thresholding
Liang Mi, Yu Qiao, Jie Yang, et al.
This paper presents a robust method to search for the correct SIFT keypoint matches with adaptive distance ratio threshold. Firstly, the reference image is analyzed by extracting some characteristics of its SIFT keypoints, such as their distance to the object boundary and the number of their neighborhood keypoints. The matching credit of each keypoint is evaluated based on its characteristics. Secondly, an adaptive distance ratio threshold for the keypoint is determined based on its matching credit to identify the correctness of its best match in the source image. The adaptive threshold loosens the matching conditions for keypoints of high matching credits and tightens the conditions for those of low matching credits. Our approach improves the scheme of SIFT keypoint matching by applying adaptive distance ratio threshold rather than global threshold that ignores different matching credits of various keypoints. The experiment results show that our algorithm outperforms the standard SIFT matching method in some complicated cases of object recognition, in which it discards more false matches as well as preserves more correct matches.
Object class and instance recognition on rgb-d data
Viktor Seib, Susanne Christ-Friedmann, Susanne Thierfelder, et al.
We present a novel approach for combining 3D depth and visual information for object class and object instance recognition. Object classes are recognized by first assigning local geometric primitive labels using a CRF, followed by an SVM classification. Object instances are recognized using Hough-transform clustering of SURF features. Both algorithms perform well on publicly available object databases as well as on acquired data with an RGB-D camera. The ob - ject instance recognition algorithm was further evaluated during the RoboCup world championship 2012 in Mexico-City and won the first place in the Technical Challenge of the @Home-league.
Aspects on the design, implementation, and simulation of a tracked mini robot destined for special applications in theatres of operations
Silviu-Mihai Petrişor, Ghiţă Bârsan
The authors of this paper wish to highlight elements regarding the organology, functioning and simulation, in a real workspace, of a tracked mini robot structure destined for special applications in theatres of operation, a technological product which is subject to a national patent granted to our institution (patent no. RO a 2012 01051), the result of research activities undertaken under a contract won by national competition, a grant for young research teams, PN-RUTE- 2010 type. The issues outlined in this paper are aspects related to the original invention in comparison with other mini-robot structures, the inventors presenting succinctly the technological product description and its applicability both in the military and applicative area as well as in the educational one. Additionally, the advantages of using the technological product are shown in a real workspace, the constructive and functional solution before, finally, presenting, based on the modelling of the mechanical structure of the tilting module attached to the mini-robot, an application on the simulation and programming of the mini-robot under study.
Breast cancer mitosis detection in histopathological images with spatial feature extraction
Abdülkadir Albayrak, Gökhan Bilgin
In this work, cellular mitosis detection in histopathological images has been investigated. Mitosis detection is very expensive and time consuming process. Development of digital imaging in pathology has enabled reasonable and effective solution to this problem. Segmentation of digital images provides easier analysis of cell structures in histopathological data. To differentiate normal and mitotic cells in histopathological images, feature extraction step is very crucial step for the system accuracy. A mitotic cell has more distinctive textural dissimilarities than the other normal cells. Hence, it is important to incorporate spatial information in feature extraction or in post-processing steps. As a main part of this study, Haralick texture descriptor has been proposed with different spatial window sizes in RGB and La*b* color spaces. So, spatial dependencies of normal and mitotic cellular pixels can be evaluated within different pixel neighborhoods. Extracted features are compared with various sample sizes by Support Vector Machines using k-fold cross validation method. According to the represented results, it has been shown that separation accuracy on mitotic and non-mitotic cellular pixels gets better with the increasing size of spatial window.
Improved segmentation of occluded and adjoining vehicles in traffic surveillance videos
Occlusion in image processing refers to concealment of any part of the object or the whole object from view of an observer. Real time videos captured by static cameras on roads often encounter overlapping and hence, occlusion of vehicles. Occlusion in traffic surveillance videos usually occurs when an object which is being tracked is hidden by another object. This makes it difficult for the object detection algorithms to distinguish all the vehicles efficiently. Also morphological operations tend to join the close proximity vehicles resulting in formation of a single bounding box around more than one vehicle. Such problems lead to errors in further video processing, like counting of vehicles in a video. The proposed system brings forward efficient moving object detection and tracking approach to reduce such errors. The paper uses successive frame subtraction technique for detection of moving objects. Further, this paper implements the watershed algorithm to segment the overlapped and adjoining vehicles. The segmentation results have been improved by the use of noise and morphological operations.
Hyperspectral image classification based on NMF Features Selection Method
Bolanle T. Abe, J. A. Jordaan
Hyperspectral instruments are capable of collecting hundreds of images corresponding to wavelength channels for the same area on the earth surface. Due to the huge number of features (bands) in hyperspectral imagery, land cover classification procedures are computationally expensive and pose a problem known as the curse of dimensionality. In addition, higher correlation among contiguous bands increases the redundancy within the bands. Hence, dimension reduction of hyperspectral data is very crucial so as to obtain good classification accuracy results. This paper presents a new feature selection technique. Non-negative Matrix Factorization (NMF) algorithm is proposed to obtain reduced relevant features in the input domain of each class label. This aimed to reduce classification error and dimensionality of classification challenges. Indiana pines of the Northwest Indiana dataset is used to evaluate the performance of the proposed method through experiments of features selection and classification. The Waikato Environment for Knowledge Analysis (WEKA) data mining framework is selected as a tool to implement the classification using Support Vector Machines and Neural Network. The selected features subsets are subjected to land cover classification to investigate the performance of the classifiers and how the features size affects classification accuracy. Results obtained shows that performances of the classifiers are significant. The study makes a positive contribution to the problems of hyperspectral imagery by exploring NMF, SVMs and NN to improve classification accuracy. The performances of the classifiers are valuable for decision maker to consider tradeoffs in method accuracy versus method complexity.
Ensemble classifier using GRG algorithm for land cover classification
Bolanle T. Abe, J. A. Jordaan, Tshilidzi Marwala
Image processing is of great value because it enables satellite images to be translated into useful information. The preprocessing of remotely sensed images before features extraction is important to remove noise and improve the ability to interpret image data more accurately. All images should appear as if they were acquired from the same sensor at the end of image preprocessing. A major challenge associated with hyperspectral imagery in remote sensing analysis is the mixed pixels which are due to huge dimension nature of the data. This study makes a positive contribution to the problem of land cover classification by exploring Generalized Reduced Gradient (GRG) algorithm on hyperspectral datasets by using Washington DC mall and Indiana pines test site of Northwestern Indiana, USA as study sites. The algorithm was used to estimate the fractional abundance in the datasets for land cover classification. Ensemble classifiers such as random forest, bagging and support vector machines were implemented in Waikato Environment for knowledge Analysis (WEKA) to carry out the classification procedures. Experimental results show that random forest ensemble outperformed the other ensemble methods. The comparison of the classifiers is crucial for a decision maker to consider compromises in accuracy technique against complexity technique.
A new approach of facial features’ localization using a morphological operation in still and sequence images
Kenz Amhmed Bozed, Osei Adjei, Ali Mansour
Facial features’ localization is a crucial step for many systems of face detection and facial expression recognition. It plays an essential role in human face analysis especially in searching for facial features (mouth, nose and eyes) when the face region is included within the image. The fundamental technique used in facial analysis is to detect the face and subsequently the associated salient features. In this paper, a new Algorithm is based on morphological properties of the face region for the extraction of salient features is proposed. A morphological operation is used to locate the pupils of the eyes and estimate the mouth position according to them. The boundaries of the allocated features are computed as a result when the features are allocated. This algorithm is applied to individual images subsequently application to video sequences. The experimental results achieved from this work indicate that the algorithm has been very successful in recognizing different types of facial expressions.
Performance comparisons for well-known edge detectors with proposed Yong operator
Ching Yee Yong, Rubita Sudirman, Nasrul Humaimi Mahmood, et al.
This study investigates and acts as a trial outcome for image processing technique. The proposed Yong operator edge detection filter was developed to analyse and access the information in an image that can be used in hospitals, clinics and imaging researches. It aims to enhance and obtain the ideal edges from real life images of moderate complexity. Results show the proposed Yong operator capable to connect broken edges, reduce false edges, resist to noise and handle fragmentation for what the well-known operators are not able to do so. The proposed operator was successfully interpreting several of complicated image data types relatively in terms of inexpensive computation and short processing time.
A new occluded target location method based on straight line
Feng Gao, Shaohua Qiu, Gongjian Wen
A new partially occluded target location method based on straight line is proposed. It is divided into four steps: firstly, we label the straight lines of concerned target in the history image artificially and store the line points together with the grads orientation. The labeled lines, the length of which is restricted, should distribute symmetrically. Then, transform the stored lines using the transformation model whose parameters are derived from geometry calibration result of the real-time image. Afterwards, construct pyramid structure of real-time image and search the optimal match position. The geometry coherence rule is used to gain holistic optimal match result. Lastly, compare the matching measure with the threshold to decide whether need to perform the same match process using the higher solution image, and output the match result. The experiment results, tested by real-time remote sensing images especially when part of them are occluded, are shown that the proposed algorithm for target location is accurate and effective.
Cascading conditional random fields for image registration
F. C. Calnegru
This article presents a new Markov Random Field based algorithm for parametric image registration. The algorithm consists in approximating the parameters of the registering transformation, by cascading a number of second order conditional random fields, until a certain condition is met, and then, in refining those parameters through estimating the energy minimum of a third order conditional random field. By casting the registration task in this computational framework, we circumvent the problems associated with estimating the parameters in a higher order Markov Random Field, as well as the accuracy issues introduced by approximating the energy that has to be minimized. The main features of our algorithm are speed, generality, being able to cope with all the types of similarity measures, and accuracy.
Visual odometry with high resolution time-of-flight cameras
Yosef Dalbah, Nicolas Dingeldey, Friedrich M. Wahl
We present a method that estimates the ego motion of a vehicle based on camera data of high resolution (207x204 pixels) Time-of-Flight cameras using visual odometry techniques. Translation and rotation of camera motion in six degrees of freedom are calculated. Point correspondences in consecutive amplitude image pairs are built. By consideration of the depth image of the camera 3D point correspondences are derived from the two dimensional point correspondences. Camera motion between the two images is then computed by registration of the two resulting point clouds. The process is optimized by incorporation of outlier removal and a multi sensor setup. The presented optimizations raise the precision and robustness of the method and enable visual odometry by Time-of-Flight camera data as an alternative to common odometry systems in low speed scenarios.
Weakly supervised automatic segmentation and 3D modeling of the knee joint from MR images
Amal Amami, Zouhour Ben Azouz
Automatic segmentation and 3D modeling of the knee joint from MR images, is a challenging task. Most of the existing techniques require the tedious manual segmentation of a training set of MRIs. We present an approach that necessitates the manual segmentation of one MR image. It is based on a volumetric active appearance model. First, a dense tetrahedral mesh is automatically created on a reference MR image that is arbitrary selected. Second, a pairwise non-rigid registration between each MRI from a training set and the reference MRI is computed. The non-rigid registration is based on a piece-wise affine deformation using the created tetrahedral mesh. The minimum description length is then used to bring all the MR images into a correspondence. An average image and tetrahedral mesh, as well as a set of main modes of variations, are generated using the established correspondence. Any manual segmentation of the average MRI can be mapped to other MR images using the AAM. The proposed approach has the advantage of simultaneously generating 3D reconstructions of the surface as well as a 3D solid model of the knee joint. The generated surfaces and tetrahedral meshes present the interesting property of fulfilling a correspondence between different MR images. This paper shows preliminary results of the proposed approach. It demonstrates the automatic segmentation and 3D reconstruction of a knee joint obtained by mapping a manual segmentation of a reference image.
Feature measures for the segmentation of neuronal membrane using a machine learning algorithm
Saadia Iftikhar, Afzal Godil
In this paper, we present a Support Vector Machine (SVM) based pixel classifier for a semi-automated segmentation algorithm to detect neuronal membrane structures in stacks of electron microscopy images of brain tissue samples. This algorithm uses high-dimensional feature spaces extracted from center-surrounded patches, and some distinct edge sensitive features for each pixel in the image, and a training dataset for the segmentation of neuronal membrane structures and background. Some threshold conditions are later applied to remove small regions, which are below a certain threshold criteria, and morphological operations, such as the filling of the detected objects, are done to get compactness in the objects. The performance of the segmentation method is calculated on the unseen data by using three distinct error measures: pixel error, wrapping error, and rand error, and also a pixel by pixel accuracy measure with their respective ground-truth. The trained SVM classifier achieves the best precision level in these three distinct errors at 0.23, 0.016 and 0.15, respectively; while the best accuracy using pixel by pixel measure reaches 77% on the given dataset. The results presented here are one step further towards exploring possible ways to solve these hard problems, such as segmentation in medical image analysis. In the future, we plan to extend it as a 3D segmentation approach for 3D datasets to not only retain the topological structures in the dataset but also for the ease of further analysis.
Unattended vehicle detection for automatic traffic light control
Aya Salama Abdel Hady, Mohamed Moustafa
Machine vision based traffic light control depends mainly on measuring traffic statistics at cross roads. Most of the previous studies have not taken unattended vehicles into consideration when calculating either the traffic density or the traffic flow. In this paper, we propose incorporating unattended vehicles into a new metric for measuring the traffic congestion. In addition to the vehicle motion analysis, opening the driver's side door is an important indicator that this vehicle is going to be unattended. Therefore, we focus in this paper on presenting how to detect this event for stationary vehicles from a live camera or a video feed. Through a set of experiments, we have found out that a Scale Invariant Feature Transform (SIFT) feature-descriptor with a Support Vector Machines (SVM) classifier was able to successfully classify open-door vehicles from closed-door ones in 96.7% of our test dataset.
A Kinect based sign language recognition system using spatio-temporal features
Abbas Memiş, Songül Albayrak
This paper presents a sign language recognition system that uses spatio-temporal features on RGB video images and depth maps for dynamic gestures of Turkish Sign Language. Proposed system uses motion differences and accumulation approach for temporal gesture analysis. Motion accumulation method, which is an effective method for temporal domain analysis of gestures, produces an accumulated motion image by combining differences of successive video frames. Then, 2D Discrete Cosine Transform (DCT) is applied to accumulated motion images and temporal domain features transformed into spatial domain. These processes are performed on both RGB images and depth maps separately. DCT coefficients that represent sign gestures are picked up via zigzag scanning and feature vectors are generated. In order to recognize sign gestures, K-Nearest Neighbor classifier with Manhattan distance is performed. Performance of the proposed sign language recognition system is evaluated on a sign database that contains 1002 isolated dynamic signs belongs to 111 words of Turkish Sign Language (TSL) in three different categories. Proposed sign language recognition system has promising success rates.
Vehicle classification in video using virtual detection lines
Roberts Kadiķis, Kārlis Freivalds
A video processing algorithm for vehicle parameter acquisition and classification is presented. The algorithm is based on combination of several detection lines. According to passing vehicles, intervals are created on the detection lines. Intervals on different lines, belonging to the same vehicle, are combined. Further processing of vehicle intervals allows to acquire vehicle parameters and to classify vehicles. The accuracy of vehicle counting and classification is analyzed on different videos.
License plate detection algorithm
Michael Broitman, Yuri Klopovsky, Normunds Silinskis
A novel algorithm for vehicle license plates localization is proposed. The algorithm is based on pixel intensity transition gradient analysis. Near to 2500 natural-scene gray-level vehicle images of different backgrounds and ambient illumination was tested. The best set of algorithm’s parameters produces detection rate up to 0.94. Taking into account abnormal camera location during our tests and therefore geometrical distortion and troubles from trees this result could be considered as passable. Correlation between source data, such as license Plate dimensions and texture, cameras location and others, and parameters of algorithm were also defined.
Action classification using a discriminative non-parametric Hidden Markov Model
Natraj Raman, S. J. Maybank, Dell Zhang
We classify human actions occurring in videos, using the skeletal joint positions extracted from a depth image sequence as features. Each action class is represented by a non-parametric Hidden Markov Model (NP-HMM) and the model parameters are learnt in a discriminative way. Specifically, we use a Bayesian framework based on Hierarchical Dirichlet Process (HDP) to automatically infer the cardinality of hidden states and formulate a discriminative function based on distance between Gaussian distributions to improve classification performance. We use elliptical slice sampling to efficiently sample parameters from the complex posterior distribution induced by our discriminative likelihood function. We illustrate our classification results for action class models trained using this technique.
A new accurate pill recognition system using imprint information
Zhiyuan Chen, Sei-ichiro Kamata
Great achievements in modern medicine benefit human beings. Also, it has brought about an explosive growth of pharmaceuticals that current in the market. In daily life, pharmaceuticals sometimes confuse people when they are found unlabeled. In this paper, we propose an automatic pill recognition technique to solve this problem. It functions mainly based on the imprint feature of the pills, which is extracted by proposed MSWT (modified stroke width transform) and described by WSC (weighted shape context). Experiments show that our proposed pill recognition method can reach an accurate rate up to 92.03% within top 5 ranks when trying to classify more than 10 thousand query pill images into around 2000 categories.
PCA facial expression recognition
Inas H. El-Hori, Zahraa K. El-Momen, Ali Ganoun
This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. The comparative study of Facial Expression Recognition (FER) techniques namely Principal Component’s analysis (PCA) and PCA with Gabor filters (GF) is done. The objective of this research is to show that PCA with Gabor filters is superior to the first technique in terms of recognition rate. To test and evaluates their performance, experiments are performed using real database by both techniques. The universally accepted five principal emotions to be recognized are: Happy, Sad, Disgust and Angry along with Neutral. The recognition rates are obtained on all the facial expressions.
Application of edge detection algorithm for vision guided robotics assembly system
Bunil Kumar Balabantaray, Panchanand Jha, Bibhuti Bhusan Biswal
Machine vision system has a major role in making robotic assembly system autonomous. Part detection and identification of the correct part are important tasks which need to be carefully done by a vision system to initiate the process. This process consists of many sub-processes wherein, the image capturing, digitizing and enhancing, etc. do account for reconstructive the part for subsequent operations. Edge detection of the grabbed image, therefore, plays an important role in the entire image processing activity. Thus one needs to choose the correct tool for the process with respect to the given environment. In this paper the comparative study of edge detection algorithm with grasping the object in robot assembly system is presented. The proposed work is performed on the Matlab R2010a Simulink. This paper proposes four algorithms i.e. Canny’s, Robert, Prewitt and Sobel edge detection algorithm. An attempt has been made to find the best algorithm for the problem. It is found that Canny’s edge detection algorithm gives better result and minimum error for the intended task.
Part identification in robotic assembly using vision system
Bunil Kumar Balabantaray, Bibhuti Bhusan Biswal
Machine vision system acts an important role in making robotic assembly system autonomous. Identification of the correct part is an important task which needs to be carefully done by a vision system to feed the robot with correct information for further processing. This process consists of many sub-processes wherein, the image capturing, digitizing and enhancing, etc. do account for reconstructive the part for subsequent operations. Interest point detection of the grabbed image, therefore, plays an important role in the entire image processing activity. Thus it needs to choose the correct tool for the process with respect to the given environment. In this paper analysis of three major corner detection algorithms is performed on the basis of their accuracy, speed and robustness to noise. The work is performed on the Matlab R2012a. An attempt has been made to find the best algorithm for the problem.
Forensic analysis of social networking application on iOS devices
Shuhui Zhang, Lianhai Wang
The increased use of social networking application on iPhone and iPad make these devices a goldmine for forensic investigators. Besides, QQ, Wechat, Sina Weibo and skype applications are very popular in China and didn’t draw attention to researchers. These social networking applications are used not only on computers, but also mobile phones and tablets. This paper focuses on conducting forensic analysis on these four social networking applications on iPhone and iPad devices. The tests consisted of installing the social networking applications on each device, conducting common user activities through each application and correlation analysis with other activities. Advices to the forensic investigators are also given. It could help the investigators to describe the crime behavior and reconstruct the crime venue.
Real-time oriented edge detection via difference of shifted image
Kiseon Jeong, Moonyong Jin, Daegyu Hwang, et al.
We propose a novel oriented edge detection method called Difference of Shifted Image (DoSI) which has only subtractions between neighborhood pixels using padding-based shifting operation. Firstly, we can more quickly extract an oriented edge component in each direction from 8-neighborhoods using DoSI because there are no multiplications. Then, we can make a final edge map using all edge components by taking maximum value per each pixel. Moreover, we propose various types of oriented edge operators based on the Prewitt, Sobel and Laplacian. They are achieved by combinations of some oriented edge components obtained from DoSI. They have similar performance to existing edge operators based on convolution operations and also their procedures can be implemented in parallel. The experimental results show that the proposed edge detection methods requires less computation time than convolution-based methods and most of them are similar in edge description ability to the existing oriented edge operators.
Real-time pedestrian detection based on GMM and HOG cascade
Moonyong Jin, Kiseon Jeong, Sook Yoon, et al.
Most of the human detection methods are using HOG (Histogram of Oriented Gradients). In the case of fixed camera environment, it is possible to make background model using GMM (Gaussian mixture model) and easily extract motions using background subtraction. However, it is difficult to recognize pedestrians among extracted motions. In this paper, we propose an efficient coarse-to-fine pedestrian detection framework which combines motion detection and HOG cascade to make a faster pedestrian detector. Firstly, motion detection is used as the coarse detection in order to reduce the area of interest to be covered by the pedestrian detector. Then HOG cascade which detects pedestrians is executed only on the blobs or ROIs selected from the coarse detection. The experimental results on PET2009 768X576 dataset show that proposed method of which processing speed is 11.46 fps is 7.5 times faster than HOG and 2.2 times faster than HOG cascade.
Violence detection based on histogram of optical flow orientation
Zhijie Yang, Tao Zhang, Jie Yang, et al.
In this paper, we propose a novel approach for violence detection and localization in a public scene. Currently, violence detection is considerably under-researched compared with the common action recognition. Although existing methods can detect the presence of violence in a video, they cannot precisely locate the regions in the scene where violence is happening. This paper will tackle the challenge and propose a novel method to locate the violence location in the scene, which is important for public surveillance. The Gaussian Mixed Model is extended into the optical flow domain in order to detect candidate violence regions. In each region, a new descriptor, Histogram of Optical Flow Orientation (HOFO), is proposed to measure the spatial-temporal features. A linear SVM is trained based on the descriptor. The performance of the method is demonstrated on the publicly available data sets, BEHAVE and CAVIAR.
Practical algorithmic probability: an image inpainting example
Possibility of practical application of algorithmic probability is analyzed on an example of image inpainting problem that precisely corresponds to the prediction problem. Such consideration is fruitful both for the theory of universal prediction and practical image inpaiting methods. Efficient application of algorithmic probability implies that its computation is essentially optimized for some specific data representation. In this paper, we considered one image representation, namely spectral representation, for which an image inpainting algorithm is proposed based on the spectrum entropy criterion. This algorithm showed promising results in spite of very simple representation. The same approach can be used for introducing ALP-based criterion for more powerful image representations.
Traffic congestion classification using motion vector statistical features
Amina Riaz, Shoab A. Khan
Due to the rapid increase in population, one of the major problems faced by the urban areas is traffic congestion. In this paper we propose a method for classifying highway traffic congestion using motion vector statistical properties. Motion vectors are estimated using pyramidal Kanada-Lucas-Tomasi (KLT) tracker algorithm. Then motion vector features are extracted and are used to classify the traffic patterns into three categories: light, medium and heavy. Classification using neural network, on publicly available dataset, shows an accuracy of 95.28%, with robustness to environmental conditions such as variable luminance. Our system provides a more accurate solution to the problem as compared to the systems previously proposed.
Classification of wet aged related macular degeneration using optical coherence tomographic images
Anam Haq, Fouwad Jamil Mir, Ubaid Ullah Yasin, et al.
Wet Age related macular degeneration (AMD) is a type of age related macular degeneration. In order to detect Wet AMD we look for Pigment Epithelium detachment (PED) and fluid filled region caused by choroidal neovascularization (CNV). This form of AMD can cause vision loss if not treated in time. In this article we have proposed an automated system for detection of Wet AMD in Optical coherence tomographic (OCT) images. The proposed system extracts PED and CNV from OCT images using segmentation and morphological operations and then detailed feature set are extracted. These features are then passed on to the classifier for classification. Finally performance measures like accuracy, sensitivity and specificity are calculated and the classifier delivering the maximum performance is selected as a comparison measure. Our system gives higher performance using SVM as compared to other methods.
Robust place recognition with an application to semantic topological mapping
J. R. Siddiqui, S. Khatibi
The problem of robust and invariant representation of places is being addressed. A place recognition technique is proposed followed by an application to a semantic topological mapping. The proposed technique is evaluated on a robot localization database which consists of a large set of images taken under various weather conditions. The results show that the proposed method can robustly recognize the places and is invariant to geometric transformations, brightness changes and noise. The comparative analysis with the state-of-the-art semantic place description methods show that the method outperforms the competing methods and exhibits better average recognition rates.
Exploring manifold structure of face images via multiple graphs
Masheal Alghamdi
Geometric structure in the data provides important information for face image recognition and classification tasks. Graph regularized non-negative matrix factorization (GrNMF) performs well in this task. However, it is sensitive to the parameters selection. Wang et al. proposed multiple graph regularized non-negative matrix factorization (MultiGrNMF) to solve the parameter selection problem by testing it on medical images. In this paper, we introduce the MultiGrNMF algorithm in the context of still face Image classification, and conduct a comparative study of NMF, GrNMF, and MultiGrNMF using two well-known face databases. Experimental results show that MultiGrNMF outperforms NMF and GrNMF for most cases.
Robust visual tracking based on online learning of joint sparse dictionary
Qiaozhe Li, Yu Qiao, Jie Yang, et al.
In this paper, we propose a robust visual tracking algorithm based on online learning of a joint sparse dictionary. The joint sparse dictionary consists of positive and negative sub-dictionaries, which model foreground and background objects respectively. An online dictionary learning method is developed to update the joint sparse dictionary by selecting both positive and negative bases from bags of positive and negative image patches/templates during tracking. A linear classifier is trained with sparse coefficients of image patches in the current frame, which are calculated using the joint sparse dictionary. This classifier is then used to locate the target in the next frame. Experimental results show that our tracking method is robust against object variation, occlusion and illumination change.
Depth consistency evaluation for error-pose detection
Sou-Young Jin, Ho-Jin Choi, Youssef Iraqi
With the development of depth sensors, i.e. Kinect, it is now possible to predict human body poses from a depthmap without any manual labeling. The predicted poses can be used as meaningful features for many applications such as human action recognition. However, existing pose estimation algorithms are not perfect, which can seriously affect the performance of its following applications. In this paper, we propose a novel method to detect erroneous poses. Human poses are captured by Kinect SDK which predicts body joints and connects them with straight lines to represent a pose. We observe depth gradient of pixels located on a body part is consistent when the body part is predicted correctly. With this observation, our algorithm examines depth gradients of pixels on each body part. During the depth gradient processing, our algorithm also considers occlusions. Once a sudden change is detected in depth values on a body part, we check whether the gradient is still consistent excluding the sudden change region. We tested our algorithm on many human activities and our experimental results show that our algorithm acceptably detects erroneous poses in real time.
Clustering space-time interest points for action representation
Sou-Young Jin, Ho-Jin Choi
This paper presents a novel approach to represent human actions in a video. Our approach deals with the limitation of local representation, i.e. space-time interest points, which cannot adequately represent actions in a video due to lack of global information about geometric relationships among interest points. It adds the geometric relationships to interest points by clustering interest points using squared Euclidean distances, followed by using a minimum hexahedron to represent each cluster. Within each video, we build a multi-dimensional histogram based on the characteristics of hexahedrons in the video for recognition. The experimental results show that the proposed representation is powerful to include the global information on top of local interest points and it successfully increases the accuracy of action recognition.
Multiband space time processing for torpedo alert sonar
Yang Chen, Anbang Zhao
A space time processing technology using harmonic CW wave is introduced to enhance the detecting performance of motion target based on active towed sonar based on CW wave. The detecting ability of CW wave and harmonic CW wave in multi-path channel is analyzed comparatively. The simulation results indicate that in multi-path channel harmonic CW wave is provided with a better performance.
Using motion correction to improve real-time cardiac MRI reconstruction
E. Bilgazyev, I. Uyanik, M. Unan, et al.
Cardiac gating or breath-hold MRI acquisition is challenging. In particular, data collected in a short amount of time might be insufficient for the diagnosis of patients with impaired breath-holding capabilities and/or arrhythmia. A major challenge in cardiac MRI is the motion of the heart itself, the pulsate blood flow, and the respiratory motion. Furthermore, the motion of the diaphragm in the chest moving up and down gets translated to the heart when a patient breathes. Therefore, artifacts arise due to the changes in signal intensity or phase as a function of time, resulting in blurry images. This paper describes a novel reconstruction strategy for real time cardiac MRI without requiring the use of an electro-cardiogram or of breath holding. In this research we focused on automation and evaluation of the performance of our proposed method in real time MRI data to ensure a good basis for the signal extraction. Hence, it assists in the reconstruction. The proposed method enables one to extract cardiac beating waveforms directly from real-time cardiac MRI series collected from freely breathing patients and without cardiac gating. Our method only requires minimal user involvement as initialization step. Thereafter, the method follows the registered area in every frame and updates itself.
Local stereo matching using binary weighted normalized cross-correlation
Tong Liu, Liyan Qiao, Xiyuan Peng
Significant achievements have been attained in the field of dense stereo correspondence by local algorithms since the emergence of adaptive support weight by Yoon [1]. However, most algorithms suffer from photometric distortions and low-texture areas. In this paper, we present a novel stereo matching algorithm that can be sensitive to low-texture changes within support windows while keep insensitive to radiometric variations between left and right images. The algorithm performs Normalized Cross-Correlation with Binary Weighted support window (BWNCC) using k-nearest neighbors algorithm to resolve boundary problems. And, the proposed algorithm can be accelerated with transform domain convolution. We also propose to accelerate the BWNCC with transform domain computation. Experiment results confirm that the proposed method is robust, and has the comparable accuracy as the state-of-the-art.
Sparse representation based face recognition using weighted regions
Emil Bilgazyev, E. Yeniaras, I. Uyanik, et al.
Face recognition is a challenging research topic, especially when the training (gallery) and recognition (probe) images are acquired using different cameras under varying conditions. Even a small noise or occlusion in the images can compromise the accuracy of recognition. Lately, sparse encoding based classification algorithms gave promising results for such uncontrollable scenarios. In this paper, we introduce a novel methodology by modeling the sparse encoding with weighted patches to increase the robustness of face recognition even further. In the training phase, we define a mask (i.e., weight matrix) using a sparse representation selecting the facial regions, and in the recognition phase, we perform comparison on selected facial regions. The algorithm was evaluated both quantitatively and qualitatively using two comprehensive surveillance facial image databases, i.e., SCfaceandMFPV, with the results clearly superior to common state-of-the-art methodologies in different scenarios. Publisher’s Note: This paper, originally published on 24 December 2013, was replaced with a revised version on 11 June 2014. If you downloaded the original PDF but are unable to access the revision, please contact SPIE Digital Library Customer Service for assistance.
Quality enhancement of low-resolution image by using natural images
E. Bilgazyev, E. Yeniaras, I. Uyanik, et al.
In this paper, we propose a new algorithm to estimate a super-resolution image from a given low-resolution image, by adding high-frequency information that is extracted from natural high-resolution images in the training dataset. The selection of the high-frequency information from the training dataset is accomplished in two steps: a nearest-neighbor search algorithm is used to select the closest images from the training dataset, which can be implemented in the GPU, and a sparse-representation algorithm is used to estimate a weight parameter to combine the high-frequency information of selected images. This simple but very powerful super-resolution algorithm can produce state-of-the-art results. Qualitatively and quantitatively, we demonstrate that the proposed algorithm outperforms existing common practices.
Securing palmprint authentication systems using spoof detection approach
Automated human authentication using features extracted from palmprint images has been studied extensively in the literature. Primary focus of the studies thus far has been the improvement of matching performance. As more biometric systems get deployed for wide range of applications, the threat of impostor attacks on these systems is on the rise. The most common among various types of attacks is the sensor level spoof attack using fake hands created using different materials. This paper investigates an approach for securing palmprint based biometric systems against spoof attacks that use photographs of the human hand for circumventing the system. The approach is based on the analysis of local texture patterns of acquired palmprint images for extracting discriminatory features. A trained binary classifier utilizes the discriminating information to determine if the input image is of real hand or a fake one. Experimental results, using 611 palmprint images corresponding to 100 subjects in the publicly available IITD palmprint image database, show that 1) palmprint authentication systems are highly vulnerable to spoof attacks and 2) the proposed spoof detection approach is effective for discriminating between real and fake image samples. In particular, the proposed approach achieves the best classification accuracy of 97.35%.
Application of discriminative models for interactive query refinement in video retrieval
Amit Srivastava, Saurabh Khanwalkar, Anoop Kumar
The ability to quickly search for large volumes of videos for specific actions or events can provide a dramatic new capability to intelligence agencies. Example-based queries from video are a form of content-based information retrieval (CBIR) where the objective is to retrieve clips from a video corpus, or stream, using a representative query sample to find more like this. Often, the accuracy of video retrieval is largely limited by the gap between the available video descriptors and the underlying query concept, and such exemplar queries return many irrelevant results with relevant ones. In this paper, we present an Interactive Query Refinement (IQR) system which acts as a powerful tool to leverage human feedback and allow intelligence analyst to iteratively refine search queries for improved precision in the retrieved results. In our approach to IQR, we leverage discriminative models that operate on high dimensional features derived from low-level video descriptors in an iterative framework. Our IQR model solicits relevance feedback on examples selected from the region of uncertainty and updates the discriminating boundary to produce a relevance ranked results list. We achieved 358% relative improvement in Mean Average Precision (MAP) over initial retrieval list at a rank cutoff of 100 over 4 iterations. We compare our discriminative IQR model approach to a naïve IQR and show our model-based approach yields 49% relative improvement over the no model naïve system.
DTCWT based high capacity steganography using coefficient replacement and adaptive scaling
N. Sathisha, R. Priya, K. Suresh Babu, et al.
The steganography is used for secure communication. In this paper we propose Dual Tree Complex Wavelet Transform (DTCWT) based high capacity steganography using coefficient replacement and adaptive scaling. The DTCWT is applied on cover image and Lifting Wavelet Transform2 (LWT2) is applied on payload to convert spatial domain into transform domain. The new concept of replacing HH sub band coefficients of DTCWT of cover image by LL sub band coefficients of payload is introduced to generate intermediate stego object. The adaptive scaling factor is used based on entropy of cover image to scale down intermediate stego object coefficient values to generate final stego object. It is observed that the capacity and security are increased in the proposed algorithm compared to existing algorithms.
Face recognition using transform domain texture features
Rangaswamy Y., Ramya S K, K B Raja, et al.
The face recognition is an efficient biometric system to identify a person. In this paper, we propose Face Recognition using Transform Domain Texture Features (FRTDTF). The face images are preprocessed and two sets of texture features are extracted. In first feature set, the Discrete Wavelet Transform (DWT) is applied on face image and considered only high frequency sub band coefficients to extract edge information efficiently. The Dual Tree Complex Wavelet Transform (DTCWT) is applied on high frequency sub bands of DWT to derive Low and High frequency DTCWT coefficients. The texture features of DTCWT coefficients are computed using Overlapping Local Binary Pattern (OLBP) to generate feature set 1. In second feature set, the DTCWT is applied on preprocessed face image and considered all frequency sub bands coefficients to extract significant information and edge information of face image. The texture features of DTCWT matrix are computed using OLBP to generate feature set 2. The final feature set is the concatenation of feature set 1 and set 2. The Euclidian distance (ED) is used to compare test image features with features of face images in the database. It is observed that, the performance parameter values are better in the case of proposed algorithm compared to existing algorithms.
A new approach for vehicle color recognition based on specular-free image
Wei Hu, Jie Yang, Li Bai, et al.
Vehicle color recognition plays an important role in the intelligent transportation system. Most of the state-of-art methods roughly take all pixels into consideration, but many parts of cars such as car windows and wheels contain no color information. Also these methods do not work well enough in reducing the influence of sunlight. In this paper, we propose a novel approach that aims to estimate the RGB value of the car body rather than just classify the vehicle’s color and achieve state-of-art performance. We try to filter the useless parts automatically and estimate the influence of sunlight on each pixel by introducing the specular-free image and the weighted-light-influence image. Experimental results demonstrate the performance of the proposed scheme in differentiating cars with very similar color.
An analysis of inhibitory pseudo-interconnections in unsupervised neural networks
Minh-Triet Tran, Nam Do-Hoang Le
Lateral connection is a fundamental element of human neural networks which enables sparse learning and topographical order in feature maps. Due to high complexity and computational cost, computer scientists tend to simplify it in practical implementations. To utilize the simplicity of traditional networks while preserving the effects of interconnections, the authors employ numerical filters in unsupervised learning networks. These filters suppress low activations and decorrelate high ones, which are similar to how inhibitory lateral connections behave. Inhibitory networks outperform conventional approach in both standard datasets CIFAR-10 and STL-10. Our method also yields competitive results in comparison with other single-layer unsupervised networks. Furthermore, it is promising to apply inhibitory networks into deep learning systems for complex recognition problem.
An improved background subtraction approach in target detection and tracking
Hao Lai, Yuesheng Zhu, Zhenming Nong
In this paper, a novel background subtraction approach is proposed to avoid stationary foreground objects being merged into the background in target detection and tracking, in which an improved background model is designed by using virtual frames and the blur can be attenuated with this model when an object moves again after it stays for a long time. Moreover, the proposed model is fused with the eigenbackgrounds to improve the environmental adaptability. Our experimental results indicate that the proposed approach enhances the performance of target detection and tracking in intelligent surveillance and is superior to some state-of-the-art methods according to the precision-recall measurement.
Realtime hand detection system using convex shape detector in sequential depth images
Chung-Li Tai, Chia-Chang Li, Duan-Li Liao
In this paper, a real-time hand detection and tracking system is proposed. A calibrated stereo vision system is used to obtain disparity images and real world coordinates are available by geometry transformation. Unlike other pixel-based shape detector that edge information is necessary, the proposed convex shape detector, which is based on real world coordinates, is applied directly in depth images to detect hands regardless of distance. Around waving gesture recognition and simple hand tracking are also implemented in this work. The acceptable accuracy of the proposed system is examined in verification process. Experimental results of hand detection and tracking prove the robustness and the feasibility of the proposed method.
Saliency based skin detection in complex scenes
Kashif Ahmad, Nasir Ahmad, Rehan Khan, et al.
Background cluttering badly affects the performance of Skin detection. In highly cluttered images, skin detection becomes more difficult and the algorithm can’t differentiate between the skin and non-skin pixels. In this paper, we introduce saliency algorithm for removing the irrelevant information especially the skin like regions, in the background of the human images to tackle the background cluttering problem and improve the performance of skin detection algorithms in images with complex backgrounds. Extensive experimentation on highly cluttered and complex images shows that saliency algorithm further enhances the performance of skin detection algorithms not only in terms of false positive rate but in true positive rate, true negative, false negative rate, accuracy and precision too.
Video geographic information system using mobile mapping in mobilephone camera
Jinsuk Kang, Jae-Joon Lee
In this Paper is to develop core technologies such as automatic shape extraction from images (video), spatialtemporal data processing, efficient modeling, and then make it inexpensive and fast to build and process the huge 3D geographic data. The upgrade and maintenance of the technologies are also easy due to the component-based system architecture. Therefore, we designed and implemented the Video mobile GIS using a real-time database system, which consisted of a real-time GIS engine, a middleware, and a mobile client.
Ground-based visual guidance in autonomous UAV landing
Yu Zhang, Lincheng Shen, Yirui Cong, et al.
Visual guidance has attracted more and more attention in the navigation field thanks to its accuracy and robustness. This paper presents a ground-based visual guidance system for the autonomous Unmanned Aerial Vehicles (UAV) landing. The system consists of two cameras and pan-tilt units (PTU) that mounted on both sides of the runway. In this system, computer vision is adopted for UAV detection and tracking. To be more specific, triangulation, a geometric method in binocular vision, is employed to calculate the 3D coordinates of the UAV in order to provide landing guidance parameters and finally achieve autonomous UAV landing. The 3D positioning principles adopted in ground-based measurement are simulated and verified. The results show that the accuracy can be achieved and relevant requirements are satisfied by ground-based visual guidance.
The decoding method based on wavelet image En vector quantization
Chun-yang Liu, Hui Li, Tao Wang
With the rapidly progress of internet technology, large scale integrated circuit and computer technology, digital image processing technology has been greatly developed. Vector quantization technique plays a very important role in digital image compression. It has the advantages other than scalar quantization, which possesses the characteristics of higher compression ratio, simple algorithm of image decoding. Vector quantization, therefore, has been widely used in many practical fields. This paper will combine the wavelet analysis method and vector quantization En encoder efficiently, make a testing in standard image. The experiment result in PSNR will have a great improvement compared with the LBG algorithm.
Determining noise performance of co-occurrence GMuLBP on object detection task
Nuh Alpaslan, Mehmet Murat Turhan, Davut Hanbay
Object detection is currently one of the most actively researched areas of computer vision, image processing and analysis. Image co-occurrence has shown significant performance on object detection task because it considers the characteristic of objects and spatial relationship between them simultaneously. CoHOG has achieved great success on different object detection tasks, especially human detection. Whereas, CoHOG is sensitive to noise and it does not consider gradient magnitude which significantly effects the object detection accuracy. To overcome these disadvantages the CoGMuLBP was proposed. CoGMuLBP uses a new statistical orientation assignment method based on uniform LBP instead of using the common gradient orientation. In this study, detection accuracies of CoGMuLBP and CoHOG are calculated on three different datasets with NN classifier. In addition, to evaluate the noise performance of the methods, gaussian noises were added to test images and performances were recalculated. Numerical experiments performed on three different datasets show that 1) CoGMuLBP has higher detection accuracy than CoHOG; 2) using uniform LBP based gradient orientation improves detection accuracy; and 3) CoGMuLBP is more robust to gaussian noise and illumination changes. These results provide the effectiveness of CoGMuLBP for object detection.
Completeness set proof of precondition and post-condition types of activity in any EPM
Qian Yu, Tong Li, JinZhuo Liu, et al.
Software evolution process model (EPM) is created in terms of a formal evolution process meta-model (EPMM) and semi-formal approach to modeling based on EPMM [1]. In order to better manage and control the software evolution process and make the best of existing software technology, the method to transform any EPM to its execution model based logic programming has been proposed. Completeness of conversion depends on completeness of the rules, that is, all the expressions of the original model are found the correspondence in the target model. Since transformation rules are proposed based on precondition or post-condition types of activities in anyone EPM, this need to prove that activity type set in anyone EPM is completeness set. To this end, the precondition and post-condition of activities in EPM are classified based on analyzing all expressions in EPMs and the semantics of the activity execution. Type completeness set of activity’s precondition and its post-condition is presented. Lastly we prove that the activity type set in anyone EPM is completeness set by mathematical induction.
Tangent bundle Manifold Learning for image analysis
A. P. Kuleshov, A. V. Bernstein
Image applications require additional special features of Manifold Learning (ML) methods. To deal with some of such features, we introduce amplification of the ML, called Tangent Bundle ML (TBML), in which proximity is required not only between the original Data manifold and data-based Reconstructed manifold but also between their tangent spaces. We present a new geometrically motivated Grassman and Stiefel Eigenmaps method for the TBML, which also gives a new solution for the ML.
Similarity measures for pattern matching on-the-fly
Recently, we presented a new OCR-concept [1] for historic prints. The core part is the glyph recognition based on pattern matching with patterns that are derived from computer font glyphs and are generated on-the-fly. The classification of a sample is organized as a search process for the most similar glyph pattern. In this paper, we investigate several similarity measures which are of vital importance for this concept.
Simulation analysis and design on the structure of electromagnetic dumping device
Xiao-ning Chen, Bin Zhang, Yong Geng, et al.
The paper firstly introduces the development of dumping device. In accordance with the existing parameters of the dumping device. This paper puts forward a new way of utilizing electromagnetic force to provide power to the dumping device. This article analyzes the three classic ways to the electromagnetic launch technology and selects the coil electromagnetic emission to the structure design of electromagnetic dumping device. Finally COMSOL Multiphysics software gives a simulation on the initial launch position which is the basic structural parameter of electromagnetic dumping device and the paper optimizes the structural setting of electromagnetic dumping device.
Gender classification from neutral and expressive faces
Yasmina Andreu, Pedro García-Sevilla, Ramón A. Mollineda
This paper presents a statistical study of local vs. global approaches for classifying gender from neutral and expressive faces. A cross-dataset evaluation is provided by using different training and test face databases, as well as several well-known classifiers (1-NN, PCA+LDA and SVM) and widely used features for facial description. Three statistical tests have proved that local approaches are more suitable than global ones for solving gender classification problems over expressive faces when training with non-expressive faces. However, if a large set of expressive faces is available for training, global solutions outperform local ones.
Fast optical flow estimation based on multi-grid
Xiuzhi Li, Songmin Jia, Jun Tan, et al.
Estimation efficiency is one of key topics in computationally intense optical flow algorithm. Traditional numerical iterative methods are effective at eliminating the high frequency components of the estimation error, while keeping most of low frequency components unchanged. In this paper, we consider the multi-grid based real-time implementation of dense optical flow computation by classical Horn-Schunck model. For this purpose, establishing of the linear set of equation, which is required in linear multi-grid model, is carefully studied, and the overall multi-grid framework is presented. Efficiency and effectiveness of the proposed algorithm is validated by experimental results.
Conditional fault-tolerant cycles in folded hypercubes with faulty elements
Jian-Wei Zheng, Da-chang Guo, Ri-Fei Liang
As an attractive variation of the hypercubes , Qn n -dimensional folded hypercube FQn can be obtained by adding 2n−1 complementary edges between the vertices of hypercube. Let v FF (respectively, FFe ) denotes the set of faulty vertices (respectively, faulty edges) in an n -dimensional folded hypercube. In this paper, we prove that FQn − FFv − FFe contains a fault-free cycle of length at least 2n − FFv if FQn satisfies both of the constraints that (1)each vertex in FQn is incident to at least two fault-free edges(2) 2 3, FFe + FFv ≤ n − when n ≥ 4 .
Exploiting context in kernel-mapping recommender system algorithms
Mustansar Ali Ghazanfar, Adam Prϋgel-Bennett
Making e ective recommendations from a domain consisting of millions of ratings is a major research challenge in the application of machine learning. Kernel Mapping Recommender (KMR) algorithms have been proposed providing state-of-the-art performance. In this paper, we show how context information can be added to KMR algorithms. We consider the trusted friends of a user as their social context and show how this information can be used to provide more personalised, refined, and trustworthy recommendations. The limited set of friends; however, restricts the amount of data available to create useful recommendations. This paper sheds light on this issue and specifically on the amount of friends necessary to get satisfactory recommendations.
A relay selection algorithm for radio and television services based on time-delay and bandwidth
Chaoyi Zhang, Muqing Wu, Linlin Luan, et al.
This paper presents a relay routing method for Radio and TV services, this method through obtaining node’s timedelay and power information, obtains the value of system interrupt decisions, and as a decision threshold to select relay node. While in consideration of link priorities and fairness, we design a relay routing protocol that can dynamically change the route when network is changed. Simulation results show that this protocol can expand coverage, reduce communication blind spots, increase system throughput and enhance the quality of service.
Study on dynamic services composition of web services based on BPEL
Jinyue Gao, Fei Huang, Gongxuan Zhang
From the core concepts of SOA (Service-Oriented Architecture) ——"Service" starting the service composition is discussed in detail, from the service relationships network modeling, services dynamic composition approach based on Business Process Execution Language BPEL (Business Process Execution Language) is proposed in this paper, meanwhile two concepts of service agent and service quality are described, which achieve the service process dynamic execution.
Automatic music genres classification as a pattern recognition problem
Ihtisham Ul Haq, Fauzia Khan, Sana Sharif, et al.
Music genres are the simplest and effect descriptors for searching music libraries stores or catalogues. The paper compares the results of two automatic music genres classification systems implemented by using two different yet simple classifiers (K-Nearest Neighbor and Naïve Bayes). First a 10-12 second sample is selected and features are extracted from it, and then based on those features results of both classifiers are represented in the form of accuracy table and confusion matrix. An experiment carried out on test 60 taken from middle of a song represents the true essence of its genre as compared to the samples taken from beginning and ending of a song. The novel techniques have achieved an accuracy of 91% and 78% by using Naïve Bayes and KNN classifiers respectively.
Predicting performance interference of application in virtualized environments
Yu Dai, Lei Yang, Hexu Xing, et al.
This paper proposes a method for predicting the performance interference of applications in the virtualized environment. In this method, we firstly analyze the relationship between the performance interference degree and the system-level workloads, and based on this we propose a linear regression algorithm to model relationship between the performance interference degree and the system-level workloads by using the historical data about performance interference degree as the training data set. For the applications without historical data about performance interference degree, we develop a method for predicting the performance interference by clustering the available models of performance interference and matchmaking between the workload pattern of the application and the workload patterns of the available models to generate the performance interference model for the application whose performance interference will to be predicted. By use of the available model, the performance interference of the application can be predicted without historical data about the performance interference among the applications co-located on the same physical host. The experiments show the effectiveness of the proposed measurement and prediction methods of the performance interference among the virtual machines.
A combined SIFT/SURF descriptor for automatic face recognition
Ladislav Lenc, Pavel Král
This paper deals with Automatic Face Recognition (AFR). A novel approach which combines the SIFT and SURF features for the face representation is proposed. The obtained combined SIFT/SURF descriptor is then used for face comparison by the adapted Kepenekci matching method. The proposed method is evaluated on the FERET and CTK corpora. The obtained recognition rates are 98.4% and 64.6% respectively. These recognition scores show that our approach outperforms significantly all other methods on these corpora. The differences between recognition error rates of the proposed approach and the second best one are 41% and 7% in relative value respectively.
Classifying imbalanced data using an Svm ensemble with k-means clustering in semiconductor test process
Eun-mi Park, Jee-Hyong Lee
In the semiconductor manufacturing process, it is important to predict defective chips in advance for reduction of test cost and early stabilization of the production process. However, highly imbalanced datasets in the semiconductor test process degrade the performance of prediction. In order to enhance an SVM Ensemble, this study presents an improved methodology using the K-means, which clusters the majority class and the minority class before training an SVM. A result of the experiment with the actual data of the semiconductor test process is reported to demonstrate that our approach outperforms other methods in terms of classifying the imbalanced dataset.
Design of personalized search engine based on user-webpage dynamic model
Jihan Li, Shanglin Li, Yingke Zhu, et al.
Personalized search engine focuses on establishing a user-webpage dynamic model. In this model, users' personalized factors are introduced so that the search engine is better able to provide the user with targeted feedback. This paper constructs user and webpage dynamic vector tables, introduces singular value decomposition analysis in the processes of topic categorization, and extends the traditional PageRank algorithm.
An effective self-assessment based on concept map extraction from test-sheet for personalized learning
Keng-Hou Liew, Yu-Shih Lin, Yi-Chun Chang, et al.
Examination is a traditional way to assess learners’ learning status, progress and performance after a learning activity. Except the test grade, a test sheet hides some implicit information such as test concepts, their relationships, importance, and prerequisite. The implicit information can be extracted and constructed a concept map for considering (1) the test concepts covered in the same question means these test concepts have strong relationships, and (2) questions in the same test sheet means the test concepts are relative. Concept map has been successfully employed in many researches to help instructors and learners organize relationships among concepts. However, concept map construction depends on experts who need to take effort and time for the organization of the domain knowledge. In addition, the previous researches regarding to automatic concept map construction are limited to consider all learners of a class, which have not considered personalized learning. To cope with this problem, this paper proposes a new approach to automatically extract and construct concept map based on implicit information in a test sheet. Furthermore, the proposed approach also can help learner for self-assessment and self-diagnosis. Finally, an example is given to depict the effectiveness of proposed approach.
Corpus analysis and automatic detection of emotion-including keywords
Bo Yuan, Xiangqing He, Ying Liu
Emotion words play a vital role in many sentiment analysis tasks. Previous research uses sentiment dictionary to detect the subjectivity or polarity of words. In this paper, we dive into Emotion-Inducing Keywords (EIK), which refers to the words in use that convey emotion. We first analyze an emotion corpus to explore the pragmatic aspects of EIK. Then we design an effective framework for automatically detecting EIK in sentences by utilizing linguistic features and context information. Our system outperforms traditional dictionary-based methods dramatically in increasing Precision, Recall and F1-score.
On the Laplacian model for particle-based simulation using Moving-Particle Semi-implicit (Mps) Method
Khai-Ching Ng, Tony Wen-Hann Sheu
A general form of Laplacian model is derived for the numerical framework of Moving Particle Semi-implicit (MPS) method. The existing proposals of MPS Laplacian model available in the open literature can indeed be reproduced from this general Laplacian model. Most importantly, the numerical accuracy of the evaluated Laplacian term particularly on the irregular particle layout can be further improved by adjusting the tuning parameter introduced in the general Laplacian model.
SPMBR: a scalable algorithm for mining sequential patterns based on bitmaps
Xiwei Xu, Changhai Zhang
Now some sequential patterns mining algorithms generate too many candidate sequences, and increase the processing cost of support counting. Therefore, we present an effective and scalable algorithm called SPMBR (Sequential Patterns Mining based on Bitmap Representation) to solve the problem of mining the sequential patterns for large databases. Our method differs from previous related works of mining sequential patterns. The main difference is that the database of sequential patterns is represented by bitmaps, and a simplified bitmap structure is presented firstly. In this paper, First the algorithm generate candidate sequences by SE(Sequence Extension) and IE(Item Extension), and then obtain all frequent sequences by comparing the original bitmap and the extended item bitmap .This method could simplify the problem of mining the sequential patterns and avoid the high processing cost of support counting. Both theories and experiments indicate that the performance of SPMBR is predominant for large transaction databases, the required memory size for storing temporal data is much less during mining process, and all sequential patterns can be mined with feasibility.