Eighth International Conference on Machine Vision (ICMV 2015) | (2015) | Publications

Volume Details

Date Published: 29 December 2015

Contents: 11 Sessions, 82 Papers, 0 Presentations

Conference: Eighth International Conference on Machine Vision 2015

Volume Number: 9875

All links to SPIE Proceedings will open in the SPIE Digital Library.

Show all abstracts

View Session

Front Matter: Volume 9875
Image Transform and Analysis
Image Segmentation
Image Detection and Pattern Recognition
Medical Image Processing
Image Processing and Application
Computer Vision and Visualization
Signal Analysis and Processing
Communication and Information System
Computer Theory and Application
Mechanical Control and System

Front Matter: Volume 9875

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 9875, including the Title Page, Copyright information, Table of Contents, Introduction (if any), and Conference Committee listing.

Image Transform and Analysis

A new dehazing algorithm based on overlapped sub-block homomorphic filtering

Lu Yu, Xuebin Liu, Guizhong Liu

Show abstract

Considering the images captured under hazy weather conditions are blurred, a new dehazing algorithm based on overlapped sub-block homomorphic filtering in HSV color space is proposed. Firstly, the hazy image is transformed from RGB to HSV color space. Secondly, the luminance component V is dealt with the overlapped sub-block homomorphic filtering. Finally, the processed image is converted from HSV to RGB color space once again. Then the dehazing images will be obtained. According to the established algorithm model, the dehazing images could be evaluated by six objective evaluation parameters including average value, standard deviation, entropy, average gradient, edge intensity and contrast. The experimental results show that this algorithm has good dehazing effect. It can not only improve degradation of the image, but also amplify the image details and enhance the contrast of the image effectively.

Determination of mango fruit from binary image using randomized Hough transform

Mohamed Rizon, Nurul Ain Najihah Yusri, Mohd Fadzil Abdul Kadir, et al.

Show abstract

A method of detecting mango fruit from RGB input image is proposed in this research. From the input image, the image is processed to obtain the binary image using the texture analysis and morphological operations (dilation and erosion). Later, the Randomized Hough Transform (RHT) method is used to find the best ellipse fits to each binary region. By using the texture analysis, the system can detect the mango fruit that is partially overlapped with each other and mango fruit that is partially occluded by the leaves. The combination of texture analysis and morphological operator can isolate the partially overlapped fruit and fruit that are partially occluded by leaves. The parameters derived from RHT method was used to calculate the center of the ellipse. The center of the ellipse acts as the gripping point for the fruit picking robot. As the results, the rate of detection was up to 95% for fruit that is partially overlapped and partially covered by leaves.

A method of periodic pattern localization on document images

Timofey S. Chernov, Dmitry P. Nikolaev, Vitali M. Kliatskine

Show abstract

Periodic patterns often present on document images as holograms, watermarks or guilloche elements which are mostly used for fraud protection. Localization of such patterns lets an embedded OCR system to vary its settings depending on pattern presence in particular image regions and improves the precision of pattern removal to preserve as much useful data as possible. Many document images’ noise detection and removal methods deal with unstructured noise or clutter on documents with simple background. In this paper we propose a method of periodic pattern localization on document images which uses discrete Fourier transform that works well on documents with complex background.

An evaluation of popular hyperspectral images classification approaches

Andrey Kuznetsov, Vladislav Myasnikov

Show abstract

This work is devoted to the problem of the best hyperspectral images classification algorithm selection. The following algorithms are used for comparison: decision tree using full cross-validation; decision tree C 4.5; Bayesian classifier; maximum-likelihood method; MSE minimization classifier, including a special case – classification by conjugation; spectral angle classifier (for empirical mean and nearest neighbor), spectral mismatch classifier and support vector machine (SVM). There are used AVIRIS and SpecTIR hyperspectral images to conduct experiments.

In search of a new initialization of K-means clustering for color quantization

Mariusz Frackiewicz, Henryk Palus

Show abstract

Color quantization is still an important auxiliary operation in the processing of color images. The K-means clustering (KM), used to quantize the color, requires an appropriate initialization. In this paper, we propose a combined KM method that use to initialize the results of well-known quantization algorithms such as Wu's, NeuQuant (NQ) and Neural Gas (NG). This approach, assessed by three quality indices: PSNR, ΔE and ΔM, improves the results. Experimental results of such combined quantization indicate that the deterministic Wu+KM and random NG+KM approaches leading to the best quantized images.

Locally isometric and conformal parameterization of image manifold

A. V. Bernstein, A. P. Kuleshov, Yu. A. Yanovich

Show abstract

Images can be represented as vectors in a high-dimensional Image space with components specifying light intensities at image pixels. To avoid the ‘curse of dimensionality’, the original high-dimensional image data are transformed into their lower-dimensional features preserving certain subject-driven data properties. These properties can include ‘information-preserving’ when using the constructed low-dimensional features instead of original high-dimensional vectors, as well preserving the distances and angles between the original high-dimensional image vectors. Under the commonly used Manifold assumption that the high-dimensional image data lie on or near a certain unknown low-dimensional Image manifold embedded in an ambient high-dimensional ‘observation’ space, a constructing of the lower-dimensional features consists in constructing an Embedding mapping from the Image manifold to Feature space, which, in turn, determines a low-dimensional parameterization of the Image manifold. We propose a new geometrically motivated Embedding method which constructs a low-dimensional parameterization of the Image manifold and provides the information-preserving property as well as the locally isometric and conformal properties.

Nonlinear mapping methods with adjustable computational complexity for hyperspectral image analysis

E. V. Myasnikov

Show abstract

Nonlinear mapping (Sammon mapping) is a well-known dimensionality reduction technique. Recently several nonlinear mapping methods with reduced computational complexity have been proposed but they do not provide a flexible control over a computational complexity. In this paper a nonlinear mapping method with adjustable computational complexity is proposed. The proposed method is based on the hierarchical decomposition of the multidimensional space, priority queues, and simple optimization procedure to provide fast and flexible dimensionality reduction process. The proposed method is compared to an alternative one based on stochastic optimization. The experiments are carried out on well-known hyperspectral images. Studied methods are evaluated in terms of the data mapping error and runtime. Experimental results for both two- and three-dimensional output spaces are presented.

Fast Hough transform analysis: pattern deviation from line segment

E. Ershov, A. Terekhin, D. Nikolaev, et al.

Show abstract

In this paper, we analyze properties of dyadic patterns. These pattern were proposed to approximate line segments in the fast Hough transform (FHT). Initially, these patterns only had recursive computational scheme. We provide simple closed form expression for calculating point coordinates and their deviation from corresponding ideal lines.

On evaluation of depth accuracy in consumer depth sensors

Azim Zaliha Abd Aziz, Hong Wei, James Ferryman

Show abstract

This paper presents an experimental study of different depth sensors. The aim is to answer the question, whether these sensors give accurate data for general depth image analysis. The study examines the depth accuracy between three popularly used depth sensors; ASUS Xtion Prolive, Kinect Xbox 360 and Kinect for Windows v2. The main attention is to study on the stability of pixels in the depth image captured at several different sensor-object distances by measuring the depth returned by the sensors within specified time intervals. The experimental results show that the fluctuation (mm) of the random selected pixels within the target area, increases with increasing distance to the sensor, especially on the Kinect for Xbox 360 and the Asus Xtion Prolive. Both of these sensors provide pixels fluctuation between 20mm and 30mm at a sensor-object distance beyond 1500mm. However, the pixel’s stability of the Kinect for Windows v2 not affected much with the distance between the sensor and the object. The maximum fluctuation for all the selected pixels of Kinect for Windows v2 is approximately 5mm at sensor-object distance of between 800mm and 3000mm. Therefore, in the optimal distance, the best stability achieved.

Image Segmentation

Moving cast shadow resistant for foreground segmentation based on shadow properties analysis

Hao Zhou, Yun Gao, Guowu Yuan, et al.

Show abstract

Moving object detection is the fundamental task in machine vision applications. However, moving cast shadows detection is one of the major concerns for accurate video segmentation. Since detected moving object areas are often contain shadow points, errors in measurements, localization, segmentation, classification and tracking may arise from this. A novel shadow elimination algorithm is proposed in this paper. A set of suspected moving object area are detected by the adaptive Gaussian approach. A model is established based on shadow optical properties analysis. And shadow regions are discriminated from the set of moving pixels by using the properties of brightness, chromaticity and texture in sequence.

A new interactive algorithm for image segmentation

Haiying Zhao, Changping Sun, Hong Chen

Show abstract

In this paper, a new interactive algorithm for image segmentation is proposed. First, threshold segmentation is applied to the original image, and the corresponding foreground image and background image are obtained. In the foreground image, some manually selected contour pixel points of the pattern to be segmented will become the initial seed points. The set composed of seed points is defined as set E. Then, the distances between each pixel point in foreground image and each seed point in set E are computed. If the minimum distance is less than the threshold, the pixel point with minimum distance is labeled as seed point, and added to the set E. Until all the pixel points in the foreground image have been labeled, the seed points in the final set E compose the segmentation image. At last, the effectiveness of the proposed algorithm is proved by a simulation.

A variable parameter parametric snake method

A. Marouf, A. Houacine

Show abstract

In this paper, we introduce a new approach to parametric snake method by using variable snake parameters. Adopting fixed parameter values for all points of the snake, as usual, constitutes by itself a limitation that leads to poor performances in terms of convergence and tracking properties. A more adapted choice should be the one that allows selection depending on the image region properties as on the contour shape and position. However, such variability is not an easy task in general and a precise method need to be defined to assure contour point dependent tuning at iterations. We were particularly interested in applying this idea to the recently presented parametric method [1]. In the work mentioned, an attraction term is used to improve the convergence of the standard parametric snake without a significant increase in computational load. We show here, that improved performances can ensue from applying variable parameter concepts. For this purpose, the method is first analyzed and then a procedure is developed to assure an automatic variable parameter tuning. The interest of our approach is illustrated through object segmentation results.

Image Detection and Pattern Recognition

3D fast wavelet network model-assisted 3D face recognition

Salwa Said, Olfa Jemai, Mourad Zaied, et al.

Show abstract

In last years, the emergence of 3D shape in face recognition is due to its robustness to pose and illumination changes. These attractive benefits are not all the challenges to achieve satisfactory recognition rate. Other challenges such as facial expressions and computing time of matching algorithms remain to be explored. In this context, we propose our 3D face recognition approach using 3D wavelet networks. Our approach contains two stages: learning stage and recognition stage. For the training we propose a novel algorithm based on 3D fast wavelet transform. From 3D coordinates of the face (x,y,z), we proceed to voxelization to get a 3D volume which will be decomposed by 3D fast wavelet transform and modeled after that with a wavelet network, then their associated weights are considered as vector features to represent each training face . For the recognition stage, an unknown identity face is projected on all the training WN to obtain a new vector features after every projection. A similarity score is computed between the old and the obtained vector features. To show the efficiency of our approach, experimental results were performed on all the FRGC v.2 benchmark.

Hand posture recognizer based on separator wavelet networks

Tahani Bouchrika, Olfa Jemai, Mourad Zaied, et al.

Show abstract

This paper presents a novel hand posture recognizer based on separator wavelet networks (SWNs). Aiming at creating a robust and rapid hand posture recognizer, we have contributed by proposing a new training algorithm for the wavelet network classifier based on fast wavelet transform (FWN). So, the contribution resides in reducing the number of WNs modeling training data. To make that, inspiring from the adaboost feature selection method, we thought to create SWNs (n-1 WNs for n classes) instead of modeling each training sample by its wavelet network (WN). By proposing the new training algorithm, the recognition phase will be positively influenced. It will be more rapid thanks to the reduction of the number of comparisons between test images WNs and training WNs. Comparisons with other works, employing universal hand posture datasets are presented and discussed. Obtained results have shown that the new hand posture recognizer is comparable to previously established ones.

Robust head pose estimation using locality-constrained sparse coding

Hyunduk Kim, Sang-Heon Lee, Myoung-Kyu Sohn

Show abstract

Sparse coding (SC) method has been shown to deliver successful result in a variety of computer vision applications. However, it does not consider the underlying structure of the data in the feature space. On the other hand, locality constrained linear coding (LLC) utilizes locality constraint to project each input data into its local-coordinate system. Based on the recent success of LLC, we propose a novel locality-constrained sparse coding (LSC) method to overcome the limitation of the SC. In experiments, the proposed algorithms were applied to head pose estimation applications. Experimental results demonstrated that the LSC method is better than state-of-the-art methods.

A region finding method to remove the noise from the images of the human hand gesture recognition system

Muhammad Jibran Khan, Waqas Mahmood

Show abstract

The performance of the human hand gesture recognition systems depends on the quality of the images presented to the system. Since these systems work in real time environment the images may be corrupted by some environmental noise. By removing the noise the performance of the system can be enhanced. So far different noise removal methods have been presented in many researches to eliminate the noise but all have its own limitations. We have presented a region finding method to deal with the environmental noise that gives better results and enhances the performance of the human hand gesture recognition systems so that the recognition rate of the system can be improved.

Implementation of age and gender recognition system for intelligent digital signage

Sang-Heon Lee, Myoung-Kyu Sohn, Hyunduk Kim

Show abstract

Intelligent digital signage systems transmit customized advertising and information by analyzing users and customers, unlike existing system that presented advertising in the form of broadcast without regard to type of customers. Currently, development of intelligent digital signage system has been pushed forward vigorously. In this study, we designed a system capable of analyzing gender and age of customers based on image obtained from camera, although there are many different methods for analyzing customers. We conducted age and gender recognition experiments using public database. The age/gender recognition experiments were performed through histogram matching method by extracting Local binary patterns (LBP) features after facial area on input image was normalized. The results of experiment showed that gender recognition rate was as high as approximately 97% on average. Age recognition was conducted based on categorization into 5 age classes. Age recognition rates for women and men were about 67% and 68%, respectively when that conducted separately for different gender.

Weighting video information into a multikernel SVM for human action recognition

Jordi Bautista-Ballester, Jaume Vergés-Llahí, Domenec Puig

Show abstract

Action classification using a Bag of Words (BoW) representation has shown computational simplicity and good performance, but the increasing number of categories, including actions with high confusion, and the addition of significant contextual information has led most authors to focus their efforts on the combination of image descriptors. In this approach we code the action videos using a BoW representation with diverse image descriptors and introduce them to the optimal SVM kernel as a linear combination of learning weighted single kernels. Experiments have been carried out on the action database HMDB and the upturn achieved with our approach is much better than the state of the art, reaching an improvement of 14.63% of accuracy.

Toward an optimal convolutional neural network for traffic sign recognition

Hamed Habibi Aghdam, Elnaz Jahani Heravi, Domenec Puig

Show abstract

Convolutional Neural Networks (CNN) beat the human performance on German Traffic Sign Benchmark competition. Both the winner and the runner-up teams trained CNNs to recognize 43 traffic signs. However, both networks are not computationally efficient since they have many free parameters and they use highly computational activation functions. In this paper, we propose a new architecture that reduces the number of the parameters 27% and 22% compared with the two networks. Furthermore, our network uses Leaky Rectified Linear Units (ReLU) as the activation function that only needs a few operations to produce the result. Specifically, compared with the hyperbolic tangent and rectified sigmoid activation functions utilized in the two networks, Leaky ReLU needs only one multiplication operation which makes it computationally much more efficient than the two other functions. Our experiments on the Gertman Traffic Sign Benchmark dataset shows 0:6% improvement on the best reported classification accuracy while it reduces the overall number of parameters 85% compared with the winner network in the competition.

Improving neural network performance on SIMD architectures

Elena Limonova, Dmitry Ilin, Dmitry Nikolaev

Show abstract

Neural network calculations for the image recognition problems can be very time consuming. In this paper we propose three methods of increasing neural network performance on SIMD architectures. The usage of SIMD extensions is a way to speed up neural network processing available for a number of modern CPUs. In our experiments, we use ARM NEON as SIMD architecture example. The first method deals with half float data type for matrix computations. The second method describes fixed-point data type for the same purpose. The third method considers vectorized activation functions implementation. For each method we set up a series of experiments for convolutional and fully connected networks designed for image recognition task.

Application of Random Ferns for non-planar object detection

Alexey Mastov, Ivan Konovalenko, Anton Grigoryev

Show abstract

The real time object detection task is considered as a part of a project devoted to development of autonomous ground robot. This problem has been successfully solved with Random Ferns algorithm, which belongs to keypoint-based method and uses fast machine learning algorithms for keypoint matching step. As objects in the real world are not always planar, in this article we describe experiments of applying this algorithm for non-planar objects. Also we introduce a method for fast detection of a special class of non-planar objects | those which can be decomposed into planar parts (e.g. faces of a box). This decomposition needs one detector for each side, which may significantly affect speed of detection. Proposed approach copes with it by omitting repeated steps for each detector and organizing special queue of detectors. It makes the algorithm three times faster than naive one.

Viola-Jones based hybrid framework for real-time object detection in multispectral images

E. Kuznetsova, E. Shvets, D. Nikolaev

Show abstract

This paper describes a method for real-time object detection based on a hybrid of a Viola-Jones cascade with a convolutional neural network. This scheme allows flexible trade-offs between detection quality and computational performance. We also propose a generalization of this method to multispectral images that effectively and efficiently utilizes information from each spectral channel. The new scheme is experimentally compared to traditional Viola-Jones, showing improved detection quality with adjustable performance.

Geometric filtration of classification-based object detectors in realtime road scene recognition systems

Viktor Prun, Dmitry Bocharov, Ivan Koptelov, et al.

Show abstract

We study the issue of performance improvement of classification-based object detectors by including certain geometric-oriented filters. Configurations of the observed 3D scene may be used as a priori or a posteriori information for object filtration. A priori information is used to select only those object parameters (size and position on image plane) that are in accordance with the scene, restricting implausible combinations of parameters. On the other hand the detection robustness can be enhanced by rejecting detection results using a posteriori information about 3D scene. For example, relative location of detected objects can be used as criteria for filtration. We have included proposed filters in object detection modules of two different industrial vision-based recognition systems and compared the resulting detection quality before detectors improving and after. Filtering with a priori information leads to significant decrease of detector's running time per frame and increase of number of correctly detected objects. Including filter based on a posteriori information leads to decrease of object detection false positive rate.

Segments graph-based approach for smartphone document capture

Alexander E. Zhukovsky, Vladimir V. Arlazarov, Vasiliy V. Postnikov, et al.

Show abstract

Document capture with a smartphone camera is already here to stay. Interactive applications for document capture and its enhancement have filled mobile application stores. However, discounting the predictions and judging only from the experience of using such applications, they are not yet ready to compete with stationary scanners when high quality and reliability is required. This paper is devoted to analysis of the problem of document detection in the image and evaluation of the quality of existing mobile applications. Based on this analysis we present a new reliable algorithm for document capture, based on the boundary segments detection and constructing a segments graph to fit rectangular projective model. The algorithm achieves about 95% quality of document detection and outperforms all of the reviewed algorithms, implemented in mobile applications.

An analysis of automatic human detection and tracking

Philipe Rangel Demuth, Daniel Luiz Cosmo, Patrick Marques Ciarelli

Show abstract

This paper presents an automatic method to detect and follow people on video streams. This method uses two techniques to determine the initial position of the person at the beginning of the video file: one based on optical flow and the other one based on Histogram of Oriented Gradients (HOG). After defining the initial bounding box, tracking is done using four different trackers: Median Flow tracker, TLD tracker, Mean Shift tracker and a modified version of the Mean Shift tracker using HSV color space. The results of the methods presented in this paper are then compared at the end of the paper.

Approach to recognition of flexible form for credit card expiration date recognition as example

Alexander Sheshkus, Dmitry P. Nikolaev, Anastasia Ingacheva, et al.

Show abstract

In this paper we consider a task of finding information fields within document with flexible form for credit card expiration date field as example. We discuss main difficulties and suggest possible solutions. In our case this task is to be solved on mobile devices therefore computational complexity has to be as low as possible. In this paper we provide results of the analysis of suggested algorithm. Error distribution of the recognition system shows that suggested algorithm solves the task with required accuracy.

Adaptive WildNet Face network for detecting face in the wild

Dinh-Luan Nguyen, Vinh-Tiep Nguyen, Minh-Triet Tran, et al.

Show abstract

Combining Convolutional Neural Network and Deformable Part Models is a new trend in object detection area. Following this trend, we propose Adaptive WildNet Face network using Deformable Part Models structure to exploit advantages of two methods in face detection area. We evaluate the merit of our method on Face Detection Data Set and Benchmark. Experimental results show that our method achieves up to 86.22% true positive images in 1000 false positive images in FDDB. Our method becomes one of state-of-the-art methods in FDDB dataset and it opens a new way to detect faces of images in the wild.

Face detection using beta wavelet filter and cascade classifier entrained with Adaboost

Rim Afdhal, Akram Bahar, Ridha Ejbali, et al.

Show abstract

Face detection has been one of the most studied topics in the computer vision literature due to its relevant role in applications such as video surveillance, human computer interface and face image database management. Here, we will present a face detection approach which contains two steps. The first step is training phase based on Adaboost algorithm. The second step is the detection phase. The proposed approach presents an enhancement of Viola and Jones’ algorithm by replacing Haar descriptors with Beta wavelet. The obtained results have proved an excellent performance of detection not only when a face is in front of the camera but also when it is oriented towards the right or the left. Moreover, thanks to the start period needed for the detection, our approach can be applied during a real time experience.

Medical Image Processing

Comparative analysis of codeword representation by clustering methods for the classification of histological tissue types

Ahmet Saygili, Gunalp Uysal, Gokhan Bilgin

Show abstract

In this study, the classification of several histological tissue types, i.e., muscles, nerves, connective and epithelial tissue cells, is studied in high resolutional histological images. In the feature extraction step, bag of features method is utilized to reveal distinguishing features of each tissue cell types. Local small blocks of sub-images/patches are extracted to find discriminative patterns for followed strategy. For detecting points of interest in local patches, Harris corner detection method is applied. Afterwards, discriminative features are extracted using the scale invariant feature transform method using these points of interests. Several code word representations are obtained by clustering approach (using k-means fuzzy c-means, expectation maximization method, Gaussian mixture models) and evaluated in comparative manner. In the last step, the classification of the tissue cells data are performed using k-nearest neighbor and support vector machines methods.

Stored-fluorography mode reduces radiation dose during cardiac catheterization measured with OSLD dosimeter

Chien-Yi Ting, Zhih-Cherng Chen, Kuo-Ting Tang, et al.

Show abstract

Coronary angiogram is an imperative tool for diagnosis of coronary artery diseases, in which cine-angiography is a commonly used method. Although the angiography proceeds under radiation, the potential risk of radiation exposure for both the patients and the operators was seldom noticed. In this study, the absorbed radiation dose in stored-fluorography mode was compared with that in cine-angiography mode by using optically simulated luminescent dosimeters to realize their effects on radiation dose. Patients received coronary angiogram via radial artery approach were randomized into the stored-fluorography group (N=30) or the cine-angiography group (N=30). The excluded criteria were: 1. women at pregnancy or on breast feeding, 2. chronic kidney diseases with glomerular filtration rate less than 60 mL/min. During the coronary angiogram, absorbed dose of the patients and the operator radiation exposure was measured with optically simulated luminescent dosimeter (OSLD). The absorbed dose of the patients in the stored-fluorography group (3.13±0.25 mGy) was apparently lower than that in the cine-angiography group (65.57±5.37 mGy; P<0.001). For the operator, a statistical difference (P<0.001) was also found between the stored-fluorography group (0.09163 μGy) and the cine-angiography (0.6519μGy). Compared with traditional cine-angiography mode, the stored-fluorography mode can apparently reduce radiation exposure of the patients and the operator in coronary angiogram.

Three-dimensional assessment of scoliosis based on ultrasound data

Junhua Zhang, Hongjian Li, Bo Yu

Show abstract

In this study, an approach was proposed to assess the 3D scoliotic deformity based on ultrasound data. The 3D spine model was reconstructed by using a freehand 3D ultrasound imaging system. The geometric torsion was then calculated from the reconstructed spine model. A thoracic spine phantom set at a given pose was used in the experiment. The geometric torsion of the spine phantom calculated from the freehand ultrasound imaging system was 0.041 mm^-1 which was close to that calculated from the biplanar radiographs (0.025 mm^-1). Therefore, ultrasound is a promising technique for the 3D assessment of scoliosis.

Image Processing and Application

Extraction of latent images from printed media

Vladislav Sergeyev, Victor Fedoseev

Show abstract

In this paper we propose an automatic technology for extraction of latent images from printed media such as documents, banknotes, financial securities, etc. This technology includes image processing by adaptively constructed Gabor filter bank for obtaining feature images, as well as subsequent stages of feature selection, grouping and multicomponent segmentation. The main advantage of the proposed technique is versatility: it allows to extract latent images made by different texture variations. Experimental results showing performance of the method over another known system for latent image extraction are given.

Grid fill algorithm for vector graphics render on mobile devices

Jixian Zhang, Kun Yue, Guowu Yuan, et al.

Show abstract

The performance of vector graphics render has always been one of the key elements in mobile devices and the most important step to improve the performance is to enhance the efficiency of polygon fill algorithms. In this paper, we proposed a new and more efficient polygon fill algorithm based on the scan line algorithm and Grid Fill Algorithm (GFA). First, we elaborated the GFA through solid fill. Second, we described the techniques for implementing antialiasing and self-intersection polygon fill with GFA. Then, we discussed the implementation of GFA based on the gradient fill. Generally, compared to other fill algorithms, GFA has better performance and achieves faster fill speed, which is specifically consistent with the inherent characteristics of mobile devices. Experimental results show that better fill effects can be achieved by using GFA.

A reversible data hiding method based on OWD predictor

Yu Cheng

Show abstract

To improve the embedding capacity of reversible image data hiding method, how to make the prediction error difference histogram chart more compact, higher peak is important. This paper proposed an optimal weight detection (OWD predictor) reversible data hiding method that improves the existing prediction methods, using optimal weights to improve the prediction accuracy of pixel values. Experimental results show that, compared with other reversible data hiding methods, the proposed method significantly improves the embedding capacity while ensured the image quality.

Model based and model free methods for features extraction to recognize gait using fast wavelet network classifier

Aycha Dorgham, Tahani Bouchrika, Mourad Zaied

Show abstract

Human gait is an attractive modality for recognizing people at a distance. Gait recognition systems aims to identify people by studying their manner of walking. In this paper, we contribute by the creation of a new approach for gait recognition based on fast wavelet network (FWN) classifier. To guaranty the effectiveness of our gait recognizer, we have employed both static and dynamic gait characteristics. So, to extract the static features (dimension of the body part), model based method was employed. Thus, for the dynamic features (silhouette appearance and motion), model free method was used. The combination of these two methods aims at strengthens the WN classification results. Experimental results employing universal datasets show that our new gait recognizer performs better than already established ones.

Modification of the method of parametric estimation of atmospheric distortion in MODTRAN model

A. M. Belov

Show abstract

The paper presents a modification of the method of parametric estimation of atmospheric distortion in MODTRAN model as well as experimental research of the method. The experimental research showed that the base method does not take into account physical meaning of atmospheric spherical albedo parameter and presence of outliers in source data that results to overall atmospheric correction accuracy decreasing. Proposed modification improves the accuracy of atmospheric correction in comparison with the base method. The modification consists in the addition of nonnegativity constraint on the atmospheric spherical albedo estimated value and the addition of preprocessing stage aimed to adjust source data.

Image contrast enhancement using Chebyshev wavelet moments

Dm. V. Uchaev, D. V. Uchaev, V. A. Malinnikov

Show abstract

A new algorithm for image contrast enhancement in the Chebyshev moment transform (CMT) domain is introduced. This algorithm is based on a contrast measure that is defined as the ratio of high-frequency to zero-frequency content in the bands of CMT matrix. Our algorithm enables to enhance a large number of high-spatial-frequency coefficients, that are responsible for image details, without severely degrading low-frequency contributions. To enhance high-frequency Chebyshev coefficients we use a multifractal spectrum of scaling exponents (SEs) for Chebyshev wavelet moment (CWM) magnitudes, where CWMs are multiscale realization of Chebyshev moments (CMs). This multifractal spectrum is very well suited to extract meaningful structures on images of natural scenes, because these images have a multifractal character. Experiments with test images show some advantages of the proposed algorithm as compared to other widely used image enhancement algorithms. The main advantage of our algorithm is the following: the algorithm very well highlights image details during image contrast enhancement.

Multi-resolution Gabor wavelet feature extraction for needle detection in 3D ultrasound

Arash Pourtaherian, Svitlana Zinger, Nenad Mihajlovic, et al.

Show abstract

Ultrasound imaging is employed for needle guidance in various minimally invasive procedures such as biopsy guidance, regional anesthesia and brachytherapy. Unfortunately, a needle guidance using 2D ultrasound is very challenging, due to a poor needle visibility and a limited field of view. Nowadays, 3D ultrasound systems are available and more widely used. Consequently, with an appropriate 3D image-based needle detection technique, needle guidance and interventions may significantly be improved and simplified. In this paper, we present a multi-resolution Gabor transformation for an automated and reliable extraction of the needle-like structures in a 3D ultrasound volume. We study and identify the best combination of the Gabor wavelet frequencies. High precision in detecting the needle voxels leads to a robust and accurate localization of the needle for the intervention support. Evaluation in several ex-vivo cases shows that the multi-resolution analysis significantly improves the precision of the needle voxel detection from 0.23 to 0.32 at a high recall rate of 0.75 (gain 40%), where a better robustness and confidence were confirmed in the practical experiments.

Towards social interaction detection in egocentric photo-streams

Maedeh Aghaei, Mariella Dimiccoli, Petia Radeva

Show abstract

Detecting social interaction in videos relying solely on visual cues is a valuable task that is receiving increasing attention in recent years. In this work, we address this problem in the challenging domain of egocentric photo-streams captured by a low temporal resolution wearable camera (2fpm). The major difficulties to be handled in this context are the sparsity of observations as well as unpredictability of camera motion and attention orientation due to the fact that the camera is worn as part of clothing. Our method consists of four steps: multi-faces localization and tracking, 3D localization, pose estimation and analysis of f-formations. By estimating pair-to-pair interaction probabilities over the sequence, our method states the presence or absence of interaction with the camera wearer and specifies which people are more involved in the interaction. We tested our method over a dataset of 18.000 images and we show its reliability on our considered purpose.

Multi-view score fusion for content-based mammogram retrieval

Sami Dhahbi, Walid Barhoumi, Ezzeddine Zagrouba

Show abstract

Screening mammography provides two views for each breast: Medio-Lateral Oblique (MLO) and Cranial-Caudal (CC) views. However, current content based image retrieval (CBIR) systems analyze each view independently, in spite of their complementarities. To further improve the retrieval performance, this paper introduces a two-view CBIR system that combines retrieval results of MLO and CC views. First, we computed the similarity scores between MLO (resp. CC) ROIs in the database and the MLO (resp. CC) query ROI. These ROIs are characterized using curvelet moments. Then, a new linear weighted sum scheme combines MLO and CC scores; it assigns weights for each view according to the distribution of the classes of its neighbors. The ROIs having the highest fused scores are displayed to the radiologist and used to compute the malignancy likelihood of the lesion. Experiments performed on mammograms from the Digital Database for Screening Mammography (DDSM) show the effectiveness of the proposed method.

Capturing the best hyperspectral image in different lighting conditions

Andrzej Kordecki, Artur Bal

Show abstract

The quality of image often decides about its usability in further application. Hence, it is essential to ensure the best possible image quality at the stage of the image acquisition process. The lighting conditions are one of the most important factors affecting the quality of the obtained image. In the case of hyperspectral imaging, in comparison to standard image acquisition, selection of appropriate light sources involves additional difficulties connected with the spectral nature of the light. The article describes how the lights for such application can be selected. The proposed selection criterion is based on the accuracy of measured spectral reflectance of the object. Presented method was tested on real object and three different types of light source.

Demosaicing as the problem of regularization

Irina Kunina, Aleksey Volkov, Sergey Gladilin, et al.

Show abstract

Demosaicing is the process of reconstruction of a full-color image from Bayer mosaic, which is used in digital cameras for image formation. This problem is usually considered as an interpolation problem. In this paper, we propose to consider the demosaicing problem as a problem of solving an underdetermined system of algebraic equations using regularization methods. We consider regularization with standard l_1/2-, l₁ -, l₂- norms and their effect on quality image reconstruction. The experimental results showed that the proposed technique can both be used in existing methods and become the base for new ones

A new study on mammographic image denoising using multiresolution techniques

Min Dong, Ya-Nan Guo, Yi-De Ma, et al.

Show abstract

Mammography is the most simple and effective technology for early detection of breast cancer. However, the lesion areas of breast are difficult to detect which due to mammograms are mixed with noise. This work focuses on discussing various multiresolution denoising techniques which include the classical methods based on wavelet and contourlet; moreover the emerging multiresolution methods are also researched. In this work, a new denoising method based on dual tree contourlet transform (DCT) is proposed, the DCT possess the advantage of approximate shift invariant, directionality and anisotropy. The proposed denoising method is implemented on the mammogram, the experimental results show that the emerging multiresolution method succeeded in maintaining the edges and texture details; and it can obtain better performance than the other methods both on visual effects and in terms of the Mean Square Error (MSE), Peak Signal to Noise Ratio (PSNR) and Structure Similarity (SSIM) values.

An optimized structure on FPGA of key point description in SIFT algorithm

Chenyu Xu, Jinlong Peng, En Zhu, et al.

Show abstract

SIFT algorithm is one of the most significant and effective algorithms to describe the features of image in the field of image matching. To implement SIFT algorithm to hardware environment is apparently considerable and difficult. In this paper, we mainly discuss the realization of Key Point Description in SIFT algorithm, along with Matching process. In Key Point Description, we have proposed a new method of generating histograms, to avoid the rotation of adjacent regions and insure the rotational invariance. In Matching, we replace conventional Euclidean distance with Hamming distance. The results of the experiments fully prove that the structure we propose is real-time, accurate, and efficient. Future work is still needed to improve its performance in harsher conditions.

A new method for robust video watermarking resistant against key estimation attacks

Vitaly Mitekin

Show abstract

This paper presents a new method for high-capacity robust digital video watermarking and algorithms of embedding and extraction of watermark based on this method. Proposed method uses password-based two-dimensional pseudonoise arrays for watermark embedding, making brute-force attacks aimed at steganographic key retrieval mostly impractical. Proposed algorithm for 2-dimensional "noise-like" watermarking patterns generation also allows to significantly decrease watermark collision probability ( i.e. probability of correct watermark detection and extraction using incorrect steganographic key or password).. Experimental research provided in this work also shows that simple correlation-based watermark detection procedure can be used, providing watermark robustness against lossy compression and watermark estimation attacks. At the same time, without decreasing robustness of embedded watermark, average complexity of the brute-force key retrieval attack can be increased to 10¹⁴ watermark extraction attempts (compared to 10⁴-10⁶ for a known robust watermarking schemes). Experimental results also shows that for lowest embedding intensity watermark preserves it’s robustness against lossy compression of host video and at the same time preserves higher video quality (PSNR up to 51dB) compared to known wavelet-based and DCT-based watermarking algorithms.

Automatic and robust method for registration of optical imagery with point cloud data

Yingdan Wu, Yang Ming

Show abstract

Aim to the difficulty of automatic and robust registration of optical imagery with point cloud data, this paper propose a new method based on SIFT and Mutual Information (MI). The SIFT features are firstly extracted and matched, whose result is used to derive the coarse geometric relationship between the optical imagery and the point cloud data. Secondly, the MI-based similarity measure is used to derive the conjugate points. And then the RANSAC algorithm is adopted to eliminate the erroneous matching points. Repeating the procedure of MI matching and mismatching points deletion until the finest pyramid image level. Using the matching results, the transform model is determined. The experiments have been made and they demonstrate the potential of the MI-based measure for the registration of optical imagery with the point cloud data, and this highlight the feasibility and robustness of the method proposed in this paper to automated registration of multi-modal, multi-temporal remote sensing data for a wide range of applications.

CT metal artifact reduction by soft inequality constraints

Marina Chukalina, Dmitry Nikolaev, Valerii Sokolov, et al.

Show abstract

The artifacts (known as metal-like artifacts) arising from incorrect reconstruction may obscure or simulate pathology in medical applications, hide or mimic cracks and cavities in the scanned objects in industrial tomographic scans. One of the main reasons caused such artifacts is photon starvation on the rays which go through highly absorbing regions. We indroduce a way to suppress such artifacts in the reconstructions using soft penalty mimicing linear inequalities on the photon starved rays. An efficient algorithm to use such information is provided and the effect of those inequalities on the reconstruction quality is studied.

A deep convolutional neural network for recognizing foods

Elnaz Jahani Heravi, Hamed Habibi Aghdam, Domenec Puig

Show abstract

Controlling the food intake is an efficient way that each person can undertake to tackle the obesity problem in countries worldwide. This is achievable by developing a smartphone application that is able to recognize foods and compute their calories. State-of-art methods are chiefly based on hand-crafted feature extraction methods such as HOG and Gabor. Recent advances in large-scale object recognition datasets such as ImageNet have revealed that deep Convolutional Neural Networks (CNN) possess more representation power than the hand-crafted features. The main challenge with CNNs is to find the appropriate architecture for each problem. In this paper, we propose a deep CNN which consists of 769; 988 parameters. Our experiments show that the proposed CNN outperforms the state-of-art methods and improves the best result of traditional methods 17%. Moreover, using an ensemble of two CNNs that have been trained two different times, we are able to improve the classification performance 21:5%.

A high capacity multiple watermarking scheme based on Fourier descriptor and Sudoku

Li Zhang, Huimin Zheng

Show abstract

Digital watermark is a type of technology to hide some significant information which is mainly used to protect digital data. A high capacity multiple watermarking method is proposed, which adapts the Fourier descriptor to pre-process the watermarks, while a Sudoku puzzle is used as a reference matrix in embedding process and a key in extraction process. It can dramatically reduce the required capacity by applying Fourier descriptor. Meanwhile, the security of watermarks can be guaranteed due to the Sudoku puzzle. Unlike previous algorithms applying Sudoku puzzle in spatial domain, the proposed algorithm works in transformed domain by applying LWT2.In addition, the proposed algorithm can detect the temper location accurately. The experimental results demonstrated that the goals mentioned above have been achieved.

Fast roadway detection using car cabin video camera

Daria Krokhina, Veniamin Blinov, Sergey Gladilin, et al.

Show abstract

We describe a fast method for road detection in images from a vehicle cabin camera. Straight section of roadway is detected using Fast Hough Transform and the method of dynamic programming. We assume that location of horizon line in the image and the road pattern are known. The developed method is fast enough to detect the roadway on each frame of the video stream in real time and may be further accelerated by the use of tracking.

Computer Vision and Visualization

Computer vision based room interior design

Nasir Ahmad, Saddam Hussain, Kashif Ahmad, et al.

Show abstract

This paper introduces a new application of computer vision. To the best of the author’s knowledge, it is the first attempt to incorporate computer vision techniques into room interior designing. The computer vision based interior designing is achieved in two steps: object identification and color assignment. The image segmentation approach is used for the identification of the objects in the room and different color schemes are used for color assignment to these objects. The proposed approach is applied to simple as well as complex images from online sources. The proposed approach not only accelerated the process of interior designing but also made it very efficient by giving multiple alternatives.

Multi-shot person re-identification approach based key frame selection

Yousra Hadj Hassen, Walid Ayedi, Tarek Ouni, et al.

Show abstract

This paper presents a novel approach to solve the problem of person re-identification in non-overlapping camera views. We propose an appearance based method for person re-identification that condenses a set of frames of the same individual into the multi-class classifier SVM (Support Vector Machine). Still, the choice of different and most expressive frames for each target is very challenging. Besides, efficient person re-identification algorithms are computationally expensive due to the big amount of data used. One of the originalities of our method is how to select different shots during person tracking within each camera to guaranty efficient person re-identification. We evaluate our approach on the publicly available PRID 2011 multi-shot re-identification dataset and demonstrate some performance in comparison with the elimination of the proposed key frames selection.

Visual navigation of the UAVs on the basis of 3D natural landmarks

Simon Karpenko, Ivan Konovalenko, Alexander Miller, et al.

Show abstract

This work considers the tracking of the UAV (unmanned aviation vehicle) on the basis of onboard observations of natural landmarks including azimuth and elevation angles. It is assumed that UAV's cameras are able to capture the angular position of reference points and to measure the angles of the sight line. Such measurements involve the real position of UAV in implicit form, and therefore some of nonlinear filters such as Extended Kalman filter (EKF) or others must be used in order to implement these measurements for UAV control. Recently it was shown that modified pseudomeasurement method may be used to control UAV on the basis of the observation of reference points assigned along the UAV path in advance. However, the use of such set of points needs the cumbersome recognition procedure with the huge volume of on-board memory. The natural landmarks serving as such reference points which may be determined on-line can significantly reduce the on-board memory and the computational difficulties. The principal difference of this work is the usage of the 3D reference points coordinates which permits to determine the position of the UAV more precisely and thereby to guide along the path with higher accuracy which is extremely important for successful performance of the autonomous missions. The article suggests the new RANSAC for ISOMETRY algorithm and the use of recently developed estimation and control algorithms for tracking of given reference path under external perturbation and noised angular measurements.

Building a robust vehicle detection and classification module

Anton Grigoryev, Timur Khanipov, Ivan Koptelov, et al.

Show abstract

The growing adoption of intelligent transportation systems (ITS) and autonomous driving requires robust real-time solutions for various event and object detection problems. Most of real-world systems still cannot rely on computer vision algorithms and employ a wide range of costly additional hardware like LIDARs. In this paper we explore engineering challenges encountered in building a highly robust visual vehicle detection and classification module that works under broad range of environmental and road conditions. The resulting technology is competitive to traditional non-visual means of traffic monitoring. The main focus of the paper is on software and hardware architecture, algorithm selection and domain-specific heuristics that help the computer vision system avoid implausible answers.

Problem-oriented stereo vision quality evaluation complex

D. Sidorchuk, N. Gusamutdinova, I. Konovalenko, et al.

Show abstract

We describe an original low cost hardware setting for efficient testing of stereo vision algorithms. The method uses a combination of a special hardware setup and mathematical model and is easy to construct, precise in applications of our interest. For a known scene we derive its analytical representation, called virtual scene. Using a four point correspondence between the scene and virtual one we compute extrinsic camera parameters, and project virtual scene on the image plane, which is the ground truth for depth map. Another result, presented in this paper, is a new depth map quality metric. Its main purpose is to tune stereo algorithms for particular problem, e.g. obstacle avoidance.

Characterizing the influence of surface roughness and inclination on 3D vision sensor performance

John R. Hodgson, Peter Kinnell, Laura Justham, et al.

Show abstract

This paper reports a methodology to evaluate the performance of 3D scanners, focusing on the influence of surface roughness and inclination on the number of acquired data points and measurement noise. Point clouds were captured of samples mounted on a robotic pan-tilt stage using an Ensenso active stereo 3D scanner. The samples have isotropic texture and range in surface roughness (Ra) from 0.09 to 0.46 μm. By extracting the point cloud quality indicators, point density and standard deviation, at a multitude of inclinations, maps of scanner performance are created. These maps highlight the performance envelopes of the sensor, the aim being to predict and compare scanner performance on real-world surfaces, rather than idealistic artifacts. The results highlight the need to characterize 3D vision sensors by their measurement limits as well as best-case performance, determined either by theoretical calculation or measurements in ideal circumstances.

Research and implementation of visualization techniques for 3D explosion fields

Jianguo Ning, Xiangzhao Xu, Tianbao Ma, et al.

Show abstract

The visualization of scalar data in 3D explosion fields was devised to solve the problems of the complex physical and the huge data in numerical simulation of explosion mechanics problems. For enhancing the explosion effects and reducing the impacts of image analysis, the adjustment coefficient was added into original Phong illumination model. A variety of accelerated volume rendering algorithm and multithread technique were used to realize the fast rendering and real-time interactive control of 3D explosion fields. Cutaway view was implemented, so arbitrary section of 3D explosion fields can be seen conveniently. Slice can be extracted along three axes of 3D explosion fields, and the value at an arbitrary point on the slice can be gained. The experiment results show that the volume rendering acceleration algorithm can generate high quality images and can increase the speed of image generating, while achieve interactive control quickly.

An unsupervised method for summarizing egocentric sport videos

Hamed Habibi Aghdam, Elnaz Jahani Heravi, Domenec Puig

Show abstract

People are getting more interested to record their sport activities using head-worn or hand-held cameras. This type of videos which is called egocentric sport videos has different motion and appearance patterns compared with life-logging videos. While a life-logging video can be defined in terms of well-defined human-object interactions, notwithstanding, it is not trivial to describe egocentric sport videos using well-defined activities. For this reason, summarizing egocentric sport videos based on human-object interaction might fail to produce meaningful results. In this paper, we propose an unsupervised method for summarizing egocentric videos by identifying the key-frames of the video. Our method utilizes both appearance and motion information and it automatically finds the number of the key-frames. Our blind user study on the new dataset collected from YouTube shows that in 93:5% cases, the users choose the proposed method as their first video summary choice. In addition, our method is within the top 2 choices of the users in 99% of studies.

3D vision assisted flexible robotic assembly of machine components

Philips S. Ogun, Zahid Usman, Karthick Dharmaraj, et al.

Show abstract

Robotic assembly systems either make use of expensive fixtures to hold components in predefined locations, or the poses of the components are determined using various machine vision techniques. Vision-guided assembly robots can handle subtle variations in geometries and poses of parts. Therefore, they provide greater flexibility than the use of fixtures. However, the currently established vision-guided assembly systems use 2D vision, which is limited to three degrees of freedom. The work reported in this paper is focused on flexible automated assembly of clearance fit machine components using 3D vision. The recognition and the estimation of the poses of the components are achieved by matching their CAD models with the acquired point cloud data of the scene. Experimental results obtained from a robot demonstrating the assembly of a set of rings on a shaft show that the developed system is not only reliable and accurate, but also fast enough for industrial deployment.

A multi level system design for vigilance measurement based on head posture estimation and eyes blinking

Ines Teyeb, Olfa Jemai, Mourad Zaied, et al.

Show abstract

Driving security is an important task for human society. The major challenge in the field of accident avoidance systems is the driver vigilance monitoring. The lack of vigilance can be noticed by various ways, such as, fatigue, drowsiness and distraction. Hence, the need of a reliable driver’s vigilance decrease detection system which can alert drivers before a mishap happens. In this paper, we present a novel approach for vigilance estimation based on multilevel system by combining head movement analysis and eyes blinking. We have used Viola and Jones algorithm to analyse head movement and a classification system using wavelet networks for eyelid closure measuring. The contribution of our application is classifiying the vigilance state at multi level. This is different from the binary-class (awakening or hypovigilant state) existing in most popular systems.

Signal Analysis and Processing

Coding efficiency of AVS 2.0 for CBAC and CABAC engines

Jing Cui, Youngkyu Choi, Soo-Ik Chae

Show abstract

In this paper we compare the coding efficiency of AVS 2.0[1] for engines of the Context-based Binary Arithmetic Coding (CBAC)^[2] in the AVS 2.0 and the Context-Adaptive Binary Arithmetic Coder (CABAC)^[3] in the HEVC^[4]. For fair comparison, the CABAC is embedded in the reference code RD10.1 because the CBAC is in the HEVC in our previous work^[5]. The rate estimation table is employed only for RDOQ in the RD code. To reduce the computation complexity of the video encoder, therefore we modified the RD code so that the rate estimation table is employed for all RDO decision. Furthermore, we also simplify the complexity of rate estimation table by reducing the bit depth of its fractional part to 2 from 8. The simulation result shows that the CABAC has the BD-rate loss of about 0.7% compared to the CBAC. It seems that the CBAC is a little more efficient than that the CABAC in the AVS 2.0.

Multichannel active control of nonlinear noise processes using diagonal structure bilinear FXLMS algorithm

Dong Chen, Ding Yuan, Tan Li, et al.

Show abstract

A novel nonlinear adaptive algorithm named as diagonal structure bilinear filtered-x least mean square (DBFXLMS) for multichannel nonlinear active noise control is proposed in this paper. The performances of the proposed algorithm are shown below and the computational complexity is compared with the second-order Volterra filtered-x LMS (VFXLMS) algorithm and the filtered-s least mean square (FSLMS) algorithm, in terms of normalized mean square error (NMSE), for multichannel active control of nonlinear noise processes. Both the simulations and the computational complexity analyses demonstrate that the proposed method has an improvement as compared to the proposed algorithms.

Blind separation of convolutive sEMG mixtures based on independent vector analysis

Xiaomei Wang, Yina Guo, Wenyan Tian

Show abstract

An independent vector analysis (IVA) method base on variable-step gradient algorithm is proposed in this paper. According to the sEMG physiological properties, the IVA model is applied to the frequency-domain separation of convolutive sEMG mixtures to extract motor unit action potentials information of sEMG signals. The decomposition capability of proposed method is compared to the one of independent component analysis (ICA), and experimental results show the variable-step gradient IVA method outperforms ICA in blind separation of convolutive sEMG mixtures.

An energy-efficient SIMD DSP with multiple VLIW configurations and an advanced memory access unit for LTE-A modem LSIs

Mitsuru Tomono, Makiko Ito, Yoshitaka Nomura, et al.

Show abstract

Energy efficiency is the most important factor in the design of wireless modem LSIs for mobile handset systems. We have developed an energy-efficient SIMD DSP for LTE-A modem LSIs. Our DSP has mainly two hardware features in order to reduce energy consumption. The first one is multiple VLIW configurations to minimize accesses to instruction memories. The second one is an advanced memory access unit to realize complex memory accesses required for wireless baseband processing. With these features, performance of our DSP is about 1.7 times faster than a base DSP on average for standard LTE-A Libraries. Our DSP achieves about 20% improvement in energy efficiency compared to a base DSP for LTE-A modem LSIs.

Communication and Information System

Mobile indoor localization using Kalman filter and trilateration technique

Abdul Wahid, Su Mi Kim, Jaeho Choi

Show abstract

In this paper, an indoor localization method based on Kalman filtered RSSI is presented. The indoor communications environment however is rather harsh to the mobiles since there is a substantial number of objects distorting the RSSI signals; fading and interference are main sources of the distortion. In this paper, a Kalman filter is adopted to filter the RSSI signals and the trilateration method is applied to obtain the robust and accurate coordinates of the mobile station. From the indoor experiments using the WiFi stations, we have found that the proposed algorithm can provide a higher accuracy with relatively lower power consumption in comparison to a conventional method.

User-scheduling algorithm for a MU-MIMO system

Haiyang Yu, Jaeho Choi

Show abstract

A user-scheduling algorithm for MU-MIMO systems is presented in this paper. The algorithm is a codebook based precoding method which can be suitable for the IEEE 802.16m mobile broadband standard. The proposed algorithm can effectively improve the sum capacity and fairness among the users.

Extrinsic information transfer charts and constituent decoder for turbo coded communications

Wenjun Yu, Jaeho Choi

Show abstract

Turbo codes have achieved near Shannon limit performance in data communication over noisy channels. Recently introduced Extrinsic Information Transfer (EXIT) Charts [1] have become an essential part of turbo code design and have also been used as a complementary design tool for the traditional bit error rate simulations. Additionally, compressive turbo codes have been shown to achieve near-entropy performance in different source coding problems [2], [3], [4]. The main objective of this paper is an extension of EXIT charts from turbo channel codes to turbo source codes, as well as extension of this technique to analog and finite precision iterative decoders.

Analysis of the fuzzy greatest of CFAR detector in homogeneous and non-homogeneous Weibull clutter title

Mohamed Baadeche, Faouzi Soltani

Show abstract

In this paper, we analyze the distributed FGO-CFAR detector in homogeneous and Non-Homogeneous Weibull clutter with an assumption of known shape parameter. The non-homogeneity is modeled by the presence of a clutter edge in the reference window. We derive membership function which maps the observations to the false alarm space and compute the threshold at the data fusion center. Applying the ‘Maximum’, ‘Minimum’, ‘Algebraic Sum’ and ‘Algebraic Product’ fuzzy rules for L detectors considered at the data fusion center, the obtained results showed that the best performance is obtained by the ‘Algebraic Product’ fuzzy rule followed by the ‘Minimum’ one and in these two cases the probability of detection increases significantly with the number of detectors.

Improved metropolis light transport algorithm based on multiple importance sampling

Huaiqing He, Jiaqian Yang, Haohan Liu

Show abstract

Metropolis light transport was an unbiased and robust Monte Carlo method, which could efficiently reduce noise during rendering the realistic graphics to resolve the global illumination problem. The basic Metropolis light transport was improved by combining with multiple importance sampling, which better solved the large correlation and high variance between samples caused by the basic Metropolis light transport. The experiences manifested that the quality of images generated by improved algorithm was better compared with the basic Metropolis light transport in the same scenes settings.

Computer Theory and Application

Ensembles of detectors for online detection of transient changes

Alexey Artemov, Evgeny Burnaev

Show abstract

Classical change-point detection procedures assume a change-point model to be known and a change consisting in establishing a new observations regime, i.e. the change lasts infinitely long. These modeling assumptions contradicts applied problems statements. Therefore, even theoretically optimal statistics in practice very often fail when detecting transient changes online. In this work in order to overcome limitations of classical change-point detection procedures we consider approaches to constructing ensembles of change-point detectors, i.e. algorithms that use many detectors to reliably identify a change-point. We propose a learning paradigm and specific implementations of ensembles for change detection of short-term (transient) changes in observed time series. We demonstrate by means of numerical experiments that the performance of an ensemble is superior to that of the conventional change-point detection procedures.

Nonparametric decomposition of quasi-periodic time series for change-point detection

Alexey Artemov, Evgeny Burnaev, Andrey Lokot

Show abstract

The paper is concerned with the sequential online change-point detection problem for a dynamical system driven by a quasiperiodic stochastic process. We propose a multicomponent time series model and an effective online decomposition algorithm to approximate the components of the models. Assuming the stationarity of the obtained components, we approach the change-point detection problem on a per-component basis and propose two online change-point detection schemes corresponding to two real-world scenarios. Experimental results for decomposition and detection algorithms for synthesized and real-world datasets are provided to demonstrate the efficiency of our change-point detection framework.

Influence of resampling on accuracy of imbalanced classification

E. Burnaev, P. Erofeev, A. Papanov

Show abstract

In many real-world binary classification tasks (e.g. detection of certain objects from images), an available dataset is imbalanced, i.e., it has much less representatives of a one class (a minor class), than of another. Generally, accurate prediction of the minor class is crucial but it’s hard to achieve since there is not much information about the minor class. One approach to deal with this problem is to preliminarily resample the dataset, i.e., add new elements to the dataset or remove existing ones. Resampling can be done in various ways which raises the problem of choosing the most appropriate one. In this paper we experimentally investigate impact of resampling on classification accuracy, compare resampling methods and highlight key points and difficulties of resampling.

Accuracy assessment of Kinect for Xbox One in point-based tracking applications

Adrian Goral, Andrzej Skalski

Show abstract

We present the accuracy assessment of a point-based tracking system built on Kinect v2. In our approach, color, IR and depth data were used to determine the positions of spherical markers. To accomplish this task, we calibrated the depth/infrared and color cameras using a custom method. As a reference tool we used Polaris Spectra optical tracking system. The mean error obtained within the range from 0.9 to 2.9 m was 61.6 mm. Although the depth component of the error turned out to be the largest, the random error of depth estimation was only 1.24 mm on average. Our Kinect-based system also allowed for reliable angular measurements within the range of ±20° from the sensor’s optical axis.

The suitability of lightfield camera depth maps for coordinate measurement applications

Shreedhar Rangappa, Mitul Tailor, Jon Petzing, et al.

Show abstract

Plenoptic cameras can capture 3D information in one exposure without the need for structured illumination, allowing grey scale depth maps of the captured image to be created. The Lytro, a consumer grade plenoptic camera, provides a cost effective method of measuring depth of multiple objects under controlled lightning conditions. In this research, camera control variables, environmental sensitivity, image distortion characteristics, and the effective working range of two Lytro first generation cameras were evaluated. In addition, a calibration process has been created, for the Lytro cameras, to deliver three dimensional output depth maps represented in SI units (metre). The novel results show depth accuracy and repeatability of +10.0 mm to -20.0 mm, and 0.5 mm respectively. For the lateral X and Y coordinates, the accuracy was +1.56 μm to −2.59 μm and the repeatability was 0.25 μm.

Design of virtual display and testing system for moving mass electromechanical actuator

Zhigang Gao, Keda Geng, Jun Zhou, et al.

Show abstract

Aiming at the problem of control, measurement and movement virtual display of moving mass electromechanical actuator(MMEA), the virtual testing system of MMEA was developed based on the PC-DAQ architecture and the software platform of LabVIEW, and the comprehensive test task such as drive control of MMEA, tests of kinematic parameter, measurement of centroid position and virtual display of movement could be accomplished. The system could solve the alignment for acquisition time between multiple measurement channels in different DAQ cards, then on this basis, the researches were focused on the dynamic 3D virtual display by the LabVIEW, and the virtual display of MMEA were realized by the method of calling DLL and the method of 3D graph drawing controls. Considering the collaboration with the virtual testing system, including the hardware drive, the measurement software of data acquisition, and the 3D graph drawing controls method was selected, which could obtained the synchronization measurement, control and display. The system can measure dynamic centroid position and kinematic position of movable mass block while controlling the MMEA, and the interface of 3D virtual display has realistic effect and motion smooth, which can solve the problem of display and playback about MMEA in the closed shell.

Model selection for anomaly detection

E. Burnaev, P. Erofeev, D. Smolyakov

Show abstract

Anomaly detection based on one-class classification algorithms is broadly used in many applied domains like image processing (e.g. detection of whether a patient is “cancerous” or “healthy” from mammography image), network intrusion detection, etc. Performance of an anomaly detection algorithm crucially depends on a kernel, used to measure similarity in a feature space. The standard approaches (e.g. cross-validation) for kernel selection, used in two-class classification problems, can not be used directly due to the specific nature of a data (absence of a second, abnormal, class data). In this paper we generalize several kernel selection methods from binary-class case to the case of one-class classification and perform extensive comparison of these approaches using both synthetic and real-world data.

A scalable and practical one-pass clustering algorithm for recommender system

Asra Khalid, Mustansar Ali Ghazanfar, Awais Azam, et al.

Show abstract

KMeans clustering-based recommendation algorithms have been proposed claiming to increase the scalability of recommender systems. One potential drawback of these algorithms is that they perform training offline and hence cannot accommodate the incremental updates with the arrival of new data, making them unsuitable for the dynamic environments. From this line of research, a new clustering algorithm called One-Pass is proposed, which is a simple, fast, and accurate. We show empirically that the proposed algorithm outperforms K-Means in terms of recommendation and training time while maintaining a good level of accuracy.

Computer control by hand gestures

Intidhar Jemel, Ridha Ejbali, Mourad Zaied

Show abstract

This work fits into the context of the interpretation of automatic gestures based on computer vision. The aim of our work is to transform a conventional screen in a surface that allows the user to use his hands as pointing devices. These can be summarized in three main steps. Hand detection in a video, monitoring detected hands and conversion paths made by the hands to computer commands. To realize this application, it is necessary to detect the hand to follow. A classification phase is essential, at the control part. For this reason, we resorted to the use of a neuro-fuzzy classifier for classification and a pattern matching method for detection.

Mechanical Control and System

Optimization of deformations and hoop stresses in TSV liners to boost interconnect reliability in electronic appliances

Mary Atieno Juma, Xuliang Zhang, Song Bai He, et al.

Show abstract

Recently, there has been a lot of research with electronic products because more and different functions are integrated into devices and the final product sizes have to be small to meet the market demand. A lot of research has been done on the (TSVs) Through Silicon Vias. In this paper, through silicon via liners are investigated. The liners: silicon dioxide, polystyrene and polypropylene carbonate are exposed to pressure on their inner surfaces and this yielded hoop stresses within their thickness. Deflections too occurred and this is a proof that deformation really took place. In one of our papers, hoop stresses for the same materials were investigated. The values were a little higher but different for each material used. In this paper, we use global cylindrical, partial cylinder model with different theta in Analysis system 14 to model the through silicon via liners. The values are lower meaning the reliability of the liners have been optimized and boosted. However, silicon dioxide liner had the lowest hoop stress around its circumference and lowest deflection value meaning that it’s still one of the most reliable materials in the manufacture of through silicon via liners in the industry; but overdependence can be avoided if the other liners are used too.

Stochastic control of light UAV at landing with the aid of bearing-only observations

Alexander Miller, Boris Miller

Show abstract

This work considers the tracking of the UAV (unmanned aviation vehicle) at landing on unprepared field. Despite the advantages in UAV guidance the autonomous landing remains to be one of most serious problems. The principal difficulties are the absence of the precise UAV position measurements with respect to the landing field and the action of external atmospheric perturbations (turbulence and wind). So the control problem for UAV landing is the nonlinear stochastic one with incomplete information. The aim of the article is the development of stochastic control algorithms based on pseudomeasurement Kalman filter in the problem of the UAV autonomous landing with the aid of ground-based optical/radio radars in the case of strong wind and large initial error of the UAV entrance into the area covered by radars. The novelty of the article is the joint control-observation algorithm based on unbiased pseudomeasurement Kalman filter which provides the quadratic characteristics of the estimation errors. The later property is highly important for the UAV control based on the data fusion from INS (inertial navigation system) and the bearing observations obtained from external terrain based locators. The principal difficulty in the UAV landing control is the absence of the direct control tools at the terrain end, so the possible control can be based on the angular-range data obtained by terrain locators which must be transmitted from terrain location station to the UAV control unit. Thus the stochastic approach looks very effective in this challenging problem of the UAV landing.

Complex approach to long-term multi-agent mapping in low dynamic environments

Evgeny A. Shvets, Dmitry P. Nikolaev

Show abstract

In the paper we consider the problem of multi-agent continuous mapping of a changing, low dynamic environment. The mapping problem is a well-studied one, however usage of multiple agents and operation in a non-static environment complicate it and present a handful of challenges (e.g. double-counting, robust data association, memory and bandwidth limits). All these problems are interrelated, but are very rarely considered together, despite the fact that each has drawn attention of the researches. In this paper we devise an architecture that solves the considered problems in an internally consistent manner.