Proceedings Volume 9443

Sixth International Conference on Graphic and Image Processing (ICGIP 2014)

cover
Proceedings Volume 9443

Sixth International Conference on Graphic and Image Processing (ICGIP 2014)

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 17 March 2015
Contents: 10 Sessions, 114 Papers, 0 Presentations
Conference: Sixth International Conference on Graphic and Image Processing (ICGIP 2014) 2014
Volume Number: 9443

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9443
  • Face Recognition
  • Feature Detection and Target Tracking
  • Image Processing
  • Image Analysis and Information Encryption
  • Modeling and Visualization
  • Video Analysis and Processing
  • Medical Signal Processing
  • Signal Processing
  • Information Systems and Image Processing Applications
Front Matter: Volume 9443
icon_mobile_dropdown
Front Matter: Volume 9443
This PDF file contains the front matter associated with SPIE Proceedings Volume 9443, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Face Recognition
icon_mobile_dropdown
Two dimensional discriminant neighborhood preserving embedding in face recognition
Meng Pang, Jifeng Jiang, Chuang Lin, et al.
One of the key issues of face recognition is to extract the features of face images. In this paper, we propose a novel method, named two-dimensional discriminant neighborhood preserving embedding (2DDNPE), for image feature extraction and face recognition. 2DDNPE benefits from four techniques, i.e., neighborhood preserving embedding (NPE), locality preserving projection (LPP), image based projection and Fisher criterion. Firstly, NPE and LPP are two popular manifold learning techniques which can optimally preserve the local geometry structures of the original samples from different angles. Secondly, image based projection enables us to directly extract the optimal projection vectors from twodimensional image matrices rather than vectors, which avoids the small sample size problem as well as reserves useful structural information embedded in the original images. Finally, the Fisher criterion applied in 2DDNPE can boost face recognition rates by minimizing the within-class distance, while maximizing the between-class distance. To evaluate the performance of 2DDNPE, several experiments are conducted on the ORL and Yale face datasets. The results corroborate that 2DDNPE outperforms the existing 1D feature extraction methods, such as NPE, LPP, LDA and PCA across all experiments with respect to recognition rate and training time. 2DDNPE also delivers consistently promising results compared with other competing 2D methods such as 2DNPP, 2DLPP, 2DLDA and 2DPCA.
A review of recent advances in 3D face recognition
Jing Luo, Shuze Geng, Zhaoxia Xiao, et al.
Face recognition based on machine vision has achieved great advances and been widely used in the various fields. However, there are some challenges on the face recognition, such as facial pose, variations in illumination, and facial expression. So, this paper gives the recent advances in 3D face recognition. 3D face recognition approaches are categorized into four groups: minutiae approach, space transform approach, geometric features approach, model approach. Several typical approaches are compared in detail, including feature extraction, recognition algorithm, and the performance of the algorithm. Finally, this paper summarized the challenge existing in 3D face recognition and the future trend. This paper aims to help the researches majoring on face recognition.
Supervised descent method with low rank and sparsity constraints for robust face alignment
Yubao Sun, Bin Hu, Jiankang Deng, et al.
Supervised Descent Method (SDM) learns the descent directions of nonlinear least square objective in a supervised manner, which has been efficiently used for face alignment. However, SDM still may fail in the cases of partial occlusions and serious pose variations. To deal with this issue, we present a new method for robust face alignment by utilizing the low rank prior of human face and enforcing sparse structure of the descent directions. Our approach consists of low rank face frontalization and sparse descent steps. Firstly, in terms of the low rank prior of face image, we recover such a low-rank face from its deformed image and the associated deformation despite significant distortion and corruption. Alignment of the recovered frontal face image is more simple and effective. Then, we propose a sparsity regularized supervised descent model by enforcing the sparse structure of the descent directions under the l1constraint, which makes the model more effective in computation and robust to partial occlusion. Extensive results on several benchmarks demonstrate that the proposed method is robust to facial occlusions and pose variations
Hardware-software face detection system based on multi-block local binary patterns
Laurentiu Acasandrei, Angel Barriga
Face detection is an important aspect for biometrics, video surveillance and human computer interaction. Due to the complexity of the detection algorithms any face detection system requires a huge amount of computational and memory resources. In this communication an accelerated implementation of MB LBP face detection algorithm targeting low frequency, low memory and low power embedded system is presented. The resulted implementation is time deterministic and uses a customizable AMBA IP hardware accelerator. The IP implements the kernel operations of the MB-LBP algorithm and can be used as universal accelerator for MB LBP based applications. The IP employs 8 parallel MB-LBP feature evaluators cores, uses a deterministic bandwidth, has a low area profile and the power consumption is ~95 mW on a Virtex5 XC5VLX50T. The resulted implementation acceleration gain is between 5 to 8 times, while the hardware MB-LBP feature evaluation gain is between 69 and 139 times.
Particle swarm optimization based articulated human pose tracking using enhanced silhouette extraction
Sanjay Saini, Dayang Rohaya Bt Awang Rambli, Suziah Bt Sulaiman, et al.
In this paper, we address the problem of three dimensional human pose tracking and estimation using Particle Swarm Optimization (PSO) with an improved silhouette extraction mechanism. In this work, the tracking problem is formulated as a nonlinear function optimization problem so the main objective is to optimize the fitness function between the 3D human model and the image observations. In order to improve the tracking performance, new shadow detection, removal and a level-set mechanism are applied during silhouette extraction. Both the silhouette and edge likelihood are used in the fitness function. Experiments using HumanEva-II dataset demonstrate that the proposed approach performance is considerably better than baseline algorithm which uses the Annealed Particle Filter (APF).
Gestalt interest points for image description in weight-invariant face recognition
In this work, we propose two improvements of the Gestalt Interest Points (GIP) algorithm for the recognition of faces of people that have underwent significant weight change. The basic assumption is that some interest points contribute more to the description of such objects than others. We assume that we can eliminate certain interest points to make the whole method more efficient while retaining our classification results. To find out which gestalt interest points can be eliminated, we did experiments concerning contrast and orientation of face features. Furthermore, we investigated the robustness of GIP against image rotation. The experiments show that our method is rotational invariant and - in this practically relevant forensic domain - outperforms the state-of-the-art methods such as SIFT, SURF, ORB and FREAK.
Combining appearance and geometric features for facial expression recognition
Hui Yu, Honghai Liu
This paper introduces a method for facial expression recognition combining appearance and geometric facial features. The proposed framework consistently combines multiple facial representations at both global and local levels. First, covariance descriptors are computed to represent regional features combining various feature information with a low dimensionality. Then geometric features are detected to provide a general facial movement description of the facial expression. These appearance and geometric features are combined to form a vector representation of the facial expression. The proposed method is tested on the CK+ database and shows encouraging performance.
Toward retail product recognition on grocery shelves
Gül Varol, Rıdvan Salih Kuzu
This paper addresses the problem of retail product recognition on grocery shelf images. We present a technique for accomplishing this task with a low time complexity. We decompose the problem into detection and recognition. The former is achieved by a generic product detection module which is trained on a specific class of products (e.g. tobacco packages). Cascade object detection framework of Viola and Jones [1] is used for this purpose. We further make use of Support Vector Machines (SVMs) to recognize the brand inside each detected region. We extract both shape and color information; and apply feature-level fusion from two separate descriptors computed with the bag of words approach. Furthermore, we introduce a dataset (available on request) that we have collected for similar research purposes. Results are presented on this dataset of more than 5,000 images consisting of 10 tobacco brands. We show that satisfactory detection and classification can be achieved on devices with cheap computational power. Potential applications of the proposed approach include planogram compliance control, inventory management and assisting visually impaired people during shopping.
An integrated modeling approach to age invariant face recognition
Fahad Bashir Alvi, Russel Pears
This Research study proposes a novel method for face recognition based on Anthropometric features that make use of an integrated approach comprising of a global and personalized models. The system is aimed to at situations where lighting, illumination, and pose variations cause problems in face recognition. A Personalized model covers the individual aging patterns while a Global model captures general aging patterns in the database. We introduced a de-aging factor that de-ages each individual in the database test and training sets. We used the k nearest neighbor approach for building a personalized model and global model. Regression analysis was applied to build the models. During the test phase, we resort to voting on different features. We used FG-Net database for checking the results of our technique and achieved 65 percent Rank 1 identification rate.
Online Farsi digit recognition using their upper half structure
In this paper, we investigated the efficiency of upper half Farsi numerical digit structure. In other words, half of data (upper half of the digit shapes) was exploited for the recognition of Farsi numerical digits. This method can be used for both offline and online recognition. Half of data is more effective in speed process, data transfer and in this application accuracy. Hidden Markov model (HMM) was used to classify online Farsi digits. Evaluation was performed by TMU dataset. This dataset contains more than 1200 samples of online handwritten Farsi digits. The proposed method yielded more accuracy in recognition rate.
Feature Detection and Target Tracking
icon_mobile_dropdown
Fast ellipse detection by elliptical arcs extracting and grouping
Yipeng Li, Chunhui Zhao
A novel and simple ellipse detection method is proposed in this paper. First, Canny operator is carried on the gray image to get edge image. Second, all the edge segments are obtained from edge image and output gradients of edge segments for further analysis. According to gradient direction, the edge segments are split into primitive lines and arcs. Then elliptical arcs are extracted from the results of splitting and an efficient grouping strategy is proposed to group elliptical arcs coming from the same ellipse as candidate ellipse. Finally, least-square fitting method is implemented to estimate the parameters of these candidate ellipses. Experiment results show that the proposed method is robust to noise and fast for real-time implementation.
Stereo vision-based pedestrian detection using dense disparity map-based detection and segmentation
Chung-Hee Lee, Dongyoung Kim
In this paper, we propose a stereo vision-based pedestrian detection method using a dense disparity map-based detection and segmentation algorithm. To enhance a pedestrian detection performance, we use a dense disparity map extracted from a global stereo matching algorithm. First, we extract a road feature information from the dense disparity map, which is a decision basis of presence or absence of obstacles on the road. It is very important to extract the road feature from the disparity for detecting obstacles robustly regardless of external traffic situations. The obstacle detection is performed with the road feature information to detect only obstacles from entire image. In other words, pedestrian candidates including various upright objects are detected in the obstacle detection stage. Each obstacle area tends to include multiple objects. Thus, a disparity map-based segmentation is performed to separate the obstacle area into each obstacle accurately. And then, accurate pedestrian areas are extracted from segmented obstacle areas using road contact and pedestrian height information. This stage enables to reduce false alarms and to enhance computing speed. To recognize pedestrians, classifier is performed in each verified pedestrian candidate. Finally, we perform a verification stage to examine the recognized pedestrian in detail. Our algorithms are verified by conducting experiments using ETH database.
Multi-lane detection based on multiple vanishing points detection
Chuanxiang Li, Yiming Nie, Bin Dai, et al.
Lane detection plays a significant role in Advanced Driver Assistance Systems (ADAS) for intelligent vehicles. In this paper we present a multi-lane detection method based on multiple vanishing points detection. A new multi-lane model assumes that a single lane, which has two approximately parallel boundaries, may not parallel to others on road plane. Non-parallel lanes associate with different vanishing points. A biological plausibility model is used to detect multiple vanishing points and fit lane model. Experimental results show that the proposed method can detect both parallel lanes and non-parallel lanes.
Edge detection and reduction of brightness of students’ bubble form images
Sümeyya İlkin, Suhap Şahin
Optical Mark Recognition (OMR) is a traditional data input technique and an important human computer interaction technique which is widely used in education testing. This paper proposes a new idea for grading multiple-choice test which is based on a camera on smartphone. The system key techniques and relevant implementations, which include the image scan, edge detection and reduction of brightness on colorful bubble form images, are presented.
Detection and recognition of uneaten fish food pellets in aquaculture using image processing
Huanyu Liu, Lihong Xu, Dawei Li
The waste of fish food has always been a serious problem in aquaculture. On one hand, the leftover fish food spawns a big waste in the aquaculture industry because fish food accounts for a large proportion of the investment. On the other hand, the left over fish food may pollute the water and make fishes sick. In general, the reason for fish food waste is that there is no feedback about the consumption of delivered fish food after feeding. So it is extremely difficult for fish farmers to determine the amount of feedstuff that should be delivered each time and the feeding intervals. In this paper, we propose an effective method using image processing techniques to solve this problem. During feeding events, we use an underwater camera with supplementary LED lights to obtain images of uneaten fish food pellets on the tank bottom. An algorithm is then developed to figure out the number of left pellets using adaptive Otsu thresholding and a linear-time component labeling algorithm. This proposed algorithm proves to be effective in handling the non-uniform lighting and very accurate number of pellets are counted in experiments.
Improved video copy detection algorithm based on multi-scale Harris feature points
In order to meet the real-time requirement of video copy detection, a robust video copy detection algorithm is proposed. Harris feature points is extracted based on local feature descriptor, video frames are divided into blocks and video fingerprints are generated by calculating video feature point amplitude and angle differences. The matching result graph is formed based on video matching frames, copy video is detection by searching the longest path of the graph. Compared with other video detection algorithm, proposed algorithm is with good robustness and discrimination accuracy, experiment proved that detection speed is improved further.
Detecting moving objects under a moving camera in complex environments
Genyuan Zhang, Qin Yu, Sisi Yang, et al.
Robust detection of moving objects in image sequences is an essential part of many vision applications. However, it is not easily achievable with a moving camera since the camera and moving objects motions are mixed together. In this paper we propose a method to detect moving objects under a moving camera. The camera ego-motion is compensated by the corresponding feature sets. The difference image between two consecutive images that ego-motion is compensated is transformed into a binary image using k-means algorithm. According to the clustering results, the region of interest where moving objects are likely to exist is searched by the projection approach. Then local threshold and contour filling methods are applied to detect the accurate moving objects. Experimental results on real image sequences demonstrate that our method can get intact moving objects in the case of a moving camera efficiently.
Abnormal behaviors detection using particle motion model
Yutao Chen, Hong Zhang, Feiyang Cheng, et al.
Human abnormal behaviors detection is one of the most challenging tasks in the video surveillance for the public security control. Interaction Energy Potential model is an effective and competitive method published recently to detect abnormal behaviors, but their model of abnormal behaviors is not accurate enough, so it has some limitations. In order to solve this problem, we propose a novel Particle Motion model. Firstly, we extract the foreground to improve the accuracy of interest points detection since the complex background usually degrade the effectiveness of interest points detection largely. Secondly, we detect the interest points using the graphics features. Here, the movement of each human target can be represented by the movements of detected interest points of the target. Then, we track these interest points in videos to record their positions and velocities. In this way, the velocity angles, position angles and distance between each two points can be calculated. Finally, we proposed a Particle Motion model to calculate the eigenvalue of each frame. An adaptive threshold method is proposed to detect abnormal behaviors. Experimental results on the BEHAVE dataset and online videos show that our method could detect fight and robbery events effectively and has a promising performance.
A novel hybrid motion detection algorithm based on 2D histogram
Xiaomeng Su, Haiying Wang
This article proposes a novel hybrid motion detection algorithm based on 2-D (2-Dimensional) spatio-temporal states histogram. The new algorithm combines the idea of image change detection based on 2-D histogram and spatio-temporal entropy image segmentation. It quantifies the continuity of pixel state in time and space domain which are called TDF (Time Domain Filter) and SDF (Space Domain Filter) respectively. After this, put both channels of output data from TDF and SDF into a 2-D histogram. In the 2-D histogram, a curve division method helps to separate the foreground state points and the background ones more accurately. Innovatively, the new algorithm converts the video sequence to its histogram sequence, and transforms the difference of pixel’s value in the video sequence into the difference of pixel’s position in the 2-D histogram. Experimental results on different types of scenes added Gaussian noise shows that the proposed technique has strong ability of detecting moving objects.
Infrared small target detection based on visual attention
Detecting dim and small target in infrared images and videos is one of the most important techniques in many computer vision applications, such as video surveillance and infrared imaging precise guidance. In this paper, we proposed a real-time target detection approach in infrared imagery. This method combined saliency detection technology and local average filtering. First, we compute the log amplitude spectrum of infrared image. Second, we find the spikes of the amplitude spectrum using cubic facet model and suppress the sharp spikes using local average filtering. At last, the detection result in spatial domain is obtained by reconstructing the 2D signal using the original phase and the filtered amplitude spectrum. Experimental results of infrared images with different types of backgrounds demonstrate the high efficiency and accuracy of the proposed method to detect the dim and small targets.
Saliency region and density maximization for salient object detection
Xin He, Huiyun Jing
In this paper, we propose an alternative salient object detection method based on maximum saliency region and density. The proposed approach can automatically detect the salient object with a well-defined boundary. Saliency region and density maximization is used as the quality function to find the optimal window containing a salient object. And for efficiently executing window search, a branch-and-bound search algorithm based on saliency region and density is proposed. Then the located window is used to initialize the GrabCut method, and the salient object with a well- defined boundary is extracted through applying GrabCut. Experimental results show that the proposed salient object detection approach outperforms the state-of-the-art methods.
A framework for small infrared target real-time visual enhancement
Xiaoliang Sun, Gucan Long, Yang Shang, et al.
This paper proposes a framework for small infrared target real-time visual enhancement. The framework is consisted of three parts: energy accumulation for small infrared target enhancement, noise suppression and weighted fusion. Dynamic programming based track-before-detection algorithm is adopted in the energy accumulation to detect the target accurately and enhance the target’s intensity notably. In the noise suppression, the target region is weighted by a Gaussian mask according to the target’s Gaussian shape. In order to fuse the processed target region and unprocessed background smoothly, the intensity in the target region is treated as weight in the fusion. Experiments on real small infrared target images indicate that the framework proposed in this paper can enhances the small infrared target markedly and improves the image’s visual quality notably. The proposed framework outperforms tradition algorithms in enhancing the small infrared target, especially for image in which the target is hardly visible.
An approach for tissue density classification in mammographic images using artificial neural network based on wavelet and curvelet transforms
Hüseyin Yaşar, Murat Ceylan
Breast cancer is one of the types of cancer which is most commonly seen in women. Density of breast is an important indicator for the risk of cancer. In addition, densities of tissue may harden the diagnosis by hiding the abnormalities occurring on the breast. For this reason, during the process of diagnosis, the process of automatic classification of breast density has a significant importance. In this study, a new system with the base of Artificial Neural Network (ANN) and multiple resolution analysis is suggested. Wavelet and curvelet analyses having the most common use have been used as multi resolution analysis. 4 pieces of statistics which are minimum value, maximum value, mean value and standard deviation have been extracted from the images which have been eluted to their sub-bands via multi resolution analysis. For the purpose of testing the success of the system, 322 pieces of images which are in MIAS database have been used. The obtained results for different backgrounds are so satisfying; and the highest classification values have been obtained as 97.16 % with Wavelet transform and ANN for fatty background and 79.80 % with Wavelet transform and ANN for fatty-glanduar background. The same results have been obtained using Wavelet transform and ANN and Curvelet transform and ANN for dense background and accuracy rate of 84.82 % have been reached. The results of mean classification have been obtained, for three pieces of tissue types (fatty, fatty-glanduar, dense), in sequence as 84.47 % with the use of ANN, 85.71 % with the use of curvelet analysis and ANN; and 87.26 % with the use of wavelet analysis and ANN.
An adaptive interval generation method for efficient distance coding of binary images
Amir L. Liaghati, W. David Pan
We proposed an adaptive method for more efficient distance-coding of binary images. The proposed method partitions the image into blocks where the interval sequences of zeros or ones can be calculated, as opposed to the conventional method where intervals are calculated by following a fixed scan order. In the proposed method, one can adaptively choose either horizontal or vertical scan within a block, depending on criteria based on entropy values. The resulting intervals tend to have lower entropies than the conventional non-adaptive methods, thereby allowing for higher compression than distance coding using a lossless codec. Our simulations on various test images demonstrated that (i) the proposed method achieved significantly higher compression than the non-adaptive distance coding method; (ii) the proposed method can be used as an efficient preprocessor of a lossless coder, offering higher compression than directly coding on the original images.
Research on target tracking in coal mine based on optical flow method
Hongye Xue, Qingwei Xiao
To recognize, track and count the bolting machine in coal mine video images, a real-time target tracking method based on the Lucas-Kanade sparse optical flow is proposed in this paper. In the method, we judge whether the moving target deviate from its trajectory, predicate and correct the position of the moving target. The method solves the problem of failure to track the target or lose the target because of the weak light, uneven illumination and blocking. Using the VC++ platform and Opencv lib we complete the recognition and tracking. The validity of the method is verified by the result of the experiment.
Adaptive object tracking via both positive and negative models matching
Shaomei Li, Chao Gao, Yawen Wang
To improve tracking drift which often occurs in adaptive tracking, an algorithm based on the fusion of tracking and detection is proposed in this paper. Firstly, object tracking is posed as abinary classification problem and is modeled by partial least squares (PLS) analysis. Secondly, tracking object frame by frame via particle filtering. Thirdly, validating the tracking reliability based on both positive and negative models matching. Finally, relocating the object based on SIFT features matching and voting when drift occurs. Object appearance model is updated at the same time. The algorithm can not only sense tracking drift but also relocate the object whenever needed. Experimental results demonstrate that this algorithm outperforms state-of-the-art algorithms on many challenging sequences.
Missile placement analysis based on improved SURF feature matching algorithm
Kaida Yang, Wenjie Zhao, Dejun Li, et al.
The precious battle damage assessment by use of video images to analysis missile placement is a new study area. The article proposed an improved speeded up robust features algorithm named restricted speeded up robust features, which combined the combat application of TV-command-guided missiles and the characteristics of video image. Its restrictions mainly reflected in two aspects, one is to restrict extraction area of feature point; the second is to restrict the number of feature points. The process of missile placement analysis based on video image was designed and a video splicing process and random sample consensus purification were achieved. The RSURF algorithm is proved that has good realtime performance on the basis of guarantee the accuracy.
A modified dual-band ratio temperature measurement method for remote target using temperature change information
Temperature is an important feature of infrared targets. However, due to the attenuation and distortion parameters in radiation transmission process are unknown, precise temperature measurement is a difficult task. In this paper, a modified Dual-Band Ratio (DBR) temperature measurement method for remote target is proposed. The method is based on a new presented variation derived from the temperature change process named Dual-Band Differential Ratio (DBDR). Firstly, the temperature of the target is estimated by the traditional DBR method, and then a correction using DBDR information is carried out to improve the measurement accuracy. Experiment results showed that the proposed method can improve the temperature measurement accuracy and it could also be carried out without any prior information about the target.
A vision framework for the localization of soccer players and ball on the pitch using Handycams
Tiago Vilas, J. M. F. Rodrigues, P. J. S. Cardoso, et al.
The current performance requirements in soccer make imperative the use of new technologies for game observation and analysis, such that detailed information about the teams’ actions is provided. This paper summarizes a framework to collect the soccer players and ball positions using one or more Full HD Handycams, placed no more than 20cm apart in the stands, as well as how this framework connects to the FootData project. The system was based on four main modules: the detection and delimitation of the soccer pitch, the ball and the players detection and assignment to their teams, the tracking of players and ball and finally the computation of their localization (in meters) in the pitch.
Combined block-matching and adaptive differential motion estimation in a hierarchical multi-scale framework
Matthias Brüggemann, Rüdiger Kays, Paul Springer, et al.
In this paper we present a combination of block-matching and differential motion field estimation. We initialize the motion field using a predictive hierarchical block-matching approach. This vector field is refined by a pixel-recursive differential motion estimation method. We integrate image warping and adaptive filter kernels into the Horn and Schunck differential optical flow estimation approach to break the block structure of the initial correspondence vector fields and compute motion field updates to fulfill the smoothness constraint inside motion boundaries. The influence of occlusion areas is reduced by integrating an in-the-loop occlusion detection and adjusting the adaptive filter weights in the iteration process. We integrate the combined estimation into a hierarchical multi-scale framework. The refined motion on the current scale is upscaled and used as prediction for block-matching motion estimation on the next scale. With the proposed system we are able to combine the advantages of block-matching and differential motion estimation and achieve a dense vector field with floating point precision even for large motion.
Vehicle tracking process based on combination of SURF and color feature
Xiaofeng Lu, Lei Wang
In this paper, we describe a novel method for visual vehicle tracking process based on the combination of speeded-up robust features (SURF) points and color feature. The whole tracking process is constructed in the framework of particle filter. To further improve the precision and stability of tracking, a dynamic update mechanism of target template is proposed to capture appearance changes. This mechanism includes two strategies: Adopting new feature points and discarding bad feature points. A novel distance kernel function method is adopted to allocate the weight of each particle, and to improve the stability of the tracking template. The experiments present that our algorithm can track the targets more robustly and adaptively than the traditional algorithms.
Image Processing
icon_mobile_dropdown
Image denoising using ridgelet shrinkage
Pawan Kumar, Kishore Bhurchandi
Protecting fine details and edges while denoising digital images is a challenging area of research due to changing characteristics of both, noise and signal. Denoising is used to remove noise from corrupted images but in the process fine details like weak edges and textures are hampered. In this paper we propose an algorithm based on Ridgelet transform to denoise images and protect fine details. Here we use cycle spinning on Ridgelet coefficients with soft thresholding and name the algorithm as Ridgelet Shrinkage in order to suppress noise and preserve details. The projections in Ridgelets filter out the noise while protecting the details while the ridgelet shrinkage further suppress noise. The proposed algorithm out performs the Wavelet Shrinkage and Non-local (NL) means denoising algorithms on the basis of Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) numerically and visually both.
Sparse principle component analysis for single image super-resolution
Qianying Zhang, Jitao Wu
In this paper, we propose a novel image super-resolution method based on sparse principle component analysis. Various coupled sub-dictionaries are trained to represent high-resolution and low-resolution image patches. The proposed method simultaneously exploits the incoherence of the sub-dictionaries and nonlocal self-similarity existing in natural images. The purpose of introducing these two regularization terms is to design a novel dictionary learning algorithm for having good reconstruction. Furthermore, in the dictionary learning process, the algorithm can update the dictionary as a whole and reduce the computational cost significantly. Experimental results show the efficiency of the proposed method compared to the existing algorithms in terms of both PSNR and visual perception.
Image quality assessment using Takagi-Sugeno-Kang fuzzy model
Dragana Đorđević, Dragan Kukolj, Peter Schelkens
The main aim of the paper is to present a non-linear image quality assessment model based on a fuzzy logic estimator, namely the Takagi-Sugeno-Kang fuzzy model. This image quality assessment model uses a clustered space of input objective metrics. Main advantages of the introduced quality model are simplicity and understandably of its fuzzy rules. As reference model the polynomial 3 rd order model was chosen. The parameters of the Takagi-Sugeno-Kang fuzzy model are optimized in accordance to the mapping criteria of the selected set of input objective quality measures to the Mean Opinion Score (MOS) scale.
A novel SAR fusion image segmentation method based on triplet Markov field
Jiajing Wang, Shuhong Jiao, Zhenyu Sun
Markov random field (MRF) has been widely used in SAR image segmentation because of the advantage of directly modeling the posterior distribution and suppresses the speckle on the influence of the segmentation result. However, when the real SAR images are nonstationary images, the unsupervised segmentation results by MRF can be poor. The recent proposed triplet Markov field (TMF) model is well appropriate for nonstationary SAR image processing due to the introduction of an auxiliary field which reflects the nonstationarity. In addition, on account of the texture features of SAR image, a fusion image segmentation method is proposed by fusing the gray level image and texture feature image. The effectiveness of the proposed method in this paper is demonstrated by a synthesis SAR image and the real SAR images segmentation experiments, and it is better than the state-of-art methods.
An approach to the segmentation of multi-page document flow using binary classification
Onur Agin, Cagdas Ulas, Mehmet Ahat, et al.
In this paper, we present a method for segmentation of document page flow applied to heterogeneous real bank documents. The approach is based on the content of images and it also incorporates font based features inside the documents. Our method involves a bag of visual words (BoVW) model on the designed image based feature descriptors and a novel approach to combine the consecutive pages of a document into a single feature vector that represents the transition between these pages. The transitions here could be represented by one of the two different classes: continuity of the same document or beginning of a new document. Using the transition feature vectors, we utilize three different binary classifiers to make predictions on the relationship between consecutive pages. Our initial results demonstrate that the proposed method can exhibit promising performance for document flow segmentation at this stage.
Local homogeneity combined with DCT statistics to blind noisy image quality assessment
Lingxian Yang, Li Chen, Heping Chen
In this paper a novel method for blind noisy image quality assessment is proposed. First, it is believed that human visual system (HVS) is more sensitive to the local smoothness area in a noise image, an adaptively local homogeneous block selection algorithm is proposed to construct a new homogeneous image named as homogeneity blocks (HB) based on computing each pixel characteristic. Second, applying the discrete cosine transform (DCT) for each HB and using high frequency component to evaluate image noise level. Finally, a modified peak signal to noise ratio (MPSNR) image quality assessment approach is proposed based on analysis DCT kurtosis distributions change and noise level above-mentioned. Simulations show that the quality scores that produced from the proposed algorithm are well correlated with the human perception of quality and also have a stability performance.
KM_GrabCut: a fast interactive image segmentation algorithm
Jianbo Li, Yiping Yao, Wenjie Tang
Image segmentation is critical for image processing. Among several algorithms, GrabCut is well known by its little user interaction and desirable segmentation result. However, it needs to take a lot of time to adjust the Gaussian Mixture Model (GMM) and to cut the weighted graph with Max-Flow/Min-Cut Algorithm iteratively. To solve this problem, we first build a common algorithmic framework which can be shared by the class of GrabCut-like segmentation algorithms, and then propose KM_GrabCut algorithm based on this framework. The KM_GrabCut first uses K-means clustering algorithm to cluster pixels in foreground and background respectively, and then constructs a GMM based on each clustering result and cuts the corresponding weighted graph only once. Experimental results demonstrate that KM_GrabCut outperforms GrabCut with higher performance, comparable segmentation result and user interaction.
Estimation of variance of sea surfaces slopes through the variance of the glitter patterns’ images
It has been realized an estimation of variance of the sea surface slopes through the variances on images that consist of bright and dark regions that are called glitter pattern. The probability distribution of the sea surface slopes has been used considering a non-Gaussian case taking in account the skewness and the kurtosis of the sea surface slopes. These relationships of variance have been calculated for five different angles of light incidence on the sea surface and for four different heights of the image sensor. The brightness in the glittern pattern has been modeled using a Gaussian function with information of the incident and reflection light angle in its argument. Some computational aspects and applications for optical engineering are mentioned.
Effective and fully automatic image segmentation using quantum entropy and pulse-coupled neural networks
Songlin Du, Yaping Yan, Yide Ma
A novel image segmentation algorithm which uses quantum entropy and pulse-coupled neural networks (PCNN) is proposed in this paper. Optimal iteration of the PCNN is one of the key factors affecting segmentation accuracy. We borrow quantum entropy from quantum information to act as a criterion in determining optimal iteration of the PCNN. Optimal iteration is captured while total quantum entropy of the segments reaches a maximum. Moreover, compared with other PCNN-employed algorithms, the proposed algorithm works without any manual intervention, because all parameters of the PCNN are set automatically. Experimental results prove that the proposed method can achieve much lower probabilities of error segmentation than other PCNN-based image segmentation algorithms, and this suggests that higher image segmentation quality is achieved by the proposed method.
Sparse representation using multiple dictionaries for single image super-resolution
Yih-Lon Lin, Chung-Ming Sung, Yu-Min Chiang
New algorithms are proposed in this paper for single image super-resolution using multiple dictionaries based on sparse representation. In the proposed algorithms, a classifier is constructed which is based on the edge properties of image patches via the two lowest discrete cosine transformation (DCT) coefficients. The classifier partitions all training patches into three classes. Training patches from each of the three classes can then be used for the training of the corresponding dictionary via the K-SVD (singular value decomposition) algorithm. Experimental results show that the high resolution image quality using the proposed algorithms is better than that using the traditional bi-cubic interpolation and Yang’s method.
An image denoising algorithm based on clustering and median filtering
YuLing Wang, Ming Li, Li Li
It is proposed of an improved median de-noising method, namely an image de-noising algorithm based on clustering and median filtering. The algorithm is a kind of image fast de-noising method based on the clustering idea, the singular point points are isolated from the image and then clustering. It is advantage to better protect the details of an image and to substantially reduce calculation. Compared with traditional median filter, mean filter and wiener filter, our approach is more adaptive and receives better results. While for images that have complex details such as texture images, the results of experiment show that the proposed algorithm works less well in the de-noising effect comparatively.
Block-matching 3D transform based multi-focus image fusion
Feng Zhu, Yingkun Hou, Minxian Li, et al.
Block matching 3-D transform (BM3D) is an excellent image denoising algorithm, the success of this algorithm is that it makes full use of the image self-similarity. Profiting from the superior performance of the BM3D, this paper propose a multi-focus image fusion algorithm. A pair of the original images are both implemented BM3D respectively, averaging their low frequency coefficients can get the low frequency coefficient of the result image, the bigger high frequency coefficient in these two transforms is chosen as the high frequency coefficient of the result image; Implementing inverse 3-D transformation can obtain the fused image. Experimental results show that the proposed algorithm is better than the existing multi-focus image fusion algorithms on both the subjective visual and the objective evaluation.
A combined image steganographic method using multi-way pixel-value differencing
In order to increase the hiding capacity and provide visually imperceptible changes of an image, a multi-way pixelvalue differencing scheme is proposed in this paper. A threshold value is designated to distinguish between edge-like and smooth blocks. The smooth blocks adopt the original tri-way pixel-value differencing (TPVD) method whereas the edgelike blocks apply the proposed mode selection algorithm to further determine the data embedding way by utilizing horizontal or vertical edges. In addition, a further altering process is conducted to achieve the original partitioning results after the embedding procedure so that the stego-images can extract secret data without the participation of the original cover images. The experimental results show that the proposed method can embed a large hiding capacity in a cover image and the image quality remains high.
Multi-modal image fusion based on ROI and Laplacian Pyramid
In this paper, we propose a region of interest-based (ROI-adaptive) fusion algorithm of infrared and visible images by using the Laplacian Pyramid method. Firstly, we estimate the saliency map of infrared images, and then divide the infrared image into two parts: the regions of interest (RoI) and the regions of non-interest (nRoI), by normalizing the saliency map. Visible images are also segmented into two parts by using the Gauss High-pass filter: the regions of high frequency (RoH) and the regions of low frequency (RoL). Secondly, we down-sampled both the nRoI of infrared image and the RoL of visible image as the input of next level processing. Finally, we use normalized saliency map of infrared images as the weighted coefficient to get the basic image on the top level and choose max gray value of the RoI of infrared image and the RoH of visible image to get the detail image. In this way, our method can keep target feature of infrared image and texture detail information of visual image at the same time. Experiment results show that such fusion scheme performs better than the other fusion algorithms both on human visual system and quantitative metrics.
Image fusion based on fractional Fourier domain phase and amplitude
Longlong Li, Qiming Zou, Qian Huang, et al.
Applying fractional Fourier transform (FRFT) on an image, the phase and amplitude portion have different capability to reflect spectral information of the source image. Generally, the phase portion is of more importance. Taking advantage of these characters, a novel image fusion algorithm, named as FRFT-phase-amplitude, is proposed. Firstly, apply FRFT on source images, and then the amplitude and the phase information are separated. Secondly, fuse the amplitude portion in FRFT domain using the largest absolute value fusion rule. Thirdly, do inverse fractional Fourier transform (IFRFT) on phase portions to get reconstructed phase images, and fuse them in spatial domain by selecting the larger pixel value, then process FRFT on this fused phase image. Finally, combine fused phase portion with fused amplitude portion in fractional domain, and apply IFRFT on the combination to create the fused image. Experiments reveal that the FRFT-phase- amplitude algorithm can produce better fusion effects than methods based on wavelet transform and FRFT.
A multi-scale fusion-based dark channel prior dehazing algorithm
Yujun Zeng, Xiaolin Liu
During model-based image dehazing, the role of the accuracy of transmission estimation is crucial, which has a decisive effect on the final result. Considering that an ideal transmission map must be smooth, edge-preserving and free of redundant false details, a fusion-based dark channel prior (DCP) dehazing algorithm is presented in this paper. On the basis of DCP, a pixel-wise and a patch-wise transmission maps are obtained. Then an L0 smoothing filter and a large scale Gaussian filter are applied to them respectively. Finally, a much more accurate refined transmission map is attained through fusion and a haze-free image is restored using the atmosphere degradation model. Furthermore, a novel scheme for setting the lower bound of transmission adaptively is also put forward. Experiments demonstrate a better and faster dehazing capability over original DCP algorithm and state-of-the-art dehazing methods, especially in suppressing halo artifacts, restoring details and coping with the haze existing in small-scale areas of depth discontinuity occluded by foreground..
An optimizing processing approach to contrast correction based on nonlinear mapping of windowed tone
Ming Gao, Shiyin Qin
A contrast correction method is presented based on nonlinear mapping of windowed tone. The main idea of method is to employ the local nonlinear mapping model on the small size with overlapping windows of traversal the whole image. At first, a high dynamic range (HDR) image contrast correction is introduced, and then through the formula deduction, a model for decision optimization of contrast correction is established, in which some constraints are termed as two adaptive guided images based on human visual properties so as to improve the optimal solution. Finally, the optimal contrast correction can be implemented by solving the optimizing processing problem through a linearized reduction. A series of experiments with the HDR natural images are carried out and the results of objective quality metrics have showed that the proposed method can effectively improve and optimize the contrast correction to outperform those current existing methods.
Image Analysis and Information Encryption
icon_mobile_dropdown
Image haze removal algorithm for transmission lines based on weighted Gaussian PDF
Wanguo Wang, Jingjing Zhang, Li Li, et al.
Histogram specification is a useful algorithm of image enhancement field. This paper proposes an image haze removal algorithm of histogram specification based on the weighted Gaussian probability density function (Gaussian PDF). Firstly, we consider the characteristics of image histogram that captured when sunny, fogging and haze weather. Then, we solve the weak intensity of image specification through changing the variance and weighted Gaussian PDF. The performance of the algorithm could removal the effective of fog and experimental results show the superiority of the proposed algorithm compared with histogram specification. It also has much advantage in respect of low computational complexity, high efficiency, no manual intervention.
An improved Bayesian matting method based on image statistic characteristics
Wei Sun, Siwei Luo, Lina Wu
Image matting is an important task in image and video editing and has been studied for more than 30 years. In this paper we propose an improved interactive matting method. Starting from a coarse user-guided trimap, we first perform a color estimation based on texture and color information and use the result to refine the original trimap. Then with the new trimap, we apply soft matting process which is improved Bayesian matting with smoothness constraints. Experimental results on natural image show that this method is useful, especially for the images have similar texture feature in the background or the images which is hard to give a precise trimap.
Genetic algorithm for bundle adjustment in aerial panoramic stitching
This paper presents a genetic algorithm for bundle adjustment in aerial panoramic stitching. Compared with the conventional LM (Levenberg-Marquardt) algorithm for bundle adjustment, the proposed bundle adjustment combining the genetic algorithm optimization eliminates the possibility of sticking into the local minimum, and not requires the initial estimation of desired parameters, naturally avoiding the associated steps, that includes the normalization of matches, the computation of homography transformation, the calculations of rotation transformation and the focal length. Since the proposed bundle adjustment is composed of the directional vectors of matches, taking the advantages of genetic algorithm (GA), the Jacobian matrix and the normalization of residual error are not involved in the searching process. The experiment verifies that the proposed bundle adjustment based on the genetic algorithm can yield the global solution even in the unstable aerial imaging condition.
A new SVD based fragile image watermarking by using genetic algorithm
Veysel Aslantas, Mevlut Dogru
In this paper, a novel fragile image watermarking scheme based on singular value decomposition (SVD) using genetic algorithm (GA) is proposed. Every line of watermark is scaled by using multiple scaling factors (SFs). Host image is divided into blocks. Watermarked image is obtained by embedding a different line of the watermark to singular values (SVs) of the every block. In this proposed method, the SFs are optimized using GA to obtain maximum transparency. Experimental results indicate that the method reached the highest possible transparency. Fragility of the watermark under various attacks such as rotating, rescaling and sharpening is tested. When an attack does not occur, exactly the original extracted watermark is obtained; on the other hand, the extracted watermark is intensely distorted.
Dominant color correlogram descriptor for content-based image retrieval
Atoany Fierro-Radilla, Karina Perez-Daniel, Mariko Nakano-Miyatake, et al.
Content-based image retrieval (CBIR) has become an interesting and urgent research topic due to the increase of necessity of indexing and classification of multimedia content in large databases. The low level visual descriptors, such as color-based, texture-based and shape-based descriptors, have been used for the CBIR task. In this paper we propose a color-based descriptor which describes well image contents, integrating both global feature provided by dominant color and local features provided by color correlogram. The performance of the proposed descriptor, called Dominant Color Correlogram descriptor (DCCD), is evaluated comparing with some MPEG-7 visual descriptors and other color-based descriptors reported in the literature, using two image datasets with different size and contents. The performance of the proposed descriptor is assessed using three different metrics commonly used in image retrieval task, which are ARP (Average Retrieval Precision), ARR (Average Retrieval Rate) and ANMRR (Average Normalized Modified Retrieval Rank). Also precision-recall curves are provided to show a better performance of the proposed descriptor compared with other color-based descriptors.
Weakly supervised glasses removal
Zhicheng Wang, Yisu Zhou, Lijie Wen
Glasses removal is an important task on face recognition, in this paper, we provide a weakly supervised method to remove eyeglasses from an input face image automatically. We choose sparse coding as face reconstruction method, and optical flow to find exact shape of glasses. We combine the two processes iteratively to remove glasses more accurately. The experimental results reveal that our method works much better than these algorithms alone, and it can remove various glasses to obtain natural looking glassless facial images.
Satellite image scene classification using spatial information
Weiwei Song, Dunwei Wen, Ke Wang, et al.
In order to enhance the local feature’s describing capacity and improve the classification performance of high-resolution (HR) satellite images, we present an HR satellite image scene classification method that make use of spatial information of local feature. First, the spatial pyramid matching model (SPMM) is adopted to encode spatial information of local feature. Then, images are represented by the local feature descriptors and encoding information. Finally, the support vector machine (SVM) classifier is employed to classify image scenes. The experiment results on a real satellite image dataset show that our method can classify the scene classes with an 82.6% accuracy, which indicates that the method can work well on describing HR satellite images and classifying different scenes.
A comparative study on manifold learning of hyperspectral data for land cover classification
Ceyda Nur Ozturk, Gokhan Bilgin
This paper focuses on the land cover classification problem by employing a number of manifold learning algorithms in the feature extraction phase, then by running single and ensemble of classifiers in the modeling phase. Manifolds are learned on training samples selected randomly within available data, while the transformation of the remaining test samples is realized for linear and nonlinear methods via the learnt mappings and a radial-basis function neural network based interpolation method, respectively. The classification accuracies of the original data and the embedded manifolds are investigated with several classifiers. Experimental results on a 200-band hyperspectral image indicated that support vector machine was the best classifier for most of the methods, being nearly as accurate as the best classification rate of the original data. Furthermore, our modified version of random subspace classifier could even outperform the classification accuracy of the original data for local Fisher’s discriminant analysis method despite of a considerable decrease in the extrinsic dimension.
A comparison of image inpainting techniques
Image inpainting is an important research topic in the field of image processing. The objective of inpainting is to “guess” the lost information according to surrounding image information, which can be applied in old photo restoration, object removal and demosaicing. Based on the foundation of previous literature of image inpainting and image modeling, this paper provides an overview of the state-of-art image inpainting methods. This survey first covers mathematics models of inpainting and different kinds of image impairment. Then it goes to the main components of an image, the structure and the texture, and states how these inpainting models and algorithms deal with the two separately, using PDE’s method, exemplar-based method and etc. Afterwards sparse-representation-based inpainting and related techniques are introduced. Experimental analysis will be presented to evaluate the relative merits of different algorithms, with the measure of Peak Signal to Noise Ratio (PSNR) as well as direct visual perception.
Example-based automatic generation of image filters and classifiers based on image-value pairs
Munehiro Doi, Yoshinori Dobashi, Hideaki Tamori, et al.
We propose a novel method for the automatic generation of the spatial image filter sequences based on Genetic Programming (GP). In this method, the filter sequences consist of the filters which process Image-Value Pairs. This idea allows the filter sequences to contain not only image processing, but also numerical operations. And we exploit the popular method of the multi-objective optimization to generate the robust filter sequences. We demonstrate the generation of the background elimination filter from the pictures of flowers and also demonstrate the generation of the image classification filters.
Learning self-adaptive color harmony model for aesthetic quality classification
Zhijie Kuang, Peng Lu, Xiaojie Wang, et al.
Color harmony is one of the key aspects in aesthetic quality classification for photos. The existing color harmony models either are in lack of quantization schemes or can assess simple color patterns only. Therefore, these models cannot be applied to assess color harmony of photos directly. To address this problem, we proposed a simple data-based self-adaptive color harmony model. In this model, the hue distribution of a photo is fitted by mean shift based method, then features are extracted according to this distribution and finally the Gaussian mixture model is applied for learning features extracted from all the photos. The experimental results on eight categories datasets show that the proposed method outperforms the classic rule-based methods and the state-of-the-art data-based model.
Image registration on fractional Fourier transform domain
Huixian Niu, Enqing Chen, Lin Qi, et al.
In recent years, fractional Fourier transform (FRFT) which contains both spatial and frequency information has been a hot topic. Image registration (IR), as an important preprocessing procedure, is very promising to be implemented on FRFT domain. A novel method based on the properties of FRFT and conventional phase correlation technique is proposed in this paper. This method not only can get more accurate results than previous FRFT-based methods, but also avoids the iterative operation, which greatly reduces the computation complexity. Simulation results prove the proposed superiority than existing methods based on FRFT.
A novel data hiding scheme for block truncation coding compressed images using dynamic programming strategy
Ching-Chun Chang, Yanjun Liu, Son T Nguyen
Data hiding is a technique that embeds information into digital cover data. This technique has been concentrated on the spatial uncompressed domain, and it is considered more challenging to perform in the compressed domain, i.e., vector quantization, JPEG, and block truncation coding (BTC). In this paper, we propose a new data hiding scheme for BTC-compressed images. In the proposed scheme, a dynamic programming strategy was used to search for the optimal solution of the bijective mapping function for LSB substitution. Then, according to the optimal solution, each mean value embeds three secret bits to obtain high hiding capacity with low distortion. The experimental results indicated that the proposed scheme obtained both higher hiding capacity and hiding efficiency than the other four existing schemes, while ensuring good visual quality of the stego-image. In addition, the proposed scheme achieved a low bit rate as original BTC algorithm.
Artificial frame filling using adaptive neural fuzzy inference system for particle image velocimetry dataset
Bayram Akdemir, Sercan Doğan, Muharrem Hilmi Aksoy, et al.
Liquid behaviors are very important for many areas especially for Mechanical Engineering. Fast camera is a way to observe and search the liquid behaviors. Camera traces the dust or colored markers travelling in the liquid and takes many pictures in a second as possible as. Every image has large data structure due to resolution. For fast liquid velocity, there is not easy to evaluate or make a fluent frame after the taken images. Artificial intelligence has much popularity in science to solve the nonlinear problems. Adaptive neural fuzzy inference system is a common artificial intelligence in literature. Any particle velocity in a liquid has two dimension speed and its derivatives. Adaptive Neural Fuzzy Inference System has been used to create an artificial frame between previous and post frames as offline. Adaptive neural fuzzy inference system uses velocities and vorticities to create a crossing point vector between previous and post points. In this study, Adaptive Neural Fuzzy Inference System has been used to fill virtual frames among the real frames in order to improve image continuity. So this evaluation makes the images much understandable at chaotic or vorticity points. After executed adaptive neural fuzzy inference system, the image dataset increase two times and has a sequence as virtual and real, respectively. The obtained success is evaluated using R2 testing and mean squared error. R2 testing has a statistical importance about similarity and 0.82, 0.81, 0.85 and 0.8 were obtained for velocities and derivatives, respectively.
Cropping and noise resilient steganography algorithm using secret image sharing
Oswaldo Juarez-Sandoval, Atoany Fierro-Radilla, Angelina Espejel-Trujillo, et al.
This paper proposes an image steganography scheme, in which a secret image is hidden into a cover image using a secret image sharing (SIS) scheme. Taking advantage of the fault tolerant property of the (k,n)-threshold SIS, where using any k of n shares (k≤n), the secret data can be recovered without any ambiguity, the proposed steganography algorithm becomes resilient to cropping and impulsive noise contamination. Among many SIS schemes proposed until now, Lin and Chan’s scheme is selected as SIS, due to its lossless recovery capability of a large amount of secret data. The proposed scheme is evaluated from several points of view, such as imperceptibility of the stegoimage respect to its original cover image, robustness of hidden data to cropping operation and impulsive noise contamination. The evaluation results show a high quality of the extracted secret image from the stegoimage when it suffered more than 20% cropping or high density noise contamination.
A robust method for estimating motorbike count based on visual information learning
Kien C. Huynh, Dung N. Thai, Sach T. Le, et al.
Estimating the number of vehicles in traffic videos is an important and challenging task in traffic surveillance, especially with a high level of occlusions between vehicles, e.g.,in crowded urban area with people and/or motorbikes. In such the condition, the problem of separating individual vehicles from foreground silhouettes often requires complicated computation [1][2][3]. Thus, the counting problem is gradually shifted into drawing statistical inferences of target objects density from their shape [4], local features [5], etc. Those researches indicate a correlation between local features and the number of target objects. However, they are inadequate to construct an accurate model for vehicles density estimation. In this paper, we present a reliable method that is robust to illumination changes and partial affine transformations. It can achieve high accuracy in case of occlusions. Firstly, local features are extracted from images of the scene using Speed-Up Robust Features (SURF) method. For each image, a global feature vector is computed using a Bag-of-Words model which is constructed from the local features above. Finally, a mapping between the extracted global feature vectors and their labels (the number of motorbikes) is learned. That mapping provides us a strong prediction model for estimating the number of motorbikes in new images. The experimental results show that our proposed method can achieve a better accuracy in comparison to others.
Variational optical flow estimation for images with spectral and photometric sensor diversity
Tomas Bengtsson, Tomas McKelvey, Konstantin Lindström
Motion estimation of objects in image sequences is an essential computer vision task. To this end, optical flow methods compute pixel-level motion, with the purpose of providing low-level input to higher-level algorithms and applications. Robust flow estimation is crucial for the success of applications, which in turn depends on the quality of the captured image data. This work explores the use of sensor diversity in the image data within a framework for variational optical flow. In particular, a custom image sensor setup intended for vehicle applications is tested. Experimental results demonstrate the improved flow estimation performance when IR sensitivity or flash illumination is added to the system.
Top-down vertical itemset mining
Vertical itemset mining is an important frequent pattern mining problem with broad applications. It is challenging since one may need to examine a combinatorial explosive number of possible patterns of items of a dataset in a traditional horizontal algorithm. Since high dimensional datasets typically contain a large number of columns and a small number of rows, vertical itemset mining algorithms, which extract the frequent itemsets of dataset by producing all combination of rows ids, are a good alternative for horizontal algorithms in mining frequent itemsets from high dimensional dataset. Since a rowset can be simply produced from its subsets by adding a new row id to a sub rowset, many bottom up vertical itemset mining algorithms are designed and represented in the literature. However, bottom up vertical mining algorithms suffer from a main drawback. Bottom-up algorithms start the process of generating and testing of rowsets from the small rowsets and go on to the larger rowsets, whereas the small rowsets cannot produce a frequent itemsets because they contain less than minimum support threshold number of rows. In this paper, we described a new efficient vertical top down algorithm called VTD (Vertical Top Down) to conduct mining of frequent itemsets in high dimensional datasets. Our top down approach employed the minimum support threshold to prune the rowsets which any itemset could not be extracted from them. Several experiments on real bioinformatics datasets showed that VTD is orders of magnitude better than previous closed pattern mining algorithms. Our performance study showed that this algorithm outperformed substantially the best former algorithms.
Concave points for separating touching particles
Separation of touching objects/particles is a step before measuring morphological characteristics. An approach for identifying and splitting touching char particles is presented. The proposed approach is based on two processes. First, concave points are detected using a concavity measure and a list of touching point candidates is built. Second, separation lines are identified using location, length, blur and size. A decision criterion is derived for deciding whether or not to split a particle. The proposed approach is evaluated using 180 images of char particles and compared to the Watershed algorithm. The evaluation was twofold: quantifying the accuracy of identifying touching particles and measuring the separation quality. Expert criteria are used as a ground truth for qualitative evaluations. A good agreement between the visual judgement and automatic results was obtained, using the proposed approach.
Modeling and Visualization
icon_mobile_dropdown
An efficient framework for modeling clouds from Landsat8 images
Chunqiang Yuan, Jing Guo
Cloud plays an important role in creating realistic outdoor scenes for video game and flight simulation applications. Classic methods have been proposed for cumulus cloud modeling. However, these methods are not flexible for modeling large cloud scenes with hundreds of clouds in that the user must repeatedly model each cloud and adjust its various properties. This paper presents a meteorologically based method to reconstruct cumulus clouds from high resolution Landsat8 satellite images. From these input satellite images, the clouds are first segmented from the background. Then, the cloud top surface is estimated from the temperature of the infrared image. After that, under a mild assumption of flat base for cumulus cloud, the base height of each cloud is computed by averaging the top height for pixels on the cloud edge. Then, the extinction is generated from the visible image. Finally, we enrich the initial shapes of clouds using a fractal method and represent the recovered clouds as a particle system. The experimental results demonstrate our method can yield realistic cloud scenes resembling those in the satellite images.
Heuristic-driven graph wavelet modeling of complex terrain
Teodor Cioacă, Bogdan Dumitrescu, Mihai-Sorin Stupariu, et al.
We present a novel method for building a multi-resolution representation of large digital surface models. The surface points coincide with the nodes of a planar graph which can be processed using a critically sampled, invertible lifting scheme. To drive the lazy wavelet node partitioning, we employ an attribute aware cost function based on the generalized quadric error metric. The resulting algorithm can be applied to multivariate data by storing additional attributes at the graph’s nodes. We discuss how the cost computation mechanism can be coupled with the lifting scheme and examine the results by evaluating the root mean square error. The algorithm is experimentally tested using two multivariate LiDAR sets representing terrain surface and vegetation structure with different sampling densities.
Modeling synthetic radar image from a digital terrain model
Philippe Durand, Luan Jaupi, Dariush Ghorbanzadeh, et al.
In this paper we propose to simulate SAR radar images that can be acquired by aircraft or satellite. This corresponds to a real problematic, in fact, an airborne radar data acquisition campaign, was conducted in the south east of France. We want to estimate the geometric deformations that a digital terrain model can be subjected. By extrapolation, this construction should also allow to understand the image distortion if a plane is replaced by a satellite. This manipulation allow to judge the relevance of a space mission to quantify geological and geomorphological data. The radar wave is an electromagnetic wave, they have the advantage of overcoming atmospheric conditions since more wavelength is large is better crossing the cloud layer. Therefore imaging radar provides continuous monitoring.
Gaze estimation using a hybrid appearance and motion descriptor
Chunshui Xiong, Lei Huang, Changping Liu
It is a challenging problem to realize a robust and low cost gaze estimation system. Existing appearance-based and feature-based methods both have achieved impressive progress in the past several years, while their improvements are still limited by feature representation. Therefore, in this paper, we propose a novel descriptor combining eye appearance and pupil center-cornea reflections (PCCR). The hybrid gaze descriptor represents eye structure from both feature level and topology level. At the feature level, a glints-centered appearance descriptor is presented to capture intensity and contour information of eye, and a polynomial representation of normalized PCCR vector is employed to capture motion information of eyeball. At the topology level, the partial least squares is applied for feature fusion and selection. At last, sparse representation based regression is employed to map the descriptor to the point-of-gaze (PoG). Experimental results show that the proposed method achieves high accuracy and has a good tolerance to head movements.
Anaglyph videoanimations from oblique stereoimages
The paper deals with the approach of compiling of animations from a pair of oblique stereoimages. The authors investigated as simple and cheap way as possible to develop such approach which will be available for wide scope of ordinary users with common equipment. They concentrated on three procedures of oblique stereoimage handling to compile sets of images, animations and analogue documents. After capturing construction site by a pair of web cameras the data were corrected, photogrammetrically adjusted (due to radial distortion) and exported. Firstly, a set of anaglyphic images were compiled, then they were trimmed and timeline was inserted. The final anaglyph animations are compiled in various versions. In addition, an anaglyphic book containing 150 images was created in a special way that the user can easily browse through its content. The main outputs are several unique anaglyph products, but more beneficial outputs are developed procedures of anaglyph visualization that can be applied with minor modifications to photographing of any objects.
3D reconstruction and visualization of plant leaves
Xiaomeng Gu, Lihong Xu, Dawei Li, et al.
In this paper, a three-dimensional reconstruction method, which is based on point clouds and texture images, is used to realize the visualization of leaves of greenhouse crops. We take Epipremnum aureum as the object for study and focus on applying the triangular meshing method to organize and categorize scattered point cloud input data of leaves, and then construct a triangulated surface with interconnection topology to simulate the real surface of the object. At last we texture-map the leaf surface with real images to present a life-like 3D model which can be used to simulate the growth of greenhouse plants.
Attitude measurement by using target Schlieren graph and 3D digital model in wind tunnel
Lei Cheng, Yinong Yang, Bindang Xue, et al.
Schlieren photography is normal device in wind tunnel. It records varying density of flow and also shows the attitude of model. In this paper, a method is proposed to estimate the model attitudes through matching the projection drawings of 3D digital model with the schlieren photography and high speed camera image. A simulation experiment is also designed to test the method. The results show that the maximum error less than 0.1°. We also use the method to deal with the wind tunnel test data, and experimental results show that the proposed system can meet the demands of the wind tunnel test.
Design of 3D simulation engine for oilfield safety training
Hua-Ming Li, Bao-Sheng Kang
Aiming at the demand for rapid custom development of 3D simulation system for oilfield safety training, this paper designs and implements a 3D simulation engine based on script-driven method, multi-layer structure, pre-defined entity objects and high-level tools such as scene editor, script editor, program loader. A scripting language been defined to control the system's progress, events and operating results. Training teacher can use this engine to edit 3D virtual scenes, set the properties of entity objects, define the logic script of task, and produce a 3D simulation training system without any skills of programming. Through expanding entity class, this engine can be quickly applied to other virtual training areas.
Reconstruction of indoor scene from a single image
Di Wu, Hongyu Li, Lin Zhang
Given a single image of an indoor scene without any prior knowledge, is it possible for a computer to automatically reconstruct the structure of the scene? This letter proposes a reconstruction method, called RISSIM, to recover the 3D modelling of an indoor scene from a single image. The proposed method is composed of three steps: the estimation of vanishing points, the detection and classification of lines, and the plane mapping. To find vanishing points, a new feature descriptor, named “OCR”, is defined to describe the texture orientation. With Phrase Congruency and Harris Detector, the line segments can be detected exactly, which is a prerequisite. Perspective transform is a defined as a reliable method whereby the points on the image can be represented on a 3D model. Experimental results show that the 3D structure of an indoor scene can be well reconstructed from a single image although the available depth information is limited.
Improved stereo matching applied to digitization of greenhouse plants
Peng Zhang, Lihong Xu, Dawei Li, et al.
The digitization of greenhouse plants is an important aspect of digital agriculture. Its ultimate aim is to reconstruct a visible and interoperable virtual plant model on the computer by using state-of-the-art image process and computer graphics technologies. The most prominent difficulties of the digitization of greenhouse plants include how to acquire the three-dimensional shape data of greenhouse plants and how to carry out its realistic stereo reconstruction. Concerning these issues an effective method for the digitization of greenhouse plants is proposed by using a binocular stereo vision system in this paper. Stereo vision is a technique aiming at inferring depth information from two or more cameras; it consists of four parts: calibration of the cameras, stereo rectification, search of stereo correspondence and triangulation. Through the final triangulation procedure, the 3D point cloud of the plant can be achieved. The proposed stereo vision system can facilitate further segmentation of plant organs such as stems and leaves; moreover, it can provide reliable digital samples for the visualization of greenhouse tomato plants.
Video Analysis and Processing
icon_mobile_dropdown
An experimental evaluation of some background subtraction algorithms under a variety of video surveillance challenges
This paper analyses the behavior of some existing background subtraction algorithms for possible use in automated video surveillance applications. The performance of the analyzed algorithms has been demonstrated by their authors on a selected video sequences to show the merits of their approaches. Nevertheless, choosing an adequate approach for a given application is not an easy task. In this study; by using background subtraction evaluation metrics combined with visual inspection, we asses in deep the performance of 04 algorithms under a variety of video surveillance challenges. This experimental analysis highlights the advantages and the limitations of each approach and helps in choosing the suitable method for a given video surveillance scenario.
Low complexity data duplication with selective slice dropping for reliable video communication
Sajid Nazir, Dejan Vukobratovic, Gorry Fairhurst
Video quality can be improved by selective duplication of the important video data. This paper proposes a simple and novel scheme to exploit the space occupancy of bi-directionally predicted slices for duplication of selected information in order to improve the video quality. The idea of data duplication is not new, however the selection of the data for duplication and its placement within the transmitted data can have profound effects on performance. The novelty of the proposed schemes is that the duplicated data does not bring additional payload. It is shown that sufficient space exists within video data to accommodate duplication of important data without increasing overall size of video data. The amount of duplicated data can also be tailored to suit specific transmission scenario. The results show significant gains in video quality with the proposed duplication schemes.
Video object segmentation via adaptive threshold based on background model diversity
The background subtraction could be presented as classification process when investigating the upcoming frames in a video stream, taking in consideration in some cases: a temporal information, in other cases the spatial consistency, and these past years both of the considerations above. The classification often relied in most of the cases on a fixed threshold value. In this paper, a framework for background subtraction and moving object detection based on adaptive threshold measure and short/long frame differencing procedure is proposed. The presented framework explored the case of adaptive threshold using mean squared differences for a sampled background model. In addition, an intuitive update policy which is neither conservative nor blind is presented. The algorithm succeeded on extracting the moving foreground and isolating an accurate background.
Fast video super-resolution via sparse coding
Methods for super-resolution can be classified into three categories: (i) The Interpolation-based methods, (ii) The Reconstruction-based methods (iii) The Learning-based methods. The Learning-based methods usually have the best performance due to the learning process. However, learning-based methods can’t be applied to video super-resolution due to the great computational complexity. We proposed a fast sparsity-based video super-resolution algorithm by utilizing inter-frame information. Firstly, the background can be extracted via existing methods such as Gaussians Mixture Model(GMM) in this paper. Secondly, we construct background and foreground patch dictionaries by randomly sampling patches from high-resolution video. During the process of video super-resolution, only the foreground regions are reconstructed using foreground dictionary via sparse coding. Respectively the background is updated and only changed regions of the background is reconstructed using background dictionary in the same way. Finally, the background and foreground should be fused to get the super-resolution outcome. The experiments show that it makes sparsity-based methods much faster in video super-resolution with approximate, even better, performance.
Real-time video analysis for retail stores
Ehtesham Hassan, Avinash Kumar Maurya
With the advancement in video processing technologies, we can capture subtle human responses in a retail store environment which play decisive role in the store management. In this paper, we present a novel surveillance video based analytic system for retail stores targeting localized and global traffic estimate. Development of an intelligent system for human traffic estimation in real-life poses a challenging problem because of the variation and noise involved. In this direction, we begin with a novel human tracking system by an intelligent combination of motion based and image level object detection. We demonstrate the initial evaluation of this approach on available standard dataset yielding promising result. Exact traffic estimate in a retail store require correct separation of customers from service providers. We present a role based human classification framework using Gaussian mixture model for this task. A novel feature descriptor named graded colour histogram is defined for object representation. Using, our role based human classification and tracking system, we have defined a novel computationally efficient framework for two types of analytics generation i.e., region specific people count and dwell-time estimation. This system has been extensively evaluated and tested on four hours of real-life video captured from a retail store.
A robust mean-shift tracking through occlusion and scale based on object trajectory for surveillance camera
Object tracking is an important part in surveillance systems, One of the algorithms used for this task is the meanshift algorithm due to the robustness, computational efficiency and implementation ease. However the traditional meanshift cannot effectively track the moving object when the scale changes, because of the fixed size of the tracking window, and can lose the target while an occlusion, In this study a method based on the trajectory direction of the moving object is presented to deal with the problem of scale change. Furthermore a histogram similarity metric is used to detect when target occlusion occurs, and a method based on multi kernel is proposed, to estimate which part is not in occlusion and this part will be used to extrapolate the motion of the object and gives an estimation of its position, Experimental results show that the improved methods have a good adaptability to the scale and occlusion of the target.
The application of autostereoscopic display in smart home system based on mobile devices
Yongjun Zhang, Zhi Ling
Smart home is a system to control home devices which are more and more popular in our daily life. Mobile intelligent terminals based on smart homes have been developed, make remote controlling and monitoring possible with smartphones or tablets. On the other hand, 3D stereo display technology developed rapidly in recent years. Therefore, a iPad-based smart home system adopts autostereoscopic display as the control interface is proposed to improve the userfriendliness of using experiences. In consideration of iPad’s limited hardware capabilities, we introduced a 3D image synthesizing method based on parallel processing with Graphic Processing Unit (GPU) implemented it with OpenGL ES Application Programming Interface (API) library on IOS platforms for real-time autostereoscopic displaying. Compared to the traditional smart home system, the proposed system applied autostereoscopic display into smart home system’s control interface enhanced the reality, user-friendliness and visual comfort of interface.
Medical Signal Processing
icon_mobile_dropdown
Centerline-based vessel segmentation using graph cuts
Xin Hu, Yuanzhi Cheng
Complete and accurate segmentation of the vessel from 3D (three dimensional) CT images is challenging due to lowcontrast, combined with noise, and high variation of vessel size. We describe a novel centerline-based method to produce the accurate vessel segmentation. It starts with locating vessel centerline which will be used as guidance, followed by graph cuts, with edge-weights depending on the intensity of the centerline. The main advantage of our framework is that it detects vessel boundary in problematic regions that contain small vessels and noise. A comparison has been made with two state-of-the-art vessel segmentation methods. Quantitative results on synthetic data indicate that our method is more accurate than these methods. Furthermore, experimental results on clinical data have shown that our method is capable of detecting more detailed information of vessel. It is more accurate and robust that these state-of-the-art methods and is, therefore, more suited for automatic vessel extraction.
A statistical description of 3D lung texture from CT data
Kraisorn Chaisaowong, Andreas Paul
A method was described to create a statistical description of 3D lung texture from CT data. The second order statistics, i.e. the gray level co-occurrence matrix (GLCM), has been applied to characterize texture of lung by defining the joint probability distribution of pixel pairs. The required GLCM was extended to three-dimensional image regions to deal with CT volume data. For a fine-scale lung segmentation, both the 3D GLCM of lung and thorax without lung are required. Once the co-occurrence densities are measured, the 3D models of the joint probability density function for each describing direction of involving voxel pairs and for each class (lung or thorax) are estimated using mixture of Gaussians through the expectation-maximization algorithm. This leads to a feature space that describes the 3D lung texture.
Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches
Mahdieh Khalilinezhad, Behrooz Minaei, Gianni Vernazza, et al.
Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors’ comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.
Assessment of an ICA-based noise reduction method for multi-channel auditory evoked potentials
Siavash Mirahmadizoghi, Steven Bell, David Simpson
In this work a new independent component analysis (ICA) based method for noise reduction in evoked potentials is evaluated on for auditory late responses (ALR) captured with a 63-channel electroencephalogram (EEG) from 10 normal-hearing subjects. The performance of the new method is compared with a single channel alternative in terms of signal to noise ratio (SNR), the number of channels with an SNR above an empirically derived statistical critical value and an estimate of hearing threshold. The results show that the multichannel signal processing method can significantly enhance the quality of the signal and also detected hearing thresholds significantly lower than with the single channel alternative.
Compensatory neurofuzzy model for discrete data classification in biomedical
Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.
Pupil segmentation using active contour with shape prior
Iris segmentation is the process of defining the valid part of the eye image used for further processing (feature extraction, matching and decision making). Segmentation of the iris mostly starts with pupil boundary segmentation. Most pupil segmentation techniques are based on the assumption that the pupil is circular shape. In this paper, we propose a new pupil segmentation technique which combines shape, location and spatial information for accurate and efficient segmentation of the pupil. Initially, the pupil’s position and radius is estimated using a statistical approach and circular Hough transform. In order to segment the irregular boundary of the pupil, an active contour model is initialized close to the estimated boundary using information from the first step and segmentation is achieved using energy minimization based active contour. Pre-processing and post-processing were carried out to remove noise and occlusions respectively. Experimental results on CASIA V1.0 and 4.0 shows that the proposed method is highly effective at segmenting irregular boundaries of the pupil.
A new algorithm for segmentation of cardiac quiescent phases and cardiac time intervals using seismocardiography
Mojtaba Jafari Tadi, Tero Koivisto, Mikko Pänkäälä, et al.
Systolic time intervals (STI) have significant diagnostic values for a clinical assessment of the left ventricle in adults. This study was conducted to explore the feasibility of using seismocardiography (SCG) to measure the systolic timings of the cardiac cycle accurately. An algorithm was developed for the automatic localization of the cardiac events (e.g. the opening and closing moments of the aortic and mitral valves). Synchronously acquired SCG and electrocardiography (ECG) enabled an accurate beat to beat estimation of the electromechanical systole (QS2), pre-ejection period (PEP) index and left ventricular ejection time (LVET) index. The performance of the algorithm was evaluated on a healthy test group with no evidence of cardiovascular disease (CVD). STI values were corrected based on Weissler’s regression method in order to assess the correlation between the heart rate and STIs. One can see from the results that STIs correlate poorly with the heart rate (HR) on this test group. An algorithm was developed to visualize the quiescent phases of the cardiac cycle. A color map displaying the magnitude of SCG accelerations for multiple heartbeats visualizes the average cardiac motions and thereby helps to identify quiescent phases. High correlation between the heart rate and the duration of the cardiac quiescent phases was observed.
SA-SVM based automated diagnostic system for skin cancer
Ammara Masood, Adel Al-Jumaily
Early diagnosis of skin cancer is one of the greatest challenges due to lack of experience of general practitioners (GPs). This paper presents a clinical decision support system aimed to save time and resources in the diagnostic process. Segmentation, feature extraction, pattern recognition, and lesion classification are the important steps in the proposed decision support system. The system analyses the images to extract the affected area using a novel proposed segmentation method H-FCM-LS. The underlying features which indicate the difference between melanoma and benign lesions are obtained through intensity, spatial/frequency and texture based methods. For classification purpose, self-advising SVM is adapted which showed improved classification rate as compared to standard SVM. The presented work also considers analyzed performance of linear and kernel based SVM on the specific skin lesion diagnostic problem and discussed corresponding findings. The best diagnostic rates obtained through the proposed method are around 90.5 %.
Signal Processing
icon_mobile_dropdown
A self-adaptive anti-vibration pipeline-filtering algorithm
The mobile pipeline-filtering algorithm is a real-time algorithm that performs well in detecting small dim targets, but it is particularly sensitive to interframe vibration of sequence images. When searching for small dim targets at sea based on an infrared imaging system, irregular and random vibration of the airborne imaging platform causes huge interference problems for the mobile pipeline-filtering. This paper puts forward a pipeline-filtering algorithm that has a good performance on self-adaptive anti-vibration. In the block matching method using the normalized cross-correlations coefficient (NCC), the interframe vibration of sequence images is acquired in real time and used to correct coordinates of the single-frame detection results, and then the corrected detection results are used to complete the mobile pipelinefiltering. Experimental results show that the algorithm can overcome the problem of interframe vibration of sequence images, thus realizing accurate detection of small dim maritime targets.
Sensor signals monitoring and control using wavelets transform representation algorithm
Okuwobi Idowu Paul, Yonghua Lu
The usefulness of wavelet transforms has been compared and contrasted to Fourier transforms. Most importantly, wavelets transform provide a much needed alternative to Fourier transform for certain application such as pattern based monitoring and control. Effort has been made to provide a technique to extract essential trends from process signals and provide a compact representation. The effectiveness of a signal processing technique depends to a large extent on the nature of the signals involved. On technique that works for specific signal trends might not be effective in dealing with other signal trends. More so in pre-processing stage, signal extension has been identified as the critical factor influencing signal representation and retention of trends. This paper introduce a new algorithm in solving the present problems in sensor signal monitoring and control. The New Extension Technique (NET) was introduced, which provide an accurate wavelet decomposition irrespective of the nature of the signal. This method uses a statistical approach to provide a good approximation of the signal outside the boundaries of the signal depending on signal trends at the boundaries. Different statistical approaches were adopted for this purpose and four new extension methods were also introduced in order to ascertain which extension methods provide a reliable extension for all cases. The concept behind these methods is the same, since the signal samples close to the boundary are considered and a mean value is determined. The procedure for determining this mean value differs for each of these four methods; NET A, NET B, NET C, and NET D. The signal is then extended by making it symmetric with respect to that mean value and then inverting it.
Groupwise surface correspondence using particle filtering
Guangxu Li, Hyoungseop Kim, Joo Kooi Tan, et al.
To obtain an effective interpretation of organic shape using statistical shape models (SSMs), the correspondence of the landmarks through all the training samples is the most challenging part in model building. In this study, a coarse-tofine groupwise correspondence method for 3-D polygonal surfaces is proposed. We manipulate a reference model in advance. Then all the training samples are mapped to a unified spherical parameter space. According to the positions of landmarks of the reference model, the candidate regions for correspondence are chosen. Finally we refine the perceptually correct correspondences between landmarks using particle filter algorithm, where the likelihood of local surface features are introduced as the criterion. The proposed method was performed on the correspondence of 9 cases of left lung training samples. Experimental results show the proposed method is flexible and under-constrained.
Using neural networks in remote sensing monitoring of exogenous processes
Ruslan Sharapov, Alexey Varlamov
In paper considered the problem of using remote sensing monitoring of the exogenous processes. The satellite observations can used in tasks of detection of newly formed landslides, landslips and karst collapses. Practice shows that the satellite images of the same area, taken at different times, can have significant differences from each other. For this reason, it is necessary to perform the images correction to bring them into the same species, removing impact of changes in weather conditions, etc. In addition, it is needed to detect the clouds in the images. Clouds interfere with the analysis of images. The detection of exogenous processes manifestations can be make after these actions. For image correction and object detection can be used the neural networks. In paper are given the algorithm for image correction and the structure of a neural network.
Barcode localization with region based gradient statistical analysis
Zhiyuan Chen, Yuming Zhao
Barcode, as a kind of data representation method, has been adopted in a wide range of areas. Especially with the rise of the smart phone and the hand-held device equipped with high resolution camera and great computation power, barcode technique has found itself more extensive applications. In industrial field, barcode reading system is highly demanded to be robust to blur, illumination change, pitch, rotation, and scale change. This paper gives a new idea in localizing barcode under a region-based gradient statistical analysis. Making this idea as the basis, four algorithms have been developed for dealing with Linear, PDF417, Stacked 1D1D and Stacked 1D2D barcodes respectively. After being evaluated on our challenging dataset with more than 17000 images, the result shows that our methods can achieve an average localization accuracy of 82.17% with respect to 8 kinds of distortions and within an average time of 12 ms.
An artificial target location method for Curiosity rover
Ying Li, Jing Peng, Ying Du
Template matching is a common method for object recognition and location. But the premise of template matching is the target should not change a lot in shape from the template image. When non-coplanar rotation exits, the traditional template matching method is helpless. By analyzing the artificial target of the curiosity rover, a two-step artificial target location method is proposed. Firstly, least squares ellipse fitting method is used to recognize the artificial target in the image and locate the center of each ellipse preliminary. Secondly, according to the preliminary result of ellipse fitting, the image is graph cut into pieces, and each piece only has one ellipse. Then Hough transform is used to locate the center of the artificial target precisely. Meanwhile, before edge detection, mathematical morphology technology is conducted to remove the influence of the shadow in the image. Otsu algorithm is used to choose the threshold value of canny edge detector adaptively. Experiments are carried out based on artificial target images of curiosity rover, which show that the robustness of the algorithm in non-ideal illumination situation. The location accuracy is within 1 pixel.
Target confirmation and relocation using the correlation filter in mean shift tracking
Yi Song, Shuxiao Li, Hongxing Chang
The accurate locating for the target is critical for robust visual tracking methods. This paper addresses the target position confirmation and relocation in mean shift tracking, and proposes a novel method to integrate a MOSSE based correlation filter into the mean shift tracker to obtain its ability of accurate locating. To confirm whether the estimated location of the target is accurate, four measures are evaluated. If the proposed conditions for relocating the target are satisfied, the estimated target position will be adjusted to be more accurate. When the target is occluded, a relocating approach is developed using the correlation filter to find the target after occlusion. The target model and the filter template are updated in each frame according to the evaluation results of the estimated target. Experimental results show the integration of the correlation filter can help the mean shift tracker locate and relocate the target well.
A novel color filter array and demosaicking algorithm for hexagonal grids
Alexander Fröhlich, Andreas Unterweger
We propose a new color filter array for hexagonal sampling grids and a corresponding demosaicking algorithm. By exploiting properties of the human visual system in their design, we show that our proposed color filter array and its demosaicking algorithm are able to outperform the widely used Bayer pattern with state-of-the-art demosaicking algorithms in terms of both, objective and subjective image quality.
Information Systems and Image Processing Applications
icon_mobile_dropdown
Privacy protection in surveillance systems based on JPEG DCT baseline compression and spectral domain watermarking
Thomas Sablik, Jörg Velten, Anton Kummert
An novel system for automatic privacy protection in digital media based on spectral domain watermarking and JPEG compression is described in the present paper. In a first step private areas are detected. Therefore a detection method is presented. The implemented method uses Haar cascades to detects faces. Integral images are used to speed up calculations and the detection. Multiple detections of one face are combined. Succeeding steps comprise embedding the data into the image as part of JPEG compression using spectral domain methods and protecting the area of privacy. The embedding process is integrated into and adapted to JPEG compression. A Spread Spectrum Watermarking method is used to embed the size and position of the private areas into the cover image. Different methods for embedding regarding their robustness are compared. Moreover the performance of the method concerning tampered images is presented.
Intelligent elevator management system using image processing
In the modern era, the increase in the number of shopping malls and industrial building has led to an exponential increase in the usage of elevator systems. Thus there is an increased need for an effective control system to manage the elevator system. This paper is aimed at introducing an effective method to control the movement of the elevators by considering various cases where in the location of the person is found and the elevators are controlled based on various conditions like Load, proximity etc... This method continuously monitors the weight limit of each elevator while also making use of image processing to determine the number of persons waiting for an elevator in respective floors. Canny edge detection technique is used to find out the number of persons waiting for an elevator. Hence the algorithm takes a lot of cases into account and locates the correct elevator to service the respective persons waiting in different floors.
Fast color image matting by online active contour model
Xiaomin Xie, Changming Wang, Aijun Zhang
Compared with gray images, colorful images contain more useful information. In this paper, an online active contour model regarding color image matting is proposed. Using our model, the objects of color images are detected according to their colors. For the proposed model, the new scheme firstly identifies the objects to be segmented by setting the initial contour. Then the new energy functional, which is based on the intensities in each channel, is minimized through an efficient level set formula. Thus less iterations and little calculation time are needed. Finally, the morphological opening and closing operation is adopted for regularization. Experiments results demonstrate the efficiency and effectiveness of the proposed approach, compared with the current active contour models.
Image authentication via sparsity-based phase-shifting digital holography
Wen Chen, Xudong Chen
Digital holography has been widely studied in recent years, and a number of applications have been demonstrated. In this paper, we demonstrate that sparsity-based phase-shifting digital holography can be applied for image authentication. In phase-shifting digital holography, the holograms are sequentially recorded. Only small parts of each hologram are available for numerical reconstruction. It is found that nonlinear correlation algorithm can be applied to simply authenticate the reconstructed object. The results illustrate that the recovered image can be correctly verified. In the developed system, the recorded holograms are highly compressed which can facilitate data storage or transmission, and one simple authentication strategy has been established instead of applying relatively complex algorithms (such as compressive sensing) to recover the object.
The optimization algorithm based knot and control point automatic adjustment
Xingyue Jia, Xiuyang Zhao
Aiming at the issue of point cloud or mesh model, which can be approximated using cubic B-spline surfaces, an algorithm of optimizing the knot vector based on Gaussian Mixture Model(GMM) is proposed in this paper. In addition, the control points of sub-corner points are searched by the Particle Swarm Optimization (PSO) in the process of stitching two B-spline surfaces with different knot vectors. Compared with conventional B-spline surface skinning, the proposed algorithms have two advantages. First, the global optimum is easy to be found by statistically learning and sampling in accordance with the probability distribution of the best individuals. Second, the stitching surface obtained is much smoother and the precise of approximate surface is also higher. The effectiveness of the proposed algorithm have been demonstrated according to experimental examples.
Towards relative gradient and its applications
Yang Wang, Hongzhi Liu, Zhonghai Wu
Image gradients which present directional changes of pixel values in an image are widely considered as important clues for salient features like edges. However, it is difficult to distinguish edges from details which also have large gradients merely based on gradients. In this paper, we propose a novel model called relative gradient which can overcome the problem and better distinguish edges from flat regions and details. We demonstrate the effectiveness of our model by improving some representative algorithms using the relative gradient instead of traditional gradient in contexts of edge detection and non-linear filtering. More applications can be found in image processing, analysis and related tasks.
A color constancy model with minimum brightness variance assumption
The realization of color constancy on computer vision is important to recognize objects in varying light sources. This paper proposes a method to estimate the illuminant under the “Minimum Brightness Variance Assumption” which states that the variation of the brightness of the objects is as small as possible. In this method, the illuminant is estimated to be red when the red part of the object in the scene is bright. In detail, we define an evaluation function to calculate the variance of the brightness in the scene and we minimize the evaluation function to estimate the color of the illuminant and the color of the object. We conducted experiments with synthetic images and confirmed that the proposed method works well to reduce the influence of the illuminant for the objects in the scene.
Robust interest points matching based on local description and spatial constraints
Hana Gharbi, Sahbi Bahroun, Ezzeddine Zagrouba
Matching of interest points is a key and an essential step in image description and search with local features. In this paper, we present a new matching method based on the prediction validation principle by matching pairs of interest points with their local description and with adding spatial constraints. The proposed method is independent of the detection process in order to obtain robust estimates of matching points under different changes likes scale, orientation, illumination. Our new matching method is based on two main steps: the first step computes local features around interest points. In the second step, we add some spatial constraints in order to enhance the robustness of the matches. The experimental setup shows that the proposed method can produce robust matches with higher repeatability and reasonable computational efficiency compared to some state of the art algorithms.
A new improved local Chan-Vese model
Ming Shen, Yiping Wu
Based on the local image information, we propose a new improved local active contour model to segment inhomogeneous images. The level set evolution equation of the proposed model which is different from improved Chan- Vese (ICV) model and local Chan-Vese (LCV) model is ordinary differential equation. Without mean curvature and other complicate difference items, the implementation becomes simpler by employing a finite difference scheme, thus the efficiency of global segmentation is dramatically improved. Experimental results on synthetic images as well as real medical images are shown in the paper to demonstrate the segmentation accuracy, efficiency and robustness of the proposed method.
Visible-spectrum remote eye tracker for gaze communication
Takashi Imabuchi, Oky Dicky Ardiansyah Prima, Hikaru Kikuchi, et al.
Many approaches have been proposed to create an eye tracker based on visible-spectrum. These efforts provide a possibility to create inexpensive eye tracker capable to operate outdoor. Although the resulted tracking accuracy is acceptable for a visible-spectrum head-mounted eye tracker, there are many limitations of these approaches to create a remote eye tracker. In this study, we propose a high-accuracy remote eye tracker that uses visible-spectrum imaging and several gaze communication interfaces suited to the tracker. The gaze communication interfaces are designed to assist people with motor disability. Our results show that the proposed eye tracker achieved an average accuracy of 0.77° and a frame rate of 28 fps with a personal computer. With a tablet device, the proposed eye tracker achieved an average accuracy of 0.82° and a frame rate of 25 fps. The proposed gaze communication interfaces enable users to type a complete sentence containing eleven Japanese characters in about a minute.
Crystallization mosaic effect generation by superpixels
Yuqi Xie, Pengbo Bo, Ye Yuan, et al.
Art effect generation from digital images using computational tools has been a hot research topic in recent years. We propose a new method for generating crystallization mosaic effects from color images. Two key problems in generating pleasant mosaic effect are studied: grouping pixels into mosaic tiles and arrangement of mosaic tiles adapting to image features. To give visually pleasant mosaic effect, we propose to create mosaic tiles by pixel clustering in feature space of color information, taking compactness of tiles into consideration as well. Moreover, we propose a method for processing feature boundaries in images which gives guidance for arranging mosaic tiles near image features. This method gives nearly uniform shape of mosaic tiles, adapting to feature lines in an esthetic way. The new approach considers both color distance and Euclidean distance of pixels, and thus is capable of giving mosaic tiles in a more pleasing manner. Some experiments are included to demonstrate the computational efficiency of the present method and its capability of generating visually pleasant mosaic tiles. Comparisons with existing approaches are also included to show the superiority of the new method.
The development of automated behavior analysis software
Yuki Jaana, Oky Dicky Ardiansyah Prima, Takashi Imabuchi, et al.
The measurement of behavior for participants in a conversation scene involves verbal and nonverbal communications. The measurement validity may vary depending on the observers caused by some aspects such as human error, poorly designed measurement systems, and inadequate observer training. Although some systems have been introduced in previous studies to automatically measure the behaviors, these systems prevent participants to talk in a natural way. In this study, we propose a software application program to automatically analyze behaviors of the participants including utterances, facial expressions (happy or neutral), head nods, and poses using only a single omnidirectional camera. The camera is small enough to be embedded into a table to allow participants to have spontaneous conversation. The proposed software utilizes facial feature tracking based on constrained local model to observe the changes of the facial features captured by the camera, and the Japanese female facial expression database to recognize expressions. Our experiment results show that there are significant correlations between measurements observed by the observers and by the software.
Learning historical heritage with a serious game: a user study of Heerlen Roman bathhouse
Wen Qi
The advances of computer games have shown their potentials for developing edutainment content and services. Current cultural heritages often make use of games in order to complement existing presentations, to create a memorable exhibition. It offers opportunities to reorganize and conceptualize historical, cultural and technological information or knowledge about the exhibits. To demonstrate the benefits of serious games in terms of facilitating the learning activities, we designed a video game about the Heerlen Roman bathhouse heritage. This paper explains the design considerations of this Roman bathhouse game, with a particular focus on the link between game play and learning. In addition, we have carried out a user study to observe and measure the learning effects of this game. Both quantitative and qualitative data are collected to analyze the performance of the learners. The results have shown that this game indeed can help learners understand the important historical facts and the related knowledge of the heritage being studied. Further directions include converting the first-person game into a third-person or multiple players’ game.