High accuracy image restoration method for seeing through water
Author(s):
Kalyan K. Halder;
Murat Tahtali;
Sreenatha G. Anavatti
Show Abstract
This paper presents an algorithm for recovering an image from a sequence of distorted versions of it, where the distortions are caused by a wavy water surface. A robust non-rigid image registration technique is employed to determine the pixel shift maps of all the frames in the sequence against a reference frame. An iterative image dewarping algorithm is applied to correct the geometric distortions of the sequence. A non-local means filter is used to mitigate noise and improve the signal-to-noise ratio (SNR). The performance of our proposed method is compared against the state-of-the-art method. Results show that our proposed method performs significantly better in removing geometric distortions of the underwater images.
Full-body gestures and movements recognition: user descriptive and unsupervised learning approaches in GDL classifier
Author(s):
Tomasz Hachaj;
Marek R. Ogiela
Show Abstract
Gesture Description Language (GDL) is a classifier that enables syntactic description and real time recognition of full-body gestures and movements. Gestures are described in dedicated computer language named Gesture Description Language script (GDLs). In this paper we will introduce new GDLs formalisms that enable recognition of selected classes of movement trajectories. The second novelty is new unsupervised learning method with which it is possible to automatically generate GDLs descriptions. We have initially evaluated both proposed extensions of GDL and we have obtained very promising results. Both the novel methodology and evaluation results will be described in this paper.
Distortion operator kernel and accuracy of iterative image restoration
Author(s):
Artyom Makovetskii;
Vitaly Kober
Show Abstract
Variational functionals are commonly used for restoration of images distorted by a linear operator. In order to minimize a functional, the gradient descent method can be used. In this paper, we analyze the performance of the gradient descent method in the frequency domain and show that the method converges to the sum of the original undistorted function and the kernel function of a linear distortion operator. For uniform linear degradation, the kernel function is oscillating. It is shown that the use of metrical as well as topological characteristics can improve restoration quality. Computer simulation results are provided to illustrate the performance of the proposed algorithm.
Nonlinear filtering for character recognition in low quality document images
Author(s):
Julia Diaz-Escobar;
Vitaly Kober
Show Abstract
Optical character recognition in scanned printed documents is a well-studied task, where the captured conditions like sheet position, illumination, contrast and resolution are controlled. Nowadays, it is more practical to use mobile devices for document capture than a scanner. So as a consequence, the quality of document images is often poor owing to presence of geometric distortions, nonhomogeneous illumination, low resolution, etc. In this work we propose to use multiple adaptive nonlinear composite filters for detection and classification of characters. Computer simulation results obtained with the proposed system are presented and discussed.
Color image restoration based on camera microscanning
Author(s):
José L. López Martínez;
Vitaly Kober;
Manuel Escalante-Torres
Show Abstract
In this work, we propose a method to restore color images from a set of degraded color images obtained with a microscanning imaging system. Using the set of observed images, image restoration is carried out by solving a system of equations that is derived from optimization of an objective function. Since the proposed method possesses a high computational complexity, a fast algorithm is developed. Experimental and computer simulation results obtained with the proposed method are analyzed in terms of restoration accuracy and tolerance to additive input noise.
Study of adaptive correlation filter synthesis guided by the peak and shape of the correlation output
Author(s):
Oliver G. Campos Trujillo;
Gerardo Díaz Blancas
Show Abstract
In recent years, many proposals that consider an adaptive perspective had been developed to solve some drawbacks, such as geometric distortions, background noise and target discrimination. The metrics are based only in the correlation peak output for the filter synthesis. In this paper, the correlation shape is studied to implement adaptive correlation filters guided by the peak and shape of the correlation output. Furthermore, the shape of correlation output is studied to improve the search in the filters bank. In addition, parallel algorithms are developed for accelerated the search in the filters bank. Some results are shown, such as time of synthesis, filter performance and comparisons with other adaptive correlation filter proposals.
Digital deblurring based on linear-scale differential analysis
Author(s):
Vitali Bezzubik;
Nikolai Belashenkov;
Gleb V. Vdovin
Show Abstract
A novel method of sharpness improvement is proposed for digital images. This method is realized via linear multi-scale analysis of source image and sequent synthesis of restored image. The analysis comprises the procedure of computation of intensity gradient values using the special filters providing simultaneous edge detection and noise filtering. Restoration of image sharpness is achieved by simple subtraction of some discrete recovery function from blurred image. Said recovery function is calculated as a sum of several normalized gradient responses found by linear multi-scale analysis using the operation of spatial transposition of those gradient response values relative the points of zero-crossing of first derivatives of gradients. The proposed method provides the restoration of sharpness of edges in digital image without additional operation of spatial noise filtering and a priori knowledge of blur kernel.
prediction-guided quantization for video tone mapping
Author(s):
Agnès Le Dauphin;
Ronan Boitard;
Dominique Thoreau;
Yannick Olivier;
Edouard Francois;
Fabrice LeLéannec
Show Abstract
Tone Mapping Operators (TMOs) compress High Dynamic Range (HDR) content to address Low Dynamic Range (LDR) displays. However, before reaching the end-user, this tone mapped content is usually compressed for broadcasting or storage purposes. Any TMO includes a quantization step to convert floating point values to integer ones. In this work, we propose to adapt this quantization, in the loop of an encoder, to reduce the entropy of the tone mapped video content. Our technique provides an appropriate quantization for each mode of both the Intra and Inter-prediction that is performed in the loop of a block-based encoder. The mode that minimizes a rate-distortion criterion uses its associated quantization to provide integer values for the rest of the encoding process. The method has been implemented in HEVC and was tested over two different scenarios: the compression of tone mapped LDR video content (using the HM10.0) and the compression of perceptually encoded HDR content (HM14.0). Results show an average bit-rate reduction under the same PSNR for all the sequences and TMO considered of 20.3% and 27.3% for tone mapped content and 2.4% and 2.7% for HDR content.
Performance evaluation of objective quality metrics for HDR image compression
Author(s):
Giuseppe Valenzise;
Francesca De Simone;
Paul Lauga;
Frederic Dufaux
Show Abstract
Due to the much larger luminance and contrast characteristics of high dynamic range (HDR) images, well-known objective quality metrics, widely used for the assessment of low dynamic range (LDR) content, cannot be directly applied to HDR images in order to predict their perceptual fidelity. To overcome this limitation, advanced fidelity metrics, such as the HDR-VDP, have been proposed to accurately predict visually significant differences. However, their complex calibration may make them difficult to use in practice. A simpler approach consists in computing arithmetic or structural fidelity metrics, such as PSNR and SSIM, on perceptually encoded luminance values but the performance of quality prediction in this case has not been clearly studied. In this paper, we aim at providing a better comprehension of the limits and the potentialities of this approach, by means of a subjective study. We compare the performance of HDR-VDP to that of PSNR and SSIM computed on perceptually encoded luminance values, when considering compressed HDR images. Our results show that these simpler metrics can be effectively employed to assess image fidelity for applications such as HDR image compression.
Crowdsourcing evaluation of high dynamic range image compression
Author(s):
Philippe Hanhart;
Pavel Korshunov;
Touradj Ebrahimi
Show Abstract
Crowdsourcing is becoming a popular cost effective alternative to lab-based evaluations for subjective quality assessment. However, crowd-based evaluations are constrained by the limited availability of display devices used by typical online workers, which makes the evaluation of high dynamic range (HDR) content a challenging task. In this paper, we investigate the feasibility of using low dynamic range versions of original HDR content obtained with tone mapping operators (TMOs) in crowdsourcing evaluations. We conducted two crowdsourcing experiments by employing workers from Microworkers platform. In the first experiment, we evaluate five HDR images encoded at different bit rates with the upcoming JPEG XT coding standard. To find best suitable TMO, we create eleven tone-mapped versions of these five HDR images by using eleven different TMOs. The crowdsourcing results are compared to a reference ground truth obtained via a subjective assessment of the same HDR images on a Dolby `Pulsar' HDR monitor in a laboratory environment. The second crowdsourcing evaluation uses semantic differentiators to better understand the characteristics of eleven different TMOs. The crowdsourcing evaluations show that some TMOs are more suitable for evaluation of HDR image compression.
Evaluation of privacy in high dynamic range video sequences
Author(s):
Martin Řeřábek;
Lin Yuan;
Lukáš Krasula;
Pavel Korshunov;
Karel Fliegel;
Touradj Ebrahimi
Show Abstract
The ability of high dynamic range (HDR) to capture details in environments with high contrast has a significant impact on privacy in video surveillance. However, the extent to which HDR imaging affects privacy, when compared to a typical low dynamic range (LDR) imaging, is neither well studied nor well understood. To achieve such an objective, a suitable dataset of images and video sequences is needed. Therefore, we have created a publicly available dataset of HDR video for privacy evaluation PEViD-HDR, which is an HDR extension of an existing Privacy Evaluation Video Dataset (PEViD). PEViD-HDR video dataset can help in the evaluations of privacy protection tools, as well as for showing the importance of HDR imaging in video surveillance applications and its influence on the privacy-intelligibility trade-off. We conducted a preliminary subjective experiment demonstrating the usability of the created dataset for evaluation of privacy issues in video. The results confirm that a tone-mapped HDR video contains more privacy sensitive information and details compared to a typical LDR video.
I-vectors for image classification
Author(s):
David C. Smith
Show Abstract
Recent state-of-the-art work on speaker recognition and verification uses a simple factor analysis to derive a low-dimensional total variability space" which simultaneously captures speaker and channel variability. This approach simplified earlier work using joint factor analysis to separately model speaker and channel differences. Here we adapt this "i-vector" method to image classification by replacing speakers with image categories, voice cuts with images, and cepstral features with SURF local descriptors, and where the role of channel variability is attributed to differences in image backgrounds or lighting conditions. A Universal Gaussian mixture model (UGMM) is trained (unsupervised) on SURF descriptors extracted from a varied and extensive image corpus. Individual images are modeled by additively perturbing the supervector of stacked means of this UGMM by the product of a low-rank total variability matrix (TVM) and a normally distributed hidden random vector, X. The TVM is learned by applying an EM algorithm to maximize the sum of log-likelihoods of descriptors extracted from training images, where the likelihoods are computed with respect to the GMM obtained by perturbing the UGMM means via the TVM as above, and leaving UGMM covariances unchanged. Finally, the low-dimensional i-vector representation of an image is the expected value of the posterior distribution of X conditioned on the image's descriptors, and is computed via straightforward matrix manipulations involving the TVM and image-specific Baum-Welch statistics. We compare classification rates found with (i) i-vectors (ii) PCA (iii) Discriminant Attribute Projection (the last two trained on Gaussian MAP-adapted supervector image representations), and (iv) replacing the TVM with the matrix of dominant PCA eigenvectors before i-vector extraction.
Classification and quantification of suspended dust from steel plants by using color and transmission image analysis
Author(s):
Yoshiyuki Umegaki;
Akira Kazama;
Yoshinori Fukuda
Show Abstract
Some kind of dust can arise from ironmaking and steelmaking processes in steel works. In JFE Steel's steel plants, various measures to prevent the suspended dust from scattering to the surrounding area have been taken. To take effective preventive measures against the dust scattering, it’s important to identify dust sources and scattering routes by much observation and analysis of the dust particles. Conventionally, dust particles were sampled at many observation points in and around JFE’s plants and the amount of particles of each kind was measured visually through a microscope. In such a way, however, the operation is inefficient to measure many dust samples, and also the accuracy of the results depends on the operator. To achieve efficient, operator-independent measurement, a system that can classify and quantify the dust particles automatically has been developed [1]. The system extracts particles from color images of the dust and classifies the particles into three color types – black particles (coke, coal), red particles (iron ore, sintered ore) and white particles (slag, lime). These processes are done basically in the YCrCb color space, where colors are represented by luminance (Y) and chrominance (Cr and Cb). The YCrCb color space is more manageable than the RGB color space to distinguish the three color types. The thresholds for the classification are automatically set on the basis of the mean values of the luminance and chrominance in each image. This means there is no need to tune the thresholds to each image manually. This scheme makes the results independent of operators. Quick analysis is also realized because what the operators have to do is to capture the images of the dust and the analysis is fully automated. Classification results of the sampled particles by the developed system and the obtained statistics in terms of the color type, approach direction and diameter are shown.
System for objective assessment of image differences in digital cinema
Author(s):
Karel Fliegel;
Lukáš Krasula;
Petr Páta;
Jiří Myslík;
Josef Pecák;
Marek Jícha
Show Abstract
There is high demand for quick digitization and subsequent image restoration of archived film records. Digitization is very urgent in many cases because various invaluable pieces of cultural heritage are stored on aging media. Only selected records can be reconstructed perfectly using painstaking manual or semi-automatic procedures. This paper aims to answer the question what are the quality requirements on the restoration process in order to obtain acceptably close visual perception of the digitally restored film in comparison to the original analog film copy. This knowledge is very important to preserve the original artistic intention of the movie producers. Subjective experiment with artificially distorted images has been conducted in order to answer the question what is the visual impact of common image distortions in digital cinema. Typical color and contrast distortions were introduced and test images were presented to viewers using digital projector. Based on the outcome of this subjective evaluation a system for objective assessment of image distortions has been developed and its performance tested. The system utilizes calibrated digital single-lens reflex camera and subsequent analysis of suitable features of images captured from the projection screen. The evaluation of captured image data has been optimized in order to obtain predicted differences between the reference and distorted images while achieving high correlation with the results of subjective assessment. The system can be used to objectively determine the difference between analog film and digital cinema images on the projection screen.
Open source database of images DEIMOS: extension for large-scale subjective image quality assessment
Author(s):
Stanislav Vítek
Show Abstract
DEIMOS (Database of Images: Open Source) is an open-source database of images and video sequences for testing, verification and comparison of various image and/or video processing techniques such as compression, reconstruction and enhancement. This paper deals with extension of the database allowing performing large-scale web-based subjective image quality assessment. Extension implements both administrative and client interface. The proposed system is aimed mainly at mobile communication devices, taking into account advantages of HTML5 technology; it means that participants don’t need to install any application and assessment could be performed using web browser. The assessment campaign administrator can select images from the large database and then apply rules defined by various test procedure recommendations. The standard test procedures may be fully customized and saved as a template. Alternatively the administrator can define a custom test, using images from the pool and other components, such as evaluating forms and ongoing questionnaires. Image sequence is delivered to the online client, e.g. smartphone or tablet, as a fully automated assessment sequence or viewer can decide on timing of the assessment if required. Environmental data and viewing conditions (e.g. illumination, vibrations, GPS coordinates, etc.), may be collected and subsequently analyzed.
MTF analysis for coded aperture imaging in a flat panel display
Author(s):
Sungjoo Suh;
Jae-Joon Han;
Dusik Park
Show Abstract
In this paper, we analyze the modulation transfer function (MTF) of coded aperture imaging in a flat panel display. The flat panel display with a sensor panel forms lens-less multi-view cameras through the imaging pattern of the modified redundant arrays (MURA) on the display panel. To analyze the MTF of the coded aperture imaging implemented on the display panel, we first mathematically model the encoding process of coded aperture imaging, where the projected image on the sensor panel is modeled as a convolution of the scaled object and a function of the imaging pattern. Then, system point spread function is determined by incorporating a decoding process which is dependent on the pixel pitch of the display screen and the decoding function. Finally, the MTF of the system is derived by the magnitude of the Fourier transform of the determined system point spread function. To demonstrate the validity of the mathematically derived MTF in the system, we build a coded aperture imaging system that can capture the scene in front of the display, where the system consists of a display screen and a sensor panel. Experimental results show that the derived MTF of coded aperture imaging in a flat panel display system well corresponds to the measured MTF.
Subjective evaluation of higher dynamic range video
Author(s):
Philippe Hanhart;
Pavel Korshunov;
Touradj Ebrahimi
Show Abstract
High dynamic range (HDR) imaging is able to capture a wide range of luminance values, closer to what the human eye can perceive. However, for capture and display technologies, it is important to answer the question on the significance of higher dynamic range for user preference. This paper answers this question by investigating the added value of higher dynamic range via a rigorous set of subjective experiments using paired comparison methodology. Video sequences at four different peak luminance levels were displayed side-by-side on a Dolby Research HDR RGB backlight dual modulation display (aka ‘Pulsar’), which is capable of reliably displaying video content at 4000 cd=m2 peak luminance. The results of the subjective experiment demonstrate that the preference of an average viewer increases logarithmically with the increase in the maximum luminance level at which HDR content is displayed, with 4000 cd=m2 being the most attractive option.
Analysis of prediction algorithms for residual compression in a lossy to lossless scalable video coding system based on HEVC
Author(s):
Andreas Heindel;
Eugen Wige;
André Kaup
Show Abstract
Lossless image and video compression is required in many professional applications. However, lossless coding results in a high data rate, which leads to a long wait for the user when the channel capacity is limited. To overcome this problem, scalable lossless coding is an elegant solution. It provides a fast accessible preview by a lossy compressed base layer, which can be refined to a lossless output when the enhancement layer is received. Therefore, this paper presents a lossy to lossless scalable coding system where the enhancement layer is coded by means of intra prediction and entropy coding. Several algorithms are evaluated for the prediction step in this paper. It turned out that Sample-based Weighted Prediction is a reasonable choice for usual consumer video sequences and the Median Edge Detection algorithm is better suited for medical content from computed tomography. For both types of sequences the efficiency may be further improved by the much more complex Edge-Directed Prediction algorithm. In the best case, in total only about 2.7% additional data rate has to be invested for scalable coding compared to single-layer JPEG-LS compression for usual consumer video sequences. For the case of the medical sequences scalable coding is even more efficient than JPEG-LS compression for certain values of QP.
Method of automatic color rendering settings for machine vision systems
Author(s):
Denis D. Shitov;
Elena V. Gorbunova;
Aleksandr N. Chertov;
Valery V. Korotaev
Show Abstract
Today machine vision systems are widely used to solve various problems of observation and control in science and technology, and industrial applications. Generally a color rendering of such systems is sufficient for visual inspection of the process on the screen. However, there is a problem of color rendering settings for machine vision systems when you need to measure the exact color coordinates of the analyzed object at each point of its surface image in automatic mode. To solve this problem there are a number of methods for setting up and improving color rendering of machine vision systems. As a rule, they all have the following disadvantages: the interdependence of two or more functions of the color rendering settings; and the lack of accounting of control objects lighting conditions and its possible changes. This paper presents results of the development of specialized method for machine vision systems color rendering settings. Using this method you can provide the required accuracy of the color analysis of the observed object. The algorithm for automatic setting of color rendering and software implementing it were developed. An experimental study of the algorithm in a variety of lighting conditions and with a few different machine vision systems were carried out.
Energy minimization of mobile video devices with a hardware H.264/AVC encoder based on energy-rate-distortion opmization
Author(s):
Donghun Kang;
Jungeon Lee;
Jongpil Jung;
Chul-Hee Lee;
Chong-Min Kyung
Show Abstract
In mobile video systems powered by battery, reducing the encoder’s compression energy consumption is critical to prolong its lifetime. Previous Energy-rate-distortion (E-R-D) optimization methods based on a software codec is not suitable for practical mobile camera systems because the energy consumption is too large and encoding rate is too low. In this paper, we propose an E-R-D model for the hardware codec based on the gate-level simulation framework to measure the switching activity and the energy consumption. From the proposed E-R-D model, an energy minimizing algorithm for mobile video camera sensor have been developed with the GOP (Group of Pictures) size and QP(Quantization Parameter) as run-time control variables. Our experimental results show that the proposed algorithm provides up to 31.76% of energy consumption saving while satisfying the rate and distortion constraints.
Comparative assessment of H.265/MPEG-HEVC, VP9, and H.264/MPEG-AVC encoders for low-delay video applications
Author(s):
Dan Grois;
Detlev Marpe;
Tung Nguyen;
Ofer Hadar
Show Abstract
The popularity of low-delay video applications dramatically increased over the last years due to a rising demand for realtime video content (such as video conferencing or video surveillance), and also due to the increasing availability of relatively inexpensive heterogeneous devices (such as smartphones and tablets). To this end, this work presents a comparative assessment of the two latest video coding standards: H.265/MPEG-HEVC (High-Efficiency Video Coding), H.264/MPEG-AVC (Advanced Video Coding), and also of the VP9 proprietary video coding scheme. For evaluating H.264/MPEG-AVC, an open-source x264 encoder was selected, which has a multi-pass encoding mode, similarly to VP9. According to experimental results, which were obtained by using similar low-delay configurations for all three examined representative encoders, it was observed that H.265/MPEG-HEVC provides significant average bit-rate savings of 32.5%, and 40.8%, relative to VP9 and x264 for the 1-pass encoding, and average bit-rate savings of 32.6%, and 42.2% for the 2-pass encoding, respectively. On the other hand, compared to the x264 encoder, typical low-delay encoding times of the VP9 encoder, are about 2,000 times higher for the 1-pass encoding, and are about 400 times higher for the 2-pass encoding.
Joint-layer encoder optimization for HEVC scalable extensions
Author(s):
Chia-Ming Tsai;
Yuwen He;
Jie Dong;
Yan Ye;
Xiaoyu Xiu;
Yong He
Show Abstract
Scalable video coding provides an efficient solution to support video playback on heterogeneous devices with various channel conditions in heterogeneous networks. SHVC is the latest scalable video coding standard based on the HEVC standard. To improve enhancement layer coding efficiency, inter-layer prediction including texture and motion information generated from the base layer is used for enhancement layer coding. However, the overall performance of the SHVC reference encoder is not fully optimized because rate-distortion optimization (RDO) processes in the base and enhancement layers are independently considered. It is difficult to directly extend the existing joint-layer optimization methods to SHVC due to the complicated coding tree block splitting decisions and in-loop filtering process (e.g., deblocking and sample adaptive offset (SAO) filtering) in HEVC. To solve those problems, a joint-layer optimization method is proposed by adjusting the quantization parameter (QP) to optimally allocate the bit resource between layers. Furthermore, to make more proper resource allocation, the proposed method also considers the viewing probability of base and enhancement layers according to packet loss rate. Based on the viewing probability, a novel joint-layer RD cost function is proposed for joint-layer RDO encoding. The QP values of those coding tree units (CTUs) belonging to lower layers referenced by higher layers are decreased accordingly, and the QP values of those remaining CTUs are increased to keep total bits unchanged. Finally the QP values with minimal joint-layer RD cost are selected to match the viewing probability. The proposed method was applied to the third temporal level (TL-3) pictures in the Random Access configuration. Simulation results demonstrate that the proposed joint-layer optimization method can improve coding performance by 1.3% for these TL-3 pictures compared to the SHVC reference encoder without joint-layer optimization.
Source coding for transmission of reconstructed dynamic geometry: a rate-distortion-complexity analysis of different approaches
Author(s):
Rufael N. Mekuria;
Pablo Cesar;
Dick C. A. Bulterman
Show Abstract
Live 3D reconstruction of a human as a 3D mesh with commodity electronics is becoming a reality. Immersive applications (i.e. cloud gaming, tele-presence) benefit from effective transmission of such content over a bandwidth limited link. In this paper we outline different approaches for compressing live reconstructed mesh geometry based on distributing mesh reconstruction functions between sender and receiver. We evaluate rate-performance-complexity of different configurations. First, we investigate 3D mesh compression methods (i.e. dynamic/static) from MPEG-4. Second, we evaluate the option of using octree based point cloud compression and receiver side surface reconstruction.
Research on test of product based on spatial sampling criteria and variable step sampling mechanism
Author(s):
Ruihong Li;
Yueping Han
Show Abstract
This paper presents an effective approach for online testing the assembly structures inside products using multiple views technique and X-ray digital radiography system based on spatial sampling criteria and variable step sampling mechanism. Although there are some objects inside one product to be tested, there must be a maximal rotary step for an object within which the least structural size to be tested is predictable. In offline learning process, Rotating the object by the step and imaging it and so on until a complete cycle is completed, an image sequence is obtained that includes the full structural information for recognition. The maximal rotary step is restricted by the least structural size and the inherent resolution of the imaging system. During online inspection process, the program firstly finds the optimum solutions to all different target parts in the standard sequence, i.e., finds their exact angles in one cycle. Aiming at the issue of most sizes of other targets in product are larger than that of the least structure, the paper adopts variable step-size sampling mechanism to rotate the product specific angles with different steps according to different objects inside the product and match. Experimental results show that the variable step-size method can greatly save time compared with the traditional fixed-step inspection method while the recognition accuracy is guaranteed.
Comparison of compression efficiency between HEVC/H.265 and VP9 based on subjective assessments
Author(s):
Martin Řeřábek;
Touradj Ebrahimi
Show Abstract
Current increasing effort of broadcast providers to transmit UHD (Ultra High Definition) content is likely to increase demand for ultra high definition televisions (UHDTVs). To compress UHDTV content, several alternative encoding mechanisms exist. In addition to internationally recognized standards, open access proprietary options, such as VP9 video encoding scheme, have recently appeared and are gaining popularity. One of the main goals of these encoders is to efficiently compress video sequences beyond HDTV resolution for various scenarios, such as broadcasting or internet streaming. In this paper, a broadcast scenario rate-distortion performance analysis and mutual comparison of one of the latest video coding standards H.265/HEVC with recently released proprietary video coding scheme VP9 is presented. Also, currently one of the most popular and widely spread encoder H.264/AVC has been included into the evaluation to serve as a comparison baseline. The comparison is performed by means of subjective evaluations showing actual differences between encoding algorithms in terms of perceived quality. The results indicate a general dominance of HEVC based encoding algorithm in comparison to other alternatives, while VP9 and AVC showing similar performance.
Statistical feature selection for enhanced detection of brain tumor
Author(s):
Ahmad Chaddad;
Rivka R. Colen M.D.
Show Abstract
Feature-based methods are widely used in the brain tumor recognition system. Robust of early cancer detection is one of the most powerful image processing tools. Specifically, statistical features, such as geometric mean, harmonic mean, mean excluding outliers, median, percentiles, skewness and kurtosis, have been extracted from brain tumor glioma to aid in discriminating two levels namely, Level I and Level II using fluid attenuated inversion recovery (FLAIR) sequence in the diagnosis of brain tumor. Statistical feature describes the major characteristics of each level from glioma which is an important step to evaluate heterogeneity of cancer area pixels. In this paper, we address the task of feature selection to identify the relevant subset of features in the statistical domain, while discarding those that are either redundant or confusing, thereby improving the performance of feature-based scheme to distinguish between Level I and Level II. We apply a Decision Structure algorithm to find the optimal combination of nonhomogeneity based statistical features for the problem at hand. We employ a Naïve Bayes classifier to evaluate the performance of the optimal statistical feature based scheme in terms of its glioma Level I and Level II discrimination capability and use real-data collected from 17 patients have a glioblastoma multiforme (GBM). Dataset provided from 3 Tesla MR imaging system by MD Anderson Cancer Center. For the specific data analyzed, it is shown that the identified dominant features yield higher classification accuracy, with lower number of false alarms and missed detections, compared to the full statistical based feature set. This work has been proposed and analyzed specific GBM types which Level I and Level II and the dominant features were considered as feature aid to prognostic indicators. These features were selected automatically to be better able to determine prognosis from classical imaging studies.
Facial recognition using composite correlation filters designed by multiobjective combinatorial optimization
Author(s):
Andres Cuevas;
Victor H. Diaz-Ramirez;
Vitaly Kober;
Leonardo Trujillo
Show Abstract
Facial recognition is a difficult task due to variations in pose and facial expressions, as well as presence of noise and clutter in captured face images. In this work, we address facial recognition by means of composite correlation filters designed with multi-objective combinatorial optimization. Given a large set of available face images having variations in pose, gesticulations, and global illumination, a proposed algorithm synthesizes composite correlation filters by optimization of several performance criteria. The resultant filters are able to reliably detect and correctly classify face images of different subjects even when they are corrupted with additive noise and nonhomogeneous illumination. Computer simulation results obtained with the proposed approach are presented and discussed in terms of efficiency in face detection and reliability of facial classification. These results are also compared with those obtained with existing composite filters.
Reconstruction of compressive multispectral sensing data using a multilayered conditional random field approach
Author(s):
Farnoud Kazemzadeh;
Mohammad J. Shafiee;
Alexander Wong;
David A. Clausi
Show Abstract
The prevalence of compressive sensing is continually growing in all facets of imaging science. Com- pressive sensing allows for the capture and reconstruction of an entire signal from a sparse (under- sampled), yet sufficient, set of measurements that is representative of the target being observed. This compressive sensing strategy reduces the duration of the data capture, the size of the acquired data, and the cost of the imaging hardware as well as complexity while preserving the necessary underlying information. Compressive sensing systems require the accompaniment of advanced re- construction algorithms to reconstruct complete signals from the sparse measurements made. Here, a new reconstruction algorithm is introduced specifically for the reconstruction of compressive multispectral (MS) sensing data that allows for high-quality reconstruction from acquisitions at sub-Nyquist rates. We propose a multilayered conditional random field (MCRF) model, which extends upon the CRF model by incorporating two joint layers of certainty and estimated states. The proposed algorithm treats the reconstruction of each spectral channel as a MCRF given the sparse MS measurements. Since the observations are incomplete, the MCRF incorporates an extra layer determining the certainty of the measurements. The proposed MCRF approach was evaluated using simulated compressive MS data acquisitions, and is shown to enable fast acquisition of MS sensing data with reduced imaging hardware cost and complexity.
Correction of defective pixels for medical and space imagers based on Ising theory
Author(s):
Eliahu Cohen;
Moriel Shnitser;
Tsvika Avraham;
Ofer Hadar
Show Abstract
We propose novel models for image restoration based on statistical physics. We investigate the affinity between these fields and describe a framework from which interesting denoising algorithms can be derived: Ising-like models and simulated annealing techniques. When combined with known predictors such as Median and LOCO-I, these models become even more effective. In order to further examine the proposed models we apply them to two important problems: (i) Digital Cameras in space damaged from cosmic radiation. (ii) Ultrasonic medical devices damaged from speckle noise. The results, as well as benchmark and comparisons, suggest in most of the cases a significant gain in PSNR and SSIM in comparison to other filters.
Estimation of grain size in asphalt samples using digital image analysis
Author(s):
Hanna Källén;
Anders Heyden;
Per Lindh
Show Abstract
Asphalt is made of a mixture of stones of different sizes and a binder called bitumen, the size distribution of the stones is determined by the recipe of the asphalt. One quality check of asphalt is to see if the real size distribution of asphalt samples is consistent with the recipe. This is usually done by first extracting the binder using methylenchloride and the sieving the stones and see how much that pass every sieve size. Methylenchloride is highly toxic and it is desirable to find the size distribution in some other way. In this paper we find the size distribution by slicing up the asphalt sample and using image analysis techniques to analyze the cross-sections. First the stones are segmented from the background, bitumen, and then rectangles are fit to the detected stones. We then estimate the sizes of the stones by using the width of the rectangle. The result is compared with both the recipe for the asphalt and with the result from the standard analysis method, and our method shows good correlation with those.
Interactive alignment and image reconstruction for wafer-level multi-aperture camera systems
Author(s):
Alexander Oberdörster;
Andreas Brückner;
Hendrik P. A. Lensch
Show Abstract
Assembly of miniaturized high-resolution cameras is typically carried out by active alignment. The sensor image is constantly monitored while the lens stack is adjusted. When sharpness is acceptable in all regions of the image, the lens position over the sensor is fixed. For multi-aperture cameras, this approach is not sufficient. During prototyping, it is beneficial to see the complete reconstructed image, assembled from all optical channels. However, typical reconstruction algorithms are high-quality offline methods that require calibration. As the geometric setup of the camera repeatedly changes during assembly, this would require frequent re-calibration. We present a real-time algorithm for an interactive preview of the reconstructed image during camera alignment. With this algorithm, systematic alignment errors can be tracked and corrected during assembly. Known imperfections of optical components can also be included in the reconstruction. Finally, the algorithm easily maps to very simple GPU operations, making it ideal for applications in mobile devices where power consumption is critical.
Target tracking using interest point detection and correlation filtering
Author(s):
Leopoldo N. Gaxiola;
Víctor H. Diaz-Ramirez;
Juan J. Tapia
Show Abstract
A reliable method for real-time target tracking is presented. The method is based on an interest point detector and a bank of locally adaptive correlation filters. The point detector is used to identify local regions in the observed scene around potential location of the target. The bank of correlation filters is employed to reliably detect the target and accurately estimate its position within the scene, by processing the local regions identified by the detector. Using information of past state estimates of the target the proposed algorithm predicts the state of the target in the next frame in order to perform a fast and accurate target tracking by focusing signal processing only on small regions of the scene in each frame. In order to achieve a real-time operation performance the proposed algorithm is implemented in a graphics processing unit. Experimental results obtained with the proposed method are presented, discussed, and compared with those obtained with a similar state-of-the-art target tracking algorithm.
An algorithm for the characterization of digital images of pigmented lesions of human skin
Author(s):
Laura Y. Mera-González;
José A. Delgado-Atencio;
Juan C. Valdiviezo-Navarro;
Margarita Cunill-Rodríguez
Show Abstract
Melanoma is the most deadly form of skin cancer in human in all over the world with an increase number of victims yearly. One traditional form of diagnosis melanoma is by using the so called ABCDE rule which stands for Asymmetry, Border, Color, Diameter and Evolution of the lesion. For melanoma lesions, the color as a descriptor exhibits heterogeneous values, ranging from light brown to dark brown (sometimes blue reddish or even white). Therefore, investigating on color features from digital melanoma images could provide insights for developing automated algorithms for melanoma discrimination from common nevus. In this research work, an algorithm is proposed and tested to characterize the color in a pigmented lesion. The developed algorithm measures the hue of different sites in the same pigmented area from a digital image using the HSI color space. The algorithm was applied to 40 digital images of unequivocal melanomas and 40 images of common nevus, which were taken from several data bases. Preliminary results indicate that visible color changes of melanoma sites are well accounted by the proposed algorithm. Other factors, such as quality of images and the influence of the shiny areas on the results obtained with the proposed algorithm are discussed.
On the integer coding profile of JPEG XT
Author(s):
Thomas Richter
Show Abstract
JPEG XT (ISO/IEC 18477), the latest standardization initiative of the JPEG committee defines an image compression standard backwards compatible to the well-known JPEG standard (ISO/IEC 10918-1). JPEG XT extends JPEG by features like coding of images of higher bit-depth, coding of floating point image formats and lossless compression, all of which are backwards compatible to the legacy JPEG standard. In this work, the author presents profiles of JPEG XT that are especially suited for hardware implementations by requiring only integer logic. All functional blocks of a JPEG XT codec are here implemented by integer or fixed point logic. A performance analysis and comparison with other profiles of JPEG XT concludes the work.
Nonlinear multi-scale complex wavelet diffusion based speckle reduction approach for 3D ultrasound images
Author(s):
Muhammad Shahin Uddin;
Murat Tahtali;
Andrew J. Lambert;
Mark R. Pickering;
Margaret Marchese;
Iain Stuart
Show Abstract
3D ultrasound imaging has advantages as a non-invasive and a faster examination procedure capable of displaying volume information in real time. However, its resolution is affected by speckle noise. Speckle reduction and feature preservation are seemingly opposing goals. In this paper, a nonlinear multi-scale complex wavelet diffusion based algorithm for 3D ultrasound imaging is introduced. Speckle is suppressed and sharp edges are preserved by applying iterative multi-scale diffusion on the complex wavelet coefficients. The proposed method is validated using synthetic, real phantom, and clinical 3D images, and it is found to outperform other methods in both qualitative and quantitative measures.
To develop a geometric matching method for precision mold alignment machine
Author(s):
Chun-Jen Chen;
Chun-Li Chang;
Wenyuh Jywe
Show Abstract
In order to develop a high accuracy optical alignment system for precision molding machine, a geometric matching method was developed in this paper. The alignment system includes 4 high magnification lenses, 4 CCD cameras and 4 LED light sources. In the precision molding machine, a bottom metal mold and a top glass mold are used to produce a micro lens. The two molds combination does not use any pin or alignment part. They only use the optical alignment system to alignment. In this optical alignment system, the off-axis alignment method was used. The alignment accuracy of the alignment system is about 0.5 μm. There are 2 cross marks on the top glass mold and 2 cross marks on the bottom metal mod. In this paper did not use edge detection to recognize the mask center because the mask easy wears when the combination times increased. Therefore, this paper develops a geometric matching method to recognize mask center.
Thermographic image analysis as a pre-screening tool for the detection of canine bone cancer
Author(s):
Samrat Subedi;
Scott E. Umbaugh;
Jiyuan Fu;
Dominic J. Marino;
Catherine A. Loughin;
Joseph Sackman
Show Abstract
Canine bone cancer is a common type of cancer that grows fast and may be fatal. It usually appears in the limbs which is called "appendicular bone cancer." Diagnostic imaging methods such as X-rays, computed tomography (CT scan), and magnetic resonance imaging (MRI) are more common methods in bone cancer detection than invasive physical examination such as biopsy. These imaging methods have some disadvantages; including high expense, high dose of radiation, and keeping the patient (canine) motionless during the imaging procedures. This project study identifies the possibility of using thermographic images as a pre-screening tool for diagnosis of bone cancer in dogs. Experiments were performed with thermographic images from 40 dogs exhibiting the disease bone cancer. Experiments were performed with color normalization using temperature data provided by the Long Island Veterinary Specialists. The images were first divided into four groups according to body parts (Elbow/Knee, Full Limb, Shoulder/Hip and Wrist). Each of the groups was then further divided into three sub-groups according to views (Anterior, Lateral and Posterior). Thermographic pattern of normal and abnormal dogs were analyzed using feature extraction and pattern classification tools. Texture features, spectral feature and histogram features were extracted from the thermograms and were used for pattern classification. The best classification success rate in canine bone cancer detection is 90% with sensitivity of 100% and specificity of 80% produced by anterior view of full-limb region with nearest neighbor classification method and normRGB-lum color normalization method. Our results show that it is possible to use thermographic imaging as a pre-screening tool for detection of canine bone cancer.
Estimation and measurement of space variant features of imaging systems and analysis of their influence on accuracy in astronomical imaging
Author(s):
Elena Anisimova;
Jan Bednář;
Martin Blažek;
Petr Janout;
Karel Fliegel;
Petr Páta;
Stanislav Vítek;
Jan Švihlík
Show Abstract
Additional monitoring equipment is commonly used in astronomical imaging. This electro-optical system usually complements the main telescope during acquisition of astronomical phenomena or supports its operation e.g. evaluating the weather conditions. Typically it is a wide-field imaging system, which consists of a digital camera equipped with fish-eye lens. The wide-field imaging system cannot be considered as a space-invariant because of space-variant nature of its input lens. In our previous research efforts we have focused on measurement and analysis of images obtained from the subsidiary all-sky monitor WILLIAM (WIde-field aLL-sky Images Analyzing Monitoring system). Space-variant part of this imaging system consists of input lens with 180 fi angle of view in horizontal and 154 fi in vertical direction. For a precise astronomical measurement over the entire field of view, it is very important to know how the optical aberrations affect characteristics of the imaging system, especially its PSF (Point Spread Function). Two methods were used for characterization of the space-variant PSF, i.e. measurement in the optical laboratory and estimation using acquired images and Zernike polynomials. Analysis of results obtained using these two methods is presented in the paper. Accuracy of astronomical measurements is also discussed while considering the space-variant PSF of the system.
Polar format statistical image processing based fiber optic pressure sensors
Author(s):
Muhammed Burak Alver;
Onur Toker;
Kemal Fidanboylu
Show Abstract
This paper presents detailed study on the development of a fiber optic sensor system to design a pressure sensor with different sensor configurations. The sensor used in the experiments is based on modal power distribution (MPD) technique. MPD technique is spatial modulation of the modal power in multimode fibers. Stress measurements and CCD camera based techniques were investigated in this research. Differently from earlier MPD works, all of the data gathered from CCD camera are used instead of using some part of the data, the ring shaped pictures taken from the CCD camera converted to polar coordinates, and so stripe shaped pictures are obtained. Four different features are calculated from these converted pictures. R component of the center of mass in the polar form is the first feature. It is calculated because it was expected to decrease monotonically with respect to increasing applied pressure. Second and third features are ring thickness in polar form with taking brightness of each pixel into account and ring thickness in polar form without taking brightness of each pixel into account. These features are calculated to analyze the effect of each pixel’s brightness. It was expected for these two features that there will not be a big margin between them. Fourth feature is the ratio between third feature and first feature. A MATLAB code is written to correlate these features and applied force to the sensor. Various experiments conducted to analyze this correlation. Pictures are taken from CCD camera with 1 kg steps and from the written MATLAB code, graphics of each feature versus the applied force are generated. Experimental results showed that, the sensitivity of the proposed sensor is much higher than sensors that uses only some part of the collected data in earlier MPD studies. Furthermore, results are almost exactly the same that what was expected for the four proposed features. Results also showed that converting pictures to the polar form increases the sensitivity and reliability.
Quick-shift framework for color image segmentation based on invariant representation
Author(s):
Abdelhameed Ibrahim
Show Abstract
An automatic quick-shift framework is proposed for color image segmentation based on illumination invariant representation. In practice, the quick-shift method is sensitive to the choice of parameters, thus a quick tuning by hand is not sufficient. Changing parameters values make the proposed framework flexible and robust against different image characteristics. We eliminate the factors that may affect natural image acquisition such as shadow and highlight by applying an invariant method. This method is valid for large size images. The effectiveness of the proposed framework for a variety of images including different objects of metals and dielectrics are examined in experiments.
A visible light imaging device for cardiac rate detection with reduced effect of body movement
Author(s):
Xiaotian Jiang;
Ming Liu;
Yuejin Zhao
Show Abstract
A visible light imaging system to detect human cardiac rate is proposed in this paper. A color camera and several LEDs, acting as lighting source, were used to avoid the interference of ambient light. From people’s forehead, the cardiac rate could be acquired based on photoplethysmography (PPG) theory. The template matching method was used after the capture of video. The video signal was discomposed into three signal channels (RGB) and the region of interest was chosen to take the average gray value. The green channel signal could provide an excellent waveform of pulse wave on the account of green lights’ absorptive characteristics of blood. Through the fast Fourier transform, the cardiac rate was exactly achieved. But the research goal was not just to achieve the cardiac rate accurately. With the template matching method, the effects of body movement are reduced to a large extent, therefore the pulse wave can be detected even while people are in the moving state and the waveform is largely optimized. Several experiments are conducted on volunteers, and the results are compared with the ones gained by a finger clamped pulse oximeter. The contrast results between these two ways are exactly agreeable. This method to detect the cardiac rate and the pulse wave largely reduces the effects of body movement and can probably be widely used in the future.
On iris detection for mobile device applications
Author(s):
Magdi A. Mohamed;
Michel Sarkis;
Ning Bi;
Xin Zhong;
Yingyong Qi
Show Abstract
A novel transform called Gradient Direction Transform for fast detection of naturally curved items in digital images is described in this article. This general purpose image transform is defined to suit platforms with limited memory and processing footprints by utilizing only additions and simple shift and bitwise operations. We present this unique algorithmic approach in application to real world problems of iris detection. The new approach is tested on a large data set and the experiments show promising and superior performance compared to existing techniques.
Learning to predict where human gaze is using quaternion DCT based regional saliency detection
Author(s):
Ting Li;
Yi Xu;
Chongyang Zhang
Show Abstract
Many current visual attention approaches used semantic features to accurately capture human gaze. However, these approaches demand high computational cost and can hardly be applied to daily use. Recently, some quaternion-based saliency detection models, such as PQFT (phase spectrum of Quaternion Fourier Transform), QDCT (Quaternion Discrete Cosine Transform), have been proposed to meet real-time requirement of human gaze tracking tasks. However, current saliency detection methods used global PQFT and QDCT to locate jump edges of the input, which can hardly detect the object boundaries accurately. To address the problem, we improved QDCT-based saliency detection model by introducing superpixel-wised regional saliency detection mechanism. The local smoothness of saliency value distribution is emphasized to distinguish noises of background from salient regions. Our algorithm called saliency confidence can distinguish the patches belonging to the salient object and those of the background. It decides whether the image patches belong to the same region. When an image patch belongs to a region consisting of other salient patches, this patch should be salient as well. Therefore, we use saliency confidence map to get background weight and foreground weight to do the optimization on saliency map obtained by QDCT. The optimization is accomplished by least square method. The optimization approach we proposed unifies local and global saliency by combination of QDCT and measuring the similarity between each image superpixel. We evaluate our model on four commonly-used datasets (Toronto, MIT, OSIE and ASD) using standard precision-recall curves (PR curves), the mean absolute error (MAE) and area under curve (AUC) measures. In comparison with most state-of-art models, our approach can achieve higher consistency with human perception without training. It can get accurate human gaze even in cluttered background. Furthermore, it achieves better compromise between speed and accuracy.
Visualization of photo album on mobile devices
Author(s):
Changhwan Chun;
Hyukzae Lee;
Daeyeong Kim;
Changick Kim
Show Abstract
Visualization of photo albums has recently attracted much attention with the goal of organizing personal photo albums on mobile devices. Although there are numerous album management systems, visualizing a photo cluster still remains a challenging issue. The most popular and reasonable visualization method of photo album is to display the representative photo from each photo cluster. In this paper, we propose a method that selects the representative photo given a photo cluster. To this end, three types of evaluations, which are aesthetic photo quality, visual similarity and semantic importance, are conducted within each cluster. The photo with the highest score from three evaluations is selected as the representative photo for visualization. From, experimental results, we confirm that the proposed algorithm provides reliable organization results for various personal albums.
Multiple objects tracking with HOGs matching in circular windows
Author(s):
Daniel Miramontes-Jaramillo;
Vitaly Kober;
Víctor H. Díaz-Ramírez
Show Abstract
In recent years tracking applications with development of new technologies like smart TVs, Kinect, Google Glass and Oculus Rift become very important. When tracking uses a matching algorithm, a good prediction algorithm is required to reduce the search area for each object to be tracked as well as processing time. In this work, we analyze the performance of different tracking algorithms based on prediction and matching for a real-time tracking multiple objects. The used matching algorithm utilizes histograms of oriented gradients. It carries out matching in circular windows, and possesses rotation invariance and tolerance to viewpoint and scale changes. The proposed algorithm is implemented in a personal computer with GPU, and its performance is analyzed in terms of processing time in real scenarios. Such implementation takes advantage of current technologies and helps to process video sequences in real-time for tracking several objects at the same time.
Markov random fields for static foreground classification in surveillance systems
Author(s):
Jack K. Fitzsimons;
Thomas T. Lu
Show Abstract
We present a novel technique for classifying static foreground in automated airport surveillance systems between abandoned and removed objects by representing the image as a Markov Random Field. The proposed algorithm computes and compares the net probability of the region of interest before and after the event occurs, hence finding which fits more naturally with their respective backgrounds. Having tested on a dataset from the PETS 2006, PETS 2007, AVSS20074, CVSG, VISOR, CANDELA and WCAM datasets, the algorithm has shown capable of matching the results of the state-of-the-art, is highly parallel and has a degree of robustness to noise and illumination changes.
Single image dehazing using local adaptive signal processing
Author(s):
Jesus A. Valderrama;
Victor H. Diaz-Ramirez;
Vitaly Kober
Show Abstract
A local adaptive algorithm for single image dehazing is presented. The algorithm is able to estimate a dehazed image from an observed hazed scene by solving an objective function whose parameters are adapted to local statistics of the hazed image inside a moving window. The proposed objective function is based on a trade-off among several local rank order statistics of the dehazed signal and the mean-squared-error between the hazed and dehazed signals. In order to achieve a high-rate signal processing, the proposed algorithm is implemented in a graphics processing unit (GPU) exploiting massive parallelism. Experimental results obtained with a laboratory prototype are presented, discussed, and compared with those results obtained with existing single image dehazing methods in terms of objective metrics and computational complexity.
Adaptive live IP multicast of SVC with unequal FEC
Author(s):
Avram Lev;
Amir Lasry;
Maoz Loants;
Ofer Hadar
Show Abstract
Ideally, video streaming systems should provide the best quality video a user's device can handle without compromising on downloading speed. In this article, an improved video transmission system is presented which dynamically enhances the video quality based on a user's current network state and repairs errors from data lost in the video transmission. The system incorporates three main components: Scalable Video Coding (SVC) with three layers, multicast based on Receiver Layered Multicast (RLM) and an UnEqual Forward Error Correction (FEC) algorithm. The SVC provides an efficient method for providing different levels of video quality, stored as enhancement layers. In the presented system, a proportional-integral-derivative (PID) controller was implemented to dynamically adjust the video quality, adding or subtracting quality layers as appropriate. In addition, an FEC algorithm was added to compensate for data lost in transmission. A two dimensional FEC was used. The FEC algorithm came from the Pro MPEG code of practice #3 release 2. Several bit errors scenarios were tested (step function, cosine wave) with different bandwidth size and error values were simulated. The suggested scheme which includes SVC video encoding with 3 layers over IP Multicast with Unequal FEC algorithm was investigated under different channel conditions, variable bandwidths and different bit error rates. The results indicate improvement of the video quality in terms of PSNR over previous transmission schemes.
Real-time SHVC software decoding with multi-threaded parallel processing
Author(s):
Srinivas Gudumasu;
Yuwen He;
Yan Ye;
Yong He;
Eun-Seok Ryu;
Jie Dong;
Xiaoyu Xiu
Show Abstract
This paper proposes a parallel decoding framework for scalable HEVC (SHVC). Various optimization technologies are implemented on the basis of SHVC reference software SHM-2.0 to achieve real-time decoding speed for the two layer spatial scalability configuration. SHVC decoder complexity is analyzed with profiling information. The decoding process at each layer and the up-sampling process are designed in parallel and scheduled by a high level application task manager. Within each layer, multi-threaded decoding is applied to accelerate the layer decoding speed. Entropy decoding, reconstruction, and in-loop processing are pipeline designed with multiple threads based on groups of coding tree units (CTU). A group of CTUs is treated as a processing unit in each pipeline stage to achieve a better trade-off between parallelism and synchronization. Motion compensation, inverse quantization, and inverse transform modules are further optimized with SSE4 SIMD instructions. Simulations on a desktop with an Intel i7 processor 2600 running at 3.4 GHz show that the parallel SHVC software decoder is able to decode 1080p spatial 2x at up to 60 fps (frames per second) and 1080p spatial 1.5x at up to 50 fps for those bitstreams generated with SHVC common test conditions in the JCT-VC standardization group. The decoding performance at various bitrates with different optimization technologies and different numbers of threads are compared in terms of decoding speed and resource usage, including processor and memory.
Information embedding to a real object by projecting a checkered-pattern carrier-screen image
Author(s):
Rui Shogenji
Show Abstract
We propose an information embedding technique to a real object by projecting a checkered-pattern carrier-screen image as an illumination. The carrier-screen image is an information hiding technique, which can decode a secret image physically by superimposing a periodic pattern. As a kind of carrier-screen images, we have developed the checkered-pattern carrier-screen images, which can be physically decoded by superimposing a sheet of checkered pattern. The secret information is also visualized by image sampling with certain interval. As an example of decoding by image sampling, we proposed a decoding method with a compact digital camera. The encoded carrier-screen image has an almost uniform pattern, because modulating a checkered pattern generates it. It is also easy to display on a liquid-crystal display, because it is represented on a square pixel structure. Experimental optical embedding and decoding with a digital camera results show effectiveness of the proposed system. Since the embedded information can be decoded by using an ordinary digital camera, our system expected to use not only steganographic purpose also prevention techniques on taking photos.
Adaptive image coding based on cubic-spline interpolation
Author(s):
Jian-Xing Jiang;
Shao-Hua Hong;
Tsung-Ching Lin;
Lin Wang;
Trieu-Kien Truong
Show Abstract
It has been investigated that at low bit rates, downsampling prior to coding and upsampling after decoding can achieve better compression performance than standard coding algorithms, e.g., JPEG and H. 264/AVC. However, at high bit rates, the sampling-based schemes generate more distortion. Additionally, the maximum bit rate for the sampling-based scheme to outperform the standard algorithm is image-dependent. In this paper, a practical adaptive image coding algorithm based on the cubic-spline interpolation (CSI) is proposed. This proposed algorithm adaptively selects the image coding method from CSI-based modified JPEG and standard JPEG under a given target bit rate utilizing the so called ρ-domain analysis. The experimental results indicate that compared with the standard JPEG, the proposed algorithm can show better performance at low bit rates and maintain the same performance at high bit rates.
Heterogeneous iris image hallucination using sparse representation on a learned heterogeneous patch dictionary
Author(s):
Yung-Hui Li;
Bo-Ren Zheng;
Dai-Yan Ji;
Chung-Hao Tien;
Po-Tsun Liu
Show Abstract
Cross sensor iris matching may seriously degrade the recognition performance because of the sensor mis-match problem of iris images between the enrollment and test stage. In this paper, we propose two novel patch-based heterogeneous dictionary learning method to attack this problem. The first method applies the latest sparse representation theory while the second method tries to learn the correspondence relationship through PCA in heterogeneous patch space. Both methods learn the basic atoms in iris textures across different image sensors and build connections between them. After such connections are built, at test stage, it is possible to hallucinate (synthesize) iris images across different sensors. By matching training images with hallucinated images, the recognition rate can be successfully enhanced. The experimental results showed the satisfied results both visually and in terms of recognition rate. Experimenting with an iris database consisting of 3015 images, we show that the EER is decreased 39.4% relatively by the proposed method.
Biometric analysis of the palm vein distribution by means two different techniques of feature extraction
Author(s):
R. Castro-Ortega;
C. Toxqui-Quitl;
J. Solís-Villarreal;
A. Padilla-Vivanco;
J. Castro-Ramos
Show Abstract
Vein patterns can be used for accessing, identifying, and authenticating purposes; which are more reliable than classical identification way. Furthermore, these patterns can be used for venipuncture in health fields to get on to veins of patients when they cannot be seen with the naked eye. In this paper, an image acquisition system is implemented in order to acquire digital images of people hands in the near infrared. The image acquisition system consists of a CCD camera and a light source with peak emission in the 880 nm. This radiation can penetrate and can be strongly absorbed by the desoxyhemoglobin that is presented in the blood of the veins. Our method of analysis is composed by several steps and the first one of all is the enhancement of acquired images which is implemented by spatial filters. After that, adaptive thresholding and mathematical morphology operations are used in order to obtain the distribution of vein patterns. The above process is focused on the people recognition through of images of their palm-dorsal distributions obtained from the near infrared light. This work has been directed for doing a comparison of two different techniques of feature extraction as moments and veincode. The classification task is achieved using Artificial Neural Networks. Two databases are used for the analysis of the performance of the algorithms. The first database used here is owned of the Hong Kong Polytechnic University and the second one is our own database.
Principles of image processing in machine vision systems for the color analysis of minerals
Author(s):
Daria B. Petukhova;
Elena V. Gorbunova;
Aleksandr N. Chertov;
Valery V. Korotaev
Show Abstract
At the moment color sorting method is one of promising methods of mineral raw materials enrichment. This method is based on registration of color differences between images of analyzed objects. As is generally known the problem with delimitation of close color tints when sorting low-contrast minerals is one of the main disadvantages of color sorting method. It is can be related with wrong choice of a color model and incomplete image processing in machine vision system for realizing color sorting algorithm. Another problem is a necessity of image processing features reconfiguration when changing the type of analyzed minerals. This is due to the fact that optical properties of mineral samples vary from one mineral deposit to another. Therefore searching for values of image processing features is non-trivial task. And this task doesn't always have an acceptable solution. In addition there are no uniform guidelines for determining criteria of mineral samples separation. It is assumed that the process of image processing features reconfiguration had to be made by machine learning. But in practice it's carried out by adjusting the operating parameters which are satisfactory for one specific enrichment task. This approach usually leads to the fact that machine vision system unable to estimate rapidly the concentration rate of analyzed mineral ore by using color sorting method. This paper presents the results of research aimed at addressing mentioned shortcomings in image processing organization for machine vision systems which are used to color sorting of mineral samples. The principles of color analysis for low-contrast minerals by using machine vision systems are also studied. In addition, a special processing algorithm for color images of mineral samples is developed. Mentioned algorithm allows you to determine automatically the criteria of mineral samples separation based on an analysis of representative mineral samples. Experimental studies of the proposed algorithm were performed using samples of gold and copper-nickel ores. And obtained results confirmed its efficiency with respect to mineral objects. The research results will allow: expanding the use of the color sorting method in the field of mineral raw materials enrichment; facilitating the search for values of image processing features for machine vision systems which are used to the color analysis of minerals; reducing the time required for reconfiguration of image processing features when changing the type of analyzed minerals; realizing the process of rapid estimating the concentration rate of analyzed mineral ore by using color sorting method.
The Empirical Mode Decomposition algorithm via Fast Fourier Transform
Author(s):
Oleg O. Myakinin;
Valery P. Zakharov;
Ivan A. Bratchenko;
Dmitry V. Kornilin;
Dmitry N. Artemyev;
Alexander G. Khramov
Show Abstract
In this paper we consider a problem of implementing a fast algorithm for the Empirical Mode Decomposition (EMD). EMD is one of the newest methods for decomposition of non-linear and non-stationary signals. A basis of EMD is formed "on-the-fly", i.e. it depends from a distribution of the signal and not given a priori in contrast on cases Fourier Transform (FT) or Wavelet Transform (WT). The EMD requires interpolating of local extrema sets of signal to find upper and lower envelopes. The data interpolation on an irregular lattice is a very low-performance procedure. A classical description of EMD by Huang suggests doing this through splines, i.e. through solving of a system of equations. Existence of a fast algorithm is the main advantage of the FT. A simple description of an algorithm in terms of Fast Fourier Transform (FFT) is a standard practice to reduce operation's count. We offer a fast implementation of EMD (FEMD) through FFT and some other cost-efficient algorithms. Basic two-stage interpolation algorithm for EMD is composed of a Upscale procedure through FFT and Downscale procedure through a selection procedure for signal's points. First we consider the local maxima (or minima) set without reference to the axis OX, i.e. on a regular lattice. The Upscale through the FFT change the signal’s length to the Least Common Multiple (LCM) value of all distances between neighboring extremes on the axis OX. If the LCM value is too large then it is necessary to limit local set of extrema. In this case it is an analog of the spline interpolation. A demo for FEMD in noise reduction task for OCT has been shown.
Segmentation of astronomical images
Author(s):
Jan Švihlík;
Stanislav Vítek;
Karel Fliegel;
Petr Páta;
Elena Anisimova
Show Abstract
Object detection is one of the most important procedures in astronomical imaging. This paper deals with segmentation of astronomical images based on random forrest classifier. We consider astronomical image data acquired using a photometric system with B, V, R and I filters. Each image is acquired in more realizations. All image realizations are corrected using master dark frame and master at field obtained as an average of hundreds of images. Then a profile photometry is applied to find possible position of stars. The classifier is trained by B, V, R and I image vectors. Training samples are defined by user using ellipsoidal regions (20 selections for both classes: object, background). A number of objects and their positions are compared with astronomical object catalogue using Euclidean distance. We can conclude that the performance of the presented technique is fully comparable to other SoA algorithms.
Real time soft-partition-based weighted sum filtering with GPU acceleration
Author(s):
Shuqun Zhang;
Bryan Furia
Show Abstract
Recently image processing such as noise reduction, restoration, and super-resolution using the soft-partition-based weighted sum filters have shown state-of-the-art results. The partition-based weighted sum filters are spatially adaptive filtering techniques by combining vector quantization and linear finite impulse response filtering, which have been shown to achieve much better results than spatial-invariant filtering methods. However, they are computationally prohibitive for practical applications because of enormous computation involved in both filtering and training. Real-time filtering is impossible even for small image and window sizes. This paper presents fast implementations of the soft-partition-based weighted sum filtering by exploiting the massively parallel processing capabilities of a GPU within the CUDA framework. For the implementations, we focus on memory management and implementation strategies. The performance on various image and window sizes is measured and compared between the GPU-based and CPU-based implementations. The results show that the GPU-based implementations can significantly accelerate computations for the soft-partition-based weighted sum filtering, and make real-time image filtering possible.
An object boundary detection system based on a 3D stereo monitor
Author(s):
Shuqun Zhang;
Bryan Furia
Show Abstract
In this paper we present an object boundary detection system using an off-the-shelf available 3D stereo monitor. Instead of implementing algorithms, the system’s image processing is based on utilizing the polarization feature of liquid-crystal display and the way the image is displayed on the 3D monitor to enhance object boundary. The users can view the enhanced object contour through a polarization glasses in real-time, which can be also recorded using a camera for further processing. A software is developed for user interaction and providing feedback to obtain the best detection results. The effectiveness of the proposed system is demonstrated using some medical and biological images. The proposed system has the advantages of real-time high speed processing, almost no numerical computation, and robustness to noise over the traditional methods using image processing algorithms.
Machine vision based on the concept of contrast sensitivity of the human eye
Author(s):
Vitali Bezzubik;
Nikolai Belashenkov;
Gleb Vdovin
Show Abstract
The model of contrast sensitivity function (CSF) of machine vision system, based on the CSF of the human visual system is proposed. By analogy with the human eye, we employ the concept of ganglion cell receptive field to the artificial light-sensitive elements. By further following this concept, we introduced quantative metrics of local and global contrast of digital image. We suggested that the contrast sensitivity threshold forms an iso-line in the parameter space contrast – spatial frequency. The model, implemented in a computer vision system, has been compared to the results of contrast sensitivity research, conducted directly with the human visual system, and demonstrated a good match.
Accuracy evaluation of segmentation for high resolution imagery and 3D laser point cloud data
Author(s):
Nina Ni;
Ninghua Chen;
Jianyu Chen
Show Abstract
High resolution satellite imagery and 3D laser point cloud data provide precise geometry, rich spectral information and clear texture of feature. The segmentation of high resolution remote sensing images and 3D laser point cloud is the basis of object-oriented remote sensing image analysis, for the segmentation results will directly influence the accuracy of subsequent analysis and discrimination. Currently, there still lacks a common segmentation theory to support these algorithms. So when we face a specific problem, we should determine applicability of the segmentation method through segmentation accuracy assessment, and then determine an optimal segmentation. To today, the most common method for evaluating the effectiveness of a segmentation method is subjective evaluation and supervised evaluation. For providing a more objective evaluation result, we have carried out following work. Analysis and comparison previous proposed image segmentation accuracy evaluation methods, which are area-based metrics, location-based metrics and combinations metrics. 3D point cloud data, which was gathered by Reigl VZ1000, was used to make two-dimensional transformation of point cloud data. The object-oriented segmentation result of aquaculture farm, building and farmland polygons were used as test object and adopted to evaluate segmentation accuracy.
MODIS images super-resolution algorithm via sparse representation
Author(s):
Yue Pang;
Lingjia Gu;
Ruizhi Ren;
Jian Sun
Show Abstract
Based on the current mainstream algorithms, an effective super-resolution algorithm via sparse representation for MODIS remote sensing images is proposed in the paper. The basic idea behind the proposed algorithm is to obtain the redundant dictionaries deriving from high-resolution Landsat ETM+ images and low-resolution MODIS images, further give the instruction for reconstructing high-resolution MODIS images. Feature extraction is one vital part included in the procedure of dictionary training. The features are extracted from the wavelet-domain images as training samples, and then more effective dictionaries for high-resolution image reconstruction are obtained by applying the k-singular value decomposition (K-SVD) dictionary training algorithm. The experimental results demonstrate the proposed algorithm improved the reconstruction quality both visually and quantitatively. Compared with the traditional algorithm, the PSNR value approximately increases by 1.1 dB and SSIM value increases by 0.07. Moreover, both the quality and computational efficiency of the proposed algorithm can be improved given the appropriate number of atoms.
A study on an automatic Ronchi test system
Author(s):
Chun-Li Chang;
Wen-Hong Wu;
Chun-Jen Chen
Show Abstract
In recent years, the glasses had gradually been personal accessory to human life, so the demand of various types of glasses has increased significantly, especially safe glasses and sunglasses. And the requirement of full-inspection of the lens of safe glasses and sunglasses are getting seriously. In the past, the fast lens optical quality inspection where performed by Ronchi test and the Ronchigram images were observed and judged by human eyes. However, the larger uncertainty of measurement will be induced while observing the Ronchi patterns using human eyes. Therefore, this study presents the development of an automatic lens Inspection Instrument based on Ronchi tester, which comprises of the machine vision, image analysis and processing technique without human operation involved. In addition, an optical quality index based on Ronchigram has been developed so as to classify the quality of test lens. In this paper, we propose a lens quality index (LQI) to evaluate the optical quality of lens to be inspected.
Relay-and-antenna selection and digital transceiver design for two-way AF-MIMO multiple-relay systems
Author(s):
Chia-Chang Hu;
Hao-Hsian Su;
Kang-Tsao Tang
Show Abstract
This paper considers a two-way multiple-input multiple-output (MIMO) relaying system with multiple relays between two terminals nodes. The relay antenna selection scheme based on channel singular valued decomposition (SVD) is used to reduce energy consumption. To enhance the system performance, we apply a SVD-based algorithm with MSE criterion which calculates optimal linear transceivers precoding jointly at the source nodes and relay nodes for amplify-and-forward (AF) protocols. In computer simulations, we use an iteration method to compute the non-convex function of joint source and relays power allocation. The simulation results show the SVD-based precoding design with SVD-based relay and antenna selection scheme can achieve a superior system bit error rate (BER) performance and reduce the power consume of relay antennas.
Application of subsidence monitoring over Yangtze river marshland with ground-based SAR system IBIS
Author(s):
Zhiwei Qiu;
Jianping Yue;
Xueqin Wang
Show Abstract
The subsidence was monitored by micro‐deformation monitoring system IBIS based on ground‐based SAR interferometry technology. The displacement in the line of sight can be corrected along the subsidence direction after atmospheric disturbances reduction, then get continuous deformation map of observation area in 24 hours. These experiments show that the Ground‐Based InSAR technology can be applied to subsidence monitoring with millimeter precision, and IBIS system have an advantage in dynamic monitoring of micro‐deformation.
Objective evaluation of naturalness, contrast, and colorfulness of tone-mapped images
Author(s):
Lukáš Krasula;
Karel Fliegel;
Patrick Le Callet;
Miloš Klíma
Show Abstract
The main obstacle preventing High Dynamic Range (HDR) imaging from becoming standard in image and video processing industry is the challenge of displaying the content. The prices of HDR screens are still too high for ordinary customers. During last decade, a lot of effort has been dedicated to finding ways to compress the dynamic range for legacy displays with simultaneous preservation of details in highlights and shadows which cannot be achieved by standard systems. These dynamic range compression techniques are called tone-mapping operators (TMO) and introduce novel distortions such as spatially non-linear distortion of contrast or naturalness corruption. This paper provides an analysis of objective no-reference naturalness, contrast and colorfulness measures in the context of tone-mapped images evaluation. Reliable measures of these features could be further merged together into single overall quality metric. The main goal of the paper is to provide an initial study of the problem and identify the potential candidates for such a combination.