Proceedings Volume 7798

Applications of Digital Image Processing XXXIII

cover
Proceedings Volume 7798

Applications of Digital Image Processing XXXIII

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 20 August 2010
Contents: 17 Sessions, 77 Papers, 0 Presentations
Conference: SPIE Optical Engineering + Applications 2010
Volume Number: 7798

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 7798
  • Image Signal Processing I
  • Image Signal Processing II
  • Image Signal Processing III
  • Digital Cultural Heritage
  • Visual Search I
  • Visual Search II
  • Compression and Transforms for Images and Video I
  • Compression and Transforms for Images and Video II
  • Computational Imaging I: Joint Session with Conference 7800
  • Computational Imaging II: Joint Session with Conference 7800
  • Perceptual Coding of Still and Motion Images I
  • Perceptual Coding of Still and Motion Images II
  • Mobile Video: Processing, Communications, Display, and Applications I
  • Mobile Video: Processing, Communications, Display, and Applications II
  • Optics, Photonics and Digital Image Processing
  • Poster Session
Front Matter: Volume 7798
icon_mobile_dropdown
Front Matter: Volume 7798
This pdf file contains the front matter associated with SPIE Proceedings Volume 7798, including Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Image Signal Processing I
icon_mobile_dropdown
Multi-scale edge detection with local noise estimate
Bo Jiang, Zia-ur Rahman
The (unrealistic) assumption that noise can be modeled as independent, additive and uniform can lead to problems when edge detection methods are applied to real or natural images. The main reason for this is because filter scale and threshold for the gradient are difficult to determine at a regional or local scale when the noise estimate is on a global scale. A filter with one global scale might under-smooth areas of high noise, but over-smooth less noisy area. Similarly, a static, global threshold may not be appropriate for the entire image because different regions have different degrees of detail. Thus, some methods use more than one filter for detecting edges and discard the thresholding method in edge discrimination. Multi-scale description of the image mimics the receptive fields of neurons in the early visual cortex of animals. At the small scale, details can be reliably detected. At the larger scale, the contours or the frame get more attention. So, the image features can be fully represented by combining a range of scales. The proposed multi-scale edge detection algorithm utilizes this hierarchical organization to detect and localize edges. Furthermore, instead of using one default global threshold, local dynamic threshold is introduced to discriminate edge or non-edge. Based on a critical value function, the local dynamic threshold for each scale is determined using a novel local noise estimation (LNE) method. Additionally, the proposed algorithm performs connectivity analysis on edge map to ensure that small, disconnected edges are removed. Experiments where this method is applied to a sequence of images of the same scene but with different signal-noise-ratio (SNR), show the method to be robust to noise.
Correlation filter design using a single cluttered training image for detecting a noisy target in a nonoverlapping scene
Classical correlation filters for object detection and location estimation are designed under the assumption that the shape and intensity values of the object of interest are explicitly known. In this work we assume that the target is given at unknown coordinates in a reference image with a cluttered background corrupted by additive noise. We consider the nonoverlapping signal model for both the reference image and the input scene. Optimal correlation filters, with respect to signal-to-noise ratio and peak-to-output energy, for object detection and location estimation are derived. Estimation techniques are proposed for the parameters required for filter design. Computer simulation results obtained with the proposed filters are presented and compared with those of common correlation filters.
Denoising point clouds using pulling-back method
Chaomin Shen, Yaxin Peng, Guixu Zhang
We propose a method for denoising a point cloud by pulling every noise point to its supposed position. In R2, suppose that the point cloud be all located on a presumed curve. However, some points are not on the curve due to noise. For every point, in a small neighborhood the presumed curve is approximated by an osculating circle. The point is then pulled to the circle, i.e., its new position is the projected point on the circle. In R3, the 2-D osculating circle is replaced by Dupin indicatrix. This Dupin indicatrix is attached to the noisy point, thus it also moves with the moving noisy point. The noisy point will move along its normal direction. Then, along the normal direction, the length of every k-nearest point to the projected point on the Dupin indicatrix is computed. The noisy point's new position is the place where the sum of squared length of all k-nearest points reaches minimum. The point cloud data are used to examine the result and found satisfactory.
Image Signal Processing II
icon_mobile_dropdown
Image restoration based on multiple PSF information with applications to phase-coded imaging system
Po-Chang Chen, Yung-Lin Chen, Hsin-Yueh Sung
Conventional image restoration technique generally uses one point-spread function (PSF) corresponding to an object distance (OD) and a viewing angle (VA) in filter design. However, for those imaging systems, which concern a better balance or a new tradeoff of image restoration within a range of ODs or VAs, the conventional design might be insufficient to give satisfactory results. In this paper, an extension of the minimum mean square error (MMSE) method is proposed. The proposed method defines a cost function as a linear combination of multiple mean square errors (MSEs). Each MSE is for measuring the restoration performance at a specific OD and VA and can be computed from the restored image and its correspondent target image. Since the MSEs for different ODs are lumped into one cost function, the filter solved can provide a better balance in restoration compared with the conventional design. The method is applied to an extended depth-of-field (EDoF) imaging system and computer simulations are performed to verify its effectiveness.
Motion-compensated compressed sensing for dynamic imaging
Rajagopalan Sundaresan, Yookyung Kim, Mariappan S. Nadar, et al.
The recently introduced Compressed Sensing (CS) theory explains how sparse or compressible signals can be reconstructed from far fewer samples than what was previously believed possible. The CS theory has attracted significant attention for applications such as Magnetic Resonance Imaging (MRI) where long acquisition times have been problematic. This is especially true for dynamic MRI applications where high spatio-temporal resolution is needed. For example, in cardiac cine MRI, it is desirable to acquire the whole cardiac volume within a single breath-hold in order to avoid artifacts due to respiratory motion. Conventional MRI techniques do not allow reconstruction of high resolution image sequences from such limited amount of data. Vaswani et al. recently proposed an extension of the CS framework to problems with partially known support (i.e. sparsity pattern). In their work, the problem of recursive reconstruction of time sequences of sparse signals was considered. Under the assumption that the support of the signal changes slowly over time, they proposed using the support of the previous frame as the "known" part of the support for the current frame. While this approach works well for image sequences with little or no motion, motion causes significant change in support between adjacent frames. In this paper, we illustrate how motion estimation and compensation techniques can be used to reconstruct more accurate estimates of support for image sequences with substantial motion (such as cardiac MRI). Experimental results using phantoms as well as real MRI data sets illustrate the improved performance of the proposed technique.
Using enhancement data to deinterlace 1080i HDTV
Andy L. Lin, Jae S. Lim
When interlaced scan (IS) is used for television transmission, the received video must be deinterlaced to be displayed on progressive scan (PS) displays. To achieve good performance, the deinterlacing operation is typically computationally expensive. We propose a receiver compatible approach which performs a deinterlacing operation inexpensively, with good performance. At the transmitter, the system analyzes the video and transmits an additional low bit-rate stream. Existing receivers ignore this information. New receivers utilize this stream and perform a deinterlacing operation inexpensively with good performance. Results indicate that this approach can improve the digital television standard in a receiver compatible manner.
Image Signal Processing III
icon_mobile_dropdown
Multispectral MRI-based virtual cystoscopy
Bladder cancer is the fifth cause of cancer deaths in the United States. Virtual cystoscopy (VC) can be a screening means for early detection of the cancer using non-invasive imaging and computer graphics technologies. Previous researches have mainly focused on spiral CT (computed tomography), which invasively introduces air into bladder lumen for a contrast against bladder wall via a small catheter. However, the tissue contrast around bladder wall is still limited in CT-based VC. In addition, CT-based technique carries additional radiation. We have investigated a procedure to achieve the screening task by MRI (magnetic resonance imaging). It utilizes the unique features of MRI: (1) the urine has distinct T1 and T2 relaxation times as compared to its surrounding tissues, and (2) MRI has the potential to obtain good tissue contrast around bladder wall. The procedure is fully non-invasive and easy in implementation. In this paper, we proposed a MRI-based VC system for computer aided detection (CAD) of bladder tumors. The proposed VC system is an integration of partial volume-based segmentation containing texture information and fast marching-based CAD employing geometrical features for detecting of bladder tumors. The accuracy and efficiency of the integrated VC system are evaluated by testing the diagnoses against a database of patients.
Computational architecture for image processing on a small unmanned ground vehicle
Sean Ho, Hung Nguyen
Man-portable Unmanned Ground Vehicles (UGVs) have been fielded on the battlefield with limited computing power. This limitation constrains their use primarily to teleoperation control mode for clearing areas and bomb defusing. In order to extend their capability to include the reconnaissance and surveillance missions of dismounted soldiers, a separate processing payload is desired. This paper presents a processing architecture and the design details on the payload module that enables the PackBot to perform sophisticated, real-time image processing algorithms using data collected from its onboard imaging sensors including LADAR, IMU, visible, IR, stereo, and the Ladybug spherical cameras. The entire payload is constructed from currently available Commercial off-the-shelf (COTS) components including an Intel multi-core CPU and a Nvidia GPU. The result of this work enables a small UGV to perform computationally expensive image processing tasks that once were only feasible on a large workstation.
Automatic activity estimation based on object behaviour signature
Automatic estimation of human activities is a topic widely studied. However the process becomes difficult when we want to estimate activities from a video stream, because human activities are dynamic and complex. Furthermore, we have to take into account the amount of information that images provide, since it makes the modelling and estimation activities a hard work. In this paper we propose a method for activity estimation based on object behavior. Objects are located in a delimited observation area and their handling is recorded with a video camera. Activity estimation can be done automatically by analyzing the video sequences. The proposed method is called "signature recognition" because it considers a space-time signature of the behaviour of objects that are used in particular activities (e.g. patients' care in a healthcare environment for elder people with restricted mobility). A pulse is produced when an object appears in or disappears of the observation area. This means there is a change from zero to one or vice versa. These changes are produced by the identification of the objects with a bank of nonlinear correlation filters. Each object is processed independently and produces its own pulses; hence we are able to recognize several objects with different patterns at the same time. The method is applied to estimate three healthcare-related activities of elder people with restricted mobility.
Digital Cultural Heritage
icon_mobile_dropdown
Signal processing and analyzing works of art
Don H. Johnson, C. Richard Johnson Jr., Ella Hendriks
In examining paintings, art historians use a wide variety of physico-chemical methods to determine, for example, the paints, the ground (canvas primer) and any underdrawing the artist used. However, the art world has been little touched by signal processing algorithms. Our work develops algorithms to examine x-ray images of paintings, not to analyze the artist's brushstrokes but to characterize the weave of the canvas that supports the painting. The physics of radiography indicates that linear processing of the x-rays is most appropriate. Our spectral analysis algorithms have an accuracy superior to human spot-measurements and have the advantage that, through "short-space" Fourier analysis, they can be readily applied to entire x-rays. We have found that variations in the manufacturing process create a unique pattern of horizontal and vertical thread density variations in the bolts of canvas produced. In addition, we measure the thread angles, providing a way to determine the presence of cusping and to infer the location of the tacks used to stretch the canvas on a frame during the priming process. We have developed weave matching software that employs a new correlation measure to find paintings that share canvas weave characteristics. Using a corpus of over 290 paintings attributed to Vincent van Gogh, we have found several weave match cliques that we believe will refine the art historical record and provide more insight into the artist's creative processes.
Texton-based analysis of paintings
Laurens J. P. van der Maaten, Eric O. Postma
The visual examination of paintings is traditionally performed by skilled art historians using their eyes. Recent advances in intelligent systems may support art historians in determining the authenticity or date of creation of paintings. In this paper, we propose a technique for the examination of brushstroke structure that views the wildly overlapping brushstrokes as texture. The analysis of the painting texture is performed with the help of a texton codebook, i.e., a codebook of small prototypical textural patches. The texton codebook can be learned from a collection of paintings. Our textural analysis technique represents paintings in terms of histograms that measure the frequency by which the textons in the codebook occur in the painting (so-called texton histograms). We present experiments that show the validity and effectiveness of our technique for textural analysis on a collection of digitized high-resolution reproductions of paintings by Van Gogh and his contemporaries. As texton histograms cannot be easily be interpreted by art experts, the paper proposes to approaches to visualize the results on the textural analysis. The first approach visualizes the similarities between the histogram representations of paintings by employing a recently proposed dimensionality reduction technique, called t-SNE. We show that t-SNE reveals a clear separation of paintings created by Van Gogh and those created by other painters. In addition, the period of creation is faithfully reflected in the t-SNE visualizations. The second approach visualizes the similarities and differences between paintings by highlighting regions in a painting in which the textural structure of the painting is unusual. We illustrate the validity of this approach by means of an experiment in which we highlight regions in a painting by Monet that are not very "Van Gogh-like". Taken together, we believe the tools developed in this study are well capable of assisting for art historians in support of their study of paintings.
Multispectral imaging for digital painting analysis: a Gauguin case study
Bruno Cornelis, Ann Dooms, Frederik Leen, et al.
This paper is an introduction into the analysis of multispectral recordings of paintings. First, we will give an overview of the advantages of multispectral image analysis over more traditional techniques: first of all, the bands residing in the visible domain provide an accurate measurement of the color information which can be used for analysis but also for conservational and archival purposes (i.e. preserving the art patrimonial by making a digital library). Secondly, inspection of the multispectral imagery by art experts and art conservators has shown that combining the information present in the spectral bands residing in- and outside the visible domain can lead to a richer analysis of paintings. In the remainder of the paper, practical applications of multispectral analysis are demonstrated, where we consider the acquisition of thirteen different, high resolution spectral bands. Nine of these reside in the visible domain, one in the near ultraviolet and three in the infrared. The paper will illustrate the promising future of multispectral analysis as a non-invasive tool for acquiring data which cannot be acquired by visual inspection alone and which is highly relevant to art preservation, authentication and restoration. The demonstrated applications include detection of restored areas and detection of aging cracks.
Attenuating hue identification and color estimation for underpainting reconstruction from x-ray synchrotron imaging data
Anila Anitha, Shannon M. Hughes
This paper discusses two new developments in methods for virtually reconstructing paintings that have been painted over from Xray synchrotron imaging data of their canvases. First, Xray synchrotron data often contains areas of information loss, in which signal from underlayers was unable to penetrate particularly thick or Xray- absorbent surface features. We present a new method for automatically identifying these areas so that they may be inpainted. Second, we present preliminary results in which we reconstruct the colors of the underpainting directly from the Xray synchrotron imaging data. This is to our knowledge the rst attempt at accurate color reconstruction from this type of data.
Visual Search I
icon_mobile_dropdown
Keypoint clustering for robust image matching
Sundeep Vaddadi, Onur Hamsici, Yuriy Reznik, et al.
A number of popular image matching algorithms such as Scale Invariant Feature Transform (SIFT)1 are based on local image features. They first detect interest points (or keypoints) across an image and then compute descriptors based on patches around them. In this paper, we observe that in textured or feature-rich images, keypoints typically appear in clusters following patterns in the underlying structure. We show that such clustering phenomenon can be used to: 1) enhance recall and precision performance of the descriptor matching process, and 2) improve convergence rate of the RANSAC algorithm used in the geometric verification stage.
Fast quantization and matching of histogram-based image features
Yuriy A. Reznik, Vijay Chandrasekhar, Gabriel Takacs, et al.
We review construction of a Compressed Histogram of Gradients (CHoG) image feature descriptor, and study quantization problem that arises in its design. We explain our choice of algorithms for solving it, addressing both complexity and performance aspects. We also study design of algorithms for decoding and matching of compressed descriptors, and offer several techniques for speeding up these operations.
Permutable descriptors for orientation-invariant image matching
Gabriel Takacs, Vijay Chandrasekhar, Huizhong Chen, et al.
Orientation-invariant feature descriptors are widely used for image matching. We propose a new method of computing and comparing Histogram of Gradients (HoG) descriptors which allows for re-orientation through permutation. We do so by moving the orientation processing to the distance comparison, rather than the descriptor computation. This improves upon prior work by increasing spatial distinctiveness. Our method method allows for very fast descriptor computation, which is advantageous since many mobile applications of HoG descriptors require fast descriptor computation on hand-held devices.
Object tracking in real environments
Alexander M. Nelson, Jeremiah J. Neubert
Modern tracking methods typically rely on features to track objects. These methods function best with objects containing distinguishable features. Previously we proposed a graph cuts approach that utilizes intensity changes and the likelihood that the RGB intensities associated with a pixel belong to the object. We propose a new method that models the RGB tuple as a single random variable. This allows for more robust segmentation, but requires more data to construct the color model.The results show the ability of the method to tracking in a varity environments and with a large variety of objects.
Visual Search II
icon_mobile_dropdown
Three-dimensional target modeling with synthetic aperture radar
John R. Hupton, John A. Saghri
Conventional Synthetic Aperture Radar (SAR) offers high-resolution imaging of a target region in the range and cross-range dimensions along the ground plane. Little or no data is available in the range-altitude dimension, however, and target functions and models are limited to two-dimensional images. This paper first investigates some existing methods for the computation of target reflectivity data in the deficient elevation domain, and a new method is then proposed for three-dimensional (3-D) SAR target feature extraction. Simulations are implemented to test the decoupled least-squares technique for high-resolution spectral estimation of target reflectivity, and the accuracy of the technique is assessed. The technique is shown to be sufficiently accurate at resolving targets in the third axis, but is limited in practicality due to restrictive requirements on the input data. An attempt is then made to overcome some of the practical limitations inherent in the current 3-D SAR methods by proposing a new technique based on the direct extraction of 3-D target features from arbitrary SAR image inputs. The radar shadow present in SAR images of MSTAR vehicle targets is extracted and used in conjunction with the radar beam depression angle to compute physical target heights along the range axis. Multiple inputs of elevation data are then merged to forge rough 3-D target models.
A Bayesian network-based approach for identifying regions of interest utilizing global image features
Mustafa Jaber, Eli Saber
An image-understanding algorithm for identifying Regions-of-Interest (ROI) in digital images is proposed. Global and regional features that characterize relations between image segments are fused in a probabilistic framework to generate ROI for an arbitrary image. Features are introduced as maps for spatial position, weighted similarity, and weighted homogeneity for image regions. The proposed methodology includes modules for image segmentation, feature extraction, and probabilistic reasoning. It differs from prior art by using machine learning techniques to discover the optimum Bayesian Network structure and probabilistic inference. It also eliminates the necessity for semantic understanding at intermediate stages. Experimental results show a competitive performance in comparison with the state-of- the-art techniques with an accuracy rate of ~80% on a set of ~20,000 publicly available color images. Applications of the proposed algorithm include content-based image retrieval, image indexing, automatic image annotation, mobile phone imagery, and digital photo cropping.
Low-cost asset tracking using location-aware camera phones
David Chen, Sam Tsai, Kyu-Han Kim, et al.
Maintaining an accurate and up-to-date inventory of one's assets is a labor-intensive, tedious, and costly operation. To ease this difficult but important task, we design and implement a mobile asset tracking system for automatically generating an inventory by snapping photos of the assets with a smartphone. Since smartphones are becoming ubiquitous, construction and deployment of our inventory management solution is simple and costeffective. Automatic asset recognition is achieved by first segmenting individual assets out of the query photo and then performing bag-of-visual-features (BoVF) image matching on the segmented regions. The smartphone's sensor readings, such as digital compass and accelerometer measurements, can be used to determine the location of each asset, and this location information is stored in the inventory for each recognized asset. As a special case study, we demonstrate a mobile book tracking system, where users snap photos of books stacked on bookshelves to generate a location-aware book inventory. It is shown that segmenting the book spines is very important for accurate feature-based image matching into a database of book spines. Segmentation also provides the exact orientation of each book spine, so more discriminative upright local features can be employed for improved recognition. This system's mobile client has been implemented for smartphones running the Symbian or Android operating systems. The client enables a user to snap a picture of a bookshelf and to subsequently view the recognized spines in the smartphone's viewfinder. Two different pose estimates, one from BoVF geometric matching and the other from segmentation boundaries, are both utilized to accurately draw the boundary of each spine in the viewfinder for easy visualization. The BoVF representation also allows matching each photo of a bookshelf rack against a photo of the entire bookshelf, and the resulting feature matches are used in conjunction with the smartphone's orientation sensors to determine the exact location of each book.
Propagation of geotags based on object duplicate detection
Peter Vajda, Ivan Ivanov, Jong-Seok Lee, et al.
In this paper, we consider the use of object duplicate detection for the propagation of geotags from a small set of images with location names (IPTC) to a large set of non-tagged images. The motivation behind this idea is that images of individual locations usually contain specific objects such as monuments, buildings or signs. Therefore, object duplicate detection can be used to establish the correspondence between tagged and non-tagged images. Our recent graph based object duplicate detection approach is adapted for this task. The effectiveness of the approach is demonstrated through a set of experiments considering various locations.
Compression and Transforms for Images and Video I
icon_mobile_dropdown
Design of high-performance fixed-point transforms using the common factor method
Fixed-point implementations of transforms such as the Discrete Cosine Transform (DCT) remain as fundamental building blocks of state-of-the-art video coding technologies. Recently, the 16x16 DCT has received focus as a transform suitable for the high efficiency video coding project currently underway in the Joint Collaboration Team - Video Coding. By its definition, the 16x16 DCT is inherently more complex than transforms of traditional sizes such as 4x4 or 8x8 DCTs. However, scaled architectures such as the one employed in the design of the 8x8 DCTs specified in ISO/IEC 23002-2 can also be utilized to mitigate the complexity of fixed-point approximations of higher-order transforms such as the 16x16 DCT. This paper demonstrates the application of the Common Factor method to design two scaled implementations of the 16x16 DCT. One implementation can be characterized by its exceptionally low complexity, while the other can be characterized by its relatively high precision. We review the Common Factor method as a method to arrive at fixed-point implementations that are optimized in terms of complexity and precision for such high performance transforms.
Recent developments in standardization of high efficiency video coding (HEVC)
This paper reports on recent developments in video coding standardization, particularly focusing on the Call for Proposals (CfP) on video coding technology made jointly in January 2010 by ITU-T VCEG and ISO/IEC MPEG and the April 2010 responses to that Call. The new standardization initiative is referred to as High Efficiency Video Coding (HEVC) and its development has been undertaken by a new Joint Collaborative Team on Video Coding (JCT-VC) formed by the two organizations. The HEVC standard is intended to provide significantly better compression capability than the existing AVC (ITU-T H.264 | ISO/IEC MPEG-4 Part 10) standard. The results of the CfP are summarized, and the first steps towards the definition of the HEVC standard are described.
Efficient large size transforms for high-performance video coding
This paper describes design of transforms for extended block sizes for video coding. The proposed transforms are orthogonal integer transforms, based on a simple recursive factorization structure, and allow very compact and efficient implementations. We discuss techniques used for finding integer and scale factors in these transforms, and describe our final design. We evaluate efficiency of our proposed transforms in VCEG's H.265/JMKTA framework, and show that they achieve nearly identical performance compared to much more complex transforms in the current test model.
Compression and Transforms for Images and Video II
icon_mobile_dropdown
Low-complexity lossless codes for image and video coding
We describe design of lossless block codes for geometric, Laplacian, and similar distributions frequently arising in image and video coding. Proposed codes can be understood as a generalization of Golomb codes, allowing more precise adaptation to values of parameters of distributions, and resulting in lower redundancy. Design of universal block codes for a class of geometric distributions is also studied.
Embedded memory compression for video and graphics applications
Andy Teng, Dane Gokce, Mickey Aleksic, et al.
We describe design of a low-complexity lossless and near-lossless image compression system with random access, suitable for embedded memory compression applications. This system employs a block-based DPCM coder using variable-length encoding for the residual. As part of this design, we propose to use non-prefix (one-to-one) codes for coding of residuals, and show that they offer improvements in compression performance compared to conventional techniques, such as Golomb-Rice and Huffman codes.
Self-derivation of motion estimation techniques to improve video coding efficiency
Yi-jen Chiu, Lidong Xu, Wenhao Zhang, et al.
This paper presents the techniques to self derive the motion vectors (MVs) at video decoder side to improve coding efficiency of B pictures. With the MVs information self derived at video decoder side, the transmission of these self-derived MVs from video encoder side to video decoder side is skipped and thus better coding efficiency can be achieved. Our proposed techniques derive the block-based MVs at video decoder side by considering the temporal correlation among the available pixels in the previously-decoded reference pictures. Utilizing the MVs derived at video decoder side can be added as one of coding mode candidates from video encoder where the video encoder can utilize this new coding mode during phase of the coding mode selection to better trade off the rate-distortion performance to improve the coding efficiency. Experiments have demonstrated that the BD bitrate improvement on top of ITU-T/VCEG Key Technology Area (KTA) Reference Software platform with an overall about 7% improvement on the hierarchical IbBbBbBbP coding structure under the common test conditions of the joint call for proposal for the new video coding technology from ISO/MPEG and ITU-T committee on January 2010.
Variable length coding for binary sources and applications in video compression
Gergely Korodi, Dake He, Paul Imthurn
This article introduces a lossless encoding scheme for interleaved input from a fixed number of binary sources, each one characterized by a known probability value. The algorithm achieves compression performance close to the entropy, providing very fast encoding and decoding speed. The algorithm can efficiently benefit from independent parallel decoding units, and it is demonstrated to have significant advantages in hardware implementations over previous technologies.
Subjective evaluation of next-generation video compression algorithms: a case study
Francesca De Simone, Lutz Goldmann, Jong-Seok Lee, et al.
This paper describes the details and the results of the subjective quality evaluation performed at EPFL, as a contribution to the effort of the Joint Collaborative Team on Video Coding (JCT-VC) for the definition of the next-generation video coding standard. The performance of 27 coding technologies have been evaluated with respect to two H.264/MPEG-4 AVC anchors, considering high definition (HD) test material. The test campaign involved a total of 494 naive observers and took place over a period of four weeks. While similar tests have been conducted as part of the standardization process of previous video coding technologies, the test campaign described in this paper is by far the most extensive in the history of video coding standardization. The obtained subjective quality scores show high consistency and support an accurate comparison of the performance of the different coding solutions.
Computational Imaging I: Joint Session with Conference 7800
icon_mobile_dropdown
High dynamic range video with ghost removal
Stephen Mangiat, Jerry Gibson
We propose a new method for ghost-free high dynamic range (HDR) video taken with a camera that captures alternating short and long exposures. These exposures may be combined using traditional HDR techniques, however motion in a dynamic scene will lead to ghosting artifacts. Due to occlusions and fast moving objects, a gradient-based optical flow motion compensation method will fail to eliminate all ghosting. As such, we perform simpler block-based motion estimation and refine the motion vectors in saturated regions using color similarity in the adjacent frames. The block-based search allows motion to be calculated directly between adjacent frames over a larger search range, yet at the cost of decreased motion fidelity. To address this, we investigate a new method to fix registration errors and block artifacts using a cross-bilateral filter to preserve the edges and structure of the original frame while retaining the HDR color information. Results show promising dynamic range expansion for videos with fast local motion.
ECME hard thresholding methods for image reconstruction from compressive samples
We propose two hard thresholding schemes for image reconstruction from compressive samples. The measurements follow an underdetermined linear model, where the regression-coefficient vector is a sum of an unknown deterministic sparse signal component and a zero-mean white Gaussian component with an unknown variance. We derived an expectation-conditional maximization either (ECME) iteration that converges to a local maximum of the likelihood function of the unknown parameters for a given image sparsity level. Here, we present and analyze a double overrelaxation (DORE) algorithm that applies two successive overrelaxation steps after one ECME iteration step, with the goal to accelerate the ECME iteration. To analyze the reconstruction accuracy, we introduce minimum sparse subspace quotient (minimum SSQ), a more flexible measure of the sampling operator than the well-established restricted isometry property (RIP). We prove that, if the minimum SSQ is sufficiently large, the DORE algorithm achieves perfect or near-optimal recovery of the true image, provided that its transform coefficients are sparse or nearly sparse, respectively. We then describe a multiple-initialization DORE algorithm (DOREMI) that can significantly improve DORE's reconstruction performance. We present numerical examples where we compare our methods with existing compressive sampling image reconstruction approaches.
A survey of image retargeting techniques
Daniel Vaquero, Matthew Turk, Kari Pulli, et al.
Advances in imaging technology have made the capture and display of digital images ubiquitous. A variety of displays are used to view them, ranging from high-resolution computer monitors to low-resolution mobile devices, and images often have to undergo changes in size and aspect ratio to adapt to different screens. Also, displaying and printing documents with embedded images frequently entail resizing of the images to comply with the overall layout. Straightforward image resizing operators, such as scaling, often do not produce satisfactory results, since they are oblivious to image content. In this work, we review and categorize algorithms for contentaware image retargeting, i.e., resizing an image while taking its content into consideration to preserve important regions and minimize distortions. This is a challenging problem, as it requires preserving the relevant information while maintaining an aesthetically pleasing image for the user. The techniques typically start by computing an importance map which represents the relevance of every pixel, and then apply an operator that resizes the image while taking into account the importance map and additional constraints. We intend this review to be useful to researchers and practitioners interested in image retargeting.
Objective and subjective measurement and modeling of image quality: a case study
The image structure quality resulting from several CMOS pixel structures (conventional, backside-illuminated, and diagonally oriented) has been compared using three complimentary techniques: (1) objective measurements of noise equivalent quanta (NEQ) as a function of spatial frequency; (2) perceptual modeling of the multivariate quality loss from blur and noise in units of just noticeable differences (JNDs); and (3) subjective measurement with the softcopy quality ruler, also producing results in JNDs. The results of the perceptual modeling and subjective measurement were in good quantitative agreement. NEQ is not perceptually uniform and so could only be correlated qualitatively with the other methods, but it was helpful in understanding how performance might vary by application, given the spatial frequencies at which the curves crossed. The strengths and weaknesses of each approach are compared; all three have potential utility in evaluating computational imaging systems.
Computationally efficient deblurring of shift-variant highly defocused images
Shekhar B. Sastry, Muralidhara Subbarao
A localized and efficient iterative approach is presented for deblurring images that are highly defocused with arbitrary shift-variant point spread functions (PSF). This approach extends a recently proposed local technique, the RT technique, which works only for medium levels of blur to work for significantly higher levels of blur. The RT technique is used to localize the blurring kernel at each pixel, and a region around the pixel with size comparable to the size of the blurring kernel is divided into several smaller regions (intervals). The blurred image in each interval is modeled separately by truncated Taylor-series polynomials. This step improves the accuracy of the image model for low order truncated Taylor-series expansions. The blurred image value at each pixel is expressed as the sum of multiple partial blur integrals with each integral term corresponding to one interval. Then an expression is derived for the focused image value at a pixel in terms of the derivatives of the blurred image in the central image region and solutions in the surrounding regions. This expression is solved iteratively at each pixel in parallel to obtain a focused image. The starting solution is assumed to be either zero or the blurred image itself. It is found that this new technique can effectively invert large blurs for which the original RT method failed. In our experiments the truncated Taylor series expansion was limited to third order. Theory and algorithms as well as experimental results on both simulation and real data are presented.
Computational Imaging II: Joint Session with Conference 7800
icon_mobile_dropdown
The restoration of large blur image based on POCS algorithm
Jun Luo, Xinyu Zhang, Changsheng Xie, et al.
When we obtain some static images by a very high speed camera, the images exist large blur, and the PSF may exceed the half size of image resolution which we call ultra-half-length-blur. All the experiments are based on images which are 256 pixels by 256 pixels. Firstly, we consider horizontal blur of images between 50 and 60 pixels, then, we increase the length of horizontal blur to 100 pixels which is close to the half length of horizontal resolution of image, the blur images exist pixel alias in a certain extent, and the POCS algorithm can restore these blur images as well. For the condition of ultra-half-length-blur which exists more than 150 pixels of horizontal blur, we can not distinguish details in all the blur images, and can not restore the high-resolution image by traditional POCS algorithm. Therefore, we establish a PSF model of ultra-half-length-blur, we use different interpolations for the SR estimation at first, then we apply two POCS algorithms of ultra-half-length-blur PSF model, one is pixel-by-pixel, the other is every-other-pixel. Finally, we identify the ultimate performance of POCS algorithm by these large blur experiments.
Generating highly realistic 3D animation video with depth-of-field and motion blur effects
Karthik Sathyanarayana, Muralidhara Subbarao
A computationally efficient algorithm is described for generating shift-variant defocus and motion blur effects for animation video. This algorithm precisely models rigid body motion of 3-D objects which includes arbitrary translational and rotational motion. Camera parameters such as aperture diameter, focal length, and the location of image detector are used to calculate the blur circle radius of point spread functions (PSFs) modeled by Gaussian and Cylindrical functions. In addition, a novel and simple method similar to image inpainting is described for filling missing pixels that arise due to object motion, round-off errors, interpolation or changes in magnification. Performance of the algorithms are demonstrated on a set of 3D shapes such as sphere, cylinder, cone, etc. The software tool developed in this research is also useful in computer vision and image processing research. It can be used for simulating test data with known ground truth in the testing and evaluation of depth-from-defocus and image/video de-blurring algorithms.
Perceptual Coding of Still and Motion Images I
icon_mobile_dropdown
Evaluation of MPEG4-SVC for QoE protection in the context of transmission errors
Scalable Video Coding (SVC) provides a way to encapsulate several video layers with increasing quality and resolution in a single bitstream. Thus it is particularly adapted to address heterogeneous networks and a wide variety of decoding devices. In this paper, we evaluate the interest of SVC in a different context, which is error concealment after transmission on networks subject to packet loss. The encoded scalable video streams contain two layers with different spatial and temporal resolutions designed for mobile video communications with medium size and average to low bitrates. The main idea is to use the base layer to conceal errors in the higher layers if they are corrupted or lost. The base layer is first upscaled either spatially or temporally to reach the same resolution as the layer to conceal. Two error-concealment techniques using the base layer are then proposed for the MPEG-4 SVC standard, involving frame-level concealment and pixel-level concealment. These techniques are compared to the upscaled base layer as well as to a classical single-layer MPEG- 4 AVC/H.264 error-concealment technique. The comparison is carried out through a subjective experiment, in order to evaluate the Quality-of-Experience of the proposed techniques. We study several scenarios involving various bitrates and resolutions for the base layer of the SVC streams. The results show that SVC-based error concealment can provide significantly higher visual quality than single-layer-based techniques. Moreover, we demonstrate that the resolution and bitrate of the base layer have a strong impact on the perceived quality of the concealment.
Rate allocation as quality index performance test
Thomas Richter
In a recent work,16 the author proposed to study the performance of still image quality indices such as the SSIM by using it as objective function of a rate allocation algorithm. The outcome of that work was not only a multi-scale SSIM optimal JPEG 2000 implementation, but also a first-order approximation of the MS-SSIM that is surprisingly similar to more traditional contrast-sensitivity and visual masking based approaches. It will be seen in this work that the only difference between the latter works and the MS-SSIM index is the choice of the exponent of the masking term, and furthermore, that a slight modification of the SSIM definition that reproduces more traditional exponents is able to improve the correlation with subjective tests and also improves the performance of the SSIM optimized JPEG 2000 code. That is, understanding the duality of quality indices and rate allocation helps to improve both the visual performance and the performance of the index.
A compressive sensing approach to perceptual image coding
There exist limitations in the human visual system (HVS) which allow images and video to be reconstructed using fewer bits for the same perceived image quality. In this paper we will review the basis of spatial masking at edges and show a new method for generating a just-noticeable distortion (JND) threshold. This JND threshold is then used in a spatial noise shaping algorithm using a compressive sensing technique to provide a perceptual coding approach for JPEG2000 coding of images. Results of subjective tests show that the new spatial noise shaping framework can provide significant savings in bit-rate compared to the standard approach. The algorithm also allows much more precise control of distortion than existing spatial domain techniques and is fully compliant with part 1 of the JPEG2000 standard.
Perceptual Coding of Still and Motion Images II
icon_mobile_dropdown
Perceptually optimized quantization tables for H.264/AVC
Heng Chen, Geert Braeckman, Joeri Barbarien, et al.
The H.264/AVC video coding standard currently represents the state-of-the-art in video compression technology. The initial version of the standard only supported a single quantization step size for all the coefficients in a transformed block. Later, support for custom quantization tables was added, which allows to independently specify the quantization step size for each coefficient in a transformed block. In this way, different quantization can be applied to the highfrequency and low-frequency coefficients, reflecting the human visual system's different sensitivity to high-frequency and low-frequency spatial variations in the signal. In this paper, we design custom quantization tables taking into account the properties of the human visual system as well as the viewing conditions. Our proposed design is based on a model for the human visual system's contrast sensitivity function, which specifies the contrast sensitivity in function of the spatial frequency of the signal. By calculating the spatial frequencies corresponding to each of the transform's basis functions, taking into account viewing distance and dot pitch of the screen, the sensitivity of the human visual system to variations in the transform coefficient corresponding to each basis function can be determined and used to define the corresponding quantization step size. Experimental results, whereby the video quality is measured using VQM, show that the designed quantization tables yield improved performance compared to uniform quantization and to the default quantization tables provided as a part of the reference encoder.
Open source database of images DEIMOS: high dynamic range images
Efficient development of image processing techniques requires a database of suitable test images for verification of the performance, optimization and other related purposes. In this paper, the DEIMOS, an open-source database, is described including its structure and interface. There is a selected application example on high dynamic range content to illustrate the database features. This HDR image database contains a variety of natural scenes captured with a digital single-lens reflex camera (DSLR) under different conditions. The important capture parameters as well as the important characteristics of the camera are part of the database to ensure that the creation of each image is well documented. The DEIMOS database is created gradually step-by-step based upon the contributions of team members.
Research of color distribution index in CIE L*a*b* color space
The index for evaluating the ability of color reproduction is required. The color distribution index (CDI) was proposed to comment the display ability of color distribution of reproduction in CIE Lu'v' color space. A cell of Just Noticeable Difference (JND) for luminance and chromaticity (u'v') was proposed to qualify whether the reproduced colors are in some region of color volume of display. Human eye can perceive fewer colors at low luminance, however, the scalar of chromaticity (u'v') JND at low luminance was the same with the one at other luminance. CDI will be distorted at low luminance. In this paper, regarding perceptible vision at low luminance, we try to use chromaticity (a*b*) JND to replace chromaticity (u'v') JND. The color distribution will be discussed in CIE La*b* color space. We find that CDI at low luminance in CIE L*a*b* color space is higher than in CIE Lu'v' color space, as well as different gamma curves and different bit depths affect CDI. The displays are going to keep approaching 100% true color reproduction; hence the index for evaluating the ability of color reproduction is required.
Mobile Video: Processing, Communications, Display, and Applications I
icon_mobile_dropdown
Video quality management for mobile video application
Kai-Chieh Yang, Khaled El-Maleh, Vasudev Bhaskaren
This paper first briefly reviewed sources of visual quality degradation during video compression and also different video quality assessment techniques. It further extended discussion beyond video compression to other modules in different video application pipeline. Different video application is composed by different processing modules, such as sensor, video encoder, and display. Visual experience is not always determined by single module. Hence, the way of quantifying visual experience on different video applications should vary accordingly. Furthermore, users have very different expectation on visual experience in each application. Different quality assessment approach should be adopted based on various users' expectation.
Low-complexity H.264/AVC motion compensated prediction for mobile video applications
Szu-Wei Lee, C.-C. Jay Kuo
The performance of the motion-compensated prediction (MCP) in video coding is degraded by aliasing due to spatial sampling. To alleviate this problem in H.264/AVC, a low-pass filter is used by the fractional-pel motion estimation (FME) to suppress the aliasing component. However, the FME demands higher computational complexity on H.264/AVC encoding. In this work, we first perform a joint quantization and aliasing analysis on the H.264/AVC MCP process and show that the impact of the aliasing component can be alleviated by the quantization process. Then, we propose a fast motion estimation (ME) algorithm that uses the FME and the integer-pel motion estimation (IME) adaptively. The adaptive FME/IME algorithm examines the coding modes of the reference block for the current coding block, and then decides whether the FME or the IME should be applied to the current coding block. Experimental results show that the proposed adaptive FME/IME algorithm can help the encoder generate a bit stream at much lower computational complexity with small degradation in the coding gain as compared with a pure FME algorithm.
Decoder friendly H.264/AVC deblocking filter design
Szu-Wei Lee, C.-C. Jay Kuo
The complexity model of the H.264 deblocking filter (DBF) is studied in this work. The DBF process consists of three main modules: 1) boundary strength computation, 2) edge detection, and 3) low-pass filtering. Complexities of all three are considered in the proposed model. DBF-based decoding complexity control is also investigated. It is shown experimentally that the proposed complexity model can provide fair estimation results. Besides, the H.264 encoder equipped with the complexity model and the DBF-based decoding complexity control algorithms can generate bit streams to save a significant amount of decoding complexity while offering quality similar to those generated by a typical H.264 encoder.
Mobile Video: Processing, Communications, Display, and Applications II
icon_mobile_dropdown
Postprocessing and denoising of video using sparse multiresolutional transforms
Osman G. Sezer, Onur G. Guleryuz
This paper describes the construction of a set of sparsity-distortion-optimized orthonormal transforms designed for wavelet-domain image denoising. The optimization operates over sub-bands of given orientation and exploits intra-scale dependencies of wavelet coefficients over image singularities. When applied on the top of standard wavelet transforms, the resulting new sparse representation provides compaction that can be exploited in transform domain denoising via cycle-spinning.1 Our construction deviates from the literature, which mainly focuses on model-based methods, by offering a data-driven optimization of wavelet representations. Compared with translational-invariant denoising, the proposed method consistently offers better performance compared to the original wavelet-representation and can reach up to 3dB improvements.
Image retargeting for small display devices
Chanho Jung, Changick Kim
In this paper, we propose a novel image importance model for image retargeting. The most widely used image importance model in existing image retargeting methods is L1-norm or L2-norm of gradient magnitude. It works well under non-complex environment. However, the gradient magnitude based image importance model often leads to severe visual distortions when the scene is cluttered or the background is complex. In contrast to the most previous approaches, we focus on the excellence of gradient domain statistics (GDS) for more effective image retargeting rather than the gradient magnitude itself. In our work, the image retargeting is developed in the sense of human visual perception. We assume that the human visual perception is highly adaptive and sensitive to structural information in an image rather than non-structural information. We do not model the image structure explicitly since there are diverse aspects of image structure. Instead, our method obtains the structural information in an image by exploiting the gradient domain statistics in an implicit manner. Experimental results show that the proposed method is more effective than the previous image retargeting methods.
Adaptive image backlight compensation for mobile phone users
Haejung Kong, Chanho Jung, Wonjun Kim, et al.
The user-friendliness and cost-effectiveness have contributed to the growing popularity of mobile phone cameras. However, images captured by such mobile phone cameras are easily distorted by a wide range of factors, such as backlight, over-saturation, and low contrast. Although several approaches have been proposed to solve the backlight problems, most of them still suffer from distorted background colors and high computational complexity. Thus, they are not deployable in mobile applications requiring real-time processing with very limited resources. In this paper, we present a novel framework to compensate image backlight for mobile phone applications based on an adaptive pixel-wise gamma correction which is computationally efficient. The proposed method is composed of two sequential stages: 1) illumination condition identification and 2) adaptive backlight compensation. Given images are classified into facial images and non-facial images to provide prior knowledge for identifying the illumination condition at first. Then we further categorize the facial images into backlight images and nonbacklight images based on local image statistics obtained from corresponding face regions. We finally compensate the image backlight using an adaptive pixel-wise gamma correction method while preserving global and local contrast effectively. To show the superiority of our algorithm, we compare our proposed method with other state-of-the-art methods in the literature.
Remote gaming on resource-constrained devices
Waazim Reza, Hari Kalva, Richard Kaufman
Games have become important applications on mobile devices. A mobile gaming approach known as remote gaming is being developed to support games on low cost mobile devices. In the remote gaming approach, the responsibility of rendering a game and advancing the game play is put on remote servers instead of the resource constrained mobile devices. The games rendered on the servers are encoded as video and streamed to mobile devices. Mobile devices gather user input and stream the commands back to the servers to advance game play. With this solution, mobile devices with video playback and network connectivity can become game consoles. In this paper we present the design and development of such a system and evaluate the performance and design considerations to maximize the end user gaming experience.
Optics, Photonics and Digital Image Processing
icon_mobile_dropdown
Multivariate image analysis of laser-induced photothermal imaging used for detection of caries tooth
Time-resolved photothermal imaging has been investigated to characterize tooth for the purpose of discriminating between normal and caries areas of the hard tissue using thermal camera. Ultrasonic thermoelastic waves were generated in hard tissue by the absorption of fiber-coupled Q-switched Nd:YAG laser pulses operating at 1064 nm in conjunction with a laser-induced photothermal technique used to detect the thermal radiation waves for diagnosis of human tooth. The concepts behind the use of photo-thermal techniques for off-line detection of caries tooth features were presented by our group in earlier work. This paper illustrates the application of multivariate image analysis (MIA) techniques to detect the presence of caries tooth. MIA is used to rapidly detect the presence and quantity of common caries tooth features as they scanned by the high resolution color (RGB) thermal cameras. Multivariate principal component analysis is used to decompose the acquired three-channel tooth images into a two dimensional principal components (PC) space. Masking score point clusters in the score space and highlighting corresponding pixels in the image space of the two dominant PCs enables isolation of caries defect pixels based on contrast and color information. The technique provides a qualitative result that can be used for early stage caries tooth detection. The proposed technique can potentially be used on-line or real-time resolved to prescreen the existence of caries through vision based systems like real-time thermal camera. Experimental results on the large number of extracted teeth as well as one of the thermal image panoramas of the human teeth voltanteer are investigated and presented.
A multi-pedestrian detection and counting system using fusion of stereo camera and laser scanner
Bo Ling, Spandan Tiwari, Zhuang Li, et al.
Automated vehicle counting technology has been in use for many years, but developments in automated pedestrian counting technology have been limited. Pedestrians are more difficult to detect, track and count because their paths are much less constrained. In this paper, we present an advanced pedestrian counting system using a stereo camera and a laser scanner. A mapping algorithm has been developed to map the detection locations in the laser scanner coordinates to the stereo-image coordinates. For pedestrian tracking, we apply the nonparametric statistical hypothesis tests such as Kolmogorov- Smirnov test for association of close tracks, and incorporate pedestrian image features such as SIFT (Scale Invariance Feature Transform) into Kalman filter for multi-pedestrian tracking. Test results based on the data collected at a street intersection have demonstrated that this pedestrian counting system can accurately detect, track and count multiple pedestrians walking in a large group.
Defect detection and classification of machined surfaces under multiple illuminant directions
Yi Liao, Xin Weng, C. W. Swonger, et al.
Continuous improvement of product quality is crucial to the successful and competitive automotive manufacturing industry in the 21st century. The presence of surface porosity located on flat machined surfaces such as cylinder heads/blocks and transmission cases may allow leaks of coolant, oil, or combustion gas between critical mating surfaces, thus causing damage to the engine or transmission. Therefore 100% inline inspection plays an important role for improving product quality. Although the techniques of image processing and machine vision have been applied to machined surface inspection and well improved in the past 20 years, in today's automotive industry, surface porosity inspection is still done by skilled humans, which is costly, tedious, time consuming and not capable of reliably detecting small defects. In our study, an automated defect detection and classification system for flat machined surfaces has been designed and constructed. In this paper, the importance of the illuminant direction in a machine vision system was first emphasized and then the surface defect inspection system under multiple directional illuminations was designed and constructed. After that, image processing algorithms were developed to realize 5 types of 2D or 3D surface defects (pore, 2D blemish, residue dirt, scratch, and gouge) detection and classification. The steps of image processing include: (1) image acquisition and contrast enhancement (2) defect segmentation and feature extraction (3) defect classification. An artificial machined surface and an actual automotive part: cylinder head surface were tested and, as a result, microscopic surface defects can be accurately detected and assigned to a surface defect class. The cycle time of this system can be sufficiently fast that implementation of 100% inline inspection is feasible. The field of view of this system is 150mm×225mm and the surfaces larger than the field of view can be stitched together in software.
Comparison between two different methods to obtain the wavefront aberration function
Angel S. Cruz Félix, Jorge Ibarra, Estela López, et al.
The analysis and measurement of the wavefront aberration function are very important tools that allow us to evaluate the performance of any specified optical system. This technology has been adopted in visual optics for the analysis of optical aberrations in the human eye, before and after being subjected to laser refractive surgery. We have been working in the characterization and evaluation of the objective performance of human eyes that have been subjected to two different surface ablation techniques known as ASA and PASA1. However, optical aberrations in the human eye are time-dependent2 and, hence, difficult to analyze. In order to obtain a static profile from the post-operatory wavefront aberration function we applied these ablation techniques directly over hard contact lenses. In this work we show the comparison between two different methods to obtain the wavefront aberration function from a reference refractive surface, in order to generalize this method and being able to fully characterize hard contact lenses which have been subjected to different ablation techniques typically used in refractive surgery for vision correction. For the first method we used a Shack-Hartmann wavefront sensor, and in the second method we used a Mach-Zehnder type interferometer. We show the preliminary results of this characterization.
Refractive power maps of the anterior surface of the cornea according to different models
Lucerito Morales-Tellez, Marco A. Rosales, Estela López-Olazagasti, et al.
In order to explore and analyze the effect of an ablation performed on the anterior corneal surface, it is useful to calculate the refractive power maps of the original and the treated corneas. The optical characteristics of the anterior corneal surfaces are typically simulated with different models, according to different degrees of simplification. To predict which ablation would improve the refractive power of such cornea, which is directly related with the spherical aberration associated with the shape of the anterior corneal surface, it is important to analyze those simplifications. Such information is displayed in a refractive power map, which yields the true refractive power of the corneal surface, point by point, expressing this power in diopters. The aim of the present work is twofold: different corneal models are simulated so as to compare the spherical aberration produced by each one. On the other hand, simulations are made in such a way that permits to foresee how the visual performance of an eye can be achieved by modifying the anterior surface of its cornea through the corresponding power maps.
Rapid ideal template creation for the inspection of MEMS based on self-similarity characteristics
Surface metrology of MEMS requires high resolution sensors due to their fine structures. An automated multiscale measurement system with multiple sensors at multiple scales enables fast acquisition of the surface data by utilizing high resolution sensors only at locations required. We propose a technique that depends on the fact that often MEMS have features (e.g. combs) repeating across the surface. These features can be segmented and fused to generate an ideal template. We present an automated similarity search approach based on feature detection, rotation invariant matching, and sum of absolute differences to find similar structures on the specimen. Then, similar segments are fused and replaced in the original image to generate an ideal template.
Automatic alignment of multi-temporal images of planetary nebulae using local optimization
Automatic alignment of time-separated astronomical images have historically proven to be difficult. The main reason for this difficulty is the amount of sporadic and unpredictable noise associated with astronomical images. A few examples of these effects are: image distortion due to optics, cosmic ray hits, transient background sources (super novae) and various artifact sources associated with the CCD imager itself. In this paper a new automated image registration method is introduced for aligning two time-separated images while minimizing the inherent errors and unpredictabilities. Using local optimization, the two images are aligned when the root mean square of the difference between the two images is minimized. The dataset consists of images of galactic planetary nebulae acquired by the Hubble Space Telescope. The aligned centroids inferred by the suggested method agree with the results from previously aligned images by inspection with high confidence. It is also demonstrated that this method is robust, sufficient, does not require extensive user input and it is highly sensitive to minor adjustments.
Poster Session
icon_mobile_dropdown
The development of an automatic scanning path generation method for the spinneret test
Chun-Jen Chen, Min-Wei Hung, Wenyuh Jywe, et al.
An automatic scanning path generation method is developed. The method is based on a 3-axis automatic inspection system which is used to detect the clearance ratio of spinneret plate. The user can rely on this method to automatically generate the scanning path for an unknown spinneret plate in the spinneret test. Then the scanning path can be learned by the inspection system and repeated it for other the same spinneret. Two type spinnerets are introduced in this paper to describe the automatic scanning path generation method. In this paper, the 3-axis automatic inspection system includes a 3-axes motorized linear stage, a telcentric lens, a top light source, a bottom light source, 1 CCD camera and a controlled PC.
Meteor automatic imager and analyzer: analysis of noise characteristics and possible noise suppression
This paper is devoted to the noise analysis and noise suppression in a system for double station observation of the meteors now known as MAIA (Meteor Automatic Imager and Analyzer). The noise analysis is based on acquisition of testing video sequences at different light conditions and their further analysis. The main goal is to find a suitable noise model and subsequently determine if the noise is signal dependent or not. Noise and image model in the wavelet domain should be based on Gaussian mixture model (GMM) or Generalized Laplacian Model (GLM) and the model parameters should be estimated by moment method. GMM and GLM allow to model various types of probability density functions. Finally the advanced de-noising algorithm using Bayesian estimator will be applied.
Correlation-based nonlinear composite filters applied to image recognition
Correlation-based pattern recognition has been an area of extensive research in the past few decades. Recently, composite nonlinear correlation filters invariants to translation, rotation, and scale were proposed. The design of the filters is based on logical operations and nonlinear correlation. In this work nonlinear filters are designed and applied to non-homogeneously illuminated images acquired with an optical microscope. Images are embedded into cluttered background, non-homogeneously illuminated and corrupted by random noise, which makes difficult the recognition task. Performance of nonlinear composite filters is compared with performance of other composite correlation filters, in terms discrimination capability.
Automated tracking of yeast cell lineages
Kyungnam Kim, Amy C. Rowat, Anne E. Carpenter
We propose a cell progeny tracking method that sequentially employs image alignment, chamber cropping, cell segmentation, per-cell feature measurement, and progeny (lineage) tracking modules. It enables biologists to keep track of phenotypic patterns not only over time but also over multiple generations. Yeast cells encapsulated in chambers of a polydimethylsiloxane (PDMS) microfluidic device were imaged over time to monitor changes in fluorescence levels. We implemented our method in an automated cell image analysis tool, CellProfiler, and performed initial testing. Once refined and validated, the approach could be adapted/used in other cell segmentation and progeny tracking experiments.
Center location error correction of circular targets
Circular targets are commonly used in vision measurement and photogrammetry. Due to the asymmetric projection, the geometric centroid of the ellipse projection and the true projection of the target center are not identical, which leads to a systematic center location error. A method to correct the center location error is presented in this paper. Surface normal directions of circular targets are determined by camera calibration in advance. Then the correction values of the geometric centroids are calculated with space analytic geometry. The experimental results show the improvement of accuracy can be achieved after error correction by our method.
Performance of visual tasks from contour information
Yitzhak Yitzhaky, Liron Itan
A recently proposed visual aid for patients with a restricted visual field (tunnel vision) combines a see-through head-mounted display (HMD) and a simultaneous minified contour view of the wide field image of the environment. Such a widening of the effective visual field is helpful for tasks such as visual search, mobility and orientation. The sufficiency of contours (outlines of the objects in the image) for performing everyday visual tasks by human observers is of major importance for this application, as well as for other applications, and for basic understanding of human vision. Due to their efficient properties as good object descriptors, contours are widely used in computer vision applications, and therefore many methods have been developed for automatic extraction of them from the image. The purpose of this research is to examine and compare the use of different types of automatically created contours, and contour representations, for practical everyday visual operations using commonly observed images. The visual operations include visual searching for items such as keys, remote control, etc. Considering different recognition levels, identification of an object is distinguished from detection (when it is not clearly identified). Some new non-conventional visual-based contour representations were developed for this purpose. Experiments were performed with normal vision subjects, by superposing contours of the wide-field of the scene, over a narrow field (see-through) background. Results show that about 85% success is obtained by for searched object identification when the best contour versions are employed.
Augmented reality system
Chien-Liang Lin, Yu-Zheng Su, Min-Wei Hung, et al.
In recent years, Augmented Reality (AR)[1][2][3] is very popular in universities and research organizations. The AR technology has been widely used in Virtual Reality (VR) fields, such as sophisticated weapons, flight vehicle development, data model visualization, virtual training, entertainment and arts. AR has characteristics to enhance the display output as a real environment with specific user interactive functions or specific object recognitions. It can be use in medical treatment, anatomy training, precision instrument casting, warplane guidance, engineering and distance robot control. AR has a lot of vantages than VR. This system developed combines sensors, software and imaging algorithms to make users feel real, actual and existing. Imaging algorithms include gray level method, image binarization method, and white balance method in order to make accurate image recognition and overcome the effects of light.
Image restoration with local adaptive methods
Local adaptive processing in sliding transform domains for image restoration and noise removal with preservation of edges and detail boundaries represents a substantial advance in the development of signal and image processing techniques, thanks to its robustness to signal imperfections and local adaptivity (context sensitivity). Local filters in the domain of orthogonal transforms at each position of a moving window modify the orthogonal transform coefficients of a signal to obtain only an estimate of the central pixel of the window. A minimum mean-square error estimator in the domain of sliding discrete cosine and sine transforms for noise removal and restoration is derived. This estimator is based on fast inverse sliding transforms. To provide image processing at a high rate, fast recursive algorithm for computing the sliding sinusoidal transforms are utilized. The algorithms are based on a recursive relationship between three subsequent local spectra. Computer simulation results using synthetic and real images are provided and discussed.
Improvement of visual perception in cloudy environments
A new iterative algorithm for the improvement of the visual perception in cloudy environments is presented. The proposed approach is based on an heuristic search algorithm used to estimate the depth map of a scene taken under bad weather conditions. By the use of the suggested algorithm, an undegraded signal can be locally estimated at in an iteratve way, increasing the confidence in the decision making in computer vision applications for human assistance. Computer simulation results obtained with the proposed algorithm are provided and discussed in terms of performance metrics and computational complexity.
Performance test of optical and electronic image stabilizer for digital imaging system
Qi Li, Zhihai Xu, Huajun Feng, et al.
We designed and fabricated test apparatus to analysis performance characteristic of optical and electronic image stabilization. The imaging system (digital video camera with image stabilization function) was fixed on a platform; vibration frequency of the platform varies with the input voltage of electrical motor, and vibration amplitude of the platform is changed through position adjustment of shaft of electrical motor. We start the vibration platform and acquire ordinary image sequence, and then turn on the stabilizer and record image sequence under optical stabilizer, afterwards, the optical stabilizer was turned off, the motion detection and compensation were used to process the acquired image frames, and the image sequence with electronic image stabilization was obtained. We analyzed and processed two kinds of image sequences from test apparatus and summarized some conclusions about performance characteristic of image stabilizer. The electronic image stabilization effect is better at low frequencies and the optical image stabilization effect is better at high frequencies. Furthermore, the improvement in the degree of image stability caused by the electronic image stabilization is basically not related to the vibration frequency, while the improvement in the degree of image stability caused by optical image stabilization increases significantly with the increase in vibration frequency.
Meteor automatic imager and analyzer: system design and its parameters
A system for double station observation of the meteors now known as MAIA (Meteor Automatic Imager and Analyzer) is introduced in the paper. This system is an evolution of current analog solution. The system is based on the two stations with the gigabite ethernet cameras, sensitive image intensifiers and automatic processing of the recorded image data. The aim of such design is to capture and analyze images of meteors down to masses of fractions of gram. This paper presents the measured electrooptical characteristics of the particular components and the overall performance of the new digital system in the comparison to the current analog solution. At first the optimal settings of various parameters for each subsystem (primary lens, image intensifier, secondary lens and camera) are determined. Then the set of test images is captured and analyzed. The analysis of the images captured with both artificial and real targets verifies the suitability of the selected system design.
Analysis of the selection of overlapping region of sectioned restoration for images with space-variant point spread function
Xiaoping Tao, Jufeng Zhao, Huajun Feng, et al.
Classical image restoration is mostly base on the image deconvolution under the assumptions of linear system transformation, stationary signal statistics and stationary, signal independent noise. Unfortunately, the assumptions are not always accurate in real problems. For example, the optical aberrations, local defocus, local motion blur, temperature variation, flexible medium, and non-stationary platform all cause the uncertain different degradation in different area of the images. Therefore, overlapping-region sectioned restoration is suggested to reconstruct such blurred images with space-variant point spread function (SVPSF). First of all, the full image is divided into several sub-sections, in which the PSF nominally space invariant (SI). After the restoration with SI algorithm, the sub-frames are spliced to construct the composite full-frame. Moreover, overlapping extension is employed to isolate edge-ringing effects from circular convolution between the different restored sub-frames. In this paper, with the help of SSIM (Structural Similarity) and GRM (Gradient Ringing Matrix) image quality assessment approaches, we discussed the selection of overlapping region of the sectioned restoration with different algorithms, for images with signal to noise ratio (SNR) from 25db to 40db. Our investigation proves that the restored image quality is best when the overlapping region as wide as the energydistribution- area of degradation function.
Image restoration of nonuniformly illuminated images with camera microscanning
Various techniques for image recovery from degraded observed images were proposed. Most of the methods deal with linear degradations and carry out signal processing using a single observed image. In this paper multiplicative, additive, and impulsive image degradations are investigated. We propose restoration algorithms based on three observed degraded images obtained from a microscanning camera. It is assumed that degraded images contain information about an original image, illumination function, and noise. Using three degraded images and mathematical model of degradation a set of equations is formed. By solving the system of equations with the help of an iterative algorithm the original image is recovered.
Vertex-based marching algorithms for finding multidimensional geometric intersections
Lubomir T. Dechevsky, Arne Lakså, Børre Bang, et al.
This article is a survey of the current state of the art in vertex-based marching algorithms for solving systems of nonlinear equations and solving multidimensional intersection problems. It addresses also ongoing research and future work on the topic. Among the new topics discussed here for the first time is the problem of characterizing the type of singularities of piecewise affine manifolds, which are the numerical approximations to the solution manifolds, as generated by the most advanced of the considered vertex-based algorithms: the Marching-Simplex algorithm. Several approaches are proposed for solving this problem, all of which are related to modifications, extensions and generalizations of the Morse lemma in differential topology.
Calibration of a dual-PTZ camera system for stereo vision
Yau-Zen Chang, Jung-Fu Hou, Yi Hsiang Tsao, et al.
In this paper, we propose a calibration process for the intrinsic and extrinsic parameters of dual-PTZ camera systems. The calibration is based on a complete definition of six coordinate systems fixed at the image planes, and the pan and tilt rotation axes of the cameras. Misalignments between estimated and ideal coordinates of image corners are formed into cost values to be solved by the Nelder-Mead simplex optimization method. Experimental results show that the system is able to obtain 3D coordinates of objects with a consistent accuracy of 1 mm when the distance between the dual-PTZ camera set and the objects are from 0.9 to 1.1 meters.
Utilization of consumer level digital cameras in astronomy
This paper presents a study of possible utilization of digital single-lens reflex (DSLR) cameras in astronomy. The DSLRs have a great advantage over the professional equipments in better cost efficiency with comparable usability for selected purposes. The quality of electro-optical system in the DSLR camera determines the area where it can be used with acceptable precision. At first a set of important camera parameters for astronomical utilization is introduced in the paper. Color filter array (CFA) structure, demosaicing algorithm, image sensor spectral properties, noise and transfer characteristics are the parameters that belong among the very important ones and these are further analyzed in the paper. Compression of astronomical images using the KLT approach is also described below. The potential impact of these parameters on position and photometric measurement is presented based on the analysis and measurements with the wide-angle lens. The prospective utilization of consumer DSLR camera as a substitute for expensive devices is discussed.
Use of the EM algorithm in image registration in a scene captured by a moving camera
Nader M. Namazi, William Scharpf, Jay Obermark, et al.
This paper presents the use of Expectation-Maximization (EM) method for image motion registration in a scene captured by a moving camera. In [1] we presented a new iterative algorithm for the correction of geometrical distortion caused by global motion in a scene. A binary hypotheses test was subsequently established to classify the pixels in the corrected image as either locally moving (object motion) or not moving (stationary). There were some unknown parameters, such as noise variance and motion variance, in the developments that needed to be estimated. This paper presents the use of the EM algorithm to estimate these parameters. We present experiments with real image sequences to validate the analytical developments.
Scene kinetics mitigation using factor analysis with derivative factors
D. K. Melgaard, A. J. Scholand, K. W. Larson
Line of sight jitter in staring sensor data combined with scene information can obscure critical information for change analysis or target detection. Consequently before the data analysis, the jitter effects must be significantly reduced. Conventional principal component analysis (PCA) has been used to obtain basis vectors for background estimation; however PCA requires image frames that contain the jitter variation that is to be modeled. Since jitter is usually chaotic and asymmetric, a data set containing all the variation without the changes to be detected is typically not available. An alternative approach, Scene Kinetics Mitigation, first obtains an image of the scene. Then it computes derivatives of that image in the horizontal and vertical directions. The basis set for estimation of the background and the jitter consists of the image and its derivative factors. This approach has several advantages including: 1) only a small number of images are required to develop the model, 2) the model can estimate backgrounds with jitter different from the input training images, 3) the method is particularly effective for sub-pixel jitter, and 4) the model can be developed from images before the change detection process. In addition the scores from projecting the factors on the background provide estimates of the jitter magnitude and direction for registration of the images. In this paper we will present a discussion of the theoretical basis for this technique, provide examples of its application, and discuss its limitations.
Novel gray coded pattern for unwrapping phase in fringe projection based 3D profiling
A method to reliably extract object profiles even with height discontinuities (that leads to 2nπ phase jumps) is proposed. This method uses Fourier transform profilometry to extract wrapped phase, and an additional image formed by illuminating the object of interest by a novel gray coded pattern for phase unwrapping. Simulation results suggest that the proposed approach not only retains the advantages of the original method, but also contributes significantly in the enhancement of its performance. Fundamental advantage of this method stems from the fact that both extraction of wrapped phase and unwrapping the same were done by gray scale images. Hence, unlike the methods that use colors, proposed method doesn't demand a color CCD camera and is ideal for profiling objects with multiple colors.
A comparison between intensity and depth images for extracting features related to wear labels in carpets
S. A. Orjuela, E. Vansteenkiste, F. Rooms, et al.
Carpet manufacturers certify their products with labels corresponding to the capability of the carpets in retaining the original appearance. Traditionally, these labels are subjectively defined by reference cases where human experts evaluate the degree of wear, which is quantified by a number called the wear label. Industry is very interested in converting these traditional standards to automated objective standards. With this purpose, research has been conducted using image analysis with either depth or intensity data. In this paper, we present a comparison of texture features extracted from both types of images. For this, we scanned 3D data and photographed eight types of images provided from the EN1471 standard. The features are extracted comparing the distribution of Local Binary Patterns (LBPs) computed from images of original and change in appearance. We assess how well we can arrange the features in the order of the wear labels and count the number of consecutive wear labels that can be statistically distinguished. We found that two of the eight carpet types are properly described using depth data and five using intensity data while one type could not be described. These results suggest that both types of images can be complementary used for representing the wear labels. This can lead to an automated and universal labeling system for carpets.