Proceedings Volume 5817

Visual Information Processing XIV

Zia-ur Rahman, Robert A. Schowengerdt, Stephen E. Reichenbach
cover
Proceedings Volume 5817

Visual Information Processing XIV

Zia-ur Rahman, Robert A. Schowengerdt, Stephen E. Reichenbach
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 25 May 2005
Contents: 7 Sessions, 35 Papers, 0 Presentations
Conference: Defense and Security 2005
Volume Number: 5817

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Image Understanding, Restoration, and Enhancement I
  • Applications of Image Processing
  • Video
  • Remote Sensing and Registration
  • Image Understanding, Restoration and Enhancement II
  • Compression
  • Poster Session
  • Video
  • Applications of Image Processing
Image Understanding, Restoration, and Enhancement I
icon_mobile_dropdown
Multiframe image restoration with the 2D4 algorithm
We describe a new algorithm for combining multiple low-resolution images to obtain a high-resolution object estimate. Each camera is treated as a communication channel and we exploit sub-pixel shifts to achieve significant resolution enhancement. The 2D4 algorithm is an iterative likelihood-based method that is computationally less expensive than the two-dimensional Viterbi algorithm. In this paper, we modify the 2D4 algorithm and apply it to the multiframe image restoration problem. We demonstrate the reconstruction of a high-resolution scene from multiple blurred, noisy, and shifted low-resolution image measurements. We discuss the modifications and approximations to the 2D4 algorithm that are required to reduce its complexity for this application. We present the performance of this algorithm and compare it with the performance of Iterative Back Projection and optimal linear methods.
Minimum reconstruction error in feature-specific imaging
We describe theoretical and experimental results for a new class of optimal features for feature-specific imaging (FSI). In this paper, we theoretically solve the reconstruction problem without noise, and find a more general solution than principle component analysis (PCA). We present a generalized framework to find FSI projection matrices. Using Stochastic Tunneling, we find an optimal solution in the presence of noise and under an energy conservation constraint. We also show that a non-negativity requirement does not significantly reduce system performance. Finally, we propose an experimental system for FSI using a polarization-based optical pipeline processor.
Cubic convolution for super-resolution from microscanned images
Jiazheng Shi, Stephen E. Reichenbach, James D. Howe
This paper presents a computationally efficient method for super-resolution reconstruction and restoration from microscanned images. Microscanning creates multiple low-resolution images with slightly varying sample-scene phase shifts. Microscanning can be implemented with a physical microscanner built into specialized imaging systems or by simply panning and/or tilting traditional imaging systems to acquire a temporal sequence of images. Digital processing can combine the low-resolution images to produce an image with higher pixel resolution (i.e., super-resolution) and higher fidelity. The cubic convolution method developed in this paper employs one-pass, small-kernel convolution to perform reconstruction (increasing resolution) and restoration (improving fidelity). The approach is based on an end-to-end, continuous-discrete-continuous model of the microscanning imaging process. The derivation yields a parametric form that can be optimized for the characteristics of the scene and the imaging system. Because cubic convolution is constrained to a small spatial kernel, the approach is efficient and is amenable to adaptive processing and to parallel implementation. Experimental results with simulated imaging and with real microscanned images indicate that the cubic convolution method efficiently and effectively increases resolution and fidelity for significantly improved image quality.
A boosting algorithm for texture classification and object detection
Vidya Manian, Miguel Velez-Reyes
Boosting techniques are useful for improving performance of classification methods. In this paper, we present an algorithm for performing texture classification using adaptive boosted learning. The classifier integrates several weak classifiers utilizing statistical multiresolution wavelet features. The boosting method uses a number of positive and negative examples for learning. The features are computed from training images. The classification errors are much lesser compared to traditional parametric and non-parametric classifiers. This method is demonstrated with texture classification results. An application to object detection in multispectral images is presented. A good detection rate for objects using simple texture features from selective bands is obtained. Results show that texture features in several spectral bands can be effectively combined in a feature context to build adaptive classifiers for classification and object detection.
Analysis tools for computational imaging systems
G. E. Johnson, P. E. X. Silveira, E. Dowski
The analysis tools of traditional optical systems, such as modulation transfer functions, point spread functions, resolution test charts etc. are often not sufficient when analyzing computational imaging systems. Computational imaging systems benefit from the combined use of optics and electronics for accomplishing a given imaging or system task. In traditional optical systems the goal is essentially to form images that precisely depict a given object. Electronics are not required to form clear images, but could be required to analyze the images. In computational imaging systems specialized images are formed by generalized aspheric optical elements that are jointly optimized with the electronic processing. The specialized images formed at a detector are not necessarily clear images. Electronic processing is used to remove the image blur or otherwise form a final image. Computational imaging systems offer the advantage of increased performance and decreased size, weight, and cost over traditional optical systems. The Ambiguity Function (AF), traditionally used for the design of radar waveforms, plays an important role in computational imaging systems. The AF provides a concise analysis of the optical transfer functions of imaging systems over defocus. The Wigner Distribution (WD), traditionally used for the design of time-varying systems, is related to the AF and provides a concise analysis of the point spread functions (PSF) of imaging systems over defocus. We will describe the relationships and utility of these functions to computational imaging systems.
Applications of Image Processing
icon_mobile_dropdown
Design methodology for optimal hardware implementation of wavelet transform domain algorithms
The work presented in this paper lays the foundation for the development of an end-to-end system design methodology for implementing wavelet domain image/video processing algorithms in hardware using Xilinx field programmable gate arrays (FPGAs). With the integration of the Xilinx System Generator toolbox, this methodology will allow algorithm developers to design and implement their code using the familiar MATLAB/Simulink development environment. By using this methodology, algorithm developers will not be required to become proficient in the intricacies of hardware design, thus reducing the design cycle and time-to-market.
Detecting changes in terrain using unmanned aerial vehicles
Zia-ur Rahman, Glenn D. Hines, Michael J. Logan
In recent years, small unmanned aerial vehicles (UAVs) have been used for more than the thrill they bring to model airplane enthusiasts. Their flexibility and low cost have made them a viable option for low-altitude reconnaissance. In a recent effort, we acquired video data from a small UAV during several passes over the same flight path. The objective of the exercise was to determine if objects had been added to the terrain along the flight path between flight passes. Several issues accrue to this simple-sounding problem: (1) lighting variations may cause false detection of objects because of changes in shadow orientation and strength between passes; (2) variations in the flight path due to wind-speed, and heading change may cause misalignment of gross features making the task of detecting changes between the frames very difficult; and (3) changes in the aircraft orientation and altitude lead to a change in size of the features from frame-to-frame making a comparison difficult. In this paper, we discuss our efforts to perform this change detection, and the lessons that we learned from this exercise.
Visual enhancement of micro CT bone density images
John S. DaPonte, Michael Clark, Megan Damon, et al.
The primary goal of this research was to provide image processing support to aid in the identification of those subjects most affected by bone loss when exposed to weightlessness and provide insight into the causes for large variability. Past research has demonstrated that genetically distinct strains of mice exhibit different degrees of bone loss when subjected to simulated weightlessness. Bone loss is quantified by in vivo computed tomography (CT) imaging. The first step in evaluating bone density is to segment gray scale images into separate regions of bone and background. Two of the most common methods for implementing image segmentation are thresholding and edge detection. Thresholding is generally considered the simplest segmentation process which can be obtained by having a user visually select a threshold using a sliding scale. This is a highly subjective process with great potential for variation from one observer to another. One way to reduce inter-observer variability is to have several users independently set the threshold and average their results but this is a very time consuming process. A better approach is to apply an objective adaptive technique such as the Riddler / Calvard method. In our study we have concluded that thresholding was better than edge detection and pre-processing these images with an iterative deconvolution algorithm prior to adaptive thresholding yields superior visualization when compared with images that have not been pre-processed or images that have been pre-processed with a filter.
Practical steganographic capacity and the best cover image
The goal of this article is to investigate an alternative capacity for steganographic systems. We will define steganographic capacity as the maximum number of embeddable bit within a digital signal while maintaining imperceptible requirements. This capacity makes it somewhat possible to solve two fundamental steganographic problems: first, how to choose the best cover image among classes of images and second which embedding method may be employed to reduce the detection of hidden information within the embedded areas. In addition, the new capacity may be used for 1) the separation of an analyzed image into embeddable areas, 2) the identification of maximum embedding capacities within a cover digital image, and 3) estimating the length in bits used for embedding information within the identified regions.
Video
icon_mobile_dropdown
Application of computer vision to automatic prescription verification in pharmaceutical mail order
In large volume pharmaceutical mail order, before shipping out prescriptions, licensed pharmacists ensure that the drug in the bottle matches the information provided in the patient prescription. Typically, the pharmacist has about 2 sec to complete the prescription verification process of one prescription. Performing about 1800 prescription verification per hour is tedious and can generate human errors as a result of visual and brain fatigue. Available automatic drug verification systems are limited to a single pill at a time. This is not suitable for large volume pharmaceutical mail order, where a prescription can have as many as 60 pills and where thousands of prescriptions are filled every day. In an attempt to reduce human fatigue, cost, and limit human error, the automatic prescription verification system (APVS) was invented to meet the need of large scale pharmaceutical mail order. This paper deals with the design and implementation of the first prototype online automatic prescription verification machine to perform the same task currently done by a pharmacist. The emphasis here is on the visual aspects of the machine. The system has been successfully tested on 43,000 prescriptions.
Factors affecting development of a motion imagery quality metric
The motion imagery community would benefit from the availability of standard measures for assessing image interpretability. The National Imagery Interpretability Rating Scale (NIIRS) has served as a community standard for still imagery, but no comparable scale exists for motion imagery. Several considerations unique to motion imagery indicate that the standard methodology employed in the past for NIIRS development may not be applicable or, at a minimum, require modifications. Traditional methods for NIIRS development rely on a close linkage between perceived image quality, as captured by specific image interpretation tasks, and the sensor parameters associated with image acquisition. The dynamic nature of motion imagery suggests that this type of linkage may not exist or may be modulated by other factors. An initial study was conducted to understand the effects target motion, camera motion, and scene complexity have on perceived image interpretability for motion imagery. This paper summarizes the findings from this evaluation. In addition, several issues emerged that require further investigation: - The effect of frame rate on the perceived interpretability of motion imagery - Interactions between color and target motion which could affect perceived interpretability - The relationships among resolution, viewing geometry, and image interpretability - The ability of an analyst to satisfy specific image exploitation tasks relative to different types of motion imagery clips Plans are being developed to address each of these issues through direct evaluations. This paper discusses each of these concerns, presents the plans for evaluations, and explores the implications for development of a motion imagery quality metric.
Super-resolution video enhancment based on a constrained set of motion vectors
Modern video surveillance and target tracking applications utilize multiple cameras transmitting low-bit-rate video through channels of very limited bandwidth. The highly compressed video exhibits coding artifacts that can cause target detection and tracking procedures to fail. Thus, to lower the level of noise and retain the sharpness of the video frames, super-resolution techniques can be employed for video enhancement. In this paper, we propose an efficient super-resolution video enhancement scheme that is based on a constrained set of motion vectors. The proposed scheme computes the motion vectors using the original (uncompressed) video frames, and transmits only a small set of these vectors to the receiver. At the receiver, each pixel is assigned a motion vector from the constrained set to maximize the motion prediction performance. The size of the transmitted vector set is constrained to be less than 3% of the total coded bit stream. In the video enhancement process, an L2-norm minimization super-resolution procedure is applied. The proposed scheme is applied to enhance highly compressed, real-world video sequences. The results obtained show significant improvement in the visual quality of the video sequences, as well as in the performance of subsequent target detection and tracking procedures.
Remote Sensing and Registration
icon_mobile_dropdown
Performance of optimal registration estimators
Tuan Quang Pham, Marijn Bezuijen, Lucas J. van Vliet, et al.
This paper derives a theoretical limit for image registration and presents an iterative estimator that achieves the limit. The variance of any parametric registration is bounded by the Cramer-Rao bound (CRB). This bound is signal-dependent and is proportional to the variance of input noise. Since most available registration techniques are biased, they are not optimal. The bias, however, can be reduced to practically zero by an iterative gradient-based estimator. In the proximity of a solution, this estimator converges to the CRB with a quadratic rate. Images can be brought close to each other, thus speedup the registration process, by a coarse-to-tne multi-scale registration. The performance of iterative registration is finally shown to significantly increase image resolution from multiple low resolution images under translational motions.
Mitigation of image impairments for multichannel remote sensing data fusion
Andriy Kurekin, Alexander N. Dolia, David Marshall, et al.
Whilst for the majority of applications image quality depends on sensor accuracy and principles of image formation, in remote sensing systems information is also degraded by communication errors. To improve image fusion results in the presence of communication and sensor impairments we propose a two-stage approach. Preliminary nonlinear locally-adaptive image processing is applied at the first stage for mitigating impairments produced in image sensors and communication systems, and fusion algorithms are used at the second stage. The efficiency of the proposed algorithms is demonstrated for satellite remote sensing images and simulated data with similar characteristics and distortions. The influence of image distortions and the effectiveness of mitigation are estimated for an image fusion architecture for low-level image classification based on artificial neural networks. Experimental results are presented providing quantitative assessment of the proposed algorithms.
Interpolation of remote sensing imagery
Interpolation of remote sensing imagery is a ubiquitous task, required for myriad purposes such as registration of multiple frames, correction of geometric distortions, and mitigation of platform vibration distortions in imagery. Interpolation is also a classically systemic task, in that interpolator performance in pixel placement, anti-aliasing, and blur, affects the design of other system components, notably reconstruction filters. Interpolator design in a system context is the problem which first motivated development of the latent and apparent image quality metrics previously presented at Visual Information Processing XI and XIII. This paper presents a suite of common interpolator design philosophies with length-4 examples of the designs analyzed in terms of signal processing and image quality metrics. Conclusions are drawn both with respect to the designs and with respect to the metrics.
Information in the joint aggregate pixel distribution of two images
Wit T. Wisniewski, Robert A. Schowengerdt
Pixel value distributions of most real images have structure that cannot be modeled by simple and commonly used probability distributions, such as Gaussian or log-normal distributions. Estimation of pixel value distribution in the joint measurement space ( JMS ) of two real images reveals the joint density structure and allows its interpretation by means of statistical dependence measures. A dependence measure is a general way to express similarity or divergence between images. Candidate dependence measures include adaptations of information measures such as Shannon Entropy and Fisher Information. The dependence measure built from Fisher Information is tested and demonstrated by experiments in Independent Components Analysis ( ICA ) and co-registration of synthetic and real Landsat TM images, including successful co-registration of images from different spectral bands with zero linear correlation.
Image Understanding, Restoration and Enhancement II
icon_mobile_dropdown
Optimizing OCR accuracy for bi-tonal, noisy scans of degraded Arabic documents
Paul Herceg, Benjamin Huyck, Christopher Johnson, et al.
Acquiring foreign language from degraded hardcopy documents is of interest to military and border control applications. Bi-tonal image scans are desirable because file size is small. However, the nature of hardcopy degradations and the scanner or image enhancement software capabilities used directly affect the quality of the captured image and the extent of language acquisition. We applied a collection of manual treatments to hardcopy Arabic documents to develop a corpus of bi-tonal images. We then used this corpus in an exploratory study to derive conclusions about how bi-tonal images could be enhanced. This paper discusses the manually degraded Arabic document corpus, the image enhancement study, and the significant optical character recognition (OCR) improvements obtained with simple scanner driver adjustments.
Filtering of impulse noise in digital signals using logical transform
Median filters excel at removing impulse noise from digital signals, with high accuracy and quick running-times. However, they have limitations if the resulting signals are to be used for feature recognition purposes, as they often remove crucial details and add unwanted noise. In this paper, an algorithm for digital filtering using the logical transform is proposed. This method is able to achieve mean-squared-error results similar to median type filters while maintaining image details. Variations of the algorithm allow for greater noise reduction, but at the cost of increased computation.
Noise, edge extraction, and visibility of features
Zia-ur Rahman, Daniel J. Jobson
Noise, whether due to the image-gathering device or some other reason, reduces the visibility of fine features in an image. Several techniques attempt to mitigate the impact of noise by performing a low-pass filtering operation on the acquired data. This is based on the assumption that the uncorrelated noise has high-frequency content and thus will be suppressed by low-pass filtering. A result of this operation is that edges in a noisy image also tend to get blurred, and, in some cases, may get completely lost due to the low-pass filtering. In this paper, we quantitatively assess the impact of noise on fine feature visibility by using computer-generated targets of known spatial detail. Additionally, we develop a new scheme for noise-reduction based on the connectivity of edge-features. The overall impact of this scheme is to reduce overall noise, yet retain the high frequency content that make edge-features sharp.
Scene understanding based on network-symbolic models
New generations of smart weapons and unmanned vehicles must have reliable perceptual systems that are similar to human vision. Instead of precise computations of 3-dimensional models, a network-symbolic system converts image information into an “understandable” Network-Symbolic format, which is similar to relational knowledge models. Logic of visual scenes can be captured in the Network-Symbolic models and used for the disambiguation of visual information. It is hard to use geometric operations for processing of natural images. Instead, the brain builds a relational network-symbolic structure of visual scene, using different clues to set up the relational order of surfaces and objects. Feature, symbol, and predicate are equivalent in the biologically inspired Network-Symbolic systems. A linking mechanism binds these features/symbols into coherent structures, and image converts from a “raster” into a “vector” representation that can be better interpreted by higher-level knowledge structures. View-based object recognition is a hard problem for traditional algorithms that directly match a primary view of an object to a model. In Network-Symbolic Models, the derived structure, not the primary view, is a subject for recognition. Such recognition is not affected by local changes and appearances of the object as seen from a set of similar views.
New methods of image enhancement
In this paper, applications of the tensor and paired representations of an image are presented for image enhancement. The proposed methods are based on the fact that the 2-D image can be represented by a set of 1-D "independent" signals that split the 2-D discrete Fourier transform (DFT) of the image into different groups of frequencies. Each splitting-signal carries information of the spectrum in a specific group. Rather than enhance the image by traditional methods of the Fourier transform (or other transforms), splitting-signals can be processed separately and the 2-D DFT of the processed image can be defined by 1-D DFTs of new splitting-signals. The process of splitting-signals related to the paired representation is very effective, because of no redundancy of spectral information carrying by ifferent splitting-signals. The effectiveness of such approach is illustrated through processing the image by the a-rooting method of enhancement. Images can be enhanced by processing only a few splitting-signals, to achieve enhancement that in many cases exceeds the enhancement by the α-rooting method and other known methods. The selection of such splitting-signals is described.
Logarithmic transform coefficient histogram matching with spatial equalization
In this paper we propose an image enhancement algorithm that is based on utilizing histogram data gathered from transform domain coefficients that will improve on the limitations of the histogram equalization method. Traditionally, classical histogram equalization has had some problems due to its inherent dynamic range expansion. Many images with data tightly clustered around certain intensity values can be over enhanced by standard histogram equalization, leading to artifacts and overall tonal change of the image. In the transform domain, one has control over subtle image properties such as low and high frequency content with their respective magnitudes and phases. However, due to the nature of many of these transforms, the coefficient’s histograms may be so tightly packed that distinguishing them from one another may be impossible. By placing the transform coefficients in the logarithmic transform domain, it is easy to see the difference between different quality levels of images based upon their logarithmic transform coefficient histograms. Our results demonstrate that combing the spatial method of histogram equalization with logarithmic transform domain coefficient histograms achieves a much more balanced enhancement, that out performs classical histogram equalization.
Compression
icon_mobile_dropdown
Sensor-layer image compression based on the quantized cosine transform
We introduce a novel approach for compressive coding at the sensor layer for an integrated imaging system. Compression at the physical layer reduces the measurements-to-pixels ratio and the data volume for storage and transmission, without confounding image estimation or analysis. We introduce a particular compressive coding scheme based on the quantized Cosine transform (QCT) and the corresponding image reconstruction scheme. The QCT is restricted on the ternary set {-1,0,1} for economic implementation with a focal plane optical pixel mask. Combined with the reconstruction scheme, the QCT-based coding is shown favorable over existing coding schemes from the coded aperture literature, in terms of both reconstruction quality and photon efficiency.
Data modeling augmentation of JPEG for real-time streaming video
Holger M. Jaenisch, James W. Handley
This paper explores sub-sampling in conjunction with JPEG compression algorithms. Rather than directly compressing large high-resolution images, we propose decimation to thumbnails followed by compression. This enables Redundant Array of Independent Disks (RAID) compression and facilitates real-time streaming video with small bandwidth requirements. Image reconstruction occurs on demand at the receiver to any resolution required using Data Modeling based fractal interpolation. The receive side first uncompresses JPEG and then fractal interpolates to any required resolution. This device independent resolution capability is useful for real-time sharing of image data across virtual networks where each node has a different innate resolution capability. The same image is constructed to whatever limitations exist at each individual node, keeping image data device independent and image resolution scalable up or down as hardware/bandwidth limitations and options evolve.
Progressive resolution coding of hyperspectral imagery featuring region of interest access
We propose resolution progressive Three-Dimensional Set Partitioned Embedded bloCK (3D-SPECK), an embedded wavelet based algorithm for hyperspectral image compression. The proposed algorithm also supports random Region-Of-Interest (ROI) access. For a hyperspectral image sequence, integer wavelet transform is applied on all three dimensions. The transformed image sequence exhibits a hierarchical pyramidal structure. Each subband is treated as a code block. The algorithm encodes each code block separately to generate embedded sub-bitstream. The sub-bitstream for each subband is SNR progressive, and for the whole sequence, the overall bitstream is resolution progressive. Rate is allocated amongst the sub-bitstreams produced for each block. We always have the full number of bits possible devoted to that given scale, and only partial decoding is needed for the lower than full scales. The overall bitstream can serve the lossy-to-lossless hyperspectral image compression. Applying resolution scalable 3D-SPECK independently on each 3D tree can generate embedded bitstream to support random ROI access. Given the ROI, the algorithm can identify ROI and reconstruct only the ROI. The identification of ROI is done at the decoder side. Therefore, we only need to encode one embedded bitstream at the encoder side, and different users at the decoder side or the transmission end could decide their own different regions of interest and access or decode them. The structure of hyperspectral images reveals spectral responses that would seem ideal candidates for compression by 3D-SPECK. Results show that the proposed algorithm has excellent performance on hyperspectral image compression.
Poster Session
icon_mobile_dropdown
Edge comparison in a digital image
Timothy P. Donovan, Russell E Zuck
The purpose of this paper is to study the relationship between a bit map digital image and a given object, called the search object. In particular, to signal that it is likely, or not likely, that the search object appears, at least partially, in the image. The edges are detected using known techniques. The edges are then converted to sequences of pixels. Edges in the search object and in the digital image are then represented as objects, in the object oriented programming sense. Each edge or segment of an edge is represented as a normalized Bezier cubic parameterized curve. The conversion from a sequence of pixels to a Bezier polynomial representation is accomplished using least squares approximation techniques. The normalization process is intended to remove the effect of size in the edge or edge segment. The original Bezier representation is also maintained for each edge, as it provides necessary location information. In the event that two edges in the search object are matched with edges in the image, their relative orientation is checked using elementary vector analysis. If the edges match and their orientation is the same, then the system signals that the object is likely to appear in the image and the coordinates in the image of the object are returned. The functioning of the algorithm is not dependent on scaling, rotation, translation, or shading of the image.
Qualitative feature extraction from sensor data using short-time Fourier transform
Abolfazl Mahiari Amini
The information gathered from sensors is used to determine the health of a sensor. Once a normal mode of operation is established any deviation from the normal behavior indicates a change. This change may be due to a malfunction of the sensor(s) or the system (or process). The step-up and step-down features, as well as sensor disturbances are assumed to be exponential. An RC network is used to model the main process, which is defined by a step-up (charging), drift, and step-down (discharging). The sensor disturbances and spike are added while the system is in drift. The system runs for a period of at least three time-constants of the main process every time a process feature occurs (e.g. step change). The Short-Time Fourier Transform of the Signal is taken using the Hamming window. Three window widths are used. The DC value is removed from the windowed data prior to taking the FFT. The resulting three dimensional spectral plots provide good time frequency resolution. The results indicate distinct shapes corresponding to each process.
Enhanced water level model in image classification
Water-Level model is an effective method in density-based classification. We use biased sampling, local similarity and popularity as preprocessing, and employ a merging operation in the water-level model for classification. Biased sampling is to get some information about the global structure. Similarity and local density are mainly used to understand the local structure. In biased sampling, images are divided into many l x l patches and a sample pixel is selected from each patch. Similarity at a point p, denoted by sim(p), measures the change of gray level between point p and its neighborhood N(p). Besides using biased sampling to combine spectral and spatial information, we use similarity and local popularity in selecting sample points. A sample point is chosen based on the minimum value of sim(p) + [1-P(p)] after normalization. The selected pixel is a better representative, especially near the border of an object. To make it more effective, one has to deal with small spikes and bumps. To get rid of the small spikes, we establish a threshold |[f(P1)-f(P2)]*(P1-P2)| > c*l*l , where c is a constant, P1 is a local maximum point to be tested and P2 is the nearest local minimum from P1. The condition is only related to the size of the patches l*l. The merging operation we include in the model makes the threshold constant less sensitive in the process. DBScan is combined with the enhanced water level model to reduce noise and to get connected components. Preliminary experiments have been conducted using the proposed methods and the results are promising.
Multi-focus image fusion using ratio of blurred and original image intensities
Qiguang Miao, Baoshu Wang
This paper deals with a new multi-focus image fusion algorithm, which is on the basis of the Ratio of Blurred and Original Image Intensities. The definition of sharpness based on the sum of square of gray-level gradient vector magnitude is chosen. By analyzing the imaging model of a geometry optical system and the effect of PSF (point spread function, PSF), a simulated second imaging model of an optical system is proposed. After the second imaging, the clear object in an image will be blurred and the blurry object will be more blurred. The clear object is decided by the comparison of the difference in sharpness of each pixel, between the two different focus images and their second imaging images. In this case, the clear objects of each original image are decided automatically, and then all of them are merged into a new clear image. Experiments show that the proposed algorithm works better in preserving edge and texture information than the other image fusion methods mentioned in multi-focus image fusion do.
Moving object detection by a novel spatio-temporal segmentation
Haitao Jia, Xie Mei
On the moving object detection there is a new method which unites the temporal and spatial segmentation. Simultaneously this paper also proposal an automatic video object segmentation using the active contour module. The temporal segmentation adopts the reformative difference frame and phase correlation method, and the spatial segmentation uses the active contour module. This algorithm proposal a new viewpoint that only the external boundaries are the stable feather which must be extracted form the difference frame. This makes the algorithm achieve a high robust. And the same time temporal segmentation gives the active contour module an initialization of the edge points. Then the algorithm will refine the edge points through the energy equation iteratively. Though the simulation, this algorithm can get perfect result.
Applications of pointed ultra-resolution method in colour imaging
Evgeni Nikolaevich Terentiev, Nikolay E. Terentiev
The laws of distortions (including the influence of turbulent fluctuations of the refractive index) are presented as scalar product I(y)=(O, Ix), Ix is the undistorted image, which is localized as PSF O, I(y) is a value of the distorted image Iy in the point y. The small dimensions variation problem for resolving function R is a well definite one. The pointed algorithm of compensation for RGB PSF distortions is as follows. The scalar products I(z)=(R, Iy) are calculated in analogous way for all the points x=z, Ix ~ Iz, Iz is an image with the compensated distortions in the RGB planes. We may successfully use a pointed ultra-resolution method for all the modern multi-sensors receiving systems. Examples of applying of the pointed ultra-resolution method for the colour Martian images and images from Hubble telescope are considered.
Multilevel fusion using enhanced feature detection
Trond Ostrem, Samuel Peter Kozaitis, Ildiko Laszlo
We combined images from different sensors based on the magnitude of multiscale products of the wavelet transform. Using the product of nondownsampled wavelet coefficients across scales, we formed a fusion rule between two images. In this way, the products represented features than were related to contrast. Therefore, using our approach allowed us to enhance the contrast of two images using a pixel-level fusion approach. In experiments, our approach compared favorably to using only a single level, or other methods using subjective tests.
Sharpening advanced land imager multispectral data using a sensor model
The Advanced Land Imager (ALI) instrument on NASA's Earth Observing One (EO-1) satellite provides for nine spectral bands at 30m ground sample distance (GSD) and a 10m GSD panchromatic band. This report describes an image sharpening technique where the higher spatial resolution information of the panchromatic band is used to increase the spatial resolution of ALI multispectral (MS) data. To preserve the spectral characteristics, this technique combines reported deconvolution deblurring methods for the MS data with highpass filter-based fusion methods for the Pan data. The deblurring process uses the point spread function (PSF) model of the ALI sensor. Information includes calculation of the PSF from pre-launch calibration data. Performance was evaluated using simulated ALI MS data generated by degrading the spatial resolution of high resolution IKONOS satellite MS data. A quantitative measure of performance was the error between sharpened MS data and high resolution reference. This report also compares performance with that of a reported method that includes PSF information. Preliminary results indicate improved sharpening with the method reported here.
Video
icon_mobile_dropdown
Pose estimation and frontal face detection for face recognition
This paper proposes a pose estimation and frontal face detection algorithm for face recognition. Considering it's application in a real-world environment, the algorithm has to be robust yet computationally efficient. The main contribution of this paper is the efficient face localization, scale and pose estimation using color models. Simulation results showed very low computational load when compare to other face detection algorithm. The second contribution is the introduction of low dimensional statistical face geometrical model. Compared to other statistical face model the proposed method models the face geometry efficiently. The algorithm is demonstrated on a real-time system. The simulation results indicate that the proposed algorithm is computationally efficient.
Applications of Image Processing
icon_mobile_dropdown
Intelligent person identification system using stereo camera-based height and stride estimation
Jung-Hwan Ko, Jae-Hun Jang, Eun-Soo Kim
In this paper, a stereo camera-based intelligent person identification system is suggested. In the proposed method, face area of the moving target person is extracted from the left image of the input steros image pair by using a threshold value of YCbCr color model and by carrying out correlation between the face area segmented from this threshold value of YCbCr color model and the right input image, the location coordinates of the target face can be acquired, and then these values are used to control the pan/tilt system through the modified PID-based recursive controller. Also, by using the geometric parameters between the target face and the stereo camera system, the vertical distance between the target and stereo camera system can be calculated through a triangulation method. Using this calculated vertical distance and the angles of the pan and tilt, the target's real position data in the world space can be acquired and from them its height and stride values can be finally extracted. Some experiments with video images for 16 moving persons show that a person could be identified with these extracted height and stride parameters.