Proceedings Volume 6246

Visual Information Processing XV

cover
Proceedings Volume 6246

Visual Information Processing XV

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 12 May 2006
Contents: 6 Sessions, 30 Papers, 0 Presentations
Conference: Defense and Security Symposium 2006
Volume Number: 6246

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Image Understanding, Restoration, and Reconstruction I
  • Video, Coding, and Compression
  • Image Understanding, Restoration, and Reconstruction II
  • Applications of Image Processing I
  • Applications of Image Processing II
  • Poster Session
Image Understanding, Restoration, and Reconstruction I
icon_mobile_dropdown
A comparison of visual statistics for the image enhancement of FORESITE aerial images with those of major image classes
Daniel J. Jobson, Zia-ur Rahman, Glenn A. Woodell, et al.
Aerial images from the Follow-On Radar, Enhanced and Synthetic Vision Systems Integration Technology Evaluation (FORESITE) flight tests with the NASA Langley Research Center's research Boeing 757 were acquired during severe haze and haze/mixed clouds visibility conditions. These images were enhanced using the Visual Servo (VS) process that makes use of the Multiscale Retinex. The images were then quantified with visual quality metrics used internally within the VS. One of these metrics, the Visual Contrast Measure, has been computed for hundreds of FORESITE images, and for major classes of imaging-terrestrial (consumer), orbital Earth observations, orbital Mars surface imaging, NOAA aerial photographs, and underwater imaging. The metric quantifies both the degree of visual impairment of the original, un-enhanced images as well as the degree of visibility improvement achieved by the enhancement process. The large aggregate data exhibits trends relating to degree of atmospheric visibility attenuation, and its impact on the limits of enhancement performance for the various image classes. Overall results support the idea that in most cases that do not involve extreme reduction in visibility, large gains in visual contrast are routinely achieved by VS processing. Additionally, for very poor visibility imaging, lesser, but still substantial, gains in visual contrast are also routinely achieved. Further, the data suggest that these visual quality metrics can be used as external standalone metrics for establishing performance parameters.
Novel method of tensor representation for reconstruction of 3D PET images from projections
In this paper, a novel transform-based method of reconstruction of three-dimensional (3-D) positron emission tomography (PET) images is proposed. The proposed method is based on the concept of the non-traditional tensor form of representation of the 3-D image with respect to the 3-D discrete Fourier transform (DFT). Such representation uses a minimal number of projections. The proposed algorithms are described in detail for an image (N × N × N), where N is a power of two. The paired transform is defined completely by projections along the discrete grid nested on the image domain. The measurement data set containing specified projections of the 3-D image are generated according to the tensor representation and the proposed algorithm is tested on the data. The algorithm for selecting a required number of projections is described. This algorithm allows the user to select the projections that contain the maximum information and automatically selects the rest of the projections, so that there is no redundancy in the spectral information of the projections.
Processing of visual information in the visual and object buffers of scene understanding based on network-symbolic models
Modern computer vision systems suffer from the lack of human-like abilities to understand a visual scene, detect, unambiguously identify and recognize objects. Bottom-up fine-scale segmentation of image with grouping into regions can rarely be effective for real world images if applied to the whole image without having clear criteria of how further to combine obtained small distinctive neighbor regions into meaningful objects. On a certain scale, an object or a pattern can be perceived just as an object or a pattern rather than a set of neighboring regions. Therefore, a region of interest, where the object or pattern can be located, must be established first. Rough but wide peripheral human vision serves to this goal, while narrow but precise foveal vision analyzes and recognizes the object from the center of the region of interest after separating it from its background. Unlike the traditional computer vision models, biologically-inspired Network-Symbolic models convert image information into an 'understandable' Network-Symbolic format, which is similar to relational knowledge models. The equivalent of interaction between peripheral and foveal systems in the network-symbolic system is achieved via interaction of the Visual and Object Buffers and the top-level knowledge system. This article describes the principles of data representation and processing of information in Visual and Object buffers that allow for scene analysis and understanding with identification and recognition of objects in the visual scene.
Multiscale self-similarity features of terrain surface
Xutao Li, Hanqiang Cao, Guangxi Zhu, et al.
Self-similarity features of natural surface play a key role in region segmentation and recognition. Due to long period of natural evolution, real terrain surface is composed of many self-similar structures. Consequently, the Self-similarity is not always so perfect that remains invariable in whole scale space and the traditional single self-similarity parameter can not represent such abundant self-similarity. In this view, the self-similarity is not a constant parameter over all scales, but multi-scale parameters. In order to describe such multi-scale self-similarities of real surface, firstly we adopt the Fractional Brownian Motion (FBM) model to estimate the self-similarity curve of terrain surface. Then the curve is divided into several linear regions to represent relevant self-similarities. Based on such regions, we introduce a parameter called Self-similar Degree (SSD) in the similitude of information entropy. Moreover, the small value of SSD indicates the more consistent self-similarity. We adopt fifty samples of terrain images and evaluate SSD that represents the multi-scale self-similarity features for each sample. The samples are clustered by unsupervised fuzzy c mean clustering into various classes according to SSD and traditional monotone Hurst feature respectively. The measurement for separability of features shows that the new parameter SSD is an effective feature for terrain classification. Therefore the similarity feature set that is made up of the monotone Hurst parameter and SSD provides more information than traditional monotone feature. Consequently, the performance of terrain classification is improved.
Video, Coding, and Compression
icon_mobile_dropdown
User evaluation of differential compression for motion imagery
Laurie Gibson, John M. Irvine, Gary O'Brien, et al.
Motion imagery will play a critical role in future combat operations. The ability to provide a real time, dynamic view of the battlefield, as well as the capability to maintain persistent surveillance, together make motion imagery a valuable source of information for the soldier. Acquisition and exploitation of this rich source of information, however, depends on available communications bandwidth to transmit the necessary information to users. Methods for reducing bandwidth requirements include a variety of image compression and frame decimation techniques. This study explores spatially differential compression in which targets in the clips are losslessly compressed, while the background regions are highly compressed. This study evaluates the ability of users to perform standard target detection and identification tasks on the compressed product, compared to performance on uncompressed imagery or imagery compressed by other methods. The paper concludes with recommendations for future investigations.
Automatic network-adaptive ultra-low-bit-rate video coding
Wei-Jung Chien, Tuyet-Trang Lam, Glen P. Abousleman, et al.
This paper presents a software-only, real-time video coder/decoder (codec) for use with low-bandwidth channels where the bandwidth is unknown or varies with time. The codec incorporates a modified JPEG2000 core and interframe predictive coding, and can operate with network bandwidths of less than 1 kbits/second. The encoder and decoder establish two virtual connections over a single IP-based communications link. The first connection is UDP/IP guaranteed throughput, which is used to transmit the compressed video stream in real time, while the second is TCP/IP guaranteed delivery, which is used for two-way control and compression parameter updating. The TCP/IP link serves as a virtual feedback channel and enables the decoder to instruct the encoder to throttle back the transmission bit rate in response to the measured packet loss ratio. It also enables either side to initiate on-the-fly parameter updates such as bit rate, frame rate, frame size, and correlation parameter, among others. The codec also incorporates frame-rate throttling whereby the number of frames decoded is adjusted based upon the available processing resources. Thus, the proposed codec is capable of automatically adjusting the transmission bit rate and decoding frame rate to adapt to any network scenario. Video coding results for a variety of network bandwidths and configurations are presented to illustrate the vast capabilities of the proposed video coding system.
Human face detection in video using edge projections
Mehmet Türkan, Berkan Dülek, Ibrahim Onaran, et al.
In this paper, a human face detection method in images and video is presented. After determining possible face candidate regions using color information, each region is filtered by a high-pass filter of a wavelet transform. In this way, edges of the region are highlighted, and a caricature-like representation of candidate regions is obtained. Horizontal, vertical and filter-like projections of the region are used as feature signals in dynamic programming (DP) and support vector machine (SVM) based classifiers. It turns out that the support vector machine based classifier provides better detection rates compared to dynamic programming in our simulation studies.
Image Understanding, Restoration, and Reconstruction II
icon_mobile_dropdown
New high-dynamic-range camera architecture
The need for high (wide) dynamic range cameras in the Security and Defense sectors is self-evident. Still the development of a cost-effective and viable system proves to be an elusive goal. To this end we take a new approach which meets a number of requirements, most notably a high "fill" factor for the associated APS (active pixel sensor) array and a minimal technology development curve. The approach can be used with any sensor array technology supporting, on a granular level, random pixel access. To achieve high dynamic range one of the presented camera systems classifies image pixels according to their probable brightness levels. Then it scans the pixels according to their probable brightness, with the pixels most likely to be the brightest being scanned first and the pixels most likely to be the darkest, last. Periodically the system re-adjusts the scanning strategy based on collected data or operator inputs. The overall exposure time is dictated by the sensitivity of the selected array and by the content and frame rate of the image. The local exposure time is determined by the predicted pixel brightness levels. The prediction method we use in this paper is simple duplication; i.e. the brightness of the vast majority of pixels is assumed to change little from frame to frame. This allows us to dedicate resources only to the few pixels undergoing large output excursions. Such approach was found to require only minimal modifications to standard APS array architectures and less "off-sensor" resources than CAMs (Content Addressable Memory) or other DSP intensive methods.
Compressive imaging spectrometers using coded apertures
A spectral imager provides a 3-D data cube in which the spatial information (2-D) of the image is complemented by spectral information (1-D) about each spatial location. Typically, these systems are operated in a fully-determined (or overdetermined) manner so that the measurements can be computationally inverted into a reliable estimate of the source. We propose a notional system design that is highly underdetermined, yet still computationally invertable. This approach relies on recently-developed concepts in compressive sensing. Because the number of required measurements is greatly reduced from traditional designs, the result is a faster and more economical sensor system.
Super resolution reconstruction based on motion estimation error and edge adaptive constraints
In order to improve the quality of image with super-resolution reconstruction, a method based on motion estimation error and edge constraint was proposed. Under the condition of data consistency and amplitude restriction, the motion estimation error was analyzed, with its variance being calculated; meanwhile, in order to suppress the ringing artifacts, edge constraint was adopted and a method based clustering for judging the edge's direction was proposed. The experimental results show that the performance of the this algorithm is better than the traditional linear interpolation and method without considering motion estimation error both in vision effect and peak signal to noise ratio.
Applications of Image Processing I
icon_mobile_dropdown
Improved stego sensitivity measure for ± A steganalysis
This article addresses four basic goals; 1) the evaluation of sequentially and randomly embedded stego evidence within digital images 2) the identification of "steganographic fingerprint" for special domain based steganographic methods, 3) the reduction of steganalysis false detection rate, and 4) the investigation of two well known pixel comparison based steganalysis methods. We present an improved version of Stego Sensitivity Measure, which is based on the statistics of sample pair (the basic unit), rather than individual samples which is very sensitive to ± A embedding. The presented measure enhances stego detection accuracy and localization of stego areas within sequentially and randomly embedded color or gray scale stego images. In addition, it estimates the message length of an embedded bit-stream within bit planes of a digital image, and it has better localization of steganography detected along with an improved estimation of the message length. It also identifies the "steganographic fingerprint" of special domain sequentially and randomly based steganographic methods. Numerical experimentation was conducted with an arbitrary image database of 200 color TIFF and RAW images taken with the Nikon D100 and the Canon EOS Digital Rebel cameras. In this article comparison are also shown using two known steganalysis methods Raw Quick Pairs and RS Steganalysis which have revealed that; a) The false alarm rate for the proposed detection method is p = 0.9 for a database of 200 images clean images while RS Steganalysis has shown a high false alarm rate for clean images of p = 2.8. b) The two methods Raw Quick Pairs and RS Steganalysis cannot be used for localization of steganographic regions due to the statistical properties of the detection methods.
Quantitative confirmation of visual improvements to micro-CT bone density images
John S. DaPonte, Michael Clark, Paul Nelson, et al.
The primary goal of this research was to investigate the ability of quantitative variables to confirm qualitative improvements of the deconvolution algorithm as a preprocessing step in evaluating micro CT bone density images. The analysis of these types of images is important because they are necessary to evaluate various countermeasures used to reduce or potentially reverse bone loss experienced by some astronauts when exposed to extended weightlessness during space travel. Nine low resolution (17.5 microns) CT bone density image sequences, ranging from between 85 to 88 images per sequence, were processed with three preprocessing treatment groups consisting of no preprocessing, preprocessing with a deconvolution algorithm and preprocessing with a Gaussian filter. The quantitative parameters investigated consisted of Bone Volume to Total Volume Ratio, the Structured Model Index, Fractal Dimension, Bone Area Ratio, Bone Thickness Ratio, Euler's Number and the Measure of Enhancement. Trends found in these quantitative variables appear to corroborate the visual improvements observed in the past and suggest which quantitative parameters may be capable of distinguishing between groups that experience bone loss and others that do not..
Advanced image processing of aerial imagery
Glenn Woodell, Daniel J. Jobson, Zia-ur Rahman, et al.
Aerial imagery of the Earth is an invaluable tool for the assessment of ground features, especially during times of disaster. Researchers at NASA's Langley Research Center have developed techniques which have proven to be useful for such imagery. Aerial imagery from various sources, including Langley's Boeing 757 Aries aircraft, has been studied extensively. This paper discusses these studies and demonstrates that better-than-observer imagery can be obtained even when visibility is severely compromised. A real-time, multi-spectral experimental system will be described and numerous examples will be shown.
Applications of Image Processing II
icon_mobile_dropdown
Evaluation of sharpness measures and search algorithms for the auto-focusing of high-magnification images
Yi Yao, Besma Abidi, Narjes Doggaz, et al.
Digital imaging systems with extreme zoom capabilities are traditionally found in astronomy and wild life monitoring. More recently, the need for such capabilities has extended to long range surveillance and wide area monitoring such as forest fires, airport perimeters, harbors, and waterways. Auto-focusing is an indispensable function for imaging systems designed for such applications. This paper studies the feasibility of an image based passive auto-focusing control for high magnification systems based on off-the-shelf telescopes and digital cameras/camcorders, with concentration on two associated elements: the cost function (usually the image sharpness measure) and the search strategy. An extensive review of existing sharpness measures and search algorithms is conducted and their performances compared. In addition, their applicability and adaptability to a wide range of high magnifications (50×~1500×) are addressed. This study builds up the foundation for the development of auto-focusing schemes with particular applications to high magnification systems.
Gradient-based value mapping for colorization of two-dimensional fields
Arvind Visvanathan, Stephen E. Reichenbach, Qingping Tao
This paper develops a method for automatic colorization of two-dimensional fields presented as images, in order to visualize local changes in values. In many applications, local changes in values are as important as magnitudes of values. For example, in topography, both elevation and slope often must be considered. Gradient-based value mapping for colorization is a technique to visualize both value (e.g., intensity or elevation) and gradient (e.g., local differences or slope). The method maps pixel values to a color scale in a manner that emphasizes gradients in the image. The value mapping function is monotonically non-decreasing, to maintain ordinal relationships of values on the color scale. The color scale can be a grayscale or pseudocolor scale. The first step of the method is to compute the gradient at each pixel. Then, the pixels (with computed gradients) are sorted by value. The value mapping function is the inverse of the relative cumulative gradient magnitude function computed from the sorted array. The value mapping method is demonstrated with data from comprehensive two-dimensional gas chromatography (GCxGC), using both grayscale and a pseudocolor scale to visualize local changes related to both small and large peaks in the GCxGC data.
Automated, on-board terrain analysis for precision landings
Zia-ur Rahman, Daniel J. Jobson, Glenn A. Woodell, et al.
Advances in space robotics technology hinge to a large extent upon the development and deployment of sophisticated new vision-based methods for automated in-space mission operations and scientific survey. To this end, we have developed a new concept for automated terrain analysis that is based upon a generic image enhancement platform-multi-scale retinex (MSR) and visual servo (VS) processing. This pre-conditioning with the MSR and the VS produces a "canonical" visual representation that is largely independent of lighting variations, and exposure errors. Enhanced imagery is then processed with a biologically inspired two-channel edge detection process, followed by a smoothness based criteria for image segmentation. Landing sites can be automatically determined by examining the results of the smoothness-based segmentation which shows those areas in the image that surpass a minimum degree of smoothness. Though the MSR has proven to be a very strong enhancement engine, the other elements of the approach-the VS, terrain map generation, and smoothness-based segmentation-are in early stages of development. Experimental results on data from the Mars Global Surveyor show that the imagery can be processed to automatically obtain smooth landing sites. In this paper, we describe the method used to obtain these landing sites, and also examine the smoothness criteria in terms of the imager and scene characteristics. Several examples of applying this method to simulated and real imagery are shown.
Paired directional transform-based methods of image enhancement
In this paper, an effective realization of the α-rooting method of image enhancement by splitting-signals is proposed. The splitting-signals completely determine the image and split its spectrum by disjoint subsets of frequencies. Image enhancement is reduced to processing separate splitting-signals. We focus on processing only one specified splitting-signal, to achieve effective image enhancement that in many cases exceeds the enhancement by known a-rooting and wavelet methods. An effective realization of enhancement of image (N × N) is achieved by using one coefficient, instead of N/2 such coefficients for splitting-signals in the split α-rooting and N × N in traditional α-rooting. The proposed method does not require Fourier transforms, its realization is performed with N multiplications. The processing of the splitting-signal leads to the change of the image along the parallel lines by N different values, which leads to the concept of directional images and their application in enhancing the image along directions. A novel method of combining paired transform (pre-step of SMEME (spectral spatial maximum exclusive mean) filter) by wavelet transforms is proposed. While denoising directional clutters, the most corrupted splitting-signal is estimated and found, depending on the angle of long-waves.
VQ/DCT-based robust image watermarking
In this paper a robust watermarking technique based on Vector Quantization and the preciously developed spread spectrum robust watermarking technique is proposed. In this work, the watermark is embedded in both the DCT and codebook domains. Results illustrate that the proposed technique provides an improvement over the spread spectrum watermarking technique in terms of robustness for various signal processing attacks. A discussion on the robustness of the technique against the dead-lock and collusion problems is also provided.
Animating climate model data
John S. DaPonte, Thomas Sadowski, Paul Thomas
This paper describes a collaborative project conducted by the Computer Science Department at Southern Connecticut State University and NASA's Goddard Institute for Space Science (GISS). Animations of output from a climate simulation math model used at GISS to predict rainfall and circulation have been produced for West Africa from June to September 2002. These early results have assisted scientists at GISS in evaluating the accuracy of the RM3 climate model when compared to similar results obtained from satellite imagery. The results presented below will be refined to better meet the needs of GISS scientists and will be expanded to cover other geographic regions for a variety of time frames.
A system for tracking and recognizing pedestrian faces using a network of loosely coupled cameras
L. Gagnon, F. Laliberté, S. Foucher, et al.
A face recognition module has been developed for an intelligent multi-camera video surveillance system. The module can recognize a pedestrian face in terms of six basic emotions and the neutral state. Face and facial features detection (eyes, nasal root, nose and mouth) are first performed using cascades of boosted classifiers. These features are used to normalize the pose and dimension of the face image. Gabor filters are then sampled on a regular grid covering the face image to build a facial feature vector that feeds a nearest neighbor classifier with a cosine distance similarity measure for facial expression interpretation and face model construction. A graphical user interface allows the user to adjust the module parameters.
Poster Session
icon_mobile_dropdown
Moving traffic object retrieval in H.264/MPEG compressed video
Xu-li Shi, Guang Xiao, Shuo-zhong Wang, et al.
Moving object retrieval technique in compressed domain plays an important role in many real-time applications, e.g. Vehicle Detection and Classification. A number of retrieval techniques that operate in compressed domain have been reported in the literature. H.264/AVC is the up-to-date video-coding standard that is likely to lead to the proliferation of retrieval techniques in the compressed domain. Up to now, few literatures on H.264/AVC compressed video have been reported. Compared with the MPEG standard, H.264/AVC employs several new coding block types and different entropy coding method, which result in moving object retrieval in H.264/ AVC compressed video a new task and challenging work. In this paper, an approach to extract and retrieval moving traffic object in H.264/AVC compressed video is proposed. Our algorithm first Interpolates the sparse motion vector of p-frame that is composed of 4*4 blocks, 4*8 blocks and 8*4 blocks and so on. After forward projecting each p-frame vector to the immediate adjacent I-frame and calculating the DCT coefficients of I-frame using information of spatial intra-prediction, the method extracts moving VOPs (video object plan) using an interactive 4*4 block classification process. In Vehicle Detection application, the segmented VOP in 4*4 block-level accuracy is insufficient. Once we locate the target VOP, the actual edges of the VOP in 4*4 block accuracy can be extracted by applying Canny Edge Detection only on the moving VOP in 4*4 block accuracy. The VOP in pixel accuracy is then achieved by decompressing the DCT blocks of the VOPs. The edge-tracking algorithm is applied to find the missing edge pixels. After the segmentation process a retrieval algorithm that based on CSS (Curvature Scale Space) is used to search the interested shape of vehicle in H.264/AVC compressed video sequence. Experiments show that our algorithm can extract and retrieval moving vehicles efficiency and robustly.
Application of DSP in the image transmission system
Feng Gui, LinQi Wei
A scheme to realize static image and video code and decode based on TI DSP chip TMS320C6416 was proposed in this paper, and a reliable image transmission system was developed. According to the application demand, the software has six major modules: (1) initialization of DSP chip and other hardware; (2) video acquisition and input control program; (3) serial port communicating program; (4) RAM storage and communicating program that applies and releases the token-ring; (5) video reconstruct and output control; (6) the major parts of the software, encoding and decoding program, in which wavelet was applied first, then run length coding and Huffman coding were carried out, the image or video could had balance resolution and better visual effect by adaptive processing, in the decoding parts, the reverse operation were executed. After the system line up debugging was carried out, a satisfying result was reached: the comparatively high compression rate, preferable image quality and relatively real-time result.
Object-oriented digital imaging technology
This paper proposes an object-oriented digital imaging technology both for still and moving pictures. This innovative imaging method is non-scanning image format independent of the number of scanning lines and frame-rates of the current scanning method. The format is composed of digital bits describing image objects and deviation of moving pictures. We have recently embarked on a new era where we can exchange still and moving pictures in remote places through internet or over the wireless, and we use image data for various control systems. Therefore, the object-oriented digital imaging technology is incorporated as an innovative imaging technology, non-scanning, frameless and also fitted to flat-panels, packet networks, computer friendly, of course object recognition precisely and rapidly.
A new error resilience method for FGS video enhancement bit-stream
Ran Ma, Zhao-yang Zhang, Ping An
Video streaming over the Internet usually encounters with bandwidth variations and packet losses, which impacted badly on the reconstructed video quality. Fine Granularity Scalability (FGS) can well provide bit-rate adaptability to different bandwidth conditions over the Internet, due to its fine granular and error resilience. However, the effective solution of packet losses is Multiple Description Coding (MDC), but a great deal of redundancy information is brought up. For an FGS video bit-stream, the base layer is usually very small and of high importance, error-free transmission could be achieved through classical error resilience technique. As a result, the overall streaming quality is mostly dependent on the enhancement layer. Moreover, it is worthy of note that the different bit-planes are of different importance, which are suitable to unequal protection (UEP) strategy. So, a new joint MDC and UEP method is proposed to protect the enhancement layer in this paper. In the proposed method, the MDC encoder/decoder is embedded into the normal enhancement layer encoder/decoder. By considering of the unequal protection of bit-plane and the redundancy of MDC, the two most significant bit-planes adopt the MDC-based strategy. While, the remaining bit-planes only encoded by normal enhancement layer coding system. Experimental results are demonstrated to testify the efficiency of our proposed method.
A optimized context-based adaptive binary arithmetic coding algorithm in progressive H.264 encoder
Guang Xiao, Xu-li Shi, Ping An, et al.
Context-based Adaptive Binary Arithmetic Coding (CABAC) is a new entropy coding method presented in H.264/AVC that is highly efficient in video coding. In the method, the probability of current symbol is estimated by using the wisely designed context model, which is adaptive and can approach to the statistic characteristic. Then an arithmetic coding mechanism largely reduces the redundancy in inter-symbol. Compared with UVLC method in the prior standard, CABAC is complicated but efficiently reduce the bit rate. Based on thorough analysis of coding and decoding methods of CABAC, This paper proposed two methods, sub-table method and stream-reuse methods, to improve the encoding efficiency implemented in H.264 JM code. In JM, the CABAC function produces bits one by one of every syntactic element. Multiplication operating times after times in the CABAC function lead to it inefficient.The proposed algorithm creates tables beforehand and then produce every bits of syntactic element. In JM, intra-prediction and inter-prediction mode selection algorithm with different criterion is based on RDO(rate distortion optimization) model. One of the parameter of the RDO model is bit rate that is produced by CABAC operator. After intra-prediction or inter-prediction mode selection, the CABAC stream is discard and is recalculated to output stream. The proposed Stream-reuse algorithm puts the stream in memory that is created in mode selection algorithm and reuses it in encoding function. Experiment results show that our proposed algorithm can averagely speed up 17 to 78 MSEL higher speed for QCIF and CIF sequences individually compared with the original algorithm of JM at the cost of only a little memory space. The CABAC was realized in our progressive h.264 encoder.
Secure multimedia browser over network
In this paper, a secure multimedia browsing scheme is proposed, which is constructed based on perceptual multimedia encryption and secure key distribution. In this scheme, multimedia data are encrypted perceptually under the control of user key and quality factor. This encryption process combining with Advanced Video Coding (AVC) is of low cost, and keeps file format unchanged. The key distribution scheme deals with user input, authenticates users, and controls the secure multimedia sharing process. Thus, only the users who have registered can obtain multimedia data. And they can be classified into several types according to their payment. The analyses and experimental results show that it is suitable for secure multimedia applications such as Video-on-Demand (VOD) system, Audio-on-Demand (AOD) system, pay-TV, videoconferencing systems, wireless or mobile multimedia, and so on.
A new edge detection based on pyramid-structure wavelet transform
Many advance image processing, like segmentation and recognition, are based on contour extraction which usually lack of ability to allocate edge precisely in the image of heavy noise with low computation burden. For such problem, in this paper, we proposed a new approach of edge detection based on pyramid-structure wavelet transform. In order to suppress noise and keep good continuity of edge, the proposed edge representation considered both inter-correlations across the multi-scales and intra-correlations within the single-scale. The former one is described by point-wise singularity. The later one is described by the magnitude and ratio of wavelet coefficients in different sub-bands. Based on such edge modeling, the edge point allocation is then complemented in wavelet domain by synthesizing the edge information in multi-scales. The experimental results shows that our approaches achieve the pixel-level edge detection with strong resistant against noise due to scattering in water.
Characterization of ultra-resolution method
Evgeni N. Terentiev, N. E. Terentiev
The Fredholm equation of the first kind is commonly used in tasks of compensation for Point Spread Function (PSF) distortion. However, there exists some contradiction. Namely, the distortion in any definite point depends on a small amount of adjacent points. Meanwhile, the task of compensation is set forth for image as a whole, i.e. for a large amount of points. The ultra-resolution method does not make use of the Fredholm equation of the first kind. The PSF distortions are compensated point by point [1,2]. The method can be used at low signal-to-noise ratio. The characterization of ultraresolution method is connected with the investigation of the value of compensated distortions in dependence on the noise level and on the pass band frequency of the method. Application of resolving function for object-target as a special PSF gives us a principally new approach to the targeting problem. Characterization in this case helps us to choose the accuracy of indication in the point of the target.
Arithmetic for color image morphological transform
LinQi Wei, Feng Gui
Morphology is an importance method in image processing and computer vision technique. Now as we know that morphology has been widely used in grayscale and bi-level image processing, but it is difficult to use in color image processing, for color image has three color components. Applying morphological method on color image may cause color shift in the original image. Morphological image processing method has satisfactory results in noise restraining, image enhancing, image encoding, feature extracting, and texture analyzing. So it is important to find out a method for applying morphological method on color image, and keep the color unchanged at the same time. An approach for color image morphological transform was put forward in this paper. In our approach, we change the color model from RGB space to HIS space first, then applying morphological transform on I component in HSI space and keeping H and S components unchanged, finally using this new I component and two other components (H and S) to obtain color image in RGB space. In this paper we define some new color morphological operators for color image morphological transform, and their characters were also discussed. Experimental results were also explained in the paper. Color morphological transform is a prolonging application of morphology in color space. We find it is an effective means in image processing and feature extracting of target shape.
A new texture representation with multi-scale wavelet feature
The existing methods for texture modeling include co-occurrence statistics, filter banks and random fields. However most of these methods lack of capability to characterize the different scale of texture effectively. In this paper, we propose a texture representation which combines local scale feature, amplitude and phase of wavelet modules in multi-scales. The self-similarity of texture is not globally uniform and could be measured in both correlations across the multi-scale and statistical feature within a single-scale. In our approach, the local scale feature is represented by optimal scale obtained through the evolution of wavelet modulus across multi-scales. Then, for all the blocks of the same optimal scale, the statistical measurement of amplitude is extracted to represent the energy within the corresponding frequency band; the statistical measurement of the phase of modulus is extracted to represent the texture's orientation. Our experiment indicates that, in the proposed texture representation the separability of different texture patterns is larger than the one of the traditional features.