Proceedings Volume 3460

Applications of Digital Image Processing XXI

cover
Proceedings Volume 3460

Applications of Digital Image Processing XXI

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 1 October 1998
Contents: 9 Sessions, 93 Papers, 0 Presentations
Conference: SPIE's International Symposium on Optical Science, Engineering, and Instrumentation 1998
Volume Number: 3460

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Implementation Issues
  • Image Analysis and Assessment I
  • Image Analysis and Assessment II
  • Image and Video Compression I
  • Implementation Issues
  • Image and Video Compression II
  • Image and Video Compression III
  • Restoration and Information Extraction I
  • Restoration and Information Extraction II
  • Poster Session
Implementation Issues
icon_mobile_dropdown
License plate recognition system
Nadeem A. M. Khan, Ron J. De la Haye, Hans A. Hegt
A powerful automated license plate recognition system is presented, which is able to read license numbers of cars, even under non-ideal circumstances. At the front-end of the system, there is a high-speed shutter camera and a frame grabber that delivers the digitized images of cars passing by. In a license plate segmentation step, the approximate positions of the four corner points of the plates are indicated. Due to the perspective view, these corner points may not correspond to a rectangle. By means of resampling, a rectangular license plate with a fixed size of 180 X 40 pixels is reconstructed. After image enhancement steps, the characters are approximately segmented, based on the properties of a vertical projection of the license plate. Next, the separate characters are normalized with respect to contrast, intensity and size. Each character image is projected on a low-dimensional space using the Karhunen- Loeve (KL) transform, containing the relevant information to distinguish it from other characters. A problem with this transformation arises, when the character is not properly segmented. We solved that problem by comparing the inverse KL transformed result with the original character. In case they differ significantly, this may indicate a major segmentation error, for which we can correct. This leads to a much improved segmentation and thus a transformation that holds the needed information for the classification. The KL transformed characters can be classified by several methods. We obtained good results by classifying the transformed characters with the help of the Euclidean distance. A misclassification rate of 0.4% was achieved with a rejection rate of 13%. Further development of the system, for which a number of recommendations are given, is expected to increase the system performance.
Speckle-size metric to improve reconstruction of coherent speckle images
Barbara Tehan Landesman, David F. Olson
Robust reconstruction of coherent speckle images from non- imaged laser speckle patterns in the aperture plane of an optical system requires adequate sampling of the speckle intensity at the focal plane. Although detector size cannot be changed dynamically in the course of an experiment to achieve the necessary sampling in every frame, a measure of speckle size could be used to accept or reject individual frames in post-processing software to improve the final reconstructed image. This paper investigates the use of a speckle size metric to gauge the integrity of speckle sampling in each frame of a series of coherent speckle images. Frames containing inadequate sampling are sorted out of the final reconstructed image. The quality of the final recovery for a variety of targets and imaging conditions are compared for sorted and non-sorted reconstructions.
Multiple-CCD stereo acquisition system for high-speed imaging
Eiji S. Yafuso, David T. Sass, Eustace L. Dereniak, et al.
A high-speed 3D imaging system has been developed using multiple independent CCD cameras with sequentially triggered acquisition and individual field storage capability. The system described here utilizes sixteen independent cameras. A stereo alignment and triggering scheme arranges the cameras into two angularly separated banks of eight cameras each. By simultaneously triggering correlated stereo pairs, an eight-frame sequence of stereo images is captured. The delays can be individually adjusted to yield a greater number of acquired frames during more rapid segments of the vent, and the individual integration periods may be adjusted to ensure adequate radiometric response while minimizing image blur. Representation of the data as a 3D sequence introduces the issue of independent camera coordinate registration with the real scene. A discussion of the forward and inverse transform operator for the digital data is provided along with a description of the acquisition system.
Three-dimensional imaging of interphase cell nuclei with laser scanning microscopy
Wilfried Boecker, Thomas Radtke, Christian Streffer
During the past decade 3D image processing has become an important key component in biological research mainly due to two different developments. The first is based on an optical instrument, the so-called confocal laser scanning microscope, allowing optical sectioning of the biological specimen. The second is a biological preparatory method, the so-called FISH-technique (Fluorescence-In-Situ- Hybridization), allowing labeling of certain cellular and sub-cellular compartments with highly specific fluorescent dyes. Both methods make it possible to investigate the 3D biological framework within cells and nuclei. Image acquisition with confocal laser scanning microscopy must deal with different limits of resolution along and across the optical axis. Although lateral resolution is about 0.7 times better than in non-confocal arrangements, axial resolution is more than 3 - 4 times poorer than that of the lateral (depending on the pinhole size). For 3D reconstruction it is desirable to improve axial resolution in order to provide nearly identical image information across the 3D specimen space. This presentation will give an overview of some of the most popular restoration and deblurring algorithms used in 3D image microscopy. After 3D image restoration, segmentation of certain details of the cell structure is usually the next step in image processing. We compared two different kinds of algorithms for segmentation of chromosome territories in interphase cell nuclei. One is based on Mathematical Morphology, the other on Split & Merge methods. The segmented image regions provided the basis for chromosome domain reconstruction as well as for regional localization for subsequent quantitative measurements. As a result the chromatin density within certain chromosome domains as well as some terminal DNA sequences (telomere signals) could be measured.
Liquid crystal temperature measurement for real-time control
Dina C. Birrell, John K. Eaton
A laboratory system for multiple point, closed-loop, surface temperature control has been developed to test control algorithms for manufacturing applications such as rapid prototyping laminate manufacture and rapid thermal processing of semiconductor wafers. Accurate surface temperature measurements with high spatial resolution are required to provide a signal for feedback control. Image acquisition and conversion to temperature must be fast enough to operate in conjunction with the control method used. Optical methods are a natural fit, since many spatially distributed measurements can be obtained rapidly with a minimum of complexity. The operation of thermochromic liquid crystals in a real time control loop is described. Every 200 msec, the temperature at each of 196 zones must be obtained. Two frames and multiple pixels per zone are averaged to reduce statistical uncertainty in the reading. An in situ calibrator was developed and used to study the errors inherent in this system, including errors due to hysteresis and uneven surface lighting. The statistical uncertainty in temperature measured due to the calibrator uncertainty varied with temperature, but was less than 0.03 degree(s)C over most of the range. The uncertainty in the in situ temperature reading was larger because fewer pixels could be averaged. This uncertainty also depended upon temperature, and was below 0.2 degree(s)C over an 7.5 degree(s)C range. Additional errors were investigated. Hysteresis, or path dependent effects in the temperature response of the liquid crystals, was seen to depend upon the maximum temperature reached, even when the maximum temperature was within the usable range of the liquid crystals. Taking the crystals to temperatures above the usable range led to larger hysteresis errors, as much as 0.2 degree(s)C, while errors below 0.1 degree(s)C were seen for a smaller temperature excursion. Uneven lighting had a large effect on the saturation and intensity measured, and a significantly smaller effect on the hue. However, the small effect on the hue caused an error around 0.1 degree(s)C, so uneven lighting should be avoided if possible. Also, thresholding (determining if crystals are reading valid temperatures) is complicated by variations in lighting intensity over the surface to be measured.
Aperture design and numerical reconstruction technique requirements for high-resolution imaging of neutrons in inertial confinement fusion (ICF)
In Inertial Confinement Factor (ICF) experiments, radiation from compressed core is increasingly reabsorbed. For the largest experiments, the only radiation to escape is the 14 MeV fusion neutrons to which we must turn to learn of the physical processes taking place. The most important parameters are the shape and the size of the compressed core and this involves imaging the neutrons produced by the fusion reactions. The penumbral technique is ideally suited to neutron imaging and the feasibility of this technique has been demonstrated at the Lawrence Livermore National Laboratory in the United States. At the Phebus laser facility in France, this method has been used to image compressed ICF cores with diameters of 150 micrometers yielding approximately 109 neutrons, and the overall spatial resolution obtained in the reconstructed source was approximately 100 micrometers . On the Laser Megajoule project which is the equivalent of the National Ignition Facility in the United States, the spatial resolution required to diagnose high-convergence targets is 10 micrometers . We wish first to obtain a spatial resolution of 30 micrometers to image source with a diameter <EQ 100 micrometers at a neutron yield in the range of 1011 - 1014 neutrons. A collaborative experimental program with the Laboratory for Laser Energetics at the University of Rochester in this perspective is planned. At the same time, there is a research program in collaboration with Laval University concerning coded aperture designs and the associated reconstruction techniques. In this article we first review the basic requirements of such imagery and the concept of the penumbral imaging technique. Then we concentrate on the aperture design criteria and on the quantity of information necessary to achieve high spatial resolution. Finally, we survey the reconstruction techniques used followed by results and comparative evaluation of those methods.
Accurate distortion correction scheme for digital subtraction angiography
Haiying Liu, Hsiang-Hsin Hsiung
Robust geometric distortion characterization and correction methods have been developed and validated for DSA application. A compact as well as efficient numerical parameterization technique has been used for characterizing the 3D geometric image distortion of a clinical DNA system. To some extent, the 3rd order polynomial was found to be sufficient for the parameterization of global image distortion for all rotational angles. One of the advantages of the new technique is compact in the sense that it does not require a large distortion parameter look-up table. The new method promises a potential of using less regular distributed calibration points for a complete 3D image distortion characterization of DSA systems. Furthermore, two different schemes based on the pixel mapping for correcting image distortion have been successfully demonstrated yielding satisfactory results.
Automatic labeling of brain tissues in MRI using an encoder-segmented neural network
Ning Li, Youfu Li
Quantitative estimation of tissue labeling heavily depends on the efficiency of image segmentation technique. In this paper, an encoder-segmented neural network was proposed to improve the efficiency of image segmentation. The features are ranked according to the encoder indicators by which the insignificant feature vector will be eliminated from the original feature vectors and the important feature vectors can be re-organized as the encoded feature vectors for the subsequent clustering. ESNN developed can improve the exist FCM algorithm in feature extraction and the cluster's number selection. This method was successfully implemented automatic labeling of tissue in brain MRIs. Examples of the results are also presented for diagnosis of brain using MR images.
Implementation of a watershed algorithm on FPGAs
Shahram Zahirazami, Mohamed Akil
In this article we present an implementation of a watershed algorithm on a multi-FPGA architecture. This implementation is based on an hierarchical FIFO. A separate FIFO for each gray level. The gray scale value of a pixel is taken for the altitude of the point. In this way we look at the image as a relief. We proceed by a flooding step. It's like as we immerse the relief in a lake. The water begins to come up and when the water of two different catchment basins reach each other, we will construct a separator or a `Watershed'. This approach is data dependent, hence the process time is different for different images. The H-FIFO is used to guarantee the nature of immersion, it means that we need two types of priority. All the points of an altitude `n' are processed before any point of altitude `n + 1'. And inside an altitude water propagates with a constant velocity in all directions from the source. This operator needs two images as input. An original image or it's gradient and the marker image. A classic way to construct the marker image is to build an image of minimal regions. Each minimal region has it's unique label. This label is the color of the water and will be used to see whether two different water touch each other. The algorithm at first fill the hierarchy FIFO with neighbors of all the regions who are not colored. Next it fetches the first pixel from the first non-empty FIFO and treats this pixel. This pixel will take the color of its neighbor, and all the neighbors who are not already in the H-FIFO are put in their correspondent FIFO. The process is over when the H-FIFO is empty. The result is a segmented and labeled image.
Hardware implementation of image segmentation algorithm for real-time image compression
Piotr Wasilewski
Segmentation algorithms are fast and simple technique used to obtain an image representation at different resolution levels, so they are widely used for image compression. Neither floating-point calculations nor large amounts of memory is required, so these algorithms can be easily implemented in relatively cheap and simple real-time systems. The proposed algorithm divides an image into rectangular blocks, which may overlap. The width and height of these blocks are set independently and can have optimal values from a preset range. Blocks are filled with a mean value of pixels from original image and their sizes are increased until the mean square error value for the block is smaller than the preset value. Next, the hardware implementation in single FPGA device is proposed. Paper also presents results obtained during off-line image compression. These results show better quality (in PSNR ratio) of restored images in compare to standard QuadTree algorithm. Simulations show that proposed hardware architecture can process standard monochrome CIF image with speed over 30 frames per second preserving low cost and high quality.
Image Analysis and Assessment I
icon_mobile_dropdown
Adaptive multistage 2D image motion field estimation
Ulrich Neumann, Suya You
This paper addresses the problem of robust 2D image motion estimation in natural environments. We develop an adaptive tracking-region selection and optical-flow estimation technique. The strategy of adaptive region selection locates reliable tracking regions and makes their motion estimation more reliable and computationally efficient. The multi-stage estimation procedure makes it possible to discriminate between good and poor estimation areas, which maximizes the quality of the final motion estimation. Furthermore, the model fitting stage further reduces the estimation error and provides a more compact and flexible motion field representation that is better suited for high-level vision processing. We demonstrate the performance of our techniques on both synthetic and natural image sequences.
Reduction of checking points using unimodal error surface assumption for fast motion estimation
Jong-Nam Kim, Tae-Sun Choi
The three-step search (TSS) has played a key role in real time video encoding because of its light computational complexity, regularity of search rule, and reasonable performance for reduced computation. Many researches about modified TSS algorithms have been studied for reducing the amount of computation or improving the quality of the image predicted with obtained motion vector. This paper explains a new concept of hierarchical search in motion estimation for more reduction of computational complexity and better error performance compared with conventional modified TSS algorithms. The structure of the proposed algorithm is similar to that of the conventional TSS algorithm. The proposed algorithm, however, has different precision of search for each step. It will be shown that the proposed algorithm is very efficient in terms of speed up for computation and has improved error performance over the conventional modified TSS algorithms. Our proposed algorithm will be useful in software-based real-time video coding and low bit rate video coding.
Algorithm for finding parameter-dependent connected components of gray-scale images
Yang Wang, Prabir Bhattacharya
In a previous work we have introduced the concept of a parameter-dependent connected component of gray-scale images that takes into account both the gray values of the pixels and the differences of the gray values of the neighboring pixels. This concept is a convenient tool to analyze or understand images at a higher level than the pixel level. In this paper, we describe an algorithm for finding the parameter-dependent components for a given image. We discuss different strategies used in the algorithm and analyze their effects through the experimental results. Since the proposed algorithm is independent of the formation of the images, it can be used for the analyzed of many types of images. The experimental results show that for some appropriate values of the parameters, the objects of an image may be represented by its parameter-dependent components reasonably well. Thus, the proposed algorithm provides us with the possibility of analyzing images further at the component level.
Enhanced Voronoi diagram method for segmentation of partially overlapped thin objects
Q. M. Jonathan Wu, Xiaotian Shi, Ali Jerbi, et al.
A new clustering technique is developed for segmentation of partially overlapped thin objects. The technique is based on an enhanced Voronoi diagram which partitions random data into clusters where intra-class members possess features of close similarity. An important aspect of this study consists of introducing predicting directional vectors, reminiscent of the first and second principal components, in order to achieve better partitioning of data clusters. Computer implementations of this new partitioning scheme illustrate superior partitioning performance over the standard Voronoi approach. It is shown that the new scheme minimizes the error in data classification. A mathematical framework is provided in support of this new clustering method. Experimental results on partitioning glass fibers are presented to illustrate application of the technique to object segmentation.
Shape analysis modeling for character recognition
Nadeem A. M. Khan, Hans A. Hegt
Optimal shape modeling of character-classes is crucial for achieving high performance on recognition of mixed-font, hand-written or (and) poor quality text. A novel scheme is presented in this regard focusing on constructing such structural models that can be hierarchically examined. These models utilize a certain `well-thought' set of shape primitives. They are simplified enough to ignore the inter- class variations in font-type or writing style yet retaining enough details for discrimination between the samples of the similar classes. Thus the number of models per class required can be kept minimal without sacrificing the recognition accuracy. In this connection a flexible multi- stage matching scheme exploiting the proposed modeling is also described. This leads to a system which is robust against various distortions and degradation including those related to cases of touching and broken characters. Finally, we present some examples and test results as a proof-of- concept demonstrating the validity and the robustness of the approach.
Generalized adaptive strategies for edge detection in digital imagery
Edges in digital imagery can be identified from the zero- crossings of Laplacian of Gaussian (LOG) filtered images. Time or frequency-sampled LOG filters have been developed for the detection and localization of edges in digital image data. The image is decomposed into overlapping subblocks and processed in the transform domain. Adaptive algorithms are developed to minimize spurious edge classifications. In order to achieve accurate and efficient implementations, the discrete symmetric cosine transform of the input data is employed in conjunction with adaptive filters. The adaptive selection of the filter coefficients is based on the gradient criterion. For instance, in the case of the frequency-sampled LOG filter, the filter parameter is systemically varied to force the rejection of false or weak edges. In addition, the proposed algorithms easily extend to higher dimensions. This is useful where 3D medical image data containing edge information has been corrupted by noise. This paper employs isotropic and non-isotropic filters to track edges in such images.
Remote sensing through stratified random media using pupil-plane interferometry
Ervin Goldfain
We investigate a novel method for the retrieval of an arbitrary amplitude-object which is illuminated from the far-field and sampled through a stratified random medium of unknown statistics. The setup includes two observation paths, a CCD-based imaging system and a multiaperture interferometer placed in a plane conjugate to the entrance pupil of the imaging system. The interferometric baselines are arranged in closed loops to make the closure phase insensitive to random refractive fluctuations. The method may be beneficial to applications such as surveillance, speckle interferometry and biomedical imaging.
Registration method for free-form surfaces
Markus Rieder, Ruediger Dillmann
This paper presents a method for fast surface matching. The algorithm handles all six degrees of freedom and is based on the curvature of a surface. Two surfaces are sampled at discrete points and represented as a set of 3D verteces. The sampling rate is supposed to be at least the double of the nyquist frequency. Steps in the surface lead to a curvature value higher than a threshold. The related verteces are marked and not taken into account for any further calculation. The gaussian curvature of the two surfaces is computed. Then a certain number of feature points are extracted out of the surfaces. These feature points are connected to create triangles. Similar triangles found in both surfaces are compared. It they match the rotation between these two triangles can be computed. A transformation histograms determines the rotation with the highest probability and a sequencing displacement calculation specifies a displacement between the triangles will the best likelihood. Only the displacement between the triangles contributing to the calculated orientation vote for the correct displacement. The exact matching is done by a least square optimization procedure considering only the triangles connected with the initial transformation and possessing the same parameters in both surfaces such as size and form. The proposed method is applicable on range images without any edges or known reference points as it is based on free-form surface inherent features.
Image Analysis and Assessment II
icon_mobile_dropdown
Spatial and temporal IR variations of natural background
As the performance of systems for surveillance, reconnaissance, target detection, target recognition and target identification increases in competition with the increased skill in reduction of IR-signatures, there has been an increasing demand for analyzing and predicting the spatial properties of targets and backgrounds. The temporal variations of spatial properties, measured as texture, for object and background is of vital importance for target detection and assessment of signature reduction methods. One important question to be answered is: how does the texture for objects and backgrounds vary as a function of environment parameters e.g. weather? If that question could be answered, one important part of the problem of performing signature forecast could be solved. In an attempt to predict the dependences between spatiotemporal IR-signatures and weather parameters, the diurnal time series of different texture measures for different areas in a natural background scene have been measured and related to different weather parameters e.g. incidence, temperature and humidity. Examples of covariations between texture measures and weather parameters will be given in the paper.
Multilevel Ising search for human face detection
Kazuhiro Hotta, Masaru Tanaka, Takio Kurita, et al.
In this paper, a multilevel Ising search method for human face detection is proposed to speed up the search. In order to utilize the information obtained from the previous searched points. Ising model is adopted to represent the candidates of `face' positions and is combined with the scale invariant human face detection method. In the face detection, the distance from the mean vector of `face' class in discriminant space represents the likelihood of face. By integrating the measured distance into the energy function of Ising model as the external magnetic field, the search space is narrowed down effectively (the candidates of `face' are reduced). By incorporating color information of face region in the external magnetic field, the `face' candidates can be reduced further. In the multilevel Ising search, face candidates (spins) with different resolutions are represented in a Pyramidal structure and the coarse-to-fine strategy is taken. We demonstrate that the proposed multilevel Ising search method can effectively reduce the search space and can detect human face correctly.
Diagonal forms of symmetric convolution matrices for asymmetric multidimensional sequences
Thomas M. Foltz, Byron M. Welsh
This paper presents diagonal forms of matrices representing symmetric convolution which is the underlying form of convolution for discrete trigonometric transforms. Symmetric convolution is identically equivalent to linear convolution for appropriately zero-padded sequences. These diagonal forms provide an alternate derivation of the symmetric convolution-multiplication property of the discrete trigonometric transforms. Derived in this manner, the symmetric convolution-multiplication property extends easily to multiple dimensions, and generalizes to multidimensional asymmetric sequences. The symmetric convolution of multidimensional asymmetric sequences can then be accomplished by taking the product of the trigonometric transforms of the sequences and then applying an inverse transform to the result. An example is given of how this theory can be used for applying a 2D FIR filter with nonlinear phase which models atmospheric turbulence.
Sandwich snakes: robust active contours
Fernando Augusto Velasco Avalos, Jose Luis Marroquin Zaleta
Snakes are active contours that minimize an energy function. Sandwich snakes are formed by two snakes, one inside and the other outside of the contour that one is looking for. They have the same number of particles, which are connected in correspondence one to one. At the global minimum the two snakes have the same position.
Integration of retinal image sequences
In this paper a method for noise reduction in ocular fundus image sequences is described. The eye is the only part of the human body where the capillary network can be observed along with the arterial and venous circulation using a non invasive technique. The study of the retinal vessels is very important both for the study of the local pathology (retinal disease) and for the large amount of information it offers on systematic haemodynamics, such as hypertension, arteriosclerosis, and diabetes. In this paper a method for image integration of ocular fundus image sequences is described. The procedure can be divided in two step: registration and fusion. First we describe an automatic alignment algorithm for registration of ocular fundus images. In order to enhance vessel structures, we used a spatially oriented bank of filters designed to match the properties of the objects of interest. To evaluate interframe misalignment we adopted a fast cross-correlation algorithm. The performances of the alignment method have been estimated by simulating shifts between image pairs and by using a cross-validation approach. Then we propose a temporal integration technique of image sequences so as to compute enhanced pictures of the overall capillary network. Image registration is combined with image enhancement by fusing subsequent frames of a same region. To evaluate the attainable results, the signal-to-noise ratio was estimated before and after integration. Experimental results on synthetic images of vessel-like structures with different kind of Gaussian additive noise as well as on real fundus images are reported.
Dual-wavelength imaging for online identification of stem ends and calyxes
James Zhiqing Wen, Yang Tao
Machine vision and imaging processing techniques have been increasingly important for the fruit industry, especially when applied to quality inspection and defect sorting applications. However, automating the defect sorting process is still a challenging project due to the complexity of the process. One of the biggest difficulties involved in the technology of automated machine vision inspection of fruit defects is how to distinguish the stem-end (stem cavity) and calyx (bloom bottom) for true defects such as bruises, insect damages, and blemishes. Traditional mechanical, image processing, and structured lighting methods are proved to be unable to solve this problem due to their limitations in accuracy, speed, and so on. In this paper, a novel method is developed based on dual-wavelength infrared imaging using both near infrared and mid infrared cameras. This method enables a quick and accurate discrimination between true defects and stem-ends/calyxes. The obtained results have significant meanings to automated apple defect detection and sorting.
Selective visual attention in object recognition and scene analysis
Adilson Gonzaga, Evelina M. de Almeida Neves, Annie France Frere
An important feature of human vision system is the ability of selective visual attention. The stimulus that reaches the primate retina is processed in two different cortical pathways; one is specialized for object vision (`What') and the other for spatial vision (`Where'). By this, the visual system is able to recognize objects independently where they appear in the visual field. There are two major theories to explain the human visual attention. According to the Object- Based theory there is a limit on the isolated objects that could be perceived simultaneously and by the Space-Based theory there is a limit on the spatial areas from which the information could be taken up. This paper deals with the Object-Based theory that states the visual world occurs in two stages. The scene is segmented into isolated objects by region growing techniques in the pre-attentive stage. Invariant features (moments) are extracted and used as input of an Artificial Neural Network giving the probable object location (`Where'). In the focal-stage, particular objects are analyzed in detail through another neural network that performs the object recognition (`What'). The number of analyzed objects is based on a top-down process doing a consistent scene interpretation. With Visual Attention is possible the development of more efficient and flexible interfaces between low sensory information and high level process.
Image and Video Compression I
icon_mobile_dropdown
Wireless digital map coding and communication for PDA (personal digital assistant) applications
Zhidong Yan, C.-C. Jay Kuo
The transmission of multi-scale digital map information via a wireless link of the personal digital assistant (PDA) system is investigated in this work. We consider a digital map representation of the vector model consisting of 29 layers, in which the road layer plays the most important role. Based on the street segment length, the road layer is classified into different scales. A multi-scale map database can be constructed by adding the classification information without modifying the original database. Unlike the conventional digital map service, where the retrieved map data is first generated as a bitmap image, compressed at the server side and then transmitted to the remote client via wired link, we propose a new approach that can overcome the narrow bandwidth of the wireless channel. The basis idea is to transmit the map drawing commands rather than the rendered bitmap data by assuming that the PDA has the sufficient computational power to render the map at the client side. Preliminary experiments have been done to verify the effectiveness of the proposed scheme. It is demonstrated that the acceptable transmission through the wireless channel of 8 Kbps can be achieved.
Motion-compensated interpolation for low-bit-rate video quality enhancement
Fast motion-compensated frame interpolation (FMCI) schemes for the decoder of the block-based video codec operating in low bit rates are examined in this paper. The main objective is to improve the video quality by increasing the frame rate without a substantial increase in the computational complexity. Two FMCI schemes are proposed depending on the motion vector mapping strategy, i.e. the non-deformable and the deformable block-based FMCI schemes. They provide a trade-off of the computational complexity and the visual performance. With proposed schemes, the decoder can perform frame interpolation using motion information received from the encoder. The complexity of FMCI is reduced since no additional motion search in the decoder is needed as required by standard MCI. It has been observed from experimental results that the visual quality of coded low- bit-rate video is significantly improved at the expense of a small increase in decoder's complexity.
System for guaranteed high-quality video compression
Vaidyanath Mani, Ruifeng Xie, Echyede Cubillo, et al.
High quality video compression is necessary for reduction of transmission bandwidth and in archiving applications. We propose a compression scheme which, depending on the available bandwidth, can vary from lossless compression to lossy compression, but always with guaranteed quality. In the case of lossless compression, the customer receives the original content without any loss. Even the lower compression ratios obtained with lossless compression can represent significant savings in the communication bandwidth. In the case of lossy compression, the maximum error between recovered and the original video is mathematically bounded. The amount of compression achieved is a function of the error bounds. Furthermore, errors are statistically independent from the video content, and thus guaranteed not to create any type of artifacts. So the recovered video has the same quality, visually indistinguishable from the original, at all times and all motion conditions.
Lossless compression of pseudocolor images
Ziya Arnavut, David Leavitt, Meral Abdulazizoglu
In a pseudo-color (color-mapped) image pixel values represent indices that point to color values in a look-up table. Well-known linear predictive schemes, such as JPEG and CALIC, perform poorly when used with pseudo-color images, while universal compressors, such as Gzip, Pkzip and Compress, yield better compression gain. Recently, Burrows and Wheeler introduced the Block Sorting Lossless Data Compression Algorithm (BWA). The BWA algorithm received considerable attention. It achieves compression rates as good as context-based methods, such as PPM, but at execution speeds closer to Ziv-Lempel techniques. The BWA algorithm is mainly composed of a block-sorting transformation which is known as Burrows-Wheeler Transformation (BWT), followed by Move-To-Front coding. In this paper, we introduce a new block transformation, Linear Order Transformation (LOT). We delineate its relationship to BWT and show that LOT is faster than BWT transformation. We then show that when MTF coder is employed after the LOT, the compression gain obtained is better than the well-known compression techniques, such as GIF, JPEG, CALLIC, Gzip, LZW (Unix Compress) and the BWA for pseudo-color images.
Very low bit rate DCT coding by spectral similarity analysis
Yung-Gi Wu, Shen-Chuan Tai
Conventional transform coding schemes such as JPEG process the spectrum signal in a block by block manner due to its simple manipulation; nevertheless it does not consider the similarity of different spectrums. The proposed method devises a translation function, which reorganizes the individual spectrum data to generate the global spectrums according to their frequency band. Among those different bands, high similarity characteristic is existing. Our algorithm analyzes the similarity of those different spectrum bands to reduce the bit rate of transmission or storage. Simulations are carried to many different nature images to demonstrate that the proposed method can improve the performances when compared with other existing transform coding schemes especially at very low bit rate (below 0.25 bpp) requirement.
H.236+ I-frame coding with a hybrid DCT/wavelet transform
A low bit rates, the bit budget for I-frame coding in H.263+ can be too high to be practical. A hybrid DCT/wavelet transform based I-frame coding is proposed in this work as a solution to the rate control problem. This new coder is compatible with the H.263+ bit stream syntax, and aims at an R-D optimized performance with a reasonable amount of computational complexity. By employing fast estimation of the coding efficiency with a rate- distortion model and performing an R-D based rate allocation, the hybrid coding scheme achieves higher coding gain at low bit rates.
Fast synchronization recovery for lossy image transmission with a suffix-rich Huffman code
Te-Chung Yang, C.-C. Jay Kuo
A new entropy codec, which can recover quickly from the loss of synchronization due to the occurrence of transmission errors, is proposed and applied to wireless image transmission in this research. This entropy codec is designed based on the Huffman code with a careful choice of the assignment of 1's and 0's to each branch of the Huffman tree. The design satisfies the suffix-rich property, i.e. the number of a codeword to be the suffix of other codewords is maximized. After the Huffman coding tree is constructed, the source can be coded by using the traditional Huffman code. Thus, this coder does not introduce any overhead to sacrifice its coding efficiency. Statistically, the decoder can automatically recover the lost synchronization with the shortest error propagation length. Experimental results show that fast synchronization recovery reduces quality degradation on the reconstructed image while maintaining the same coding efficiency.
Lossless medical image compression with a hybrid coder
Jing-Dar Way, Po-Yuen Cheng
The volume of medical image data is expected to increase dramatically in the next decade due to the large use of radiological image for medical diagnosis. The economics of distributing the medical image dictate that data compression is essential. While there is lossy image compression, the medical image must be recorded and transmitted lossless before it reaches the users to avoid wrong diagnosis due to the image data lost. Therefore, a low complexity, high performance lossless compression schematic that can approach the theoretic bound and operate in near real-time is needed. In this paper, we propose a hybrid image coder to compress the digitized medical image without any data loss. The hybrid coder is constituted of two key components: an embedded wavelet coder and a lossless run-length coder. In this system, the medical image is compressed with the lossy wavelet coder first, and the residual image between the original and the compressed ones is further compressed with the run-length coder. Several optimization schemes have been used in these coders to increase the coding performance. It is shown that the proposed algorithm is with higher compression ratio than run-length entropy coders such as arithmetic, Huffman and Lempel-Ziv coders.
Wavelet packet image coding with optimized zerotree quantization
Kai Yang, Hiroyuki Kudo, Tsuneo Saito
This paper addresses the problems of how to exploit the space and frequency properties of the wavelet coefficients, and how to design a wavelet packet coder optimally in the rate and distortion sense. From the localization properties of the wavelets, the best quantizer for a wavelet coefficient is expected to match its local characteristics, i.e., to be adaptive both in space and frequency domain. Previous image coders tended to design quantizer in a band or a class level, which limited their performances as it is difficult for the localization properties of wavelets to be exploited. Contrasting with previous coders, we introduce a new image coding framework, where the compaction properties in frequency domain are exploited through the selection of wavelet packets, and the compaction properties in space domain are exploited with the tree-structured wavelet representations. For each wavelet coefficient, its model is estimated from the quantized causal neighborhoods, therefore, the optimal quantizer is spatial-varying and rate sensitive, and the optimization problem is no longer a joint optimization problem as in the SFQ-like coders. The simulation results demonstrate that the proposed coding performance is competitive, and often is superior than those of state of art zerotree-based coding schemes.
Implementation Issues
icon_mobile_dropdown
Wavelet TCQ: submission to JPEG-2000
Philip J. Sementilli, Ali Bilgin, James H. Kasner, et al.
The Joint Photographic Experts Group (JPEG) within the ISO international standards organization is defining a new standard for still image compression--JPEG-2000. This paper describes the Wavelet Trellis Coded Quantization (WTCQ) algorithm submitted by SAIC and The University of Arizona to the JPEG-2000 standardization activity. WTCQ is the basis of the current Verification Model being used by JPEG participants to conduct algorithm experiments. The outcomes from these experiments will lead to the ultimate specification of the JPEG-2000 algorithm. Prior to describing WTCQ and its subsequent evolution into the initial JPEG-2000 VM, a brief overview of the objectives of JPEG-2000 and the process by which it is being developed is presented.
Image and Video Compression II
icon_mobile_dropdown
Progressive video coding for noisy channels
We extend the work of Sherwood and Zeger to progressive video coding for noisy channels. By utilizing a 3D extension of the set partitioning in hierarchical trees (SPIHT) algorithm, we cascade the resulting 3D SPIHT video coder with a rate-compatible punctured convolutional channel coder for transmission of video over a binary symmetric channel. Progressive coding is achieved by increasing the target rate of the 3D embedded SPIHT video coder as the channel condition improves. The performance of our proposed coding system is acceptable at low transmission rate and bad channel conditions. Its low complexity makes it suitable for emerging applications such as video over wireless channels.
Real-time postprocessing technique for compression artifact reduction in low-bit-rate video coding
Mei-Yin Shen, C.-C. Jay Kuo
A computationally efficient postprocessing technique to reduce compression artifacts in low-bit-rate video coding is proposed in this research. We first formulate the artifact reduction problem as a robust estimation problem. Under this framework, the artifact-free image is obtained by minimizing a cost function that accounts for smoothness constraints as well as image fidelity. Instead of using the traditional approach that applies the gradient descent search for optimization, a set of nonlinear filters is proposed to determine the approximating global minimum to reduce the computational complexity so that real-time postprocessing is possible. We have performed experimental results on the H.263 codec and observed that the proposed method is effective in reducing severe blocking and ringing artifacts, while maintaining a low complexity and a low memory bandwidth.
Watermark design for embedded wavelet image codec
Houng-Jyh Mike Wang, C.-C. Jay Kuo
A new scheme to search perceptually significant wavelet coefficients for effective digital watermark casting is proposed in this research. An adaptive method is developed to determine significant subbands and select a number of significant coefficients in these subbands. Experimental results show that the cast watermark can be successfully retrieved after various attacks including signal processing, geometric processing, noise adding, JPEG and wavelet-based compression methods.
Spectral-signature-preserving compression of multispectral data
John A. Saghri, M. S. Laghari, A. Boujarwah, et al.
An enhancement to a previously developed Karhunen- Loeve/discrete cosine transform-based multispectral bandwidth compression technique is presented. This enhancement is achieved via addition of a spectral screening module prior to the spectral decorrelation process. The objective of the spectral screening module is to identify a set of unique spectral signatures in a block of multispectral data to be used in the subsequent spectral decorrelation module. The number of unique signatures found will depend on the desired spectral angle separation, irrespective of their frequency of occurrence. This set of unique spectral signatures, instead of the signature of each and every point of the block of data, will be used to construct the spectral covariance matrix and the resulting Karhunen-Loeve spectral transformation matrix that is used to spectrally decorrelate the multispectral images. The significance of this modification is that the covariance matrix so constructed will not be entirely based on the statistical significance of the individual spectral in the block but rather on the uniqueness of the individual spectra. Without this added spectral screening feature, small objects and ground features would likely be manifested in the low eigen planes mixed with all of the noise present in the scene. Since these lower eigen planes are coded via the subsequent JPEG compression module at a much lower bit rate, the fidelity of these small objects will be severely impacted by the compression-induced error. However, the addition of the proposed spectral screening module will relegate these small objects into the higher eigen planes and hence will greatly enhance preservation of their fidelities in the compression process. This modification alleviates the need to update the covariance matrix frequently over small sub-blocks, resulting in a reduced overhead bit requirement and a much simpler implementation task.
Image partition boundary coding
Paul J. Ausbeck Jr.
This paper introduces two image partition boundary coding models that are composed solely of binary decisions. Because of their simplified decision structure, the models can take advantage of various accelerating schemes for binary arithmetic coding. The number of decisions necessary to describe a partition using either model varies between one and two per pixel location and is proportional to partition complexity. The first model is a binary decomposition of Steve Tate's neighboring edge model. The decomposition employs boundary connectivity constraints to reduce the number of model parameters. The constraints also reduce the number of descriptive decisions to just over one per pixel for typical partitions. A theoretical zero order entropy bound of 1.6 bits per pixel also results. The second model represents a partition as a sequence of strokes. A stroke consists of one or two three-way chains. Chain termination is accomplished without redundant boundary traversal by using a special termination decision at encounters with previously drawn chains. Chain initiation decisions are also conditioned on previously drawn edge patterns. Chain direction decisions are conditioned via a boundary state machine. The paper compares object based boundary coding and pixel based coding, placing the new coders into the latter category. A technique for determining the appropriate application domain of pixel based codes is developed. The new coding models are placed into context with previous pixel based work by the development a new categorization of image partition representations. Four representations are defined, the map coloring, the edge map, the outline map, and the perimeter map. Experiments compare the new methods with other pixel based methods and with a canonical object based method.
Comparison of direct methods for restoration of motion-blurred images
Simple filters used to restore blurred images require knowledge of the point spread function (PSF) of the blurring system. Unfortunately such knowledge is usually not available when the blur is caused by relative motion between the camera and the scene. Various methods addressing this problem were developed in the last four decades. These methods can be divided into two types: direct methods whereby the restoration process is performed in a one step fashion, and indirect methods whereby the restoration process is performed by an iterative technique. Direct methods usually require identification of the PSF as a first step, and then use it to restore the blurred image with a simple filter. Lately, a new direct method was developed. As a result of this development, direct restoration methods (given only a single blurred image) are studied and compared in this paper for a variety of motion types. Various criteria such as quality of restoration, sensitivity to noise and computation requirements are considered.
Image and Video Compression III
icon_mobile_dropdown
Wavelet image compression with optimized perceptual quality
Yung-Kai Lai, C.-C. Jay Kuo
We proposed a perceptual image compression method by using the wavelet transform. This method is different from conventional wavelet coding schemes in that the Human Visual System characteristics is used in the quantization steps. Rather than the amplitude of wavelet coefficients, the contrasts of each resolution are coded. The resulting compression scheme is able to distribute the visual error uniformly over the whole image thus the visual artifact at low bit rate is minimized. Experimental results are given to show the superior visual performance of the new method in comparison with those of the conventional wavelet coders.
Network-friendly video streaming via adaptive LMS bandwidth control
Yon Jun Chung, JongWon Kim, C.-C. Jay Kuo
In this research, we examine the problem of real-time video streaming over the Internet by introducing an adaptive least-mean-squares (LMS) bandwidth controller to adjust the amount of video data uploaded to the network so that the packet loss can be minimized in face of network congestion. The adaptive LMS bandwidth controller, which resides at the client end, sends a feedback signal to the server regarding the available bandwidth that can be supported by the network at a specified packet loss rate. The available bandwidth is continuously updated with the everchanging network conditions. Simulation results are provided to demonstrate the superior performance of the proposed LMS bandwidth controller.
Low-complexity image coding technique using visual patterns
A visual pattern-based image compression technique is presented, in which 4 X 4 image blocks are classified in perceptually significant `shade' and `edge' classes. The proposed technique attempts to make use of neighboring blocks to encode a shade or an edge block by exploiting the Human Visual System characteristics. To reduce correlation present in the shade regions of an image, the mean intensity of a shade block is predicted from the neighboring shade blocks, and the error mean is computed. The error mean of a block is then encoded by choosing an appropriate quantizer based on its predicted mean. The quantizer has been designed after a careful study of the distribution of the error mean of shade blocks in test images, based on Weber's law, to maximize the compression ratio without introducing any visible error. Higher dimension shade blocks (8 X 8 and 16 X 16) are also formed, by merging adjacent shade blocks which further reduces the inter-block correlation. An edge block is assumed to contain two uniform intensity regions (low and high intensity) separated by a transition region. Hence, an edge block can be encoded by coding its edge pattern, low or high intensity and gradient. In order to reduce the inter-block correlation, the edge pattern and mean intensity (low or high) are predicted. The mean intensity of error is encoded by using an appropriate quantizer. Therefore, this technique achieves higher compression ratios, as compared to other visual pattern- based techniques, at very low computational complexity.
Selection of transforms for better image compression
Nitendra Rajput, R. K. Shevgaonkar
This paper investigates the compactness aspect of two transforms namely DCT and WHT. We define a parameter called Activity Index (AI) of the image which is the ratio of the first derivative energy in the edges and the total first derivative energy of the image. Sobel edge operator is used to obtain the edges in the image and the first derivative energy in these edges is computed. It is demonstrated that the activity index provides a good measure of the relative compression which the two transforms would give and therefore can help in choosing the transform for better compression of an image. Computations show that the WHT performs better for images having higher AI (close to 1) whereas if the AI is small (less than 0.5) the DCT's performance is superior. If AI is close to 0.5 both transforms give more or less same compression. The algorithm is tested on a variety of natural as well as robotic type of images.
Modified block-matching motion estimation algorithm for object-based video coding
Mei-Juan Chen, Pei-Chun Lee, Po-Yuen Cheng
Block-matching motion estimation algorithms (BMAs) are widely used to eliminate temporal redundancies for video coding. For BMAs, there is an implicit assumption that the motion within each block is uniform. It is not always valid if the fixed block size is not approximate to the real object in an image. Then the block effect will be noticeable and the quality of the prediction suffers. In this paper, the block-classified motion estimation algorithm is presented. The proposed algorithm classifies the frame into stationary and moving object blocks. The object blocks are then adaptively segmented into different regions according to their motion and edge characteristics. The proposed method can estimate the edge blocks accurately. Experimental results show that this scheme has better performance in terms of objective and subjective measures than the full search and variable block-size quadtree segmentation motion estimation algorithms.
Complexity and PSNR comparison of several fast motion estimation algorithms for MPEG-4
Peter M. Kuhn, Georg Diebel, Stephan Herrmann, et al.
A complexity and visual quality analysis of several fast motion estimation (ME) algorithms for the emerging MPEG-4 standard was performed as a basis for HW/SW partitioning for VLSI implementation of a portable multimedia terminal. While the computational complexity for the ME of previously standardized video coding schemes was predictable over time, the support of arbitrarily shaped visual objects (VO), various coding options within MPEG-4 as well as content dependent complexity (caused e.g. by summation truncation for SAD) introduce now content (and therefore time) dependent computational requirements, which can't be determined analytically. Therefore a new time dependent complexity analysis method, based on statistical analysis of memory access bandwidth, arithmetic and control instruction counts utilized by a real processor, was developed and applied. Fast ME algorithms can be classified into search area subsampling, pel decimation, feature matching, adaptive hierarchical ME and simplified distance criteria. Several specific implementations of algorithms belonging to these classes are compared in terms of complexity and PSNR to ME algorithms for arbitrarily and rectangular shaped VOs. It is shown that the average macroblock (MB) computational complexity per arbitrary shaped P-VOP (video object plane) depicts a significant variation over time for the different motion estimation algorithms. These results indicate that theoretical estimations and the number of MBs per VOP are of limited applicability as approximation for computational complexity over time, which is required e.g. for average system load specification (in contrast to worst case specification), for real-time processor task scheduling, and for Quality of Service guarantees of several VOs.
Postprocessing of compressed 3D graphic data by using subdivision
Ka Man Cheang, Jiankun Li, C.-C. Jay Kuo
In this work, we present a postprocessing technique applied to a 3D graphic model of a lower resolution to obtain a visually more pleasant representation. Our method is an improved version of the Butterfly subdivision scheme developed by Zorin et al. Our main contribution is to exploit the flatness information of local areas of a 3D graphic model for adaptive refinement. Consequently, we can avoid unnecessary subdivision in regions which are relatively flat. The proposed new algorithm not only reduces the computational complexity but also saves the storage space. With the hierarchical mesh compression method developed by Li and Kuo as the baseline coding method, we show that the postprocessing technique can greatly improve the visual quality of the decoded 3D graphic model.
WCRP: a software development system for efficient wavelet-based image codec design
Yiliang Bao, Houng-Jyh Mike Wang, C.-C. Jay Kuo
A wavelet-based image codec compresses an image with three major steps: discrete wavelet transform, quantization and entropy coding. There are many variants in each step. In this research, we consider a versatile software development system called the wavelet compression research platform (WCRP). WCRP provides a framework to host components of all compression steps. For each compression stage, multiple components are developed and they are contained in WCRP. They include a selection of floating-point and integer filter sets, different transform strategies, a set of quantizers and two different arithmetic coders. A codec can be easily formed by picking up components in different stages. WCRP provides an excellent tool to test the performance of various image codec designs. In addition, WCRP is an extensible system, i.e., new components available in the future can be easily incorporated and quickly tested. It makes the development of new algorithms much easier. WCRP has been used in developing a family of new quantization algorithms that are based on the concept of Binary Description of multi-level wavelet coding objects. These quantization schemes can serve different applications, such as progressive fidelity coding, lossless coding and low complexity coding. Both progressive fidelity coding and lossless coding performance of our codec are among the best in its class. A codec of low implementational complexity is made possible by our memory-scalable quantization scheme.
Combined compression and denoising of images using vector quantization
Kannan Panchapakesan, Ali Bilgin, David G. Sheppard, et al.
Compression of a noisy source is usually a two stage problem, involving the operations of estimation (denoising) and quantization. A survey of literature on this problem reveals that for the squared error distortion measure, the best possible compression strategy is to subject the noisy source to an optimal estimator followed by an optimal quantizer for the estimate. What we present in this paper is a simple but sub-optimal vector quantization (VQ) strategy that combines estimation and compression in one efficient step. The idea is to train a VQ on pairs of noisy and clean images. When presented with a noisy image, our VQ-based system estimates the noise variance and then performs joint denoising and compression. Simulations performed on images corrupted by additive, white, Gaussian noise show significant denoising at various bit rates. Results also indicate that our system is robust enough to handle a wide range of noise variations, while designed for a particular noise variance.
JPEG quantizer optimization for financial document images
Mahmoud R. El-Sakka, Khaled S. Hassanein, Mohamed S. Kamel
In this paper, two schemes for optimizing the JPEG quantizer are investigated. The first scheme starts from a given quantization-table and executes a sub-optimal search for quantization-table parameters. Those parameters, when changed by a pre-specified incremental step-size, result in the most optimal move towards decreasing (increasing) the compressed file-size, with a minimal increase (maximum decrease) of error. This procedure is repeated, until a pre- specified file-size is reached. The second scheme is based on performing a mapping from the JPEG default quantization- table to an optimized one. This mapping adapts the default quantization-table to the statistics of the DCT coefficients of the image at hand. The superiority of these two optimized JPEG quantizers is established in terms of the visual quality of their reconstructed images. This was done by running comparative visual image quality experiments involving 20 human subjects. These optimized quantizers were also demonstrated to result in a higher level of machine image quality, by improving the accuracy of cheque amount reading applications. Experimental results indicate that the second optimization scheme yields a higher level of visual image quality, while requiring a fraction of the processing time used by the first scheme.
Restoration and Information Extraction I
icon_mobile_dropdown
New least-squares minimum error variance deconvolution method
Mostafa Abdelhakam Ibrahim, Abdel-Wahab F. Hussein, Samia A. Mashali, et al.
We propose to minimize a cost function, which depends on the values of the input signal to a linear time-invariant system, to reach an optimal estimation of this input signal. This cost function is the square of the error signal between the output and the convolution of the estimated input with the blurring system. The minimization of the cost function is done using an optimization technique which requires the use of an initial estimation of the input signal. Van- Cittert deconvolution method gives this required initial estimation. Singular value decomposition technique is used in estimating the improved input signal.
Efficient new superresolution algorithm based on the suppression of ringing artifacts
A new super resolution algorithm is proposed which can provide bandwidth extension of noisy images in a small number of iterations and is potentially capable of real time operation. Attempts to restore band-limited images frequently introduce ringing artifacts. Methods designed to reduce this ringing often suppress sharp features in the scene. In images with a well-defined background intensity, super resolution techniques involving a positive constraint are effective in suppressing this ringing and providing a high degree of bandwidth extension. Problems arise, however, in a general image where no such well-defined background exists. The first stage of the algorithm reported here addresses these problems by computing an effective background. Features that need to be enhanced then exist as blurred deviations from this background. The background is computed from the first and second differentials of the image with respect to further blurring. It has been possible to suppress ringing artifacts, resulting in bandwidth extension, by comparing the calculated background with the known original blurred image. An iterative procedure based on Gerchberg's error energy reduction technique has produced good results. Computer calculations applied to both synthetic images and real millimeter wave images show that the algorithm is effective, efficient and largely immune to noise.
Wavelet-based restoration for scenes with smooth bases
Viviana Sandor, Stephen K. Park
This paper demonstrates results of wavelet-based restoration for scenes with pixel-scale features and various degrees of smoothness. The model of choice is the so-called C/D/C system model that represents the image acquisition process by accounting for system blur, for the effects of aliasing, and for additive noise. Wavelet domain modeling discretizes both the image acquisition kernel and the representations of scenes and images. In this way the image restoration problem is formulated as a discrete least squares problem in the wavelet domain. The treatment of noise is related to the singular values of the image acquisition kernel. We show that pixel-scale features can be restored exactly in the absence of noise, for various degrees of smoothness. Results are similar in the presence of noise, except for some noise- amplification and ringing artifacts that we control with an automated choice of a restoration parameter. This paper extends work in wavelet-based restoration, and builds on research in C/D/C model-based restoration.
Automated change feature extraction systems in remote sensing
Xiaolong Dai
To enhance the ability of remote sensing system to provide accurate, timely, and complete geospatial information at regional and/or global scale, an automated change detection system has been and will continue to be one of the important yet challenging problems in remote sensing. This research was designed to evaluate the requirements and develop the techniques for an automated change detection system at landscape level using various geospatial data including multisensor remotely sensed imagery and ancillary data layers. These techniques are included in three subsystems: automated computer image understanding, multisource data fusion, and database updating and visualization. This paper summarizes what has been achieved so far in this research. The experiments have been focusing on three major interrelated components. In the first component, the impact of misregistration on the accuracy of remotely sensed land cover change detection was quantitatively investigated under Landsat Thematic Mapper images. In the second components, a new feature-based approach to automated multitemporal and multisensor image registration was developed. Feature matching was done in both feature space and image space based on moment invariant distance and chain code correlation. The characteristic of this technique is that it combines moment invariant shape descriptors with chain code correlation to establish the correspondences between regions in two images. In the third component, the algorithms for an automated change detection system utilizing neural networks were developed and implemented. This work has implications on improving the efficiency and accuracy of the change feature extraction and quantification at all levels of applications ranging from local to global in scale.
Real-time blocking artifact reduction system based on spatially adaptive image restoration
Jeongsang Lee, Yoonsik Choe
This paper proposes the algorithm which removes blocking artifacts in DCT-based images and is able to be implemented in DSP chip. In general, the low pass filter is used for removing blocking artifacts. However this filter has the disadvantage of removing the information in the high frequency region, and then degrades the quality of images. Conventional approaches of image restoration improve the quality of images, but have too much computation. Thus, they are not proper for the real-time process. CLS filter is one of the conventional approaches of image restoration. This filter enhances degraded images very well, however because this filter requires too much computations and sometimes uses the iterative method. Thus, it is not suitable for the real-time process. Therefore, this paper modifies CLS filter in the FIR filter form. To acquire good quality of images, blocks are classified by the characteristics of the blocks and then are filtered. The proposed algorithm removes blocking artifacts of JPEG, MPEG, H.261 and H.263, etc. To implement the proposed algorithm, TMS320C31 which is a DSP chip is used.
Improved method for reduction of truncation artifact in magnetic resonance imaging
In Fourier magnetic resonance imaging (MRI), signals from different positions in space are phase-encoded by the application of a gradient before the total signal from the imaged subject is acquired. In practice, a limited number of the phase-encoded signals are often acquired in order to minimize the duration of the studies and maintain adequate signal-to-noise ratio. However, this results in incomplete sampling in spatial frequency or truncation of the k-space data. The truncated data, when Fourier transformed to reconstruct, give rise to images degraded by limited resolution and ringing near sharp edges, which is known as the truncation artifact. A variety of methods have been proposed to reconstruct images with reduced truncation artifact. In this work, we use a regularization method in the context of a Bayesian framework. Unlike the approaches that operate on the raw data, the regularization approach is applied directly to the reconstructed image. In this framework, the 2D image is modeled as a random field whose posterior probability conditioned on the observed image is represented by the product of the likelihood of the observed data with the prior based on the local spatial structure of the underlying image. Since the truncation artifact appears in only one of the two spatial directions, the use of conventional piecewise-constant constraints may degrade soft edge regions in the other direction that are less affected by the truncation artifact. Here, we consider more elaborate forms of constraints than the conventional piecewise- smoothness constraints, which can capture actual spatial information about the MIR images. In order to reduce the computational cost for optimizing non-convex objective functions, we use a deterministic annealing method. Our experimental results indicate that the proposed method not only reduces the truncation artifact, but also improves tissue regularity and boundary definition without degrading soft edge regions.
Edge-preserving MAP estimation of motion vector fields in noisy low-dose x-ray image sequences
Til Aach
We describe a motion compensated temporally recursive noise reduction technique especially suited for sequences of moving X-ray images, where we focus on a robust motion estimator which is able to deal with the high noise levels in such images. These noise levels are caused by the very low X-ray dose rates used in medical real-time imaging (quantum-limited imaging). The robustness of our motion estimator is achieved by spatiotemporal regularization using a generalized Gauss-Markov random field. Unlike quadratic regularization by Gauss-Markov random fields, generalized Gauss-Markov random fields are able to account for motion edges without the need to explicitly specify an detection threshold. Instead, our model controls edges by a `soft' parameter, which gradually allows the regularization term to behave like a median filter, which preserves edges without using detection thresholds.
Texture-mapped 3D face modeling method on network using Java
Hiroyuki Sato, Masaharu Shimanuki, Takao Akatsuka
Texture mapping is used frequently for the 3D object in virtual environment. However, it is not easy to realize such texture mapping on the web browser because execution of program loaded over the network is limited by the WWW security. We propose a simple and effective method that enables to map an image in the client file onto the 3D model loaded from web server. All operations can be done on the web browser. Calculation for texture mapping and for output 3D representation are executed in the client computer, so loading task of web server is decreased. To demonstrate the procedure, we use a 3D head model as the target of texture mapping. User can map his face picture to this model by himself and use the generated texture mapped 3D head model as his `avatar' in virtual space. This method is implemented by Java and JavaScript, and evaluated in the actual computer system from calculation time.
Automatic signature verification based on the wavelet descriptor
Zijun Yang, C.-C. Jay Kuo
To extract relevant features hidden in signatures is an important step in signature verification system. A `good' feature makes it easier and more efficient to discriminate forgery signatures from genuine ones. In this research, a new technique employing the wavelet feature for automatic signature verification is proposed. Signatures can be characterized by their position information either in the time domain with a graphic tablet or in the space domain with the off-line tracing algorithm. An efficient tracing and representation algorithm for off-line signatures is considered here. First, the dynamic position function of static signatures can be recovered by a new tracing algorithm. Then, we develop a hierarchical feature extraction method to represent signatures by using the planar wavelet descriptor. The wavelet descriptor has many nice properties such as multiresolution, invariance, uniqueness, stability, and spatial localization. With the wavelet descriptor, stable, efficient and representative features are extracted from signatures. It has been demonstrated with experimental results that features described by the wavelet descriptor have a better performance than Fourier descriptors in the signature verification application. Finally, we perform extensive system evaluation, and show that the wavelet descriptor technique provides significant advantages in feature extraction and information reduction for signature verification.
Setting thresholds in infrared images for the detection of concealed weapons
Mohamed-Adel Slamani, Mark G. Alford, David D. Ferris Jr.
This paper addresses the problem of finding the important thresholds in a scene for the detection of concealed weapon(s). Whenever the weapon's temperature is very close to that of the human body, the intensities in the area of the weapon are close to those of the human body. Thus, in a histogram, the intensities of the weapon area overlap with those of the human body causing (1) the weapon's intensities not to be identifiable in the histogram of the overall scene, and (2) the weapon not to be visually distinguishable in the scene. This problem is addressed by the mapping procedure of A'SCAPE. The procedure automatically detects all important thresholds in the scene including those separating regions with overlapping histograms. Real data of an IR scene is used to illustrate the procedure.
Restoration and Information Extraction II
icon_mobile_dropdown
Charting image artifacts in digital image sequences using velocital information content
This paper introduces a metric called Velocital Information Content (VIC) which is used to chart quality variations in digital image sequences. Both spatially-based and temporally-based artifacts are charted using this single metric. VIC is based on the velocital information in each image. A mathematical formulation for VIC is shown along with its relation to the spatial and temporal information content. Some strengths and weaknesses of the VIC formulation are discussed. VIC is tested on some standard image sequences with various spatio-temporal attributes. VIC is also tested on a standard image sequence with various degrees of blurring using a linear blurring algorithm. Additionally, VIC is tested using standard sequences that have been processed through a digital transmission algorithm. The transmission algorithm is based on the discrete cosine transform, and thus introduces many of the known digital artifacts such as blocking. Finally, the ability of VIC to chart image artifacts is compared to a few other traditional quality metrics. VIC offers a different role from traditional transmission-based quality metrics which require two images: the original input image and degraded output image to calculate the quality metric. VIC can detect artifacts from a single image sequence by charting variations from the norm. Therefore, VIC offers a metric for judging the quality of the image frames prior to transmission, without a transmission system or without any knowledge of the higher quality image input. The differences between VIC and transmission-oriented quality metrics, can provide a different role for VIC in analysis and image sequence processing.
Motion-distorted composite frame restoration
Several imaging systems produce pictures by superimposing the two fields of frames of interlaced sequences. The pictures obtained in this way, which are termed composite frames, are severely degraded if relative motion between the camera and the scene occurs. In the presence of motion the composite frame is affected by two types of distortions: the edge `staircase effect' due to the fact that objects appear at different positions in successive fields, and motion blur de to the scene motion during each field exposure. Motion- deinterlacing methods previously proposed to recover the `staircase effect' neglect the motion blur. However the motion blur may be significant, especially in systems designed for low intensity radiometric imaging which use long exposures, or even in short exposure systems which happen to be in moving vehicles such as tanks, planes, ships, etc.. In this paper we introduce an algorithm for restoration of the two types of distortions in a composite frame degraded by linear uniform motion.
Model-based restoration of vibrated images with geometry-free tomography: in-vessel viewing with the ITER European Home Team's probe
Peter Jakubik
A novel approach to restoration of strongly vibrated images is described. The technique uses geometry free tomography method to formulate the task and the Theory of Convex Projections to restore the image. For the restoration no other information is used than the vibrated image itself and the model of the vibration. Subsequently it is shown that the parameters of the vibration model could be easily estimated. The paper emphasizes the insight to the problem as well as describes numerical experiments showing superb performance of the method applied to an international project of designing fusion reactors. It also can be applied to other complex tasks where the need of an acceptable visual observation is higher than the level of the technically available or economically justified potentialities, as in space robotics, machine hand eye coordination, vehicle guidance, mid-flight aircraft refueling, etc.
Content-based indexing for medical image databases
Kin-man Cheung, Vincent T. Y. Ng
In the large medical image databases system, a content-based indexing structure is often established from the image feature vectors so as to allow fast retrievals of medical images. However, these vectors will generally be having a high number of dimensions which will then result in poor indexing performance. In this paper, we investigate how to improve the search performance of the packed R-tree when its indices are of high dimensions. Two new algorithms are designed according to their different approaches in applying the idea of principal component analysis technique. The first algorithm performs a dominant dimension analysis globally, and selects the first few dominant dimensions for the packing steps. Further, the same set of dominant dimensions are used in calculating image similarities afterwards. The second algorithm is differed from the first one by re-applying the analysis at each tree node, and hence obtaining a better set of dominant dimensions of the image data under the sub-tree headed by the node. In developing the second algorithm, we have also considered how to reduce the calculations by utilizing the results of the tree nodes at lower levels. This paper reports the performance of the two algorithms with different data sets. The algorithms are tested with a set of random generated images, and a real medical image database of about 2,000 MRI. In the experiments, we observe a better retrieval performance in the second algorithm. Similar results are reported even when the data are highly randomized.
Unsupervised segmentation of 3D brain MR images
Chulhee Lee, Shin Huh
In this paper, we propose an algorithm for unsupervised segmentation of 3D sagittal brain MR images. 3D images consist of sequences of 2D images. We start the 3D segmentation from mid-sagittal brain MR images. Once these mid-sagittal images are successfully segmented, we use the resulting images to simplify the processing of the more lateral sagittal slices. In order to segment mid-sagittal brain MR images, we first apply thresholding to obtain binary images. Then we find some landmarks in the binary images. The landmarks and anatomical information are used to preprocess the binary images. The preprocessing includes eliminating small regions and removing the skull, which substantially simplifies the subsequent operations. The strategy is to perform segmentation in the binary image as much as possible and then return to the original gray scale image to solve problematic areas. Once we accomplish the segmentation of the mid-sagittal brain MR image, the segmented brain area is used as a mask for adjacent slices. Experiments show promising results.
Separability measures for error estimation of two normally distributed classes
Joonyong Hong, Chulhee Lee
In pattern classification and remote sensing, the Gaussian ML classifier is most widely used because of its speed and robustness. In this paper, we propose to use two separability measures, Bhattacharyya distance and divergence to estimate the classification error of the Gaussian ML classifier. In the proposed method, we try to find empirical relationship between the separability measures and the classification error. In order to find such relationship, we generate two classes with normal distribution and compute the separability measures and classification error between the classes. Although there are infinite number of possibilities that two classes can have, we systematically search the whole mean-covariance space. From this exhaustive search, we are able to estimate the classification error accurately using the Bhattacharyya distance and divergence. It is observed that the error estimation using both the Bhattacharyya distance and divergence does not give a significant improvement over the error estimation using the Bhattacharyya distance only.
Superresolution techniques with application to guided munitions
David S. Flynn, Breck A. Sieglinger, Bernard P. Asner Jr., et al.
This paper discusses the implementation and evaluation of several different algorithms for image superresolution (SR). Such processing is of interest in many imaging situations where resolution is limited by range, wavelength, aperture size, detector size, or other physical or practical constraints. A relevant example is the application of improved resolution to passive millimeter wave imaging sensors for munitions systems. In this paper, we refer to superresolution as processing which recovers spatial frequency components of a measured image that are completely suppressed by the image formation process. We demonstrate performance of several iterative algorithms, and discuss several aspects of the implementation and evaluation of SR processing.
Poster Session
icon_mobile_dropdown
Algorithms of digital image processing in the ground aperture synthesis array
Konstantin N. Sviridov, Nicolay D. Belkin, Galina Yu. Sviridova
Image processing algorithms in the ground aperture synthesis array are proposed and studied. Considering `island' nature array OTF showed that our modified method version of triple correlations is optimal for phase locking of spatial spectrum from the object between `islands' and the iterational algorithm for the equation evolvent of the locked phases is proposed for the phase restoration. The experimental results of perspective technologies for image forming and processing in a ground-based aperture synthesis array was carried out by the methods of physical and machine simulation. The results of model experiment confirmed the efficiency of proposed perspective technologies for reaching of high angular resolution.
Optical/structural machine analysis of a material's microstructure as a basis for formation of constructional strength of details
Eduard I. Ulianov, A. V. Liasnikov, Alexey A. Lavrov, et al.
The major stage of computer design of the constructional strength of items is the adequate mathematical description of processes elastic-plastic deformation and destruction of material under active loading. Developed method (method of microstructural measurements and structural-phenomenology criterion of destruction) is based on synthesis mechanics of solid medium with a structure, modern metallography, physics of plasticity and destruction of materials (synergetics and theory of fractal structures), use of integral models. It has allowed realizing computer designing of optimum constructional strength of series of details in various conditions of loading by management of process engineering of their obtaining.
Improved scheme of run-graph encoding for efficient base presentation of line-drawing images
Zao Jiang, Jun'an Hu, Jiren Liu, et al.
The run-graph is an efficient base representation of line drawing images, which is firstly proposed by Monagan and Roosli. It saves a large amount of memory space for storing a line drawing image whereas preserves all raster information of the original image, and the mapping of the image into a run-graph is bijective. It delivers the topological information of the image by constructing node area and edge areas. In this paper, we present an improved run-graph scheme with all efficiencies of the original run- graph preserved based on practices of run-graph representation. The improvements are mainly in two respects: (1) introducing an incline coefficient for judging the short run-length, which is equivalent to be capable to adjust the skew degree of the image, or change the position of a turning point. The characteristic of run-graph generation with the inclined coefficient is studied, and thus presents a simple method for deleting the redundancy turning points. (2) Presenting definitions of a protruding noise and a concave noise and their deleting methods. The experimental example confirms that the given scheme improves the accuracy of the mapping of the topological structure of a line drawing image into node and edge areas of run-graph representation. It provides an efficient information representation for furthermore processing and recognition.
Segmentation of textured images using local spatial-frequency representation
Yue Tao
This paper presents an algorithm for segmentation of textured images. The algorithm uses an unsupervised neural network and K-means method to classify image pixels based on image local spatial-frequency information. The short-time Fourier transform employs large size window function to extract more neighborhood information. The problems of introducing large size windows in classifying larger transient regions are also investigated. High segmentation resolution is obtained by an novel image extrapolation approach and a re-classification procedure for transient areas.
Skeletonization with hollow detection on gray image by gray weighted distance transform
Prabir Bhattacharya, Kai Qian, Siqi Cao, et al.
A skeletonization algorithm that could be used to process non-uniformly distributed gray-scale images with hollows was presented. This algorithm is based on the Gray Weighted Distance Transformation. The process includes a preliminary phase of investigation in the hollows in the gray-scale image, whether these hollows are considered as topological constraints for the skeleton structure depending on their statistically significant depth. We then extract the resulting skeleton that has certain meaningful information for understanding the object in the image. This improved algorithm can overcome the possible misinterpretation of some complicated images in the extracted skeleton, especially in images with asymmetric hollows and asymmetric features. This algorithm can be executed on a parallel machine as all the operations are executed in local. Some examples are discussed to illustrate the algorithm.
Real-time object matching
Aiming Huang, Zheng Gao, Bin Dai, et al.
Abstract not available.
Comparison of classical and multiscale spatially adaptive filters for the restoration of images degraded by atmospheric turbulence
Christine Bondeau, El-Bay Bourennane
This paper deals with the restoration of images designed by the atmospheric turbulence. Atmospheric turbulence imposes a strong limit for observation on long propagation paths. For standard video frequencies, the image of a distant object observed through turbulence is blurred. The degradation extent depends on the value of the Fried parameter r0, which characterizes the turbulence strength. Knowing r0 allows us to estimate the turbulence transfer function. The image can then be processed by means of a classical deconvolution filter. We compare the results obtained with the standard Wiener filter and an improved Wiener filter to those obtained through the use of a multiscale spatially adaptive filter. As expected, this last type of filtering gives better result. The degraded test image has been obtained by turbulent wavefront simulation.
Image retrieval based on wavelet vector quantization
Tao Xia, Jingli Zhou, Shengsheng Yu, et al.
In this paper, we describe a new image indexing and retrieval algorithm for large image databases based on the wavelets decomposition and vector quantization (VQ). The algorithm characterizes the color variations over the spatial extend of the image in a manner that provides semantically-meaningful image comparisons. To speed up retrieval, a two-step procedure is used that first makes a rough comparison based on the coarse features, and then refine the search by performing a fine feature vectors match between the selected images and the query. By adopting these wavelet VQ coding features, images can be compressed and indexed simultaneously, thus decreasing the complexity of database management. For the feasibility and practicality of the approach, a prototype system has been developed and tested with some experiments. Promising results have been obtained in experiments using a database of 15,000 general purposed images.
Scanning image binarization algorithm for full 3D shape
Abstract not available.
Discriminating a specified digital image from noise process
Nicholas A. Nechval, Konstantin N. Nechval
In this paper, the problem of discriminating a specified signal from noise process is considered, where the signal is associated with a digital image. In the univariate case it is well known that the one-sided t-test is uniformly most powerful for the null hypothesis against all one-sided alternatives. Such a property does not easily extend to the multivariate case. In the present paper, a test is derived for the hypothesis that the main of a vector random variable is zero against specified alternatives, when the covariance matrix is unknown. This test depends on the given alternatives and is more powerful than Hotelling's T. The test is invariant to intensity changes in a background of Gaussian noise and achieves a fixed probability of a false alarm. Thus, operating in accordance to the local noise situation, the test is adaptive. The properties of the proposed test are investigated when a single alternative is specified.
Multimodule method for detection of a human face from complex backgrounds
Ming Xu, Takao Akatsuka
In this paper, we proposed a multi-module method for detecting human face from complex background. Each module utilizes one of the cues of human face, such as color, geometric property, motion, depth and etc.. Facial region is obtained by fusion of detected parameters result from each module, and the result is surely robust than that of using any single module. As an actual preliminary implementation, we constructed a face detection system using color module and geometric module. Although only two modules are realized in our system, the conceptual structure of the proposed multi-module method is through to be an efficient solution to the problem of face detection. Multi-module method can not only improve the robustness of detection, but also make the development of new applications, especially face detection systems an easy task, which only need to combine the extra modules with other pre-existing modules. Experimental results show that faces in complex background can be extracted efficiently with multi-module method.
Complete automatic target recognition system for real-time human face images
Haisong Liu, Minxian Wu, Guofan Jin, et al.
In this paper, a complete human face automatic target recognition system is presented. Input images of human faces are grabbed at standard video rates and recognized in real- time. The database contains 200 face images stored in computer memory in advance. The system includes a feature point extractor, a normalization algorithm, an encoding module, an optical recognizer which is based on hit/miss transform and an incoherent correlator, and a postprocessing module. When a real-time face image is picked up by CCD camera, three reference points including two internal corners of both eyes and the mouth center, are extracted automatically for normalizing use. Then the face image is normalized to a fixed scale and posture by using the affine transform. The normalized image will then be encoded to an original-complementary composite encoding image and be sent to a liquid crystal display panel, which is used here as a real-time spatial light modulator, to be recognized by the optical recognizer. Experimental results have shown that the system has an accuracy over 94%. Its tolerance abilities to head turning, nodding, and tilting are 20 degree(s), 5 degree(s), and 15 degree(s) respectively. It is also tolerable to noise disturb up to 30%, to image lost up to 40%, and even to different expressions such as seriousness, smile, and astonishment.
Iterative denoising and simultaneously reconstructing algorithm: a new approach for images degraded by multiplicative signal-dependent noise
Images of neutron distributions provided by penumbral imaging are degraded by highly signal-dependent noise, so it is difficult to separate object information from noise and to solve the inverse problem. The Richardson-Lucy restoration algorithm is a well-known method to deconvolve images given the point-spread function of the aperture, but it is very sensitive to noise fluctuations. We present a way to denoise images by a multiresolution method, and then apply a modified version of the Richardson-Lucy algorithm twice to deconvolve and to denoise simultaneously neutron images.
Auto-recognition and positioning in the first frame of the target image with an improved projection algorithm
Shizhou Yang, QingYao Luo, LiangZheng Xia, et al.
Automatic recognition and positioning in the first time of the target image is one of the key components to ATR. An improved projection positioning and projection-moment recognition method is proposed. It can be used to realize the automatic acquisition, positioning as well as recognition in the first frame of target image. The threshold range of image segmentation for positioning and recognition is significantly enlarged compared with traditional projection positioning method.
Motion stereo based on adaptive correlation matching
Vitaly Kober, Mikhail G. Mozerov, Minsik Park, et al.
A new algorithm to compute precise depth estimates for motion stereo is described. Input data is obtained from a single CCD camera and a moving belt. It is shown that the problem of matching among multiple motion stereo images can be effectively carried out by use of adaptive correlation matching. Experimental results with real stereo images are presented to demonstrate the performance of the algorithm.
Vegetation classification method with spectral, spatial, and temporal variability for Landsat/TM imagery
Dikdik Setia Permana, Takanori Nakajima, Tetsuya Yuasa, et al.
A vegetation classification model, which takes account of not only spectral information of data but also spatial and temporal information, is proposed for high spatial resolution multispectral scanner data such as Landsat/TM (thematic mapper) images. For this purpose, Markov random field model (MRF) is introduced for spectral, spatial and temporal information of data. The MRF exploits spatial class dependencies between neighboring pixels in any image and temporal class dependencies between temporal sequences. By integrating spectral, spatial, and temporal information in the classification model, it is expected to improve classification accuracy. The performance of the proposed model is investigated by using actual Landsat/TM temporal images. The experimental results shows that the classification accuracy of proposed model is about 5.09% higher than Maximum Likelihood Method that used as reference model. From this experiment, we can conclude that the proposed model is useful for classification of Landsat/TM images.
Fast algorithm of Hoteling basis construction
Vitalij N. Kurashov, Olexandr M. Soloveyko, Yurij S. Musatenko
The paper suggests new fast algorithm for approximate Hoteling basis construction which is known to be important in pattern recognition. In contrary to traditional methods the algorithm permits to classify signals with high capacity. The presented method is based on wavelet transform and it is fast. Efficiency of the constructed approximate Hoteling filter is demonstrated. Experimental results are obtained for texture classification problem. The comparison with recognition using signs obtained via approximate Karhunen-Loeve transform is conducted.
Classification of wooden plates by visual inspection based on the characteristics of the entropic threshold method
Evandro Luis Linhari Rodrigues, Valentin Obac Roda
This paper presents a new visual inspection method to classify wooden plates used in pencil manufacturing. Wooden plates with darker regions possibly have growth rings. Pencils manufactured with these plates are more difficult to sharpen and have a tendency to bend and crack, therefore these plates are classified as not adequate for pencil manufacturing. The proposed method is based on the extraction and analysis of the features of the wooden plates using gray level images. The method classifies the plates using the results obtained by an automatic threshold determination based in Shannon's entropy. The method was idealized aiming low computational complexity, i.e., the algorithm calculations involving only simple operations such as addition, subtraction, multiplication and division which could be implemented using VLSI technology. The wooden plate is mapped in an optimal number of regions. Each region is pre-classified considering as relevant features the total entropy, the total entropy curve asymmetry, the threshold level found for the region, the ratio between the entropy of the shapes and the background entropy and also the deviation between the shapes' maximum entropy point and the background's maximum entropy point. All the region information are combined based in heuristic decision rules, arriving in a pre-classification stage where the regions are labeled in four classes (A, B, C and X), letter A represents the best class. Two decision algorithms have been investigated for the final classification: the first one is based on a co-occurrence matrix considering only unite- directional horizontal neighborhood of the regions and the second one based in a heuristic method of information reduction considering combinations of the pre-classified regions. The final results obtained by the two algorithms are compared with the classification made by a human expert, demonstrating that the proposed method had a very good performance.
Application of LUM filters with automatic parameter selection to edge detection
M. Miloslavski, Tae-Sun Choi
This paper examines the use of nonlinear order-based filter, LUM filter as a prefilter for gradient edge detectors. The output of this filter is an order statistic from an observation vector. Wide range of characteristics can be achieved with the same filter structure by changing its parameters. Its ability to suppress impulse noise and enhance edges leads to significant improvement in edge map. An algorithm for automatic filter parameters adjustment is proposed based on the impulse noise level identification. Algorithm efficiency to set optimal parameters is demonstrated. Its accuracy in determining noise level is evaluated and drawbacks are discussed.
PCA-based active contour model for detection and tracking of the left ventricle in apical echocardiographic sequences
Mehdi Halit, Jean-Paul Dubus
Detecting and tracking the left ventricle in echocardiographic images is a very hard task due to the presence of noise. It is generally done by a cardiologist expert who traces manually the contour for all frames of the sequence representing the whole cardiac cycle. Our aim is to build a computer-aided system for contour detection in order to minimize the human intervention. To do this, we make use of deformable templates which deform in conformation to salient image features. A special case consists in the active contour model (snake). We are interested in applying such models for this kind of images, and in particular for apical views. We propose a new model based on a combination of the active contour model and the PCA (Principal Component Analysis) algorithm. In fact, we add in the definition of the energy associated to the model a new term issued from the PCA done on a training basis of contours. This term is used to help the snake in order to avoid being trapped in wrong positions due to noise. The results obtained are very satisfying, and the snake converges in almost cases. Comparison is done with manually traced contours done by a cardiologist expert, and with the original snake.
Methods for objective evaluation and improvement of text document images
Valery S. Kot, Alexander V. Bondarenko
Any optical character recognition (OCR) system contains preprocessing unit responsible for image binarization, and the entire recognition rate depends dramatically from the accuracy of this unit. In case of poor image quality user must spend much time to find out the best parameters of this unit while recognition rate may still remain unsatisfactory. Thus methods intended for objective evaluation and context- sensitive improvement of text document images are required. In this parameters set is proposed as a tool for integral image description. This compact set allows to select automatically or semiautomatically the optimal image processing sequence from the basic IP functions. For all tested commercial OCR systems, the proposed methods result in recognition errors decreasing about 50 - 60% for text document images of average and poor quality while requiring less than 1 minute per page of additional processing time.
Structure-adaptive evaluation of additive noise level in images
Iryna B. Ivasenko, Roman M. Palenichka
In the proposed paper, the problem of noise evaluation is considered with application to image filtering and segmentation. The underlying structural model of original image is considered which describes the shape of image objects or their parts. The distinctive feature of the presented model is the separate modeling of object's planar shape as well as the image intensity function. For the intensity function model of original image, a piecewise polynomial model of low degrees (up to the second one) is considered. Then, noise to be evaluated is treated in a broad sense, namely as the intensity residuals of the piecewise polynomial modeling. It is also assumed that most of the pixels satisfy the polynomial model except for a relatively small number of edge points between homogeneous regions and fine details. A robust noise variance estimator is proposed for the images corrupted by outliers, i.e. impulsive noise.
Image compression improvement by prefiltering
An algorithm to improve the image compression ratio, by applying low-pass filtering before the compression process is presented. Pre-filtering images prior to encoding can remove high frequencies of the original image, and thus improve the overall performance of the coder. The image degradation caused by the filter combined with the non- linear transformation of a typical compression algorithm reduce the entropy of the original image, thus higher compression ratios can be achieved. The perceived image at the decoder side is reconstructed according to a priory knowledge of the degraded filter by applying decompression and inverse filtering. The results of this work show an improvement of the compression ratio compare to the original Joint Picture Experts Group (JPEG) algorithm, with only small reduction of Mean Square Error. Our algorithm also succeeds to reduce the blocking effect that exists in the original JPEG algorithm.
Intelligent data elimination for a rare event application
Rare event applications are characterized by the event-of- interest being hidden in a large volume of routine data. The key to success in such situations is the development of a cascade of data elimination strategies, such that each stage enriches the probability of finding the event amidst the data retained for further processing. Automated detection of aberrant cells in cervical smear slides is an example of a rare event problem. Each slide can amount to 2.5 gigabytes of raw data and only 1 in 20 slides are abnormal. In this paper we examine the use of template matching, artificial neural networks, integrated optical density and morphological processing as algorithms for the first data elimination stage. Based on the experience gained, we develop a successful strategy with improves the overall event probability in the retained data from 0.01 initially to 0.87 after the second stage of processing.
Simulated fluid flow in feature enhancement
David Liebowitz, Farzin Aghdasi
This paper describes a technique for enhancing certain features in grayscale images. Of particular interest are the class of objects that are reasonably large in extent, but are only faintly darker or lighter than the background. An example of such objects is the mandibular canal which appears in panoramic dental X-ray images. Identification of this canal is required in some dental and orthodontal investigations. Traditional image segmentation techniques often fail to detect the full extent of the canal due to the large amount of structural noise in the image. We propose a new method for the enhancement of this class of objects and the subsequent segmentation task. The flow of a fluid is simulated over the image topology, allowing fluid to settle in local minima and, by application of a difference image, enhancing the visibility of features that are characterized by a significant spatially-distributed local minimum. The procedure is similar to the watershed algorithm in concept, visualizing the movement of fluid over the image surface to draw conclusions about significant local minima. Our approach is different however since it is not aimed at segmenting the image, but enhancing distributed local minima. We consider the flow of fluid from high lying to lower lying areas under gravity. This is analogous to the rain fall method of filling the catchment basins in watershed segmentation. Various models of flow, based on co- operative networks are presented and discussed. Post processing is applied to reduce the amount of false outputs. We demonstrate that our proposed method is more suitable than simple edge detection or the watershed algorithms for the enhancement and segmentation of the mandibular canal.