Proceedings Volume 1605

Visual Communications and Image Processing '91: Visual Communication

cover
Proceedings Volume 1605

Visual Communications and Image Processing '91: Visual Communication

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 1 November 1991
Contents: 14 Sessions, 90 Papers, 0 Presentations
Conference: Visual Communications, '91 1991
Volume Number: 1605

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Hierarchical Image Coding
  • Super High Definition Image Systems
  • Hierarchical Image Coding
  • Still Image Coding
  • Video Sequence Coding II
  • Hierarchical Image Coding
  • Visual Communication Hardware
  • Motion Estimation and Motion Analysis
  • Hierarchical Image Decomposition
  • Vector Quantization
  • Entropy Coding
  • Visual Communication Hardware
  • Motion Estimation and Motion Analysis
  • 3-D Motion Analysis
  • Video Sequence Coding I
  • Super High Definition Image Systems
  • Image Transmission and Communication Systems
  • Vector Quantization
  • Model Based Image Coding
  • Video Sequence Coding I
  • Model Based Image Coding
  • Entropy Coding
  • Additional Papers
  • Motion Estimation and Motion Analysis
  • Hierarchical Image Decomposition
  • Still Image Coding
  • Hierarchical Image Decomposition
  • Vector Quantization
  • Entropy Coding
  • Model Based Image Coding
  • Image Transmission and Communication Systems
  • Super High Definition Image Systems
  • 3-D Motion Analysis
  • Video Sequence Coding I
  • Motion Estimation and Motion Analysis
  • Entropy Coding
  • Hierarchical Image Decomposition
  • Video Sequence Coding II
  • Hierarchical Image Decomposition
  • Visual Communication Hardware
  • Video Sequence Coding I
  • Still Image Coding
  • Hierarchical Image Coding
  • Still Image Coding
  • Image Transmission and Communication Systems
  • 3-D Motion Analysis
  • Entropy Coding
  • Motion Estimation and Motion Analysis
  • Vector Quantization
  • Additional Papers
  • Still Image Coding
  • Entropy Coding
  • Additional Papers
  • Video Sequence Coding II
  • Still Image Coding
  • Motion Estimation and Motion Analysis
  • Hierarchical Image Coding
  • Super High Definition Image Systems
  • Motion Estimation and Motion Analysis
  • Image Transmission and Communication Systems
  • Model Based Image Coding
  • Video Sequence Coding I
  • Vector Quantization
  • Image Transmission and Communication Systems
  • 3-D Motion Analysis
  • Image Transmission and Communication Systems
  • Super High Definition Image Systems
  • Image Transmission and Communication Systems
  • Visual Communication Hardware
  • Video Sequence Coding I
  • Image Transmission and Communication Systems
  • Motion Estimation and Motion Analysis
  • Still Image Coding
  • Video Sequence Coding I
  • Image Transmission and Communication Systems
  • Hierarchical Image Decomposition
  • Video Sequence Coding II
  • Vector Quantization
  • Hierarchical Image Coding
Hierarchical Image Coding
icon_mobile_dropdown
Some fundamental experiments in subband coding of images
Sven Ole Aase, Tor A. Ramstad
The paper describes some basic experiments for analyzing the origin and character of the noise in subband coders. A new tool introduced in this paper is an ideal filterbank which guarantees that no interband noise contamination will occur. The coding results are compared with a subband coder based on a standard FIR filterbank. It is found that interband contamination does not play any significant role in the subjectively perceived noise in an image subband coder. Further experiments show that the noise in the decoded image, which subjectively manifests itself as ringing noise in smooth areas close to higher activity regions, is due to the thresholding to zero of a large number of small amplitude subband samples, which takes place in the quantization procedure. It can be thoroughly removed by the dithering technique. The article presents some image material demonstrating the cited results.
Super High Definition Image Systems
icon_mobile_dropdown
Acquisition of very high resolution images using stereo cameras
The method of image acquisition is one of the principal problems for handling very high resolution images. This paper presents a new scheme for acquiring high resolution pictures by processing stereo images. The method integrates low resolution images into a high resolution image. Analysis of the achievable passband is given from a point of view of the imager photodetector structure. Compared to an imaging device with equivalent resolution, it may be less sensitive to shot noise which becomes more dominant as the pixel size of an imager is further reduced. Preliminary experiments have shown clear improvements in the high frequencies and image details. In addition, a scheme for further increasing tlie resolution is mentioned.
Visual factors and image analysis in the encoding of high-quality still images
V. Ralph Algazi, Todd Randall Reed, Gary E. Ford, et al.
In the encoding of high quality images beyond current standards, a reexamination of issues in the representation, processing and encoding problems is needed. The fundamental reason for that change of emphasis is because the image representation, sampling density, color and motion parameters are no longer given by accepted practices or standards and, thus, require study. Some basic issues that should be reconsidered are as follows:
Hierarchical Image Coding
icon_mobile_dropdown
Multiresponse imaging system design for improved resolution
Rachel Alter-Gartenberg, Carl L. Fales, Friedrich O. Huck, et al.
Multiresponse imaging is a process that acquires A images, each with a different optical response, and reassembles them into a single image with an improved resolution that can approach \/y/A times the photodetector-array sampling lattice. Our goals are to optimize the performance of this process in terms of the resolution and fidelity of the restored image and to assess the amount of information required to do so. The theoretical approach is based on the extension of both image- restoration and rate-distortion theories from their traditional realm of signal processing to image processing which includes image gathering and display.
Comparison of directionally based and nondirectionally based subband image coders
Roberto H. Bamberger, Mark J. T. Smith
Many different types of analysis/synthesis decompositions have been investigated for image coding. Some, like those used in traditional transform, pyramid, and subband coders, partition the spatial-frequency spectrum into symmetric bands, i.e., the bands occupy a disjoint region in each of the four frequency plane quadrants [1, 2, 3, 4]. Others have examined related systems that use directionally-based decompositions [5, 6, 7, 8] in which diagonally oriented spatial frequency components are coded. However, the directional resolution in these schemes is often very limited. In this paper, some new exactly reconstructing subband decompositions are presented with directional selectivity. They are examined in the context of subband image coding and compared to directionally invariant subband decompositions in terms of subjective quality.
Still Image Coding
icon_mobile_dropdown
Use of a human visual model in subband coding of color video signal with adaptive chrominance signal vector quantization
Dominique Barba, Jose Hanen
This paper deals with vector quantization of sub-band video color images with a high visual quality. If lots of papers have already been published about the use of both techniques for monochrome still images, few of them work with color images. Our work concerns the use of some important human visual system properties to derive an optimized vector quantization of sub-band decomposed TV images. Two major properties are used. First, spatial frequency dependence in the fovea filtering is pushed into a distortion measure for vector quantization. Secondly, and perhaps the most important is that the codebook design is performed in the splitting algorithm by incorporating two complementary thresholds in the global evaluation of the distortion for each class of the code book : a visibility threshold and an annoyance threshold. With these techniques we designed optimized codebooks (with or without classification) which allowed visual vector quantizing of the chrominance component of color video digital signal without any visible impairment and with a high compression ratio.
Video Sequence Coding II
icon_mobile_dropdown
Motion-compensated subsampling of HDTV
Ricardo A. F. Belfor, Reginald L. Lagendijk, Jan Biemond
For an economical introduction of HDTV a substantial datareduction is necessary, while maintaining the high image quality. To this end Sub-Nyquist techniques can be used (MUSE, HD-MAC) for stationary parts of the image, spreading the sampling of one complete frame over several frames and combining these different frames at the receiver. Without special precautions Sub-Nyquist sampling is not possible for moving areas of the image, in this paper a new algorithm will be described for the subsampling of moving parts of a video sequence. The advantage of this new method is that the full spatial resolution can be preserved, while also maintaining the full temporal resolution. To prevent aliasing at certain velocities ( critical velocities ), the image is divided into a high-pass and a low-pass part prior to subsampling. At the receiver a motion compensated interpolation filter is used to reconstruct the original image.
Hierarchical Image Coding
icon_mobile_dropdown
Three-dimensional subband decompositions for hierarchical video coding
Frank Bosveld, Reginald L. Lagendijk, Jan Biemond
With the introduction of a multitude of video and multimedia services in the context of broadband communication networks, compatibility between various different data compression systems is becoming increasingly important. Significant research efforts have been recently directed towards so-called hierarchical coding schemes that provide compatibility by splitting the source signal into several hierarchical layers. Compatibility is achieved in this way since a receiver selects and decodes only those layers which are relevant for its (fixed resolution) display monitor. In this paper we introduce and investigate two hierarchical spatio-temporal subband decompositions; the so-called ’full’ and ’reduced temporal hierarchy’ decompositions. The former supports progressive-scan video signals only while the latter is capable of handling interlaced signals as well. The better frequency discrimination of the ’full temporal hierarchy' decomposition is expected to lead to higher coding performances for the interlaced signals but has a higher complexity.
Visual Communication Hardware
icon_mobile_dropdown
Cheops: a modular processor for scalable video coding
V. Michael Bove Jr., John A. Watlington
We describe the Cheops Imaging System, a small, parallel video processing and display system for real time experiments in video coding and interactive image-based applications. The modular structure of the system allows it to evolve to meet new application requirements or to employ new processing chips as they are developed.
Subband video-coding algorithm and its feasibility on a transputer video coder
Sergio C. Brofferio, Elena Marcozzi, Luigi Mori, et al.
This work presents current results of an applied research aimed at defining a fixed-quality variable bit-rate coding algorithm suitable to run on workstations used for different visual communication applications in ATM networks. Subband coding has been chosen due to its suitability and effectiveness in coding both fixed and moving images in such an environment: both FIR and HR filter banks have been designed and tested considering various issues such as their space and frequency characteristics and computational complexity. Adaptive quantization based on statistical properties and subjective relevance of the different subbands has been realized with the use of a marginal analysis procedure. Finally, the feasibility of a Transputer based implementation of the algorithm in a reduced-complexity form is considered.
Motion Estimation and Motion Analysis
icon_mobile_dropdown
Estimation and prediction of object-oriented segmentation for video predictive coding
Sergio C. Brofferio, Domenico Comunale, Stefano Tubaro
The research proposes a simple finite state image model for estimating video sequence motion field, background memory updating and luminance prediction. The model allows object-oriented segmentation utilizing a fuzzy state (New-Scene) so that state transitions diagrams, modeling the physical constraints can be defined both for image segmentation and for luminance prediction. Preliminary results for integrated image segmentation and displacement fields estimations are presented. The proposed algorithm allows the implementation of video coding schemes where only structural prediction errors have to be transmitted, when physical motion coherence is plausible, and which is suited for completely parallel distributed processing.
Hierarchical Image Decomposition
icon_mobile_dropdown
Statistically optimized PR-QMF design
Hakan Caglar, Yipeng Liu, Ali Naci Akansu
A multivariable optimization problem is set to design 2-band PR-QMFs in this paper. The energy compaction, aliasing energy, step response, zero-mean high-pass filter, uncorrelated subband signals, constrained nonlinearity of the phase-response, and the given input statistics are simultaneously considered in the proposed optimal filter design technique. A set of optimal PR-QMF solutions and their optimization criteria along with their energy compaction performance are given for comparison. This approach of PR-QMF design leads to an input driven adaptive subband filter bank structure. It is expected that these optimal filters outperform the well-known fixed PR-QMFs in the literature for image and video coding applications.
Vector Quantization
icon_mobile_dropdown
Fast finite-state codebook design algorithm for vector quantization
Ruey-Feng Chang, Wen-Tsuen Chen
The Linde-Buzo-Gray (LBG) algorithm is usually used to design a codebook for encoding images in the vector quantization. In each iteration of this algorithm, we must search the full codebook in order to assign the training vectors to their corresponding codewords. Therefore, the LBG algorithm needs large computation effort to obtain a good codebook from the training set. In this paper, we propose a finite-state LBG (FSLBG) algorithm for reducing the computation time. Instead of searching the whole codebook, we search only those codewords that are close to the codeword for a training vector in its previous iteration. In general, the number of these possible codewords can be very small without sacrificing performance. Because of searching only a small part of the codebook, the computation time is reduced. In our experiment, the performance of the FSLBG algorithm in terms of the signal-to-noise ratio is very close to that of the LBG algorithm. However, the computation time of the FSLBG algorithm is only about 10 percent of the time required by the LBG algorithm.
Entropy Coding
icon_mobile_dropdown
Coding of motion vectors for motion-compensated predictive/interpolative video coder
Cheng-Tie Chen, Fure-Ching Jeng
Motion vectors are side information vital to motion-compensated predictive/interpolative interframe video coding algorithms. This paper is concerned with the coding efficiency for this motion information. It is demonstrated that coding differential motion vectors is not always advantageous. An adaptive strategy is thus considered to automatically select the most favorable mode to code the motion information. The effects of video noise to motion estimation and coding are also investigated. A simple linear filtering scheme is used to reduce the noise effects. Simulation results show a significant performance improvement for noisy video by using this simple filtering scheme.
Highly efficient entropy coding of multilevel images using a modified arithmetic code
Yan-Ping Chen, Yasuhiko Yasuda
In this paper, firstly we present a modified multiplication-free arithmetic, code that can improve the efficiency of the conventional multiplication-free arithmetic code by increasing the numbers of the estimated probability from 2 to 3 in a power unit range. Then, we describe the application of this modified algorithm to the entropy coding of multi-level images. In our entropy coding scheme, the multi-alphabet symbol string, which is input from the sequence of the zig-zag scanning data obtained by applying the ADCT compression algorithm to the original image, can be directly sent to the encoder, i.e., there is no need to decompose the symbols into a series of binary decisions. Finally, some simulation results will be given to show that good performance can be achieved with our coding scheme.
Probabilistic model for quadtree representation of binary images
Chun-Hsien Chou, Chih-Peng Chu
A quadtree is a compact data structure widely used in many areas for representing binary region data. In this paper, a probabilistic model for quadtree coding of binary images is presented. The binary image to be encoded can be modeled as a first-order Markov process, and the quadtree representation of binary images can be modeled as a branching process. Based on these two mathematical models and the Huffman code, a recursive equation is obtained to estimate the code length for the quadtree representation of a binary image. The simulation results show that, with the measured statistical parameters and a proper value assigned to a dependent parameter, the differences between the bit rates of the theoretical estimation and the experimental results are in average within 5%.
Visual Communication Hardware
icon_mobile_dropdown
High-speed programmable digitizer for real-time video compression experiments
Norman R. Cox
A flexible digital video simulator has been constructed so that the effects of image coding and compression can be observed and evaluated in real time. This paper describes the design of the simulator's digitizer which includes the analog-to-digital (A/D) converter, the digital-to-analog (D/A) converter, and the programmable clock synthesizer. Although this system is presently being used for processing National Television Systems Committee (NTSC) composite video signals, it can be reconfigured for processing other image transmission formats. Examples include red, green, and blue (RGB) component signals, luminance and chrominance (YIQ) signals, and enhanced-definition television (EDTV) signals.
Motion Estimation and Motion Analysis
icon_mobile_dropdown
Motion field estimation for complex scenes
Johannes Nicolaas Driessen, Jan Biemond
The estimation of motion fields from an image sequence is a difficult problem for scenes containing moving and overlapping objects. In these cases, the problem is to simultaneously estimate the motion field and detect both the motion discontinuity and the uncovered region. We propose a new model that is based on three simple first order coupled Markov Random Fields (MRF’s), where the coupling accounts for the interaction between the three processes. The simultaneous problem is then reformulated as the minimization of an energy function associated with the chosen MRF’s. This is a non-trivial minimization problem due to the binary nature of the motion discontinuities and the uncovered regions. We propose the use of a Deterministic Annealing algorithm, which can be interpreted as an interleaved iterative technique using continuation methods for the binary processes. Experimental results show that such an approach indeed leads to the detection of motion boundaries and uncovered regions, which in turn leads to a more accurate motion field estimate near the motion boundaries as well as the uncovered regions.
3-D Motion Analysis
icon_mobile_dropdown
Estimation of three-dimensional motion in a 3-DTV image sequence
The article deals with 3D scene analysis for coding purposes and is part of general research into 3D television. The method proposed here attempts, through dynamic monocular analysis, to estimate three-dimensional motion and determine the structure of the observed objects. Motion and structure estimation are achieved by means of a differential method. A multipredictor scheme is used to guarantee correct initialisation of the algorithm. Images are segmented according to spatio- temporal criteria, using a hierarchic method based on a quad-tree with overlapping. Segmentation and estimation are performed jointly.
Video Sequence Coding I
icon_mobile_dropdown
Digital video codec for medium bitrate transmission
Touradj Ebrahimi, Frederic Dufaux, Iole Moccagatta, et al.
A digital video codec is presented, using a fast Gabor-like wavelet transform to produce a multiresolution data set. Three difFerent strategies are introduced to code different levels of the resolution in the pyramidal data, according to their visual importance. A hierarchical vector quantization is performed to exploit the inter/intra subbands correlations. A multiresolution block matching is proposed to generate the motion field in the scene. These motion vectors are then used to reduce the temporal redundancies. Simulations show good results in quality for medium bitrate transmission.
Super High Definition Image Systems
icon_mobile_dropdown
Superhigh-definition image processing on a parallel signal processing system
Tetsuro Fujii, Tomoko Sawabe, Naohisa Ohta, et al.
This paper describes a new parallel image processing system called "NOVI-II HiPIPE" that manipulates super high definition (SHD) images. This system consists of 128 processing nodes, 4 I/O nodes for image storage, and 4 I/O nodes for SHD still image display. It provides extremely high computational power and high throughput rates for SHD image processing. This system can transfer image data to the newly developed Super Frame Memory (Super FM) for SHD moving image display via the Ultra-network. A new Vector Processor (VP), which has a peak performance of 100 MFlops, is developed and the performance of an engineering sample version of this chip is checked. One VP will be installed in each processing node as the DSP engine and a total system performance of 12.8 Gflops (peak performance) is expected from NOVI-II HiPIPE. Using this system, various image coding schemes and computer graphics techniques can be easily explored in spite of the huge amount of data that must be treated.
Image Transmission and Communication Systems
icon_mobile_dropdown
Motion video coding for packet-switching networks: an integrated approach
Michael Gilge, Riccardo Gusella
The advantages of packet video, constant image quality, service integration and statistical multiplexing, are overshadowed by packet loss, delay and jitter. By integrating network-control into the image data compression algorithm, the strong interactions between the coder and the network can be exploited and the available network bandwidth can be used best. In order to enable video transmission over today’s networks without reservation or priorities and in the presence of high packet loss rates, congestion avoidance techniques need to be employed. This is achieved through rate and flow control, where feedback from the network is used to adapt coding parameters and vary the output rate. From the coding point of view the network is seen as data buffer. Analogously to constant bit rate applications, where a controller measures buffer fullness, we attempt to avoid network congestion (eq. buffer overflow) by monitoring the network and adapting the coding parameters in real-time
Vector Quantization
icon_mobile_dropdown
Image vector quantization with block-adaptive scalar prediction
Smita Gupta, Allen Gersho
A novel method for block coding of images is presented which combines 2-D scalar prediction with vector quantization. The input image is divided into spatially contiguous, non-overlapping square blocks and the inherent spatial nonstationarity is modeled by adaptation of the scalar predictor to each block. A new technique for the quantization of predictor coefficients is introduced which ensures the stability of the resultant inverse prediction error filter. Simulation results show that the predictor adaptation method leads to significantly improved performance compared to a coder with a fixed predictor for a nominal increase in the overall bit rate. When compared with predictive vector quantization, our coder provides higher coding gain and better perceptual quality.
Model Based Image Coding
icon_mobile_dropdown
Human facial motion modeling, analysis, and synthesis for video compression
Thomas S. Huang, Subhash C. Reddy, Kiyoharu Aizawa
We present some preliminary results of our work on model-based compression of video sequences of a person’s face, in the context of teleconferencing and videophone applications. The emphasis is on the difficult and challenging problem of analysis. Algorithms are presented for extracting and tracking key feature points on the face, and for estimating the global and local motion of the head/face.
Video Sequence Coding I
icon_mobile_dropdown
Motion-compensated priority discrete cosine transform coding of image sequences
Serafim N. Efstratiadis, Yunming George Huang, Z. Xiong, et al.
In this paper a new motion compensated (MC) predictive coding method for image sequences is presented. This method utilizes a prediction of the displacement vector field (DVF) in order to produce a MC prediction error which is coded and transmitted using the Discrete Cosine Transform (DCT) and the Partition Priority Coding (PPC) approach. Assuming that two previous frames are available both at the transmitter and the receiver, the DVF corresponding to the previous frame is estimated. Then, based on this estimate a temporal prediction of the DVF at the current frame is obtained. This way there is no need to transmit the DVF. Using the predicted DVF, the MC prediction error is obtained. It is then transformed and the irrelevancy reduction is carried out using the PPC method. According to this method, the transform coefficients are ordered based on their magnitude and, therefore, magnitude and location information is transmitted. Due to the properties of the MC prediction error, PPC proves to be very suitable. The proposed algorithm was experimentally tested on standard video-conferencing image sequences. Significantly improved results were obtained compared to previously reported DCT and Priority DCT based methods without MC in terms of reduced bit-rate and quality of the reconstructed image sequence.
Model Based Image Coding
icon_mobile_dropdown
Color/texture analysis and synthesis for model-based human image coding
Satoshi Ishibashi, Fumio Kishino
This paper proposes an efficient color/texture analysis and synthesis method for the model-based coding of human images. A shade and shadow elimination scheme is also introduced enhance texture analysis. The input human image is segmented into uniform color regions; color/texture data of each region is analyzed through the first and second order statistics. The synthesis process generates the color/texture of each region from the analyzied data and shading data, which is created from a three dimensional human body model and the new virtual lighting source. The proposed method acquires and encodes the objects’ original color/textures independently of shade and can reconstruct the color/textures while applying any shade effect desired.
Entropy Coding
icon_mobile_dropdown
Compaction of color images with arithmetic coding
Masahiro Iwahashi, Shun-ichi Masuda
In this paper, we will introduce a new compaction method for color images with an arithmetic coding and a Differential Pulse Code Modulation (DPCM). In the field of lossless coding, entropy coding plays an important role. For example, Arithmetic coding and Huffman coding1 are well-known entropy coding techniques. Arithmetic Coding (AC) is superior in many respects to Huffman coding.
Additional Papers
icon_mobile_dropdown
Hierarchical block motion estimation for video subband coding
Joon-Hyeon Jeon, Cheul-hee Hahm, Jae-Kyoon Kim
In this paper, a new hierarchical motion estimation scheme is proposed for progressive video coding based on subband divisions. It is based on the hierarchical block matching for a pass-band pyramid, where each layer of image except the top-most layer has the pass-band image reconstituted by use of only three highpass subbands. Using this scheme, since the quantization error at a given layer exits at only the image of the layer, the coding performance can be improved by exact motion estimation. We also present a method to encode the residueferror frame) in the pyramid structure. It is found that the hierarchical block motion estimation based on the pass-band layers is better than the base-band layers, especially when the quantization is included through transmission.
Motion Estimation and Motion Analysis
icon_mobile_dropdown
Bayesian approach to segmentation of temporal dynamics in video data
Coleen T. Jones, Ken D. Sauer
We present an algorithm for Bayesian estimation of temporally active spatial regions of video sequences. The algorithm improves the effectiveness of conditional replenishment for video compression in many applications which feature a background/foreground format. For the sake of compatibility with prevalent block-type coders, the binaryvalued segmentation is constrained to be constant on square blocks of 8x8 or 16 x 16 pixels. Our approach favors connectivity at two levels of scale. The first is at the individual pixel level, where a Gibbs distribution is used for the active pixels in the binary field of supra-threshold interframe differences. The final segmentation also assigns higher probability to patterns of active blocks which are connected, since in general, macroscopic entities are assumed to be many blocks in size. Demonstrations of the advantage of the Bayesian approach are given through simulations with standard sequences.
Hierarchical Image Decomposition
icon_mobile_dropdown
Performance evaluation of subband coding and optimization of its filter coefficients
Jiro Katto, Yasuhiko Yasuda
In this paper, two analytical methods to evaluate coding performance of subband coding are proposed, and optimization of its filter coefficients from the viewpoint of energy compaction property is considered. The first method is based on matrix representation of subband coding in time domain, where the coding gain given by Jayant and Noll is introduced as a performance measure for filter banks with orthogonal property. The second method is based on optimum bit allocation problem for subband coding (multirate filter bank), where the unified coding gain is derived as a new performance measure which can be applied to arbitrary transform techniques. We then try to find filter coefficients which maximize the unified coding gain according to input characteristics. This approach leads to optimization of filter coefficients from the viewpoint of energy compaction property.
Still Image Coding
icon_mobile_dropdown
Variable-blocksize transform coding of four-color printed images
Andre Kaup, Til Aach
In this paper we outline a new approach to transform coding of four-color printed images employing variable blocksizes in connection with cpiadtree data structures. A new statistical model based decision criterion for guidance of the quadtree decomposition is proposed. This decision criterion avoids drawbacks of simple deviation tests or detail measures and can easily be extended towards multi-color segmentation. To account for the greatly varying local image content found in printed images, a highly adaptive threshold coding concept using an activity/entropy block classification scheme is introduced. Segmentation results as well as average compression ratios for the proposed system will be given.
Hierarchical Image Decomposition
icon_mobile_dropdown
Method to convert image resolution using M-band-extended QMF banks
Masahisa Kawashima, Hideyoshi Tominaga
A new method to convert image resolution using M-band extended QMF is proposed. Due to the parallel structure ofthe filter banks, the conversion ratio of the proposed method is not constrained to be a power of 2. Moreover, since theproposed method performs the filtering of signals by convolution operations, the resolution converted image by theproposed scheme is free from the aliasing which occurs in that by the DCT method. Judging from the experimentalresults, it is concluded that the resolution converted image by the proposed scheme is superior to that by the DCTmethod.
Subband decomposition procedure for quincunx sampling grids
Chai Wook Kim, Rashid Ansari
Filters with diamond shaped passbands and stopbands are used in image and video processing for several different tasks, one of which is the subband decomposition of signals. This subband decomposition can be carried out in a tree-structured manner by using diamond prefilters followed by downsampling on quincunx grids at each stage. In order to use such a decomposition in hierarchical coding it is desirable to choose the lowpass filter to be a halfband filter so as to limit the amount of aliasing in the low frequency component of the signal at each stage. This paper addresses the design and implementation of a two-channel filter bank for such an application. A special class of one-dimensional (1-D) prototype filters is used to derive the filter banks. The procedure is based on obtaining a halfband filter using a pair of lower order halfband filters. The resulting solutions preserve the exact reconstruction property even when the filter coefficients are quantized, a property which is useful in implementation. Examples of such filter banks are presented.
Vector Quantization
icon_mobile_dropdown
Classified vector quantizer based on minimum-distance partitioning
Dongsik Kim, Sang Uk Lee
In this paper, we describe a new classified vector quantization (CVQ) technique employing the minimum-distance classifier to reduce the encoding complexity required in the full search vector quantization (VQ). The determination of the optimal subcodebook sizes for each class is an important task in the CVQ. However, we propose a CVQ technique, which, with an equal subcodebook size, suboptimally satisfies the optimal CVQ condition described in [4]. In addition, a cluster modifying algorithm, which alleviates the local minimum problem in the clustering algorithm, is proposed to ensure the optimal CVQ condition. The proposed CVQ is a kind of the partial search VQ because it requires a search process through each subcodebook only. However, simulation result reveal that the performance of the proposed CVQ is almost comparable to that of the full search VQ, while the encoding complexity is only 6.5 % of that required in the full search VQ.
Entropy Coding
icon_mobile_dropdown
Block arithmetic coding of contour images
Kyoil Kim, Jonglak Kim, Taejeong Kim
In this paper, we proposed two new contour coding techniques based on the arithmetic code, called block arithmetic coding and pre-processed block arithmetic coding. These techniques are intended to be used in the region-based image coding, and efficiently take advantage of the characteristics of the contour images. Blocks of pels in a contour image are treated as non-binary symbols. The efficiency of the binary arithmetic code is extended to coding of non-binary symbols via tree structured arrangement of binary arithmetic codes. The performances of the proposed techniques are compared with those of the techniques already reported, such as chain codes, READ codes, ordering techniques, and binary arithmetic codes. By computer simulations, the proposed techniques are shown to outperform any other existing technique.
Model Based Image Coding
icon_mobile_dropdown
Transmission of the motion of a walker by model-based image coding
Tadahiko Kimoto, Yasuhiko Yasuda
To the scene of a human walker a model-based image coding scheme is applied for transmission of a human motion. In our scheme, the motion of a whole human body is represented in 3-D stick motion. A 3-D human stick model is supposed to walk in the limited way. The structure parameters that form the model and the motion parameters that represent the motion of a walker are defined for the model. These parameters are estimated from an input image sequence by motion analysis. Here, this image sequence is a monocular one obtained with one fixed video camera. To improve the range in which the motion of a walker is represented, three stick models with different degrees of coarseness are used. An algorithm for estimating the parameters of these models in the order of coarseness is presented. The stick motion estimated from the experimental scene has been shown. Also, transmission of the estimated parameters is discussed. Because of the order in which parameters are estimated, the transmission delay in our scheme becomes at least the period of one step.
Image Transmission and Communication Systems
icon_mobile_dropdown
Variable-bit-rate HDTV coding algorithm for ATM environments for B-ISDN
Taizo Kinoshita, Tomoko Nakahashi, Masaaki Takizawa
A variable bit-rate HDTV coding algorithm based on motion-adaptive DCT is investigated for ATM environments in B-ISDN. Adaptive 2-layered coding, an ATM cell matrix for error correction, and a block interleave for error concealment are proposed to keep picture quality high by compensating for ATM cell loss. A new VLC and a congestion control scheme that restricts peak-rate and average-rate are also proposed for traffic control. The proposed algorithm is shown to reduce the coding bit-rate for HDTV conference applications in ATM environments to 10-30 Mb/s.
Super High Definition Image Systems
icon_mobile_dropdown
Superhigh-definition image communication: an application perspective
Jagdish C. Kohli
Digital Imaging Systems have been deployed in many scientific and military applications such as space research, aerial photography, weather forecasting and CAD/CAM. More recently, because of price/performance improvement in many technologies, digital imaging systems are finding more applications in many commercial/business environments such as banking, insurance, legal and the health care industry. Image capture, processing, compression, storage, display and transport-all these constitute important elements of emerging imaging systems. These elements can be assembled to build systems meeting appropriate end-user requirements. During the past three to four years, the investment in imaging systems in the U.S. has increased at an annual rate of over 30 percent, both in the public and private sectors. The business of imaging is relatively new; it can grow manifoldly in existing markets as well as create new market opportunities.
High-resolution color image coding scheme for office systems
Yutaka Koshi, Setsu Kunitake, Kazuhiro Suzuki, et al.
There is great interest in having office systems that adopt the image coding to enable high resolution and high quality document management. Network-based office systems require two classes of procedures. The first requires a new “ internal coding” scheme for integrated image processing and editing operations. The second corresponds to image data interchange between system components through the network and can be implemented using conventional image coding scheme such as JPEG. This paper deals with a new image coding scheme which is designed for internal coding facility based on the adaptive block truncation coding. This scheme can perform block by block fixed rate coding and editing within an encoded form. The simulation results show the feasibility of the proposed scheme at compression ratios of 4 to 8.
Characteristic analysis of color information based on (R,G,B)-> (H,V,C) color space transformation
Qing Gan, Makoto M. Miyahara, Kazunori Kotani
Most of the color image processing terminals and workstations available today do not support full color quality, and they describe colors, in an ordinary way, by red, green and blue signals. The colors are unable to be reproduced exactly due to limitations imposed by the analog-to-digital conversion. In this paper we investigate the reproduction errors caused by quantization in a space where the coordinates represent the excitation of the three attributes involved in color vision. Further, a procedure to limit the color quantizing error under the condition of "just noticeable difference or very little difference in visual perception" is proposed. It is found that to acquire a high quality color representation which can express the subtle color variety, the quantizing accuracy is required more than 42 bits per set of (R,G,B) signals.
3-D Motion Analysis
icon_mobile_dropdown
Compact motion representation based on global features for semantic image sequence coding
Claude Labit, Henri Nicolas
For dynamic scene analysis [HOR81],[ADI85],[HAR85], numerous studies developped derivation methods of qualitative motion features from a dense apparent velocity vector field (optical flow). However, these qualitative information [FRA90] can not be used for image sequence reconstruction because no explicit reconstruction criterion is introduced within the feature identification process itself.
Video Sequence Coding I
icon_mobile_dropdown
Motion compensation by block matching and vector postprocessing in subband coding of TV signals at 15 Mbit/s
Fabrice Lallauret, Dominique Barba
This work deals with a new coding method for transmitting TV signals at 15 Mbit/s. The coding scheme is based upon sub-band decomposition by Pseudo-QMF filtering and motion compensation. In that context we have studied the influence on the overall system of various strategies in the motion compensation block. More precisely, in the case of a motion compensation by block matching, the influence of overlapping between the blocks has been first considered. A new non-linear post processing scheme of the displacement vector is also presented which smoothed pretty well the vector field and reduced significantly the vector coding bit rate. Finally various methods of vector coding have been designed and compared.
Motion Estimation and Motion Analysis
icon_mobile_dropdown
Block-adaptive quantization of multiple-frame motion field
Fabio Lavagetto, Riccardo Leonardi
This paper presents a method for the adaptive quantization of the motion field obtained from multiple reference frames. The motion estimation is obtained through a block-matching technique, with the assumptions of pure translational motion and uniform motion within a block. The regular assumption of constant intensity along the motion trajectory in the spatio-temporal path is relaxed. On one hand, this allows for illumination changes between the reference frames and the frame to be motion-compensated. On the other hand, it allows for additional freedom to reduce the prediction error after motion-compensation when the translational and rigid body motion assumptions are violated. The matching function represents the difference for each block of the luminance signal in the frame to be encoded with respect to a linear combination of 2 displaced blocks of same size in the reference frames. The adaptive quantization mechanism is based on evaluating, on a block basis, the local sensitivity of the displaced frame difference signal to a quantization of the motion field parameters. It is shown how such sensitivity depends only on the reference frame signals, which allows to keep it below a desired threshold without additional information to be sent to a receiver. Simulations are carried out on standard CIF (240x360) source material provided to the ISO/MPEG. Results are discussed to show the improvement with respect to the strategy suggested in the draft recommendation of the ISO MPEG for interactive video at 1.5 Mbps.
Entropy Coding
icon_mobile_dropdown
Construction of efficient variable-length codes with clear synchronizing codewords for digital video applications
Shawmin Lei
Variable-length codes are widely adopted for lossless data compaction in many digital video applications, e.g., videophone and high definition television (HDTV). However, error propagation is still a major concern for practical applications. In general, since there are no explicit word-boundaries in the variable-length coded data stream, a transmission error will cause the succeeding codewords to be decoded erroneously. One way to confine the error propagation is through the periodic use of synchronizing words that have a special bit pattern which can be recognized in the coded bit stream as long as there are no errors in the words itself. We call such codewords “clear” codewords. A basic property of the clear codewords is that they cannot be formed by any concatenation of other codewords. These clear codewords are useful not only for the detection and confinement of errors but also for the multiplexing and demultiplexing of multiple variable-length coded bit streams. The construction of efficient variable- length codes with clear codewords is an interesting and important issue for many practiced applications.
Hierarchical Image Decomposition
icon_mobile_dropdown
Design of parallel multiresolution filter banks by simulated annealing
Wei Li, Andrea Basso, Ashok Chhabedia Popat, et al.
A new simulated annealing technique is described for designing multiresolution, parallel-structured filter banks with finite wordlength coefficients. The algorithm is based on a two stage optimization. In the first stage, a perturbation following uniform distribution is used; while in the second stage, perturbation is in Gaussian distribution, allowing a faster convergence. The filter banks have properties which make them well suited to image subband coding applications. In particular, they provide joint localization in the spatial and spectral domains, and allow no leakage of DC into higher subbands. This paper describes the algorithm and presents design examples.
Video Sequence Coding II
icon_mobile_dropdown
Lapped orthogonal transform for motion-compensated video compression
William E. Lynch, Amy R. Reibman
Two common techniques for compressing digital video are motion-compensated interframe coding and transform coding. In this paper, we combine the Lapped Orthogonal Transform (LOT) with motion compensated (MC) temporal prediction. The statistical structure of the MC frame differences is distinct from than that of still images. Therefore, we introduce a new lapped transform (the straddle LOT (SLOT)) which considers the structure of the MC frame differences. The combination of the conventional LOT with the SLOT, when applied to the entire image, forms the motion-compensated LOT (MCLOT). We demonstrate, both theoretically and on actual images, that the MCLOT outperforms the conventional LOT when applied to MC frame differences.
Hierarchical Image Decomposition
icon_mobile_dropdown
Signal extension and noncausal filtering for subband coding of images
Stephen A. Martucci
This paper addresses two issues of concern when implementing filter banks for subband coding of images: filter causality and filtering of the finite-length rows and columns of the image. A new framework for analyzing the effect of noncausal analysis filtering on the aliasing cancellation properties of the two-band filter bank system is presented. A general solution is given to the problem of how to extend finite-length signals before filtering so that expansion of the subband signal size can be avoided but perfect reconstruction remains. It will be shown how the methods of circular convolution and symmetric extension provide this solution. When using the symmetric extension method, how the signal is extended and the amount of delay the analysis filters may impart are highly constrained. Both even-length and odd-length FIR filters may be used, but extension is different in the two cases.
Visual Communication Hardware
icon_mobile_dropdown
High-speed hardware architecture for high-definition videotex system
Mitsuru Maruyama, Hiroaki Sakamoto, Yutaka Ishibashi, et al.
An experimental high-definition videotex system for broadband ISDN has been developed, and this paper introduces high-speed hardware architecture for this system. Key technologies required are highspeed protocol processing, high-speed data transfer, and high-speed picture readout. High-speed protocol processing — using a newly developed virtual memory copy, contents rearrangement memory, two-bus architecture, and simultaneous editing and analyzing — allows a requested 6-MB picture to be displayed within 3 seconds.
Video Sequence Coding I
icon_mobile_dropdown
45-Mbps multichannel TV coding system
Shuichi Matsumoto, Takahiro Hamada, Masahiro Saito, et al.
A new digital coding system is presented, which makes it possible to transmit up to 4 NTSC TV programs simultaneously over a single DS3 45Mbps link including two high quality sound channels and one 64 kbps ancillary data channel. The principal bit-reduction technology employed is 2 dimensional intraframe WHT (Walsh Hadamard Transform) coding with an advanced adaptive quantization reflecting human visual perception. The hardware has been made compact like a home use VTR.
Still Image Coding
icon_mobile_dropdown
Efficient odd max quantizer for use in transform image coding
Neal A. Hauser, Harvey B. Mitchell
When an image is compressed using the cosine transform algorithm it is traditional to separately quantize the coefficients with an (Even) Max Quantizer. The Odd Max Quantizer is generally used at high compression ratios since although it is less efficient it is visually more pleasing. Using a very simple coding scheme, we show how the intrinsic inefficiency of the Odd Max Quantizer may be virtually eliminated.
Image coding using adaptive-blocksize Princen-Bradley transform
Takashi Mochizuki, Mitsuharu Yano, Takao Nishitani
This paper describes the extension of Princen-Bradley transform,1,2 or MDCT, to a variable blocksize case and its application to image coding. An adaptive window scheme is also introduced and window shape selection mechanism is carefully considered. By introducing an adaptive blocksize and adaptive window scheme into MDCT image coding, the coding efficiency has been improved while maintaining the same high subjective image quality of MDCT coding. General conditions for perfect signal reconstruction of adaptive blocksize MDCT are mathematically derived in the appendix.
Hierarchical Image Coding
icon_mobile_dropdown
Edge-based subband image coding technique for encoding the upper-frequency bands
Nader Mohsenian, Nasser M. Nasrabadi
A subband coding (SBC) scheme, exploiting the dependencies that exist across the bands, is developed which can effectively compress the upper frequency bands of the decomposed image. An edge-detecting filter is incorporated into the SBC model that extracts the edge-pixels in the base-band, and identifies the locations of their counterparts in the upper-bands accordingly without transmission of any overhead information. A memoryless vector quantizing (VQ) scheme was used to encode the base-band: Once the edge-detected reconstructed base-band became available, another VQ coding scheme was employed to encode the upper frequency components across the bands in conjunction with the error derived due to the decoded base-band, only at edge locations. High quality 512 x 512 images were reconstructed at ~ 0.7 bits/pixel.
Still Image Coding
icon_mobile_dropdown
Overlapping block transform for offset-sampled image compression
Yoshitaka Morikawa, Nobumoto Yamane, Hiroshi Hamada
In this paper, an overlapping block transform coding for offset-sampled image compression, called OS-OBT, is proposed. First, OS-OBT is developed from the theory of frequency analysis\synthesis. In the second, a fast algorithm for OS-OBT is devised, which is twice faster than the usual DOT. In the last, simulations demonstrate that coding performance of OS-OBT is superior to that of DOT and block artifacts decrease in OS-OBT.
Image Transmission and Communication Systems
icon_mobile_dropdown
Experimental system using an interactive drawing input method
Yasutada Nagano, Hideaki Kanechika, Satoshi Tanaka, et al.
It is very laborious and costs much to input paper-based drawings into computer using digitizing tablets. And there are many drawings that cannot be recognized by conventional automatic drawing input systems. To cope with these drawings, we are developing an interactive drawing input system that helps manual digitizing with automatic recognition technology. The operator has only to specify the kind of figures and click some points on the drawing to specify approximate position and shape of the figure, and the system immediately recognizes the figure's position and shape exactly.
3-D Motion Analysis
icon_mobile_dropdown
Three-dimensional motion analysis and structure recovering by multistage Hough transform
Shigeyoshi Nakajima, Mingyong Zhou, Hiromitsu Hama, et al.
A new approach for the detection of motions of three-dimensional rigid bodies from two-dimensional images is presented. The approach is based on two main stages. In the first stage, the positions and velocities of feature points are detected from two-dimensional images. We state the idea in this paper but the detail. In the second stage, the rotation and the tranlation velocity of each bodies are detected from the positions and velocities of the set of feature points. We employ Hough transform method in the both stages. We describe the details of the second stage and the way of computation reduction in Hough transform. The effectiveness of our method are confirmed through computer experiment.
Entropy Coding
icon_mobile_dropdown
Study of binary image compression using universal coding
Yasuhiko Nakano, Hirotaka Chiba, Yoshiyuki Okada, et al.
This paper presents a new approach to binary image compression by using universal coding for various kinds of binary images, such as line-drawings and half-tones. We studied two types of preprocessing of universal coding for binary images, and found that for various kinds of line-drawings, and half- tone(screen-dots) images, both preprocessors outperformed compared with conventional schemes.
Motion Estimation and Motion Analysis
icon_mobile_dropdown
Iterative motion estimation method using triangular patches for motion compensation
Yuichiro Nakaya, Hiroshi Harashima
In order to overcome the drawback of the conventional block-based motion compensation, a new triangle-based method which utilizes triangular patches instead of blocks has recently been proposed. Compared to conventional methods which represent the motion of scene objects by translational displacements of blocks, the new method can cope with a wider range of motions since it allows for rotation and deformation of the triangular patches. In the block-based motion compensation, a simple local minimization algorithm (i.e. block matching) is applied to obtain the displacement vector of each block. However it is inappropriate to apply this algorithm in the triangle-based motion compensation because of the complicated linkage between the deformation of the triangular patches and the displacements of the grid points (vertices of triangles). Consequently, the primary issue is to find an optimal way to estimate motion of the grid points. In this paper, we present a new motion estimation algorithm called hexagonal matching which iteratively refines the estimated displacement vectors. Simulation results show that the motion estimation algorithm produces less prediction error than the previously proposed triangle-based method or the block-based method. We also propose another algorithm with similar function but less computational complexity.
Vector Quantization
icon_mobile_dropdown
Classified transform coding of images using two-channel conjugate vector quantization
Jae Yeal Nam, K. R. Rao
An adaptive image coding technique, called two channel conjugate classified discrete cosine transform / vector quantization (TCCCDCT/VQ), is proposed to efficiently exploit correlation in large image blocks by taking advantage of discrete cosine transform and vector quantization, while overcoming the suboptimalities of transform coding and reducing the complexity of vector quantization. In transform domain, a classified discrete cosine transform / vector quantization (CDCT/VQ) scheme is proposed and TCCCDCT/VQ is developed based on the CDCT/VQ scheme. These two techniques are applied to encode test images at about 0.51 BPP and 0.73 BPP.
Additional Papers
icon_mobile_dropdown
Hybrid coder for image sequences using detailed motion estimates
Michael Nickel, John Hakon Husoy
Motion estimation in hybrid image sequence coders is commonly done with block matching. In this paper we present another approach using detailed motion estimates for compensation. Such motion estimates have given good results when applied for frame interpolation[1]. In our case they enable us to reduce the prediction error significantly at the cost of a more complex motion estimate. We thus code this motion estimate with loss coming to a tradeoff between the accuracy of the estimate and the amount of bits necessary to send it as side information to the receiver. Our simulation results show that the same performance range as with a block matching based coder is reached.
Still Image Coding
icon_mobile_dropdown
Entropy coding for wavelet transform of image and its application for motion picture coding
Mutsumi Ohta, Mitsuharu Yano, Takao Nishitani
Motion picture wavelet coding is investigated. Wavelet coding[5,6,7] is expected to be a next generation transform coding, because it can reduce blocking effect and mosquito noise which are the major source of image quality degradation in conventional transform coding methods. Entropy coding of wavelet coefficients is first discussed, then blockless motion compensated adaptive inter/intra-frame prediction method is proposed for motion picture wavelet coding. Coding efficiency is evaluated by simulation. The proposed techniques improve subjective quality of decoded pictures.
Practical approach to fractal-based image compression
Alexander P. Pentland, Bradley Horowitz
Fractal techniques for image compression have recently attracted a great deal of attention. Unfortunately, little in the way of practical algorithms or techniques have been published. We present a technique for image compression that is based on a very simple type of iterative fractal. In our algorithm a wavelet transform (quadrature mirror filter pyramid) is used to decompose an image into bands containing information from different scales (spatial frequencies) and orientations. The conditional probabilities between these different scale bands are then determined, and used as the basis for a predictive coder. We find that the wavelet transform’s various scale and orientation bands have a great deal of redundant, self-similar structure. This redundant structure is, however, in the form of multi-modal conditional probabilities, so that linear predictors perform poorly. Our algorithm uses a simple histogram method to determine the multi-modal conditional probabilities, between scales. The resulting predictive coder is easily integrated into existing subband coding schemes. Comparison of this fractal- based scheme with our standard wavelet vector coder on 256 x 256 grey-level imagery shows up to a two-fold gain in coding efficiency with no loss in image quality, and up to a four-fold gain with small loss in image quality. Coding and decoding are implemented by small table lookups, making real-time application feasible.
Entropy Coding
icon_mobile_dropdown
Arithmetic coding model for compression of LANDSAT images
Arnulfo Perez, Sei-ichiro Kamata, Eiji Kawaguchi
The compression of LANDSAT images using Hilbert or Peano scanning and adaptive arithmetic coding is considered. The Hilbert scan is a general technique for continuous scanning of multidimensional data. Arithmetic coding has established itself as the superior method for lossless compression. This paper extends on previous work on the integration of the arithmetic coding methodology and a n-dimensional Hilbert scanning algorithm developed by Perez, Kamata and Kawaguchi. Hilbert scanning preserves the spatial continuity of an image, on both the x and y directions and a higher correlation exists between continuous points than in a raster scan. Therefore, a Hilbert adaptive scheme can better estimate the local probability distributions. Arithmetic coding is most efficient when the probabilities of the symbols are close to one. Therefore by integrating both the spatial and spectral information into a unified context a high rate of compression can be achieved.
Additional Papers
icon_mobile_dropdown
Subband coding of video using energy-adaptive arithmetic coding and statistical feedback-free rate control
Ashok Chhabedia Popat, Andre Nicoulin, Andrea Basso, et al.
An improved video subband coding technique is presented. It is based on a spatially and spectrally localizing subband analysis, followed by scalar quantization and direct arithmetic coding. The quantization and coding parameters are adapted on the basis of local energy of the subband pixels. The proposed technique automatically achieves near-optimal rate allocation with respect to mean-square error (or alternatively, weighted mean-square error). In addition, because arithmetic coding requires no alphabet extension, the quantized subband pixels can be encoded in an arbitrary order without affecting bit rate. This makes it possible to obtain a good statistical estimate of the quantization resolution required to achieve a certain overall bit rate, based on a relatively small random sample of the subband pixels and their corresponding energies. The proposed technique requires that estimates of the local energy of the subband pixels be encoded and sent to the receiver as side-information. A means of accomplishing this, using vector quantization (VQ), is mentioned.
Video Sequence Coding II
icon_mobile_dropdown
Adaptive perceptual quantization for video compression
Atul Puri, R. Aravind
International standards bodies snch as the ISO and CCITT have established various expert groups to standardize compressed digital representations of color image and video signals for different applications. Here we are concerned mainly with the work of the Motion Picture Experts Group (MPEG), whose charter is to specify a generic coded representation of video and audio. The MPEG work has been divided into three distinct phases, the first aimed at compressing low-resolution (1/4 CCIR-601) pictures at data-rates in 1-4 Mbits/sec, the second at compressing full CCIR-6O1 resolution pictures at data-rates in 4-10 Mbits/sec, and the third at compressing HDTV pictures at appropriately higher data-rates.
Still Image Coding
icon_mobile_dropdown
Fast piecewise-constant approximation of images
Hayder Radha, Martin Vetterli, Riccardo Leonardi
In this work, we present a Least-Square-Error (LSE), recursive method for generating piecewise -con stant approximations of images. The method is developed using an optimization approach to minimize a cost function. The cost function, proposed here, is based on segmenting the image, recursively, using Binary Space Partitionings (BSPs) of the image domain. We derive a LSE necessary condition for the optimum piece wise-constant approximation, and use this condition to develop an algorithm for generating the LSE, BSP-based approximation. The proposed algorithm provides a significant reduction in the computational expense when compared with a brute force method. As shown in the paper, the LSE algorithm generates efficient segmentations of simple as well as complex images. This shows the potential of the LSE approximation approach for image coding applications. Moreover, the BSP-based segmentation provides a very simple (yet flexible) description of the regions resulting from the partitioning. This makes the proposed approximation method useful for performing image affine transformations (e.g., rotation and scaling) which are common in computer graphics applications.
Motion Estimation and Motion Analysis
icon_mobile_dropdown
Temporal projection for motion estimation and motion compensating interpolation
Philippe Robert
In this paper, the problems of motion field estimation and motion compensating interpolation are considered. It is shown that motion temporal projection increases the accuracy of the motion estimation when used as a temporal prediction. Moreover, this process carries information about the occluding, appearing and disappearing areas. A label field is introduced in order to emphasize and memorize this information, and a simple method leads to the detection of such areas. In a second step, the motion and label fields are used for motion compensating interpolation. Motion temporal projection is used to define the motion field of the image to be interpolated. The information carried by the label field is propagated along the motion vectors in the image to be interpolated. This procedure leads to detect the foreground objects and correctly estimate their motion, and to detect the appearing and disappearing areas and adapt the interpolation to these particular areas.
Hierarchical Image Coding
icon_mobile_dropdown
High-speed two-dimensional pyramid image coding method and its implementation
Haralambos Sahinoglou, Sergio D. Cabrera
A non-separable pyramid scheme is developed for coding of images to achieve a high compression rate. The method allows a very efficient algorithmic implementation, using only 2 additions and a shift (division by 2) for each image pixel during coding or decoding. Since the operations needed are mostly independent of each other and have a high degree of regularity, it is also possible to design VLSI hardware to perform this operation using an array of simple basic cells. At the same time high coding efficiency results: for a typical 513x513 image we can achieve a peak signal-to-noise ratio of 30 dB with an encoding whose entropy is 0.133 bits per pixel. This information can be encoded using a method as simple as PCM, while still requiring only 0.385 bits per pixel, giving a very simple overall coding system.
Super High Definition Image Systems
icon_mobile_dropdown
High-fidelity subband coding for superhigh-resolution images
Takahiro Saito, Hirofumi Higuchi, Takashi Komatsu
Super high resolution images with more than 2,000*2.000 pixels will play a very important role in a wide variety of applications of future multimedia communications ranging from electronic publishing to broadcasting. To make communication of super high resolution images practicable, we need to develop image coding techniques that can compress super high resolution images by a factor of 1/10 to 1/20. Among existing image coding techniques, the sub-band coding technique is one of the most suitable techniques. With its applications to high-fidelity compression of super high resolution images,one of the major problem is how to encode high frequency sub-band signals. High frequency sub-band signals are well modeled as having approximately memoryless probability distribution, and hence the best way to solve this problem is to improve the quantization of high frequency sub-band signals. From the standpoint stated above, the work herein First compares three diferent scalor quantization schemes and improved permutation codes, which the authors have previously developed extending the concept of permutation codes, from the aspect of quantization performance for a memoryless probability distribution that well approximates the real statistical properties of high frequency sub-band signals, and thus demonstrates that at low coding rates improved permutation codes outperform the other scalor quatization schemes and that its superiority decreases as its coding rate increases. Moreover, from the results stated above, the work herein, develops rate-adaptive quantization techniques where the number of bits assigned to each subblock is determined according to the signal variance within the subblock and the proper quantization scheme is chosen from among different types of quantizaton schemes according to the allocated number of bits, and applies them to the high-fidelity encoding of high frequency sub-band signals of super high resolution images to demonstrate their usefulness.
Motion Estimation and Motion Analysis
icon_mobile_dropdown
Motion affine models identification and application to television image coding
Henri Sanson
This paper addresses two-dimensional motion analysis in the context of television image coding and presents a method achieving the identification of motion models per region in television-like image sequences. The proposed algorithm associates a downward hierarchical image analysis and a differential estimation of the model parameters, where the estimation of the dense velocity field and its approximation by a given model in the sense of weighted least squares are combined in an iterative process. The basic version of the algorithm performs the identification of affine models per block. However, the differential estimation of movement parameters extends straightforwardly to higher order models. Furthermore, the general scheme of the algorithm can be simply modified in order to achieve joint estimation and segmentation of the movement.
Image Transmission and Communication Systems
icon_mobile_dropdown
Two-layer pyramid image coding scheme for interworking of video services in ATM
Thomas Sikora, Thiow Keng Tan, Khok Khee Pang
A layered Pyramid image coding scheme suitable for interworking of videoconferencing and videophone services is presented in this paper. The scheme proposed takes advantage of inter-layer embedded motion prediction to compress image data on the upper Pyramid enhancement layer. As an important advantage of this scheme we have preserved independence of the coding in the two layers using embedded motion compensation. Thus no information about the coding procedure on the lower resolution layer is needed to encode the upper layer enhancement information. The results are encouraging, showing that the scheme reconstructs images of both CIF and QCIF resolution with good quality and that the scheme is more data efficient than a comparable Simulcast approach.
Model Based Image Coding
icon_mobile_dropdown
Model-based coding of facial images based on facial muscle motion through isodensity maps
Ikken So, Osamu Nakamura, Toshi Minami
A model-based coding system has come under serious consideration for the next generation of image coding schemes, aimed at greater efficiency in TV-telephone and TV-conference systems [l]-[5]. In this model-based coding system, the sender’s model image is transmitted and stored at the receiving side before the start of conversation. During the conversation, feature points are extracted from the facial image of the sender, and are transmitted to the receiver. The facial expression of the sender facial is reconstructed from the feature points received and a wireframe model constructed at the receiving side.
Video Sequence Coding I
icon_mobile_dropdown
Laplacian pyramid coding of prediction error images
Christoph Stiller, Dirk Lappe
Pyramid image coding is an important class of encoding for progressive transmission. Image data is represented in a hierarchical structure and is transmitted in the order of decreasing significance. This property can be exploited for image sequence coding. Pyramid encoding allows to adapt to the statistics of prediction error images and to the available residual data rate. In this paper a Laplacian image pyramide is employed for coding the prediction error image of a hybrid image sequence coder. A tree growing algorithm for bit assignment within the pyramid is introduced for maximization of a quality criterion in the overall reconstructed image. Experiments show reasonable image quality even for low bit rates as 8 kbit/s.
Vector Quantization
icon_mobile_dropdown
Subsampled vector quantization with nonlinear estimation using neural network approach
Huifang Sun, Constantine N. Manikopoulos, Hwei P. Hsu
An intraframe block coding scheme which is devoted to of using interblock redundancy and reducing the block effect is proposed. The scheme is based upon a combination of a nonlinear interpolation with transform and vector quantization techniques. Interblock redundancy is removed by extracting a pixel from each block as a common seed to interpolate the rest of the pixels of the neighboring blocks. The estimation error signals are reduced by nonlinear interpolation, which often results in no further processing requirements for some blocks. This saving compensates for the cost of transmitting the overhead of the estimation parameters.
Image Transmission and Communication Systems
icon_mobile_dropdown
Analysis of optimum-frame-rate in low-bit-rate video coding
Yasuhiro Takishima, Masahiro Wada, Hitomi Murakami
We analyze frame rates in low bit rate video coding and show that an optimal frame rate can be theoretically obtained. In order to achieve an optimum balance between coded picture quality and motion smoothness, a frame rate has to be selected which is appropriate to parameters such as coding scheme, property of the video signals and coding bit rate. In this paper, coding distortion measured by mean square error due to quantization is analyzed by modeling video signals, and it is shown that the frame rate can be uniquely optimized. The result of this analysis is compared with the results of computer simulation. In addition, the relation between this analysis and the subjective evaluation is described.
3-D Motion Analysis
icon_mobile_dropdown
3-D TV: joined identification of global motion parameters for stereoscopic sequence coding
Ahmed Tamtaoui, Claude Labit
Efficient compression techniques for stereoscopic sequences (3DTV) are required for storage and transmission purposes. One general approach for image sequence coding is based on motion compensation schemes. We propose a new method which enables a joined estimation of global motion descriptor vectors both for left and right sequences. The estimation of descriptors in stereoscopic sequences presented in this paper generalizes pel-recursive methods using steepest descent algorithms subject to some constraints of coherence. In the case of parallel estimations (on the left and right views), eight descriptors can characterize the motion of two matched regions. By the coherent estimation detailed here, only five descriptors can represent the motion of two matched regions. Results are evaluated according to reconstruction errors and qualitative interpretation of apparent motion fields.
Image Transmission and Communication Systems
icon_mobile_dropdown
Secret transmission method of character data in motion picture communication
Kiyoshi Tanaka, Yasuhiro Nakamura, Kineo Matsui
A secret transmission method of character data is presented in this paper when a series of motion pictures are communicated between two stations. Character data are embedded into the codes of image data to be output from the interframe DPCM encoder, so as to prevent a third party from reading the message; The received data are normally decoded to a picture in higher quality and the message is also decoded at the same time if the receiver knows how to embed it. It results in our experiments that the transmission of character data has little effect on picture quality and the amount of embedded characters will be estimated to be about 20 ~ 30% of bits transmitted in each frame, for example, when SNR is above 30dB.
Super High Definition Image Systems
icon_mobile_dropdown
New subband scheme for super-HDTV coding
The subband scheme is one of the most promissing schemes for super HDTV coding. In this paper, we propose two types of multidimensional multichannel subband scheme. Using the concept of the complementary subsampling, the properties of the analysis and synthesis filters axe analyzed. Coding gain is calculated for the proposed scheme by using a picture model. The proposed scheme shows coding gain higher than the conventional scheme.
Image Transmission and Communication Systems
icon_mobile_dropdown
Model for packet image communication in a centralized distribution system
Habib H. Torbey, Zhensheng Zhang
The distribution of the compressed image data volume for a lossless image coder is derived from the distribution of the uncompressed images and that of the coder's compression ratio. This result is used to identify the parameters of an on-off statistical source model characterizing the still image source as seen by the network. The on-off source model gives rise to a batch arrival queueing model to represent image transmission over a network in the situation where there exists a mismatch in the speed of the different components of an imaging system. For a centralized distribution system in particular, the model arises when the transmission queueing delay is the bottleneck at the database and when the arrival rate at a network node is much faster than the transmission node. While the overall time delays can be kept very low, the average buffer sizes build up considerably prompting the need for flow control and for queueing the image requests rather than the image data at the database.
Visual Communication Hardware
icon_mobile_dropdown
VLSI implementation of a buffer, universal quantizer, and frame-rate-control processor
H. Uwabu, Eiji Kakii, R. Lacombe, et al.
The CCITT Recommendation H.261 [1] describes the video coding and decoding method for the moving picture component of audiovisual services at the rates of px64 kbit/s, where p is in the range 1 to 30. Accordingly, several chip sets realizing these methods have been announced[3,4,5,6]. In this paper, a new architecture and implementation of an adaptive coding controller chip is reported. The main features of the chip include: (A) Adaptive Weighted Image Quality Control Capability, (B) High Speed Operation and (C) External Control Capability for Coding Control. The chip is implemented with CMOS standard cells and contains approximately 38,000 gates.
Video Sequence Coding I
icon_mobile_dropdown
Video compression algorithm with adaptive bit allocation and quantization
Eric Viscito, Cesar A. Gonzales
The emerging ISO MPEG video compression standard is a hybrid algorithm which employs motion compensation, spatial discrete cosine transforms, quantization, and Huffman coding. The MPEG standard specifies the syntax of the compressed data stream and the method of decoding, hut leaves considerable latitude in the design of the encoder. Although the algorithm is geared toward fixed-bit- rate storage media, the rules for bit rate control allow a good deal of variation in the number of bits allocated to each picture. In addition, the allocation of bits within a picture is subject to no rules whatsoever. One would like to design an encoder that optimizes visual quality of the decoded video sequence subject to these bit rate restrictions. However, this is difficult due to the elusive nature of a quantitative distortion measure for images and motion sequences that correlates well with human perception. This paper describes an MPEG encoder designed to produce good quality coded sequences for a wide range of video source characteristics and over a range of bit rates. The novel parts of the algorithm include a temporal bit allocation strategy, spatially adaptive quantization, and a bit rate control scheme.
Image Transmission and Communication Systems
icon_mobile_dropdown
Signal loss recovery in DCT-based image and video codecs
Yao Wang, Qin-Fan Zhu
This paper presents a new technique fpr image reconstruction from a partial set of transform coefficients in DCT image coders. By utilizing the correlation between adjacent blocks and imposing smoothing constraints on the reconstructed image the proposed algorithm produces an image which is maximally smooth among all the images with the same available coefficients. The optimal solution can be obtained either by a linear transformation or through an iterative process. The underlying principle of the algorithm is applicable to any unitary block-transform and is very effective for recovering the DC and other low frequency coefficients. Applications in DCT based still image codecs and video codecs using motion compensation have been considered. Simulation results with still images show that a very good reconstruction can be obtained even when DC and certain low frequencies are missing in many blocks .
Motion Estimation and Motion Analysis
icon_mobile_dropdown
Windowed motion compensation
Hiroshi Watanabe, Sharad Singhal
A new motion compensation technique using a window which satisfies the perfect reconstruction condition is proposed. The conventional motion compensation using rectangular blocks often gives discontinuities between neighboring motion compensation blocks in the predicted image. The proposed method is based on a window operation to the data which overlaps an area of the conventional motion compensation block. Computer simulation is carried out using MPEG video coding algorithm to evaluate the proposed method. The performance of the proposed method is better than the conventional method in terms of mean square error, and large improvement can be obtained at the block boundaries. This gives a smooth predicted image for a typical MC+DCT coding scheme.
Still Image Coding
icon_mobile_dropdown
Enhancement of transform coding by nonlinear interpolation
Siu-Wai Wu, Allen Gersho
In conventional transform coding, a linear transformation generates a set of transform coefficients for each image block, followed by quantization of the transform coefficients and inverse transformation. We show that for a given transform encoder, the conventional decoder employing inverse transformation is a special case of a nonlinear interpolative decoder that performs table lookups to reconstruct the image blocks from the code indexes. In the nonlinear interpolative decoder, each received code index of an image block addresses a particular codebook to fetch a component block. The image block is then reconstructed as the vector sum of the component blocks. Hence with a set of well designed codebooks, this new decoding technique will be superior to the conventional decoder. In this paper, we develop an iterative algorithm for designing a set of locally optimal codebooks. We also present a hybrid decoder which combines conventional decoding with nonlinear interpolative decoding to reduced the memory requirement for codebook storage. Computer simulations with the JPEG image compression algorithm demonstrate that this new decoding technique can decode enhanced quality pictures from the bit stream generated by the standard encoding scheme.
Video Sequence Coding I
icon_mobile_dropdown
HDTV compression with vector quantization of transform coefficients
Siu-Wai Wu, Allen Gersho
We present a video coding scheme that uses vector quantization (VQ) in conjunction with die discrete cosine transform (DCT) for compression of high definition television (HDTV) signals. Motion compensated field prediction is applied with a group-of-fields structure where the first field in a group is compressed by intrafield coding and consecutive fields in the group are coded with motion compensation. Instead of using a channel buffer to regulate the transmission rate of the bit stream, we set a target bit rate for each field and dynamically allocate bits in the field to code it efficiently. The motion compensated field difference (MCFD) is coded by an algorithm similar to that used in intrafield coding. An MCFD field (or original video signal in the intrafield mode) is first partitioned into non-overlapping 8x8 image blocks. Each block is transformed into a block of DCT coefficients, which are then partitioned into a set of low dimensional vectors. Vector quantization is used to code the vectors of DCT coefficients. Furthermore, in the intrafield mode, a second stage of vector quantization in the spatial domain can be used to code the error image that results from die DCT coding in the first stage. Under the fixed bit rate constraint for a field, a greedy algorithm with a perceptual distortion criterion is applied to select the DCT vectors and residual vectors to be coded. Simulation results demonstrated that the algorithm can be used for compression of HDTV at 45 Mb/s with distribution quality.
Image Transmission and Communication Systems
icon_mobile_dropdown
Effects of M-transform for bit-error resilement in the adaptive DCT coding
Nobumoto Yamane, Yoshitaka Morikawa, Hiroshi Hamada
Recently, image service systems using data compression techniques are increasingly required. In such systems, coding algorithm is expected to be insensitive to channel noise as well as to have good source coding performance. This paper examines effects of channel errors on the quality of reconstructed image in adaptive DCT coding, in the case of using binary symmetric channels(BSC).
Hierarchical Image Decomposition
icon_mobile_dropdown
Design of M-band filter banks based on wavelet transform
Ming-Haw Yaou, WenThong Chang
Multiresolution representation of signal is effective for signal processing. In this paper, the relationship between the multirate filtering and Wavelet transform is described. It is indicated that the design of M-band perfect reconstruction multirate filter banks can be interpreted as the construction of the multiresolution Wavelet bases with resolution step M. With this observation, a set of formulations for the design of perfect reconstruction FIR multirate filter banks is derived. Also, the experimental results are presented and discussed.
Video Sequence Coding II
icon_mobile_dropdown
Motion-compensated wavelet transform coding for color video compression
Ya-Qin Zhang, Sohail Zafar
A video compression scheme based on the wavelet representation and multi-resolution motion estimation (MRME) is presented in this paper. The multiresolution/multifrequency nature of the discrete wavelet transform lends itself an ideal tool for representing images and video signals. Wavelet transform decomposes a video frame into a set of sub-frames with different resolutions corresponding to different frequency bands. These multiresolution frames also provide a representation of the global motion structure of the video signals at different scales. The motion activities for a particular sub-frame in different resolutions are hence highly correlated since they actually specify the same motion structure at different scales. In the MRMC described in Section 4, motion vectors in higher resolution are predicted by the motion vectors in the lower resolution, and are refined at each step. In particular, we propose a variable block-size MRMC scheme in which the size of a block is adapted to its level in the pyramid. This scheme not only considerably reduces the searching and matching time but also provides a meaningful characterization of the intrinsic motion structure. The variable-size MRMC approach also avoids the drawback of the constant-size MRMC in describing small object motion activities. After wavelet decomposition, each scaled wavelet tends to have different statistical properties. An adaptive truncation process similar to [CHEN 84] was implemented and a bit allocation scheme similar to that in the transform coding is examined by adapting to the local variance distribution in each scaled wavelet. Based on the wavelet representation, variable-size MRMC approach and a uniform quantization scheme, four variations of the proposed motion-compensated wavelet video compression system are presented in Section 6. It is shown that the motion-compensated wavelet transform coding approach out-performs the conventional transform coding scheme in terms of the signal-to-noise ratio as well as the subjective performance.
Vector Quantization
icon_mobile_dropdown
Vector quantization of image pyramids with the ECPNN algorithm
Diego Pinto de Garrido, William A. Pearlman, Weiler A. Finamore
A recent algorithm for single rate vector quantization [1] is used for coding image pyramids. The algorithm, called entropy-constrained pairwise-nearest-neighbor (ECPNN), designs codebooks by merging the pair of Voronoi regions which gives the most decrease in entropy for a given increase in distortion. In terms of performance in the mean-squared-error sense the algorithm produces codebooks with the same performance as the ECVQ design algorithm [1,2]. The main advantage over ECVQ is that ECPNN algorithm enables much faster codebook design. A single pass through the ECPNN design algorithm, which progresses from larger to successively smaller rates, allows the storage of any desired number of optimal intermediate-rate codebooks. In the context of pyramid coding, this feature is especially desirable, since the ECPNN design algorithm must be run for each sub-band and storage of codebooks of different rates are required for each subband. Good results at 0.5 bpp, judged both visually and using peak-to-peak SNR criterion, have been obtained by coding image pyramids using ECPNN codebooks.
Tree-structured vector quantization with input-weighted distortion measures
Pamela C. Cosman, Karen Oehler, Amanda A. Heaton, et al.
A greedy tree-growing algorithm is used in conjunction with an input-dependent weighted distortion measure to develop a tree-structured vector quantizer. Vectors in the training set are classified, and weights are assigned to the classes. The resulting weighted distortion measure forces the tree to develop better representations for those classes that are considered important. Results on medical images and USC database images are presented. A tree-structured vector quantizer grown in a similar manner can be used for preliminary classification as well as compression.
Hierarchical Image Coding
icon_mobile_dropdown
Hierarchical motion-compensated deinterlacing
John W. Woods, Soo-Chul Han
This paper introduces a new method of converting interlaced video to a progressively scanned video. The missing pixel values of the interlaced sequence are interpolated from past fields according to motion vectors found between the present and past. Hierarchical block-matching motion estimation is used in finding these motion vectors. This approach gives improved results over conventional de-interlacing methods such as median filtering and linear spatial interpolation.