Proceedings Volume 2187

Digital Video Compression on Personal Computers: Algorithms and Technologies

Arturo A. Rodriguez
cover
Proceedings Volume 2187

Digital Video Compression on Personal Computers: Algorithms and Technologies

Arturo A. Rodriguez
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 2 May 1994
Contents: 9 Sessions, 33 Papers, 0 Presentations
Conference: IS&T/SPIE 1994 International Symposium on Electronic Imaging: Science and Technology 1994
Volume Number: 2187

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Hardware Implementations
  • Scene Change Detection and Video Testing
  • Video Coding Methods and Techniques I
  • Video Coding for Software-Only Playback
  • Video Coding Methods and Techniques II
  • MPEG Implementations
  • MPEG Standards
  • Low-Bit-Rate Coding
  • Poster Session
Hardware Implementations
icon_mobile_dropdown
Design of a motion JPEG (M/JPEG) adapter card
In this paper we describe a design of a high performance JPEG (Joint Photographic Experts Group) Micro Channel adapter card. The card, tested on a range of PS/2 platforms (models 50 to 95), can complete JPEG operations on a 640 by 240 pixel image within 1/60 of a second, thus enabling real-time capture and display of high quality digital video. The card accepts digital pixels for either a YUV 4:2:2 or an RGB 4:4:4 pixel bus and has been shown to handle up to 2.05 MBytes/second of compressed data. The compressed data is transmitted to a host memory area by Direct Memory Access operations. The card uses a single C-Cube's CL550 JPEG processor that complies with the baseline JPEG. We give broad descriptions of the hardware that controls the video interface, CL550, and the system interface. Some critical design points that enhance the overall performance of the M/JPEG systems are pointed out. The control of the adapter card is achieved by an interrupt driven software that runs under DOS. The software performs a variety of tasks that include change of color space (RGB or YUV), change of quantization and Huffman tables, odd and even field control and some diagnostic operations.
JPEG image compression hardware implementation with extensions for fixed-rate and compressed-image editing applications
Martin P. Boliek, James D. Allen, Tadanori Ryu, et al.
This paper describes a new high speed image compression hardware implementation which, in addition to implementing the JPEG baseline system, can be used in fixed-rate compression systems and in systems that allow editing of compressed images. High speed (CCIR 601 resolution at 30 frames/second) and small silicon size are achieved by using a unique parameterized orthogonal transform. This implementation is compatible with the DCT, yet requires no additional multiplications. Also described here is a system that uses a dynamic quantization scalar circuit to achieve fixed-rate compression and a system that utilizes the `static coefficient DPCM' feature to compress and decompress regions of interest within an image for editing or viewing.
ASIC implementation of recursive scaled discrete cosine transform algorithm
Bill N. On, Sam Narasimhan, Victor K.L. Huang
A program to implement the Recursive Scaled Discrete Cosine Transform (DCT) algorithm as proposed by H. S. Hou has been undertaken at the Institute of Microelectronics. Implementation of the design was done using top-down design methodology with VHDL (VHSIC Hardware Description Language) for chip modeling. When the VHDL simulation has been satisfactorily completed, the design is synthesized into gates using a synthesis tool. The architecture of the design consists of two processing units together with a memory module for data storage and transpose. Each processing unit is composed of four pipelined stages which allow the internal clock to run at one-eighth (1/8) the speed of the pixel clock. Each stage operates on eight pixels in parallel. As the data flows through each stage, there are various adders and multipliers to transform them into the desired coefficients. The Scaled IDCT was implemented in a similar fashion with the adders and multipliers rearranged to perform the inverse DCT algorithm. The chip has been verified using Field Programmable Gate Array devices. The design is operational. The combination of fewer multiplications required and pipelined architecture give Hou's Recursive Scaled DCT good potential of achieving high performance at a low cost in using Very Large Scale Integration implementation.
Real-time MPEG video codec on a single-chip multiprocessor
Woobin Lee, Jeremiah Golston, Robert John Gove, et al.
We present a software implementation of a real-time MPEG video codec on the MediaStation 5000 multimedia system. Unlike other compression systems whose sole function is the encoding or decoding of video data, the MediaStation 5000 is capable of performing various real-time operations involving a wide range of multimedia data, including image, graphics, video, and even audio. This programmability is provided by Texas Instruments TMS320C80, better known as Multimedia Video Processor (MVP), which is a single-chip multiprocessing device with highly parallel internal architecture. The MVP integrates a RISC processor, four DSP-like processors, an intelligent DMA controller, video controllers, and a large amount of SRAMs onto a single chip. Since the MVP contains such a high degree of parallel features, developing the MPEG software and mapping it to the MVP requires a thorough study of the algorithms and a good understanding of the processor architecture. By exploiting the advanced features of the MVP, the MediaStation 5000 can achieve the MPEG compression and decompression of video sequences in real time.
Scene Change Detection and Video Testing
icon_mobile_dropdown
Video scene decomposition with the motion picture parser
E. Deardorff, Thomas D.C. Little, J. D. Marshall, et al.
A motion picture can be modeled as a composition of many scenes where each scene is comprised of multiple shots. Thus, a conventional movie is a sequential aggregation of a large number of disparate image sequences. Within each image sequence or shot, there is consistency in image content and dynamics. This consistency in dynamics can be used in identifying scene changes for video segment decomposition and for techniques to improve data compression. We have developed an algorithm to use these dynamics for scene change detection and the decomposition of video streams into constituent logical shots. The algorithm uses intraframe image complexity and identifies scene transitions by considering short-term temporal dynamics. The algorithm has shown to be effective for detecting both abrupt scene changes (cuts) as well as smooth scene changes (fades and dissolves). This algorithm is used in an application we have developed called the Motion Picture Parser (MPP). The MPP automates the process of tagging segments of motion-JPEG-compressed movies. Segments are also tagged for subsequent semantic content-based retrieval in units of shots and scenes. The MPP application consists of a graphical user interface with various editing controls.
Method for extracting camera operations in order to describe subscenes in video sequences
Junji Maeda
Numerous attempts have been made to detect scene changes in video sequences and to treat scenes as units, to allow handling of very large amounts of video data. But less attention has been given to the structure of scenes themselves. In this study, we define a `sub-scene' as a structural component of a scene in terms of camera operations. In other words, sub-scenes are subsets of scenes, and consist of successive frames in which the movements of the contents are almost identical. The advantage of sub-dividing scenes into sub-scenes is that it gives added power to description of objects and motion information. In order to describe sub-scenes, the estimation of camera operations need not be very accurate, but it must be resistant to noise. The technique proposed in this paper for extracting camera operations, which processes more than two frames at a time by using a 2D spatio-temporal image, meets the above requirement. It is also faster than conventional frame-by-frame analysis. The results of experiments indicate that the technique is both feasible and useful.
Computer-based testing of digital video quality
Christopher P. Cressy, Guy W. Beakley
Conventional analog video test measurements are generally not adequate for digital video, especially compressed video. This is because digital video distortion and artifacts often are spatially and temporally discrete phenomena. Most analog measurements assume that errors are of a continuous, linear nature. The only alternative to data has been subjective testing. Formal subjective tests (e.g., CCIR 500) can provide reliable, relative measures of video quality. However, such testing is time-consuming and expensive. Objective testing methods are needed to provide efficient, repeatable measures of video quality. A further advantage of objective testing is that it can provide greater insights into the nature of impairments. Presently, no standardized objective measures exist for digital video. However, pioneering work has been done by NTIA, NASA, ACATS and others to quantify the quality of digital video codecs. We have implemented some of the published measurements and others of our own design on a low-cost workstation. These measures utilize complex digital image processing techniques to analyze differences between source and processed video sequences. This paper presents formulations of these measurements and describes our implementation of an automated system to capture and test digital video quality.
Video Coding Methods and Techniques I
icon_mobile_dropdown
Modulated lapped transforms in image coding
Ricardo L. de Queiroz, K. R. Rao
The class of modulated lapped transforms (MLT) with extended overlap is investigated in image coding. The finite-length-signals implementation using symmetric extensions is introduced and human visual sensitivity arrays are computed. Theoretical comparisons with other popular transforms are carried and simulations are made using intraframe coders. Emphasis is given in transmission over packet networks assuming high rate of data losses. The MLT with overlap factor 2 is shown to be superior in all our tests with bonus features such as greater robustness against block losses.
Hybrid DCT/quadtree motion-compensated coding of video sequences
Sam J. Liu, Feng-Ming Wang
This paper proposes a hybrid DCT/Quadtree coding technique to compress motion compensated difference images for video conferencing applications. The hybrid scheme is developed primarily to eliminate the visually annoying `corona' effect caused by the traditional DCT coding of subimage blocks containing edge components. In this coding framework, the quadtree segmentation scheme is selected to compress edge blocks because of its ability to describe geometrical structures accurately. For quadtree coded blocks, each segmented subregion is represented by its associated sample mean, which is quantized and transmitted along with the segmentation pattern information to the receiver for reconstruction. The remaining subimage blocks are coded using the DCT. Simulation results demonstrate that this hybrid coding system provides a subjectively superior video quality than traditional DCT coding systems: the corona effect is virtually eliminated and the visible difference is most noticeable at low bit rates.
Rate- and resolution-scalable 3D subband coding of video
We propose a full color video compression strategy, based on 3D subband coding with camera pan compensation, to generate a single embedded bit stream supporting multiple decoder display formats and a wide, finely gradated range of bit rates. An experimental implementation of our algorithm produces a single bit stream, from which suitable subsets are extracted to be compatible with many useful decoder frame sizes and frame rates and to satisfy transmission bandwidth constraints ranging from several tens of kilo-bits per second to several megabits per second. Reconstructed video quality from any of these bit stream subsets is often found to exceed that obtained from an MPEG-1 implementation, operated with equivalent bit rate constraints, in both perceptual quality and mean squared error. In addition, when restricted to two dimensions, the algorithm produces some of the best results available in still image compression.
Applying mid-level vision techniques for video data compression and manipulation
John Wang, Edward H. Adelson
Most image coding systems rely on signal processing concepts such as transforms, VQ, and motion compensation. In order to achieve significantly lower bit rates, it will be necessary to devise encoding schemes that involve mid-level and high-level computer vision. Model-based systems have been described, but these are usually restricted to some special class of images such as head-and-shoulders sequences. We propose to use mid-level vision concepts to achieve a decomposition that can be applied to a wider domain of image material. In particular, we describe a coding scheme based on a set of overlapping layers. The layers, which are ordered in depth and move over one another, are composited in a manner similar to traditional `cel' animation. The decomposition (the vision problem) is challenging, but we have attained promising results on simple sequences. Once the decomposition has been achieved, the synthesis is straightforward.
Video Coding for Software-Only Playback
icon_mobile_dropdown
Feasibility of video codec algorithms for software-only playback
Arturo A. Rodriguez, Ken Morse
Software-only video codecs can provide good playback performance in desktop computers with a 486 or 68040 CPU running at 33 MHz without special hardware assistance. Typically, playback of compressed video can be categorized into three tasks: the actual decoding of the video stream, color conversion, and the transfer of decoded video data from system RAM to video RAM. By current standards, good playback performance is the decoding and display of video streams of 320 by 240 (or larger) compressed frames at 15 (or greater) frames-per- second. Software-only video codecs have evolved by modifying and tailoring existing compression methodologies to suit video playback in desktop computers. In this paper we examine the characteristics used to evaluate software-only video codec algorithms, namely: image fidelity (i.e., image quality), bandwidth (i.e., compression) ease-of-decoding (i.e., playback performance), memory consumption, compression to decompression asymmetry, scalability, and delay. We discuss the tradeoffs among these variables and the compromises that can be made to achieve low numerical complexity for software-only playback. Frame- differencing approaches are described since software-only video codecs typically employ them to enhance playback performance. To complement other papers that appear in this session of the Proceedings, we review methods derived from binary pattern image coding since these methods are amenable for software-only playback. In particular, we introduce a novel approach called pixel distribution image coding.
High-performance video codec for CD-ROM-based video playback
Katherine S. Wang, James O. Normile, Hsi-Jung Wu
This paper discusses video compression and decompression strategies for use on general purpose computer systems where no specialized hardware is available. We first examine the alternatives and describe the performance and limitations of the first generation of such methods. A brief description is given of the possible algorithmic approaches. We introduce requirements for the encoder/decoder, and show that a vector quantization based scheme with image preprocessing and classification can provide the required performance. The remainder of the paper deals with the algorithm design and the tradeoffs made to meet the goals of realtime decode capability, compression ratio and image quality. Finally, we present results for the optimized algorithm and indicate areas which appear to be most promising for further work.
Computationally fast wavelet-based video coding scheme
Ayan Sengupta, Michael L. Hilton, Bjorn D. Jawerth
We present a new technique for rapidly evaluating the inverse wavelet transform in video compression applications. The wavelet transform decomposes each pixel in an image into a linear combination of basis functions. Typically, very few pixels in a video sequence change from one image to the next; therefore, very few of the coefficients in the transformed video sequence change from one image to the next. We capitalize on this fact, and speed up the reconstruction of transformed image sequences by computing the inverse transform for only those coefficients that have changed since the previous image. Our prototype software-only video decompressor based on this idea is capable of reconstructing 256 by 256, 8-bits per pixel, greyscale images at a rate of 18 frames per second at a compression ratio of about 22:1.
Software codec for personal computers based on the discrete cosine transform
Chris Pitts, J. Mark Beaumont, Saul Cozens, et al.
This paper describes the development of a compatible encoder and decoder pair (codec) for motion video. A study has been undertaken of compression algorithms of low computational complexity, which may be candidates for running in real time on a personal computer (PC). From this study a DCT based algorithm was selected for further development and implementation on a PC. The code has been implemented in C, with critical routines optimized in assembly language. The encoder/decoder is intended to facilitate video communication applications between personal computers without the need for dedicated compression hardware. A 33 MHz client PC connecting to a remote 66 MHz server PC via a 64 kbit/s dial-up digital link has been demonstrated. Live images grabbed by the server PC are compressed and transmitted to the client PC, requiring about 40 of the available 64 kbit/s. For 64 greyscale image of 144 X 120 pixels a frame rate of about 10 frame/s is achieved.
Using 4x4 DCTs and moving 4x4 blocks for software-only video decompression
Roger J.F. Wilson
4 X 4 inverse DCTs are computationally twice as efficient as 8 X 8 inverse DCTs. In addition they have less register pressure, making the implementation more efficient for most processors. This extra computational efficiency places more pressure on improving the other components of the video decompression system. This paper describes a coding scheme for the quantized components which has an efficient decompression algorithm. The implementation of the inverse DCT is examined in detail, including removing all multiplication operations and replacing them with single CPU cycle shift-and-add operations. The data flow through a complete 4 X 4 DCT is described such that the results of intermediate 1 X 4 DCTs are written to and read from memory as efficiently as possible. A set of quantizers for the 4 X 4 DCT are presented which allow the decompression process to use shift-and-add instead of multiply. Efficient DCTs by themselves are not enough: the paper discusses the effects of changing the search area (in particular to being non-square) and shows a coding scheme suitable for fast decompression. It also discusses how to organize the block matching and that good results can be obtained if some pixels are allowed to match worse than the nominal value.
Video Coding Methods and Techniques II
icon_mobile_dropdown
Motion-compensated visual pattern image sequence coding
Barry S. Barnett, Alan Conrad Bovik
This paper presents an improved motion compensated version of the Visual Pattern Image Sequence Coding (VPISC) paradigm. It is a high performance video codec that is easily implemented in software. Software video codecs are not only cheaper, but are more flexible system solutions because they enable multi-vendor computers to exchange encoded video information without requiring on-board protocol compatible video codec hardware. The codec is intended for real-time desktop computer applications like multimedia delivery, and local area network (as well as point-to-point) televideo conferencing. We describe a version of motion compensated VPISC (MCVPISC) that has achieved bit rates of 0.025 bpp or better for MPEG test sequences (source coding has further reduced the bit rate by about half). The computational complexity of the encoder is less than 3 integer operations per pixel. The decoder is bounded between 0.016 and 0.125 logical and integer operations per pixel.
HDCC: a software-based compression algorithm for video conferencing
Henry P. Moreton, Jeannine Smith
In this paper we describe a low cost approach to the compression of sequences of video images. The target application of this algorithm is video conferencing.
Statistical inverse discrete cosine transforms for image compression
Andy C. Hung, Teresa H.-Y. Meng
The Discrete Cosine Transform (DCT) has been applied to image and image sequence compression to decorrelate the picture data before quantization. This decorrelation results in many of the quantized transform coefficients equaling zero, hence the compression gain. For the decoder, the very few, sparsely populated, non-zero transform coefficient can be utilized for great speed-up in the inverse DCT. This paper describes and compares two styles of implementations of fast inverse DCTs for sparse data. The first implementation that we call the symmetric mapped inverse DCT is based on the forward mapped inverse DCT, but our implementation is up to three times faster. The second implementation is based on a scaled inverse DCT, with detection of zero values. Both implementations are tested for speed against other algorithms, under varying degrees of DCT coefficient sparseness.
Video compression by coefficient compensation
Hirohisa Yamaguchi
This paper discusses a new video compression algorithm called Coefficient Compensation that achieves high performance by optimizing the DCT encoding loop structure, and demonstrates its improvement characteristic as compared with the conventional JPEG or MPEG I/P picture encoding under the condition that movement compensation is not available. The first part of the paper analyzes the performance of the DCT encoding loop structure, the method of optimization and the concept of soft-decision. The second part reports on the simulation results. Encoding performance is tested by various MPEG test sequences compressed in 20 Mbits/sec. The difference between JPEG and MPEG I is in the quantization table. For JPEG, the quantization table widely accepted as one producing a good compression quality is used. All the encoded binary outputs are fully compatible to the MPEG2 syntax but additional 2 bits/block information is coded as the user data for the Coefficient Compensation.
MPEG Implementations
icon_mobile_dropdown
Fast motion estimation algorithm for an MPEG video coder
Eric Chan, Rakeshkumar Gandhi, Sethuraman Panchanathan
In this paper, we propose a reduced complexity block-matching motion estimation algorithm for an MPEG video coder. This algorithm consists of a layered structure and hence does not converge to a local optimum. Most importantly, it employs a simple matching criterion, namely, the modified pixel difference classification (MPDC) and hence results in a reduced computational complexity. The MPEG video coder has been simulated using the proposed layered structure MPDC algorithm (LSA-MPDC). Simulation results indicate that the LSA- MPDC algorithm achieves a good performance for both slow and fast moving sequences. In addition, the hardware implementation of the LSA-MPDC algorithm is very simple because of the binary operations used in the matching criteria.
Software implementation of MPEG-II video encoding using socket programming in LAN
Yanbin Yu, Dimitris Anastassiou
This contribution will examine the parallel characteristics of MPEG video encoding and explore the feasibility of using a group of workstations under a LAN to perform MPEG-II encoding. An analysis of partitioning the MPEG algorithm into several tasks is presented. This is followed by the details of implementation by using socket programming as a means to coordinate the workstations. The following conclusions have been reached: (1) Since each slice header has a start code which is byte-aligned, each slice is a good basic unit for parallel processing. This is the case for I- P- and B-pictures. (2) All the most computationally expensive operations, such as Motion Estimation and DCT, can be conducted in parallel. (3) The quantization parameter is processed in a sequential manner since it should be derived from the history of the encoding. Computer simulations have been conducted, and the bitstreams generated are the same as those of a sequential encoder and can be decoded by an MPEG-II decoder.
High-performance cross-platform MPEG decoder
Hemant Bheda, Partha Srinivasan
We present a high performance implementation of a MPEG decoder, written entirely on a high level language. The decoder implementation fully complies with the MPEG-I standard and decodes all (I, P, B) frame types in MPEG video bitstreams and is portable. Versions of this decoder are implemented on Windows 3.1, and on Windows NT (X86, MIPS, ALPHA). A comparison of the performance of the decoder between the various platforms is made. We present a high quality, fast dithering and interpolation algorithm used to convert YCbCr directly into 8 bit palletized images. We propose a new method called Collaborative Compression, of dealing with compression and decompression tasks at a very low cost to achieve 30 fps SIF performance for desktop applications. Collaborative Compression is a systems approach to partitioning the functionality between CPU-centric (i.e. software) and hardware-assist (VLSI) in order to achieve the optimal cost solution. The CPU provides glue programmability to tie the accelerated and non-accelerated parts of the algorithm together. The advent of high bandwidth, low latency busses (VL Bus and PCI) enable a high speed data pathway between the distributed computational elements.
ISO/IEC software implementation of MPEG-1 video
Chad E. Fogg, Peter Au, Stefan Eckart, et al.
The MPEG-1 video standard, ISO/IEC 11172 Part 2, specifies the syntax and the semantic rules by which bitstreams are generated in the encoder, and arithmetic rules by which pictures are reconstructed in the decoder. The actual encoder model is left open to the designer to choose among different cost and quality tradeoffs. An example encoder strategy is described in the informative annex D. A technical report giving a full encoder and decoder implementation expressed in the ANSI C programming language will become the fifth part of the ISO/IEC 11172 document. The encoder is based on a test model shaped by participants of the MPEG committee that produces good picture quality while exercising the full video syntax. The decoder employs full arithmetic accuracy at all stages, and include bitstream conformance checks. Finally, a companion systems codec demonstrates the temporal link between the systems and the video layers. To better serve as a learning tool for novices, the code is optimized for clarity rather than execution speed. In addition to an overview of the program, this paper provides a brief description of the encoder model.
MPEG Standards
icon_mobile_dropdown
Overview of the MPEG/audio compression algorithm
Davis Y. Pan
This paper gives a summary of the MPEG/audio compression algorithm. This algorithm was developed by the Motion Picture Experts Group (MPEG), as an International Organization for Standardization standard for the high fidelity compression of digital audio. The MPEG/audio compression standard is one part of a multiple part standard that addresses the compression of video (11172-2), the compression of audio (11172-3), and the synchronization of the audio, video, and related data streams (11172-1) to an aggregate bit rate of about 1.5 Mbit/sec. The MPEG/audio standard also can be used for audio-only applications to compress high fidelity audio data at much lower bit rates. While the MPEG/audio compression algorithm is lossy, often it can provide `transparent', perceptually lossless, compression even with compression factors of 6-to-1 or more. The algorithm works by exploiting the perceptual weaknesses of the human ear. This paper also will cover the basics of psychoacoustic modeling and the methods used by the MPEG/audio algorithm to compress audio data with least perceptible degradation.
MPEG-2 systems
Alexander G. MacInnis
MPEG is a standard which defines data formats for coded (compressed) audio, video, and their combination. The Systems part of this work, which is Part 1, defines the coding and related requirements for providing the combination of audio and video, including key system- level functions. These functions include synchronization of audio and video, multiplexing, clock recovery, guaranteed buffer behavior, program time identification, and many other system-level functions which are required in practice and which are not part of the compression coding of audio and video. MPEG-2 formally known as ISO 13818, and MPEG-2 Systems is ISO 13818-1. Currently MPEG-2 has the status of a Committee Draft, meaning that it has been published for review by participating national bodies, as part of the procedure for becoming an official International Standard.
Low-Bit-Rate Coding
icon_mobile_dropdown
Status of ITU and ISO/MPEG4 video coding standards at very low bit-rates
Richard Schaphorst, Cliff Reader
The goal of the ISO project, designated MPEG4, is to develop a generic video coding syntax suitable for a wide range of applications such as videophone via the PSTN and mobile radio, security systems, mobile experts, emergency monitoring, educational networks, and networked games. It is anticipated that the coding algorithm will be a significant advancement relative to the basic interframe predictive 8 X 8 DCT design which is used in most digital TV standards today. Examples of advanced coding techniques being considered include fractals, analysis/synthesis, knowledge-based, and semantic coding.
Optimized hybrid transform coding for very low bit rates: videotelephony communication on personal computer
Gerard Eude, Jean-Claude Schmitt
This paper describes a `very low bitrate visual telephony application' demonstrator which was designed to be used on the Public Switched Telephony Network for many multimedia purposes. This development was done by CNET in coordination with the european COST211ter project with the aim to demonstrate videotelephony at very low bit rates. The main concern was to optimize a video coding algorithm based on the CCITT H.2161 existing standard and directly derived from the COST211ter simulation model. The different signals which are needed for a videotelephony communication, video, speech, data and control are modulated and transmitted at a bitrate contained between 9.6 kbit/s and 28.8 kbit/s. The description of the demonstrator is given, including video algorithm and system multiplex specifications. The reasons of the choice of the video format and algorithm are also discussed. A friendly software application has been developed to run videotelephony within a Macintosh computer environment. This program uses the QuickTime routines to record and to play the videophone pictures to or from the hard disk. Single pictures or large sequences can be grabbed to the hard disk. Data can also be transmitted by opening, through the audio/video multiplex a data channel of some kbit/s in the video channel, allowing minimal groupwave application.
Transform coding for low-bit-rate applications
Dulce B. Ponceleon, Katherine S. Wang, Hsi-Jung Wu, et al.
Low bit rate image coding at 10 kbit/s and less is a difficult problem and does not appear possible with the current generation of block transform based methods. Current research efforts center around the use of transforms with less objectionable artifacts such as wavelets or model based methods. We examine a method that is transform based but captures specific features of the image to be represented. The transform uses principal component analysis to generate a basis set specific to the particular class of images to be coded. We present results from a transform designed for use in a `talking head' sequence. Significant improvement in reconstructed quality is shown when perceptual weighting is used in generating the basis set. The appendix includes details of computationally efficient methods for deriving the basis set as well as a description of the weighting method.
Video coding with wavelet transform on the very-low-bit-rate communication channel
Seong-Whan Kim, Heung-Kyu Lee
In this paper, we present a moving image coding system which uses wavelet transform for the videophone on a very low bit rate communication channel (10 K bps PSTN). There are two requirements for our coding system: the good subjective quality for low bit rate coding and the suitability for progressive transmission. To satisfy the requirements, we use multifrequency motion estimation which estimates motion for each frequency bands. After the motion estimation, we can assign less bits to the motion estimation error in the high frequency bands because the human is insensitive to the change in high frequency components in the case of moving image. The experimental results confirm that our approach out-performs discrete cosine transform coding schemes in terms of the subjective quality because there are no blocking effects, and moreover, the wavelet transform in the approach supports the progressive transmission in itself.
Poster Session
icon_mobile_dropdown
Parallel butterfly algorithm and VLSI architectures for image decorrelation
Tinku Acharya, Amar Mukherjee
We present a new high speed parallel architecture and its VLSI implementation to design a special purpose hardware for real-time lossless image compression/decompression using a decorrelation scheme. The proposed architecture can easily be implemented using state-of-the- art VLSI technology. The hardware yields a high compression rate. A prototype 1-micron VLSI chip based on this architectural idea has been designed. The scheme is favorably comparable to the JPEG baseline lossless image compression schemes. We also discuss the parallelization issues of the JPEG baseline standard still compression schemes and their difficulties.
Motion estimation optimization in a MPEG-1-like video coding scheme for low-bit-rate applications
Miguel Roser, Paulo Villegas
In this paper we present a work based on a coding algorithm for visual information that follows the International Standard ISO-IEC IS 11172, `Coding of Moving Pictures and Associated Audio for Digital Storage Media up to about 1.5 Mbit/s', widely known as MPEG1. The main intention in the definition of the MPEG 1 standard was to provide a large degree of flexibility to be used in many different applications. The interest of this paper is to adapt the MPEG 1 scheme for low bitrate operation and optimize it for special situations, as for example, a talking head with low movement, which is a usual situation in videotelephony application. An adapted and compatible MPEG 1 scheme, previously developed, able to operate at px8 Kbit/s will be used in this work. Looking for a low complexity scheme and taking into account that the most expensive (from the point of view of consumed computer time) step in the scheme is the motion estimation process (almost 80% of the total computer time is spent on the ME), an improvement of the motion estimation module based on the use of a new search pattern is presented in this paper.
Video encoding using global block matching
Gary K. Arakaki
Since motion compensation is computationally expensive the search for matching blocks is restricted to a small region. In this paper an algorithm for searching all blocks in all previous frames in practical time is described.