Image compression handles large three-dimensional datasets

Image compression using bitstream properties of rate scalability, resolution scalability, and random access decodability, enables rapid access, retrieval, and transmission of large 3D datasets.
18 April 2006
William Pearlman

The collection and use of three-dimensional data has increased tremendously due to rapid improvements in the precision and accuracy of measuring instruments and increases in the storage capacity and computational power of computers. Scientists, engineers and physicians are collecting larger, higher-resolution tomographic data than ever before and using it for increasingly sophisticated and complex analysis tasks. Volume computed tomography and magnetic resonance imaging in medicine are obvious examples. Another is the hyperspectral data collected by the Airborne Visible InfraRed Imaging Spectrometer (AVIRIS), which measures spectral reflectances and irradiances of a ground area 614 by 512 samples in 224 contiguous wavelenth bands. Also, materials scientists analyze material properties from digital three-dimensional microstructures. These are essentially tomographic measurements of certain parameters in material samples of micrometer dimensions.

Despite their varied uses and applications, these datasets do have one thing in common: they are very large, three-dimensional, and very expensive to collect. The difficulties of storing this data have been alleviated to some degree by advances in storage technology and the consequent reduction in cost for such storage. However, the limitations on transmission rate imposed by available bandwidth and the large size of the datasets continue to cause severe problems in their transmission and retrieval. Hence there is a compelling need to compress the data and access only the portions that are of interest to the user.

Our research is focused on three-dimensional image compression methods, where the compressed bitstream supports resolution- and rate-scalability and random access capabilities. The data can be rewritten as integers of the required precision and treated as 3D images, since each integer value is associated with a point in a contained 3D space.

Our goal is to be able to store a large dataset as a compressed 3D image on a server at Rensselaer Polytechnic Institute in Troy, NY. A remote user at the Naval Research Laboratory in Washington, DC, would then be able to access this image via the Internet and browse through the data, which is visualized in an adjustable-size window that moves through his command. He may view a selected volume region at various angles and resolutions with quality increasing progressively until he decides to move to another region, or select that region for remote analysis or download.

Such a scenario is easily realized when an uncompressed image can be read into memory and transmitted in reasonable time. However, when images are so large that they exceed the system's spectral and memory bandwidths, compression with special bitstream characteristics is a necessity. These characteristics are resolution and fine grain-rate scalability and random access decodability. The server needs to keep in storage a compressed bitstream (codestream) that will decode to a lossless or nearly lossless reconstruction, one that recovers the original data exactly or with a predetermined acceptable level of error. Only a relatively small portion of the codestream must be read into memory and decoded for the user to accomplish the tasks described in the above scenario. The user will be given the decoder program; thus for download of the selected image region, only the corresponding portion of the codestream must transmitted.

Only a few compression methods simultaneously support resolution- and fine grain-rate-scalability. For natural resolution scalability, we turn to algorithms that encode the wavelet transform of the image. JPEG2000 has the required properties in its Part I two-dimensional mode, but not in its Part II multi-component mode. This uses a three-dimensional transform, but separately encodes the spatial transform slices with the Part I encoder. The well-known SPIHT1 and SPECK2 algorithms, also lack the desired functionality as originally presented, but the search path for so-called significant coefficients can be modified to achieve it. The aim of our recent and current research is to put simultaneous resolution- and fine-grain-scalability, and random-access decodability, into three-dimensional SPIHT and SPECK algorithms. These algorithms are true 3D coders, as they encode together groups of coefficients belonging to all three dimensions. Two such successful efforts using 3D-SPECK3 and 3D-SBHP4 (a variant of SPECK that encodes the transform in small cubic units) were recently published. Here, we present examples of some results with 3D SPECK on hyperspectral image data.

We performed coding experiments on a signed 16-bit reflectance AVIRIS image volume, Jasper scene 1. In this experiment, we extracted the 512 × 512 upper left corner, so that the dimensions of the image volume were 512 × 512 × 224 pixels (samples). The volume was highly compressed at full scale. Every result reported was obtained by decoding part of the compressed file.

To quantify fidelity, the coding performance was reported using bit rate in bits per pixel per band versus root mean square error (RMSE) calculated over the whole sequence. The RMS value of this scene is 1567.

The RMSE values for a variety of bit rates in bits per pixel per band (bpppb) for the whole dataset are listed in Table 1. The RMSE values listed in Table 1 for low resolution image sequences are calculated with respect to the reference image generated by the same analysis filter bank and synthesized to the same scale. For a given resolution, as more bits are decoded, the corresponding RMSE decreases. The corresponding bit budgets, the numbers of bits to access and decode for each resolution, and the bit rate, are provided in Table 2.

 
Table 1. Using scalable 3D-SPECK, as more bits are decoded for a given resolution, the corresponding RMSE decreases.

Table 2. Corresponding bit budgets for Table 1 at the higher bit rates, the dynamic memory and computational cost of decoding decreases significantly from one resolution level to the one below it.
 

We can see that the dynamic memory and computational cost of decoding decreases significantly from one resolution level to the one below it at the higher bit rates. For lower bit rates, a preponderance of bits reside at the lower resolutions, so reductions in bit budget are much less and even sometimes absent.

Regions can also be selected from a display of the image at a user-specified resolution and quality. Similar results are obtained with smaller bit budgets appropriate to the size and variability of the region. Due to space limitations, we have not shown the random access capability here.

Much research remains to be done, particularly to obtain faster and more efficient compression and decompression of data. To date, we have chosen to explore algorithms like SPIHT, SPECK, and SBHP because of their low complexity and potential for fast execution. The wavelet transform is in fact the slowest part of these coding systems, so it could be implemented in hardware to improve the speed of execution. In order to achieve extremely fast encoding and decoding, we have created a low-complexity method called Progres5 that excludes the use of rate scalability. This requires several coding passes through the bit planes of the (significant) wavelet coefficients, and encoding the size of the full compressed bitstream.

Images and datasets continue to grow larger at a rapid rate, so that the need for efficient compression, resolution scalability, rate scalability, and random access decodability will only become more pressing. We are continuing to investigate various approaches toward solving the problems associated with large datasets.


Authors
William Pearlman
Electrical, Computer & Systems Engineering, Rensselaer Polytechnic Institute
Troy, NY
William A. Pearlman is Professor of Electrical, Computer and Systems Engineering and Director of the Center for Image Processing Research (CIPR) at Rensselaer Polytechnic Institute. A Fellow of the IEEE and a Fellow of SPIE. He has authored or co-authored about 200 publications in the field of image and video compression and information theory. In addition he was general chair of SPIE's Visual Comunications and Image Processing in 1989 and has served on the technical committee and chaired many sessions of that conference since its inception in 1986.

PREMIUM CONTENT
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research