Share Email Print

Proceedings Paper

Spatial domain entertainment audio decompression/compression
Author(s): Y. K. Chan; Ka Him Kevin Tam
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

The ARM7 NEON processor with 128bit SIMD hardware accelerator requires a peak performance of 13.99 Mega Cycles per Second for MP3 stereo entertainment quality decoding. For similar compression bit rate, OGG and AAC is preferred over MP3. The Patent Cooperation Treaty Application dated 28/August/2012 describes an audio decompression scheme producing a sequence of interleaving “min to Max” and “Max to min” rising and falling segments. The number of interior audio samples bound by “min to Max” or “Max to min” can be {0|1|…|N} audio samples. The magnitudes of samples, including the bounding min and Max, are distributed as normalized constants within the 0 and 1 of the bounding magnitudes. The decompressed audio is then a “sequence of static segments” on a frame by frame basis. Some of these frames needed to be post processed to elevate high frequency. The post processing is compression efficiency neutral and the additional decoding complexity is only a small fraction of the overall decoding complexity without the need of extra hardware. Compression efficiency can be speculated as very high as source audio had been decimated and converted to a set of data with only "segment length and corresponding segment magnitude" attributes. The PCT describes how these two attributes are efficiently coded by the PCT innovative coding scheme. The PCT decoding efficiency is obviously very high and decoding latency is basically zero. Both hardware requirement and run time is at least an order of magnitude better than MP3 variants. The side benefit is ultra low power consumption on mobile device. The acid test on how such a simplistic waveform representation can indeed reproduce authentic decompressed quality is benchmarked versus OGG(aoTuv Beta 6.03) by three pair of stereo audio frames and one broadcast like voice audio frame with each frame consisting 2,028 samples at 44,100KHz sampling frequency.

Paper Details

Date Published: 18 February 2014
PDF: 15 pages
Proc. SPIE 9030, Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications 2014, 90300C (18 February 2014); doi: 10.1117/12.2038142
Show Author Affiliations
Y. K. Chan, City Univ. of Hong Kong (Hong Kong, China)
Ka Him Kevin Tam, Univ. of Hong Kong (Hong Kong, China)

Published in SPIE Proceedings Vol. 9030:
Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications 2014
Reiner Creutzburg; David Akopian, Editor(s)

© SPIE. Terms of Use
Back to Top