Share Email Print

Proceedings Paper

Cepstral domain modification of audio signals for data embedding: preliminary results
Author(s): Kaliappan Gopalan
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

A method of embedding data in an audio signal using cepstral domain modification is described. Based on successful embedding in the spectral points of perceptually masked regions in each frame of speech, first the technique was extended to embedding in the log spectral domain. This extension resulted at approximately 62 bits /s of embedding with less than 2 percent of bit error rate (BER) for a clean cover speech (from the TIMIT database), and about 2.5 percent for a noisy speech (from an air traffic controller database), when all frames - including silence and transition between voiced and unvoiced segments - were used. Bit error rate increased significantly when the log spectrum in the vicinity of a formant was modified. In the next procedure, embedding by altering the mean cepstral values of two ranges of indices was studied. Tests on both a noisy utterance and a clean utterance indicated barely noticeable perceptual change in speech quality when lower range of cepstral indices - corresponding to vocal tract region - was modified in accordance with data. With an embedding capacity of approximately 62 bits/s - using one bit per each frame regardless of frame energy or type of speech - initial results showed a BER of less than 1.5 percent for a payload capacity of 208 embedded bits using the clean cover speech. BER of less than 1.3 percent resulted for the noisy host with a capacity was 316 bits. When the cepstrum was modified in the region of excitation, BER increased to over 10 percent. With quantization causing no significant problem, the technique warrants further studies with different cepstral ranges and sizes. Pitch-synchronous cepstrum modification, for example, may be more robust to attacks. In addition, cepstrum modification in regions of speech that are perceptually masked - analogous to embedding in frequency masked regions - may yield imperceptible stego audio with low BER.

Paper Details

Date Published: 22 June 2004
PDF: 11 pages
Proc. SPIE 5306, Security, Steganography, and Watermarking of Multimedia Contents VI, (22 June 2004);
Show Author Affiliations
Kaliappan Gopalan, Purdue Univ./Calumet (United States)

Published in SPIE Proceedings Vol. 5306:
Security, Steganography, and Watermarking of Multimedia Contents VI
Edward J. Delp III; Ping W. Wong, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?