Proving video stream watermarking viability

By deploying information theory models, robust watermarking can protect streaming video against piracy
08 July 2008
Mihai Mitrea, Sorin Duta, and Françoise Prêteux

From the digital montage, to DVD, video on demand, and direct television, video data needs theft protection. While in production, data is mainly protected by restricted access radio frequency identification, but during distribution it becomes much more vulnerable. Cryptography-based solutions quickly reach their limits: after decryption, which is necessary to view the movie, anyone may copy and redistribute it.

Therefore, many researchers have looked to watermarking,1 which aims to create imperceptible modifications to the video for content tracking. In order to be effective, information insertion must be robust against any video pirate's attack, or changes to render the tracking information undetectable. Traditionally, the watermark is inserted either into the pixels or into a transform of the frames. For compressed formats such as the Moving Pictures Experts Group-4 Advanced Video Coding (MPEG-4 AVC) stream, however, such an approach is inappropriate, because it requires decoding and re-encoding the data. These operations consume time and do not suit the real-time constraints of applications like video on demand or direct television.

Thus, new research has focused on directly embedding the watermark into the compressed stream. However, direct embedding presents a challenge because compression aims to eliminate visual redundancy while watermarking exploits redundancy to hide tracking information.

We evaluated whether MPEG-4 AVC watermarking is viable. To do so, we determined the maximum amount of information which can be inserted into an original video for a given transparency and robustness. We then tested whether this watermarking capacity was large enough for property right applications.

Any watermarking technique can be represented as a noisy channel (see Figure 1).1 The mark, a sample from the information source, is encoded using a secret key. It is then transmitted through a channel with noise sources from both the original video and the attacks. A trusted entity should be able to recover the tracking information using only the video data and the key. In this framework, the watermarking capacity is the capacity of the corresponding noisy channel.

This general model can be applied to MPEG-4 AVC streaming. For compressed domain applications, the mark is likely to be inserted in the quantization indexes of discrete cosine transform (DCT) coefficients. These coefficients are computed on the prediction errors in the intra-coded (I) frame macroblocks.2 Since these indices are integer values, a discrete noisy channel can model the watermarking process. We consider 15 such channels, one for each alternating current (AC) frequency.


Figure 1. Watermarking as an information channel. A mark is created using a secret key and copyright information (in this example ARTEMIS, – the visual logo of our department). The original video first undergoes a transformation T, then the signal is embedded in the original video. The resulting video is then transmitted to the recipient, where it may be retransmitted or suffer attacks. A trusted entity should be able to recover the logo from a marked video using only the secret key.

First, we assessed transparency3 using several objective measures: peak signal-to-noise ratio (PSNR), digital video quality, absolute average difference, peak mean square error, image fidelity, structural content, correlation quality, and normalized cross correlation. We determined the highest additive alteration of the quantization indices that could create visually insignificant changes. In a 192×86 pixel frame, this corresponds to {-2,-1,+1,+2} on a maximum of 50 macroblocks (see Figure 2). According to information theory, this means that the input information source has five symbols.

Next we evaluated the robustness4 of the watermarking against several real life attacks. We tested against Gaussian filtering, sharpening, StirMark random bending, and small rotations. To accomplish this task, we used information theory as a guide. We therefore estimated the noise matrices for each attack and for each of the 15 AC coefficients.


Figure 2. The peak signal-to-noise ratio (PSNR) of the modified video versus the spatial frequency provides one measure of watermark transparency. The following modifications are considered: 1 macroblock per frame (◊), 10 macroblocks per frame (▴), 50 macroblocks per frame (X), and 100 macroblocks per frame (Δ). AC: alternating current.

Finally, we computed the capacity of these channels.5 As Figure 1 shows, the original MPEG stream is completely known at insertion and unknown at detection.6,7,8 Thus, we chose to use the non-causal side information channel model. It is represented by a set of noise matrices, one for each state (a state corresponds to a value of a quantization index in the unmarked video). The number of input information source symbols (five in our application) determines the number of rows in these matrices. We obtain these by selecting, from the attack matrix, five rows centered on a particular state. Figure 3 shows the watermarking capacity for four different attacks.

The results demonstrate that 1–5bit of information can be inserted into an I frame while keeping a signal to noise ratio larger than 30dB and withstanding geometric attacks. As an example, it would take about 20–100 I frames to encode an International Standard Audiovisual Number (ISAN) that was robust to such attacks.


Figure 3. Watermarking capacity versus frequency for four attacks. The capacity values are given in bits per symbol.

Our study shows that MPEG-4 AVC watermarking has the potential to address copyright protection problems. Our capacity computation will help establish the functional area of in-band enriched multimedia.9 Future work will focus on devising a system for watermarking at close to the capacity. In addition, we may expand our research to consider generic robustness against multiple or unknown attacks.


Mihai Mitrea, Sorin Duta, Françoise Prêteux 
ARTEMIS
Institut TELECOM
Evry, France

Mihai Mitrea is an associate professor at Institut TELECOM/ TELECOM & Management SudParis branch, in the advanced research and techniques for multidimensional imaging systems (ARTEMIS) department. He received his masters and doctorate in electronics, telecommunications, and information engineering from the Polytechnic University, Bucharest in 1997 and 2003, respectively. His scientific interests include information theory, random processes and statistics with applications in digital image and natural language modelling, protection, and transmission.

Sorin Duta received a bachelors degree in communications from the Polytechnic University of Bucharest faculty of electronics, telecommunications and information technology in July 2005. He is currently a doctoral student at the Institut TELECOM / TELECOM & Management SudParis, ARTEMIS Department. His research interests include information theory and random processes as applied to visual data protection, with a special emphasis on watermarking.

Françoise Prêteux is a professor at the Institut TELECOM/TELECOM & Management SudParis, head of the ARTEMIS Department. She is an international visual communication expert and the French representative of the ISO multimedia standardization organisation. Her scientific interest focuses on multimedia coding, indexing and protection. She holds a masters degree from the École Nationale Supérieure des Mines de Paris and a doctorate in mathematics from the Pierre and Marie Curie University in Paris.


Recent News
PREMIUM CONTENT
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research