SPIE Membership Get updates from SPIE Newsroom
  • Newsroom Home
  • Astronomy
  • Biomedical Optics & Medical Imaging
  • Defense & Security
  • Electronic Imaging & Signal Processing
  • Illumination & Displays
  • Lasers & Sources
  • Micro/Nano Lithography
  • Nanotechnology
  • Optical Design & Engineering
  • Optoelectronics & Communications
  • Remote Sensing
  • Sensing & Measurement
  • Solar & Alternative Energy
  • Sign up for Newsroom E-Alerts
  • Information for:
SPIE Photonics West 2018 | Call for Papers




Print PageEmail PageView PDF

Defense & Security

Privacy-protection technology for video surveillance

Region-based video scrambling technology enables sensitive visual information to be concealed while retaining scene meaning.
3 August 2006, SPIE Newsroom. DOI: 10.1117/2.1200606.0279

Terrorist threats and high rates of criminal behavior in urban areas guarantee that security will remain a major public concern. Video surveillance is becoming ubiquitous, with systems widely deployed at strategic locations in airports, banks, public transportation, and city centers. However, their widespread use raises the specter of an invasive ‘Big Brother’ society. In addition, video surveillance is subject both to abuse by unscrupulous operators with criminal or voyeuristic aims and to institutional abuse for discriminatory purposes. These legitimate concerns frequently slow the deployment of surveillance systems.

Recent work has addressed privacy concerns.1–5 The system developed by Senior et al.1 renders a modified image, based on end-user access-control authorizations, with blanked-out areas and privacy-sensitive details removed. Similarly, Fidaleo et al.2 introduced a software architecture that utilizes privacy filters to prevent access to information or to transform incoming sensor data. Newton et al.3 address the threat associated with face-recognition techniques by ‘de-identifying’ faces in a way that preserves many facial characteristics, while Boult4 introduces a cryptographic technique to obscure faces, preserving the privacy of subjects under surveillance. The latter process is invertible for authorized personnel in possession of the necessary encryption keys. Finally, the technique developed by Martinez-Ponte et al.5 performs face detection and downshifts corresponding regions to the lowest quality layer of the codestream. Restricting transmission bandwidth means reduced image quality.

In our work,6,7 we propose a region-based transform-domain scrambling technique. First, video data yield regions of interest (ROI), such as faces or license plates, that are likely to contain privacy-sensitive information. These are then scrambled to conceal content. The approach is generic and can be applied to any transform-coding technique, such as might be based on discrete cosine transform (DCT) or discrete wavelet transform (DWT). More specifically, scrambling is performed in the transform domain by pseudo-randomly flipping the sign of transform coefficients during encoding. This flexible method enables the level of distortion to be adjusted, from mild fuzziness to complete noise. As a consequence, the scene can be understood even though individuals in it cannot be identified. Scrambling depends upon a private encryption key and is fully reversible. The key can be entrsuted to law-enforcement authorities or to third parties who thereby become the sole agents able to authorize unlocking and viewing the scene in clear.

ROI can either correspond to predefined zones or be automatically generated using video analysis. While automatic segmentation of objects in a video remains problematic, techniques such as face detection, change detection, skin detection, object segmentation and tracking, or any combination thereof, can all be successfully applied.

Studies of conditional access control have mainly considered traditional techniques used to to encrypt the codestream resulting from compression. However, compared to other types of information (e.g. banking data, confidential documents), video data is characterized by its high bit rate and low commercial value. With significant complexity increase, conventional cryptographic techniques are unsuitable.

Our scrambling approach efficiently copes with regions of arbitrary shape, and enables adjustment of the amount of distortion, but it does not entail either lower coding performance or significant complexity increase. Consider Motion JPEG 2000 (MJP2) and MPEG-4 video coding. With the former, scrambling can be effectively applied after the DWT and quantization, but prior to the arithmetic coder. Scrambling should therefore have minimal impact on coding efficiency. In general, DC coefficients are strongly correlated and therefore are unsuitable for scrambling. Furthermore, whereas the amplitude of AC coefficients is correlated, their sign is not. As a result, quantized wavelet coefficients belonging to AC subbands and corresponding to ROIs are scrambled by pseudorandomly flipping their sign. MPEG-4 quantized AC coefficients of the 8 × 8 DCT blocks corresponding to the ROI are scrambled in the same way. In both cases, a pseudorandom-number generator initialized by a seed value is used to drive the scrambling process. Multiple seeds can be used to improve system security. To communicate their values to authorized users, they are encrypted and inserted in the codestream. The level of scrambling can be adjusted by restricting it to fewer coefficients. The process is fully reversible at the decoder, where authorized users have merely to perform the inverse operation. Examples of scrambling with varying strength are shown in Figures 1 and 2.

Figure 1. Scrambling images encoded as Motion JPEG 2000 files does not mean that coding performance must be lower.

Figure 2. Levels of scrambling can be adjusted in MPEG-4.

We believe our approach offers a number of comparative advantages in terms of minimizing loss of privacy and preventing abuse. It involves transmitting a single protected codestream to all clients regardless of their access control credentials. Unauthorized users do not possess the private key required for unscrambling content while authorized users can unscramble and recover the integral and undistorted scene. The technique we propose is also flexible: it can be restricted to arbitrary-shape ROI and the level of distortion is adjustable. In addition, because it has minor impact on coding performance, it requires low computational complexity. Finally, the method can be used with most existing video coding standards.

Frederic Dufaux and Touradj Ebrahimi
Ecole Polytechnique Federale de Lausanne (EPFL)
Lausanne, Switzerland
Emitall Surveillance SA
Montreux, Switzerland 
Frederic Dufaux received his MSc in Physics and PhD in Electrical Engineering from the EPFL. He has more than 15 years experience in digital image and video processing, holding various research positions at EPFL, AT&T Bell Laboratories, MIT and Compaq. He is currently a senior researcher at EPFL and Chief Scientist of Emitall Surveillance SA. The author or co-author of more than 50 research publications, he holds 10 patents. He has published numerous papers in SPIE proceedings and teaches the short course on Information Processing for Video Surveillance at the SPIE Defense & Security Symposium.
Touradj Ebrahimi received his MSc and PhD, both in Electrical Engineering, from the EPFL. He has been with Sony Corporation and AT&T Bell Laboratories, and is currently a professor at the Signal Processing Institute of EPFL, where he is involved in various aspects of digital video and multimedia applications. He is the founder and Chairman of Emitall Surveillance SA, author or coauthor of over 150 papers, and holds 10 patents. In addition, he is a Fellow of SPIE, served as General Chair of SPIE's Visual Communications and Image Processing 2003, and has published numerous papers in SPIE proceedings. He teaches the short course on Information Processing for Video Surveillance at SPIE's Defense & Security Symposium.