Viewer data from US television audiences is collected from a statistically sampled set of Nielsen panel homes and influences over $60B of advertising revenue. An electronic box called ‘the meter’ is installed in these homes. The meter is typically connected between the television and other audio/video devices and a central processing server either by a phone line or a broadband connection (see Figure 1). It measures two key metrics: content identification (what channel is being watched) and demographics (who is watching). Data on the two metrics are sent back from all panel homes daily and processed to produce the overnight TV ratings.
With analog TV, it was sufficient to simply measure the RF frequency that carried the program being received. The unique mapping between frequency and programming allowed for easy content identification. The digital age has, however, created a number of challenges. First, the digital TV standard1 allows for transmission of multiple programs on the same carrier frequency. Second, the concept of frequency disappears once content is stored on a digital video recorder. Finally, new consumer devices allow users to watch content whenever and wherever they like. A new measurement technique was thus needed that is agnostic to content transmission challenges, user behavior, and consumer end devices. One way to address this is to have an identifier that accompanies the content and remains with it no matter when, where, and how it is consumed. Audio watermarking and fingerprinting provide the solution.
Figure 1. The meter as installed in a panel home. The audio from the TV and the three audio/video media devices (A/V) connect to the meter. The meter is then connected to a central processing server via phone line or a broadband connection.
Digital fingerprinting is by now a well-researched field2 and there are a number of ways to design digital watermarks for both forensic and broadcast applications. The watermark (or code) is external data that is hidden in a host audio signal. Most broadcast applications require that the watermarks be imperceptible and meet certain minimum robustness and resolution criteria. Audio fingerprinting,3 on the other hand, is a mapping of the audio signal to a size-reduced set of bits called the fingerprint. The mapping is designed to be as unique as possible to reducing the set of false positives and increasing true positives. Design considerations include speed of matching, size of fingerprints, and resolution.
The Nielsen audio watermark algorithm modifies select frequencies in short blocks of audio to represent the code. The code frequencies lie in the range between 4.5kHz and 6kHz and are determined by a pseudo random sequence. Codes are rendered virtually inaudible by taking advantage of the psychoacoustic masking effect of neighboring frequencies. The watermark is robust enough to survive the challenges of broadcast audio compression and other artifacts encountered such as gain changes and format conversions. Spanning 2s, the message includes 50 bits of data: a 16-bit source identifier, a 32-bit timestamp and 2-bit level identifier. The source identifier and the timestamp guarantee a framework for unique content identification as well as identifying the time of broadcast. The payload also includes start-of-message and error correction bits. The watermark is inserted by a watermarking encoder that is installed at all major broadcast/cable networks, syndicators, and local stations. The watermarks are multiplexed appropriately to identify the owner and distributor. Figure 2 shows the flow of the signal through the broadcast chain.
Figure 2. Content flow between advertisers, content owners, and distributors. The paths (A, B, C, D) are all watermarked.
In the home, the meter has access to the audio that is viewed on the TV and extracts the watermark directly, yielding the content identification information. Sometimes, however, the lack of masking energy in certain segments of the broadcast audio may prohibit the insertion of the watermark by the encoder. In those cases, the content identification challenge is solved with audio fingerprints. Under this model, the in-home meter collects fingerprints for segments of audio when it is unable to extract a watermark. These fingerprints are transmitted to the central processing server. Reference fingerprints for all US broadcast content are also computed and stored in a central database on a continual basis. The in-home generated fingerprints are matched against the reference fingerprints to identify the content and time of broadcast. Figure 3 shows the fingerprint matching architecture. The Nielsen fingerprints are derived from a low-pass-filtered version of the energy curve of the audio signal. They are lightweight in terms of computational complexity and size. The matcher uses hash based indexing4 to aid its matching of the unknown query signatures from a reference of over 200 channels per market.
Between the watermarking and fingerprinting, there is enough accuracy and redundancy in the system to identify the content and solve the challenges posed by the digital convergence. New improvements of the watermarking algorithm allow the watermark to be inserted directly in the compressed audio without converting to baseband. Decoding improvements allow for enhanced code detection in noisy environments. Similar improvements to the fingerprinting algorithm allow for faster matches and more noise immunity.
Figure 3. The fingerprinting architecture. The in-home unknown query fingerprint is compared against the reference database to find the match.
Technology R&D, The Nielsen Company
Arun Ramaswamy is the vice president for Technology and R&D at the Nielsen Company, where he leads the research group focusing on audio and video watermarking and fingerprinting, digital television, internet and mobile streaming, wireless sensors, and location based technologies. He has been granted a US patent with several pending. He has several publications in IEEE journals and conferences.
4. E. Wold, T. Blum, D. Keslar, J. Wheaton, Content-based classification, search, and retrieval of audio, IEEE Multimedia 3, no. 3, pp. 27-36, 1996.