'Multimedia' has become one of the major buzzwords of the modern era. The combination of text, imagery, video, audio, and other facets of communication such as managing, compressing, and decompressing data promises key advantages to corporations and consumers. Start-up companies and new technologies are continually coming forward with fresh ideas that improve and expand the use of multimedia.
Looming in the background, however, is the nagging problem of compatibility. The industry wants to avoid the type of struggle among different proprietary systems that held back the development of high-technology industries in the past. The battle between the incompatible beta and 8-track sound systems, for example, significantly slowed the growth of the audio business. Manufacturers' insistence on their own unique technologies dogged the personal computer business in its early years.
Figure 1. Video browsing: extracting the 10 most "active" video segments in a news program.
The solution: internationally agreed-upon standards that manufacturers can use to ensure their products are compatible with those of all other manufacturers in the same field. Along with supporting compatibility, standards also stimulate fresh technology. "In recent years standardization has moved to another level," said Andy Tescher, technical consultant at Lockheed Martin Co. (Bethesda, MD) and chair of the scientific advisory board of the Integrated Media Systems Center at the Univ. of Southern California."It's not just agreeing to interoperability; now, under the standards umbrella, you're developing new technology." Michael Bove of M.I.T.'s Media Lab (Cambridge, MA) agreed: "We have an increasing interest in smarter, richer, more useful applications." MPEG standards
For multimedia technology, as well as its individual components, the standard-setter is the Moving Picture Experts Group (MPEG). This committee of the International Standards Organization consists of roughly 300 individuals from several countries who meet regularly to discuss and develop new standards. Other scientists and technicians from members' institutions support their efforts with technical contributions.
The committee was honored with two Emmy Awards for its first two standards. MPEG-1 effectively made interactive video on CD-ROMs possible, while MPEG-2 applied fundamental standards to digital television. Now, the experts group is focusing on MPEG-4, a standard that went into practice early this year, and MPEG-7, whose details should be finalized late next year. Both new standards deal specifically with multimedia, and cover technology that permits consumers to interact with the content they see on their television screens or computer monitors.
"With MPEG-4 you can interact with scenes, build two-dimensional and three-dimensional scenes, and place video and audio items in the scenes," said Rob Koenen of KPN Research (Groningen, The Netherlands), who chairs the MPEG Requirements Group. "MPEG-4 also includes a technical framework for the management and protection of intellectual property that allows you to protect content."
In addition, Alexandros Eleftheriadis of the Image Technology Media Center at Columbia Univ. (New York, NY) said MPEG-4 "has new and improved audio and video standards suitable for medium bit rates, and scalability features. Those make it very attractive for the Internet and wireless."
MPEG-7, meanwhile, focuses on methods of searching for items in multimedia presentations that go beyond the simple and often frustrating use of keywords. Tescher said, "It addresses the standardized way to describe multimedia components. In order to do searchers you have to define what you're searching for."
Koenen summarized the difference between the two new standards: "MPEG-4 is about the decoding of content," he said. "MPEG-7 is about describing content."
Figure 2.The new MPEG-4 and JPEG 2000 standards have launched many new compression products. The left image is based on the conventional JPEG algorithm. The image on the right is the basis of a new product development from MotionTV, Inc. of Campbell, California. Both images represent a compression of 100:1.
Basically, MPEG-4 applies to the way in which multiple images on a screen are built up. "It is fundamentally an object-oriented transmission mechanism," Tescher said. "You actually transmit the objects that go into the video frame. You don't have to compose the bits of a picture -- such as a talking head, a remote scene, and text -- in the transmitter. Under MPEG-4 that's done in the receiver."
Joern Ostermann of AT&T points to a very simple application of MPEG-4. Local television stations could use the technology to place their own logos over a sports event broadcast by a national network. "MPEG-4 allows you to overlay text over images much better than MPEG-2," he said.
The new standard offers other futuristic possibilities. "The cable industry is looking into MPEG-4 because of rumors that it has a substantially better bit rate than MPEG-2," Ostermann said. "That would permit extra channels." However, no hardware is yet available for this high-end application.
Low-end applications are available. "Microsoft has offered a system for video recording," Eleftheriadis said. "Packet Video is producing MPEG-4 video systems. Consumer electronic companies have produced consumer-grade MPEG-4 video screens. And, especially in markets outside the U.S., you'll see activity on the wireless front." Eleftheriadis plans to take advantage of the nascent market. He has founded Flavor Software Inc. (New York, NY), a company that will create products based on MPEG-4.
Another start-up, face2facetm (Summit, NJ), sprang from research at Lucent Technologies' Bell Labs (Murray Hills, NJ). The company, in which Lucent has an equity stake, uses MPEG-4 technology to create models of faces for television animation, computer games, and streaming over the Internet. "We've defined a complete set of parameters and descriptors for virtual humans, right down to the nitty gritty of lip movements," said chief scientist and Bell Labs veteran Eric Petajan. At present, face2face's major customers come from the animation community. But eventually, said Petajan, "we'll be able to do enhanced speech recognition. For example, face and body parameters will have possibilities as an alternative to close captioning for the hearing-impaired." Searching for objects
One significant component of MPEG-4 is MPEG-J. The J refers to Java, the lingua franca of the World Wide Web. "MPEG-J is a set of Java application program interfaces," said Viswanathan Swaminathan of Sun Microsystems. "It also sets the rules for delivering Java into a bitstream and it specifies what happens at the receiving end." Practically, MPEG-J will permit a television viewer or a Web surfer to control the image that he or she sees.
MPEG-7 provides consumers with a different kind of control. It helps users browse through multimedia databases to find specific objects they want, such as action shots in a one-hour video. "Just as text can be searched on the World Wide Web using search engines, the idea of MPEG-7 is to give users the content they want when and where they want it," said Ajay Divakaran of Mitsubishi Electric ITA's Advanced Television Lab. "That's predicated on being able to reach remotely located content with ease. Since the amount of data out there is huge, you want a mechanism that will let you browse through media."
The key here is what is known as "metadata" -- the data that describes objects in databases. MPEG-7 provides the technical standard for describing, managing, and identifying metadata for all components of multimedia, particularly still and moving images. Thus, the standard makes it possible to search for images unaccompanied by text. "Say you want to see President Clinton's face in an image bank without looking for captions. You can use an MPEG-7 descriptor to search the unannotated images," Divakaran said. Mohamed Abdel-Mottaleb of the Image Processing Department in Philips Research Labs (Eindhoven, The Netherlands) added, "We can allow a user to browse through collections of images automatically instead of relying on keywords."
Abdel-Mottaleb presents as one example of the standard's value a soccer fan who wants to browse through just the highlights of a recent game. "We've built algorithms for automatic detection of interesting events, such as goals and red cards," he said.
Divakaran takes a similar approach in his work. "Say you look at some programming and ask for high-action pictures," he said. "I can do that for you. You could also ask for high-action events with a lot of green in them. The work basically uses motion descriptors. That is now part of the MPEG-7 description."
As those developing systems show, MPEG standards are exerting a stimulating effect on technology. "The idea of these standards is not to stifle innovation," Divakaran said, "it is to leave enough room for people to try what they want."
Perhaps the key point of the current MPEG standard-setting is that it is designed to avoid the mistakes made in the past by segments of the communications industry and encourage development of an entire range of completely compatible technologies. "It's almost inconceivable today that someone would come in and propose a proprietary video standard," Tescher said. "Nobody would care."
Indeed, the MPEG group has recently started work on another new standard. MPEG-21, Tescher said, "is supposed to provide a coherent multimedia framework." Its main purpose is to specify technical areas that require further standardization to enable an open infrastructure for multimedia content.
A former science editor of Newsweek, Peter Gwynne is a free-lance science writer based in Sandwich, MA.