Proceedings Volume 6071

Multimedia Computing and Networking 2006

cover
Proceedings Volume 6071

Multimedia Computing and Networking 2006

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 15 January 2006
Contents: 5 Sessions, 23 Papers, 0 Presentations
Conference: Electronic Imaging 2006 2006
Volume Number: 6071

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Application-dependent Transfer
  • Streaming
  • Distribution
  • Short Papers: Multimedia Systems
  • Peer-to-Peer
Application-dependent Transfer
icon_mobile_dropdown
The effects of frame rate and resolution on users playing first person shooter games
Mark Claypool, Kajal Claypool, Feissal Damaa
The rates and resolutions for frames rendered in a computer game directly impact the player performance, influencing both the overall game playability and the game's enjoyability. Insights into the effects of frame rates and resolutions can guide users in their choice for game settings and new hardware purchases, and inform system designers in their development of new hardware, especially for embedded devices that often must make tradeoffs between resolution and frame rate. While there have been studies detailing the effects of frame rate and resolution on streaming video and other multimedia applications, to the best of our knowledge, there have been no studies quantifying the effects of frame rate and resolution on user performance for computer games. This paper presents results of a carefully designed user study that measures the impact of frame rate and frame resolution on user performance in a first person shooter game. Contrary to previous results for streaming video, frame rate has a marked impact on both player performance and game enjoyment while resolution has little impact on performance and some impact on enjoyment.
Real-time 3D video compression for tele-immersive environments
Tele-immersive systems can improve productivity and aid communication by allowing distributed parties to exchange information via a shared immersive experience. The TEEVE research project at the University of Illinois at Urbana-Champaign and the University of California at Berkeley seeks to foster the development and use of tele-immersive environments by a holistic integration of existing components that capture, transmit, and render three-dimensional (3D) scenes in real time to convey a sense of immersive space. However, the transmission of 3D video poses significant challenges. First, it is bandwidth-intensive, as it requires the transmission of multiple large-volume 3D video streams. Second, existing schemes for 2D color video compression such as MPEG, JPEG, and H.263 cannot be applied directly because the 3D video data contains depth as well as color information. Our goal is to explore from a different angle of the 3D compression space with factors including complexity, compression ratio, quality, and real-time performance. To investigate these trade-offs, we present and evaluate two simple 3D compression schemes. For the first scheme, we use color reduction to compress the color information, which we then compress along with the depth information using zlib. For the second scheme, we use motion JPEG to compress the color information and run-length encoding followed by Huffman coding to compress the depth information. We apply both schemes to 3D videos captured from a real tele-immersive environment. Our experimental results show that: (1) the compressed data preserves enough information to communicate the 3D images effectively (min. PSNR > 40) and (2) even without inter-frame motion estimation, very high compression ratios (avg. > 15) are achievable at speeds sufficient to allow real-time communication (avg. ≈ 13 ms per 3D video frame).
An integrated visual approach for music indexing and dynamic playlist composition
M. Crampes, S. Ranwez, F. Velickovski, et al.
This paper presents an innovative integrated visual approach for indexing music and automatically composing personalized playlists for radios or chain stores. To efficiently index hundreds of music titles by hand with artistic descriptors, the user only needs to drag and drop them onto a dynamic music landscape. To help the user we propose different dynamic visualization tools, such as the semantic spectrum and semantic field lenses. An algorithm then propagates artistic values that are hidden in the landscape into the titles being indexed. Different propagation algorithms are tested and compared. The dynamic composition methodology is then described with its class n-gram algorithm and its means for personalization based on the same music map as the visual indexing method. The new tools and techniques presented in this paper enable us to turn musical experience into an integrated visual experience that may generate new music knowledge and emotion.
Efficient rate-distortion optimized media streaming for tree-reducible packet dependencies
Martin Röder, Jean Cardinal, Raouf Hamzaoui
In packetized media streaming systems, packet dependencies are often modeled as a directed acyclic graph called the dependency graph. We consider the situation where the dependency graph is reducible to a tree. This occurs, for instance, in MPEG1 video streams that are packetized at the frame level. Other video coding standards such as H.264 also allow tree-reducible dependencies. We propose in this context efficient dynamic programming algorithms for finding rate-distortion optimal transmission policies. The proposed algorithms are much faster than previous exact algorithms developed for arbitrary dependency graphs.
Popular song and lyrics synchronization and its application to music information retrieval
Kai Chen, Sheng Gao, Yongwei Zhu, et al.
An automatic synchronization system of the popular song and its lyrics is presented in the paper. The system includes two main components: a) automatically detecting vocal/non-vocal in the audio signal and b) automatically aligning the acoustic signal of the song with its lyric using speech recognition techniques and positioning the boundaries of the lyrics in its acoustic realization at the multiple levels simultaneously (e.g. the word / syllable level and phrase level). The GMM models and a set of HMM-based acoustic model units are carefully designed and trained for the detection and alignment. To eliminate the severe mismatch due to the diversity of musical signal and sparse training data available, the unsupervised adaptation technique such as maximum likelihood linear regression (MLLR) is exploited for tailoring the models to the real environment, which improves robustness of the synchronization system. To further reduce the effect of the missed non-vocal music on alignment, a novel grammar net is build to direct the alignment. As we know, this is the first automatic synchronization system only based on the low-level acoustic feature such as MFCC. We evaluate the system on a Chinese song dataset collecting from 3 popular singers. We obtain 76.1% for the boundary accuracy at the syllable level (BAS) and 81.5% for the boundary accuracy at the phrase level (BAP) using fully automatic vocal/non-vocal detection and alignment. The synchronization system has many applications such as multi-modality (audio and textual) content-based popular song browsing and retrieval. Through the study, we would like to open up the discussion of some challenging problems when developing a robust synchronization system for largescale database.
Streaming
icon_mobile_dropdown
MMS: a multihome-aware media streaming system
Ahsan Habib, John Chuang
Multihoming provides highly diverse redundant paths in terms of average hop count, latency, loss ratio, and jitter. In this paper, we first explore topological path diversity and show that multihoming can significantly reduce the path overlap when a multihomed receiver conducts media streaming from a set of suppliers. We then design a multihome-aware media streaming system (MMS) that exploits topological path diversity by splitting a streaming session over the available physical links to reduce path overlap among the suppliers, and migrating a connection from one path to another if the current path is congested. A network tomography-based monitoring mechanism is developed to identify congested path segments. Through a series of experiments in the wide area Internet, we show that multihoming provides streaming at a higher rate comparing to a single service provider. On average the quality of streaming sessions is improved by 30% or more.
Streamline: a scheduling heuristic for streaming applications on the grid
Bikash Agarwalla, Nova Ahmed, David Hilley, et al.
Scheduling a streaming application on high-performance computing (HPC) resources has to be sensitive to the computation and communication needs of each stage of the application dataflow graph to ensure QoS criteria such as latency and throughput. Since the grid has evolved out of traditional high-performance computing, the tools available for scheduling are more appropriate for batch-oriented applications. Our scheduler, called Streamline, considers the dynamic nature of the grid and runs periodically to adapt scheduling decisions using application requirements (per-stage computation and communication needs), application constraints (such as co-location of stages), and resource availability. The performance of Streamline is compared with an Optimal placement, Simulated Annealing (SA) approximations, and E-Condor, a streaming grid scheduler built using Condor. For kernels of streaming applications, we show that Streamline performs close to the Optimal and SA algorithms, and an order of magnitude better than E-Condor under non-uniform load conditions. We also conduct scalability studies showing the advantage of Streamline over other approaches.
A novel unbalanced multiple description coder for robust video transmission over ad hoc wireless networks
Feng Huang, Lifeng Sun, Yuzhuo Zhong
Robust transmission of live video over ad hoc wireless networks presents new challenges: high bandwidth requirements are coupled with delay constraints; even a single packet loss causes error propagation until a complete video frame is coded in the intra-mode; ad hoc wireless networks suffer from bursty packet losses that drastically degrade the viewing experience. Accordingly, we propose a novel UMD coder capable of quickly recovering from losses and ensuring continuous playout. It uses 'peg' frames to prevent error propagation in the High-Resolution (HR) description and improve the robustness of key frames. The Low-Resolution (LR) coder works independent of the HR one, but they can also help each other recover from losses. Like many UMD coders, our UMD coder is drift-free, disruption-tolerant and able to make good use of the asymmetric available bandwidths of multiple paths. The simulation results under different conditions show that the proposed UMD coder has the highest decoded quality and lowest probability of pause when compared with concurrent UMDC techniques. The coder also has a comparable decoded quality, lower startup delay and lower probability of pause than a state-of-the-art FEC-based scheme. To provide robustness for video multicast applications, we propose non-end-to-end UMDC-based video distribution over a multi-tree multicast network. The multiplicity of parents decorrelates losses and the non-end-to-end feature increases the throughput of UMDC video data. We deploy an application-level service of LR description reconstruction in some intermediate nodes of the LR multicast tree. The principle behind this is to reconstruct the disrupted LR frames by the correctly received HR frames. As a result, the viewing experience at the downstream nodes benefits from the protection reconstruction at the upstream nodes.
A transform for network calculus and its application to multimedia networking
Krishna Pandit, Jens Schmitt, Claus Kirchner, et al.
The rapid increase of multimedia traffic has to be accounted for when designing IP networks. A key characteristic of multimedia traffic is that it has strict Quality of Service (QoS) requirements in a heterogeneous manner. In such a setting, scheduling by service curves is a useful method as it allows for assigning each flow exactly the service it requires. When hosting heterogeneous multimedia traffic, the utilization of packet-switched networks can be increased by using bandwidth/delay decoupled scheduling disciplines. It has been shown in previous work how optimal network service curves are obtained with them. A basic result from Network Calculus is that the network service curve is obtained by the min--plus convolution of the node service curves. We state a theorem on the min--plus convolution in this work, which simplifies the computation of the min--plus convolution of service curves. The theorem follows from the continuous ΓΔ-transform, which we develop. With this theorem, we derive the optimal service curves for the nodes along a path. Further, we show how the admission control can be improved when networks are designed based on service curves. Considering one node, reallocating the service curves leads to admitting more flows. Then we point out scenarios where sub-optimal allocation of service curves in a node can increase the number of admitted flows to the network. The key results are accompanied by numerical examples. On a broader scale, this paper advances the research in analytically modeling packet-switched networks by pointing out novel properties and a new application of Network Calculus.
Distribution
icon_mobile_dropdown
A method to deliver multi-object content in a ubiquitous environment
Takanori Mori, Michiaki Katsumoto
We propose a multi-object content delivery method for use in a ubiquitous environment where many nodes are connected. In our target applications, a node receive multiple objects from multiple nodes to display content. We assume that each object is encoded by multiple description coding (MDC) and copies of the descriptions are stored in multiple nodes. We determine the quality of each object based on user requirements and the network bandwidth available for use by nodes. In addition, to reduce the impact of node removal, sending nodes are selected using a heuristic algorithm that reduces the number of overlapping nodes in the delivery path. Experimental results show that our method could determine the quality of objects, the sending nodes, and efficient delivery paths within a reasonable time.
Correlation-aware multimedia content distribution in overlay networks
We address the question: What is the best way to construct a mesh overlay topology for multimedia content distribution, such that the highest streaming rate can be achieved? We model overlay capacity correlations as linear capacity constraints (LCC) and propose a distributed algorithm that constructs an overlay mesh which incorporates heuristically inferred linear capacity constraints. Our simulations results confirm the accuracy of representing overlays using our LCC model and show the LCC-overlay achieving substantial improvement in achievable flow rate.
QBIX-G: a transcoding multimedia proxy
Peter Schojer, Laszlo Böszörmenyi, Hermann Hellwagner
An adaptive multimedia proxy is presented which provides (1) caching, (2) filtering, and (3) media gateway functionalities. The proxy can perform media adaptation on its own, either relying on layered coding or using transcoding in the decompressed domain. A cost model is presented which incorporates user requirements, terminal apabilities, and video variations in one formula. Based on this model, the proxy acts as a general broker of different user requirements and of different video variations. This is a first step towards What You Need is What You Get (WYNIWYG) video services, which deliver videos to users in exactly the quality they need and are willing to pay for. The MPEG-7 and MPEG-21 standards enable this in an interoperable way. A detailed evaluation based on a series of simulation runs is provided.
Preventing DoS attacks in peer-to-peer media streaming systems
William Conner, Klara Nahrstedt, Indranil Gupta
This paper presents a framework for preventing both selfishness and denial-of-service attacks in peer-to-peer media streaming systems. Our framework, called Oversight, achieves prevention of these undesirable activities by running a separate peer-to-peer download rate enforcement protocol along with the underlying peer-to-peer media streaming protocol. This separate Oversight protocol enforces download rate limitations on each participating peer. These limitations prevent selfish or malicious nodes from downloading an overwhelming amount of media stream data that could potentially exhaust the entire system. Since Oversight is based on a peer-to-peer architecture, it can accomplish this enforcement functionality in a scalable, efficient, and decentralized way that fits better with peer-to-peer media streaming systems compared to other solutions based on central server architectures. As peer-to-peer media streaming systems continue to grow in popularity, the threat of selfish and malicious peers participating in such large peer-to-peer networks will continue to grow as well. For example, since peer-to-peer media streaming systems allow users to send small request messages that result in the streaming of large media objects, these systems provide an opportunity for malicious users to exhaust resources in the system with little effort expended on their part. However, Oversight addresses these threats associated with selfish or malicious peers who cause such disruptions with excessive download requests. We evaluated our Oversight solution through simulations and our results show that applying Oversight to peer-to-peer media streaming systems can prevent both selfishness and denial-of-service attacks by effectively limiting the download rates of all nodes in the system.
Short Papers: Multimedia Systems
icon_mobile_dropdown
Investigating a stream synchronization middleware for the NEES MAST system
James C. Beyer, Srikanth K. Chirravuri, David H. C. Du
This paper describes the streaming synchronization middleware research conducted while investigating how to provide a collaborative experimentation system for the NEES Multi-Axial Sub-Assemblage Testing (MAST) Experimental setup at the University of Minnesota. Continuous multimedia streams such as those produced by MAST experiments are characterized by well-defined temporal relationships between subsequent media units (MUs). The information present in these streams can only be presented correctly when these time-dependent relationships are maintained during presentation time. Even if these relationships change during transportation (e.g. due to network delays), they need to be reconstructed at the client (sink) before playout. Whereas most previous work addresses synchronization at the application level by modifying the end system, Our goal is to leave the endsystem largely unchange and simply add a new synchronization middleware application control system. This paper presents our three proposed algorithms that ensure the continuous and synchronous playback of distributed stored multimedia streams across a communications network via a middleware controlled commercial media player.
A performance model of effective memory management in HYDRA: a large scale data stream recording system
Presently, digital continuous media (CM) are well established as an integral part of many applications. Scant attention has been paid to servers that can record such streams in real time. However, more and more devices produce direct digital output streams. Hence, the need arises to capture and store these streams with an efficient recorder that can handle both recording and playback of many streams simultaneously and provide a central repository for all data. Because of the continuously decreasing cost of memory, more and more memory is available on a large scale recording system. Unlike most previous work that focuses on how to minimize the server buffer size, this paper investigates how to effectively utilize the additional available memory resources in a recording system. We propose an effective resource management framework that has two parts: (1) a dynamic memory allocation strategy, and (2) a deadline setting policy (DSP) that can be applied consistently to both playback and recording streams, satisfying the timing requirements of CM, and also ensuring fairness among different streams. Furthermore, to find the optimal memory configuration, we construct a probability model based on the classic M/G/1 queueing model and the recently developed Real Time Queueing Theory (RTQT). Our model can predict (a) the missed deadline probability of a playback stream, and (b) the blocking probability of recording streams. The model is applicable to admission control and capacity planning in a recording system.
Sender-driven bandwidth differentiation for transmitting multimedia flows over TCP
K. H. Lau, Jack Y. B. Lee
Over the years the Internet has shown extraordinary scalability and robustness in spite of the explosive growth in geographical reach, user population size, as well as network traffic volume. This scalability and robustness is, in no small part, supported by the Internet's transport protocols, the Transmission Control Protocol (TCP) in particular. Nevertheless, with the rapid growth of multimedia-rich contents in the Internet, such as audio and video, the many strengths of TCP in data delivery are slowly imposing bottlenecks in multimedia data delivery where different media data flows often have different needs for bandwidth. As TCP's congestion control algorithm enforces fair bandwidth sharing among traffic flows sharing the same network bottleneck, different media data flows will receive the same bandwidth irrespective of the actual needs of the multimedia data being delivered. This work addresses this limitation by proposing a new algorithm to achieve non-uniform bandwidth allocation among TCP flows originating from the same sender passing through the same network bottleneck to multiple receivers. The proposed algorithm, called Virtual Packet Substitution (VPS), has four desirable features: (a) it allows the allocation of bottleneck bandwidth between a group of TCP flows; (b) the resultant traffic flows as a whole, maintain the same fair bandwidth sharing property with other competing TCP flows; (c) it can be implemented entirely in the sender's TCP protocol stack; and (d) it is compatible with and does not require modification to existing TCP protocol stack at the clients. Simulation results show that the proposed VPS algorithm can achieve accurate bandwidth allocation while still maintaining fair bandwidth sharing with competing TCP flows.
FlexSplit: a workload-aware adaptive load balancing strategy for media cluster
Qi Zhang, Ludmila Cherkasova, Evgenia Smirni
A number of technology and workload trends motivate us to consider a new request distribution and load balancing strategy for streaming media clusters. First, in emerging media workloads, a significant portion of the content is short and encoded at low bit rates. Additionally, media workloads display a strong temporal and spatial locality. This makes modern servers with gigabytes of main memory well suited to deliver a large fraction of accesses to popular files from memory. Second, a specific characteristic of streaming media workloads is that many clients do not finish playing an entire media file which results from the browsing nature of a large fraction of client accesses. In this paper, we propose and evaluate two new load-balancing strategies for media server clusters. The proposed strategies, FlexSplit and FlexSplitLard aim to efficiently utilize the combined cluster memory by exploiting specific media workload properties by "tuning" their behavior to media file popularity changes. The ability of the proposed policies to self-adapt to changing workloads across time while maintaining high performance makes these strategies an attractive choice for load balancing in media server clusters.
Cascades: scalable, flexible, and composable middleware for multi-modal sensor networking applications
Jie Huang, Wu-chi Feng, Nirupama Bulusu, et al.
This paper describes the design and implementation of Cascades, a scalable, flexible and composable middleware platform for multi-modal sensor networking applications. The middleware is designed to provide a way for application writers to use pre-packaged routines as well as incorporate their own application-tailored code when necessary. As sensor systems become more diverse in both hardware and sensing modalities, such systems support will become critical. Furthermore, the systems software must not only be flexible, but also be efficient and provide high performance. Experimentation in this paper compares and contrasts several possible implementations based upon testbed measurements on embedded devices. Our experimentation shows that such a system can indeed be constructed.
Compression by indexing: an improvement over MPEG-4 body animation parameter compression
Siddhartha Chattopadhyay, Suchendra M. Bhandarkar, Kang Li
Body Animation Parameters (BAPs) are used to animate MPEG-4 compliant virtual human-like characters. In order to stream BAPs in real time interactive environments, the BAPs are compressed for low bitrate representation using a standard MPEG-4 compression pipeline. However, the standard MPEG-4 compression is inefficient for streaming to power-constrained devices, since the streamed data requires extra power in terms of CPU cycles for decompression. In this paper, we have proposed and implemented an indexing technique for a BAP data stream, resulting in a compressed representation of the motion data. The resulting compressed representation of the BAPs is 1superior to the MPEG-4-based BAP compression in terms of both, required network throughput and power consumption at the client end to receive the compressed data stream and extract the original BAP data from the compressed representation. Although the resulting motion after de-compression at the client end is lossy, the motion distortion is minimized by intelligent use of the hierarchical structure of the skeletal avatar model. Consequently, the proposed indexing method is ideal for streaming of motion data to power- and network-constrained devices such as PDAs, Pocket PCs and Laptop PCs operating in battery mode and other devices in a mobile network environment.
Peer-to-Peer
icon_mobile_dropdown
DagStream: locality aware and failure resilient peer-to-peer streaming
Live peer to peer (P2P) media streaming faces many challenges such as peer unreliability and bandwidth heterogeneity. To effectively address these challenges, general "mesh" based P2P streaming architectures have recently been adopted. Mesh-based systems allow peers to aggregate bandwidth from multiple neighbors, and dynamically adapt to changing network conditions and neighbor failures. However, a drawback of mesh-based overlays is that it is difficult to guarantee network connectivity in a distributed fashion, especially when network locality needs to be optimized. This paper introduces a new P2P streaming framework called DagStream, which (1) organizes peers into a directed acyclic graph (DAG) where each node maintains at least k parents, thus has provable network connectivity (and hence failure resilience), and (2) enables peers to quickly achieve locality awareness in a distributed fashion, thus ensures efficient network resource usage. Our experiment results in both simulation and wide area environment show that with our DagStream protocol, peers can quickly self-organize into a locality aware DAG. Further, by selecting additional parents as needed, peers can achieve good streaming quality commensurate with their downlink bandwidth.
Characterizing files in the modern Gnutella network: a measurement study
Shanyu Zhao, Daniel Stutzbach, Reza Rejaie
The Internet has witnessed an explosive increase in the popularity of Peer-to-Peer (P2P) file-sharing applications during the past few years. As these applications become more popular, it becomes increasingly important to characterize their behavior in order to improve their performance and quantify their impact on the network. In this paper, we present a measurement study on characteristics of available files in the modern Gnutella system. We developed a new methodology to capture accurate "snapshots" of available files in a large scale P2P system. This methodology was implemented in a parallel crawler that captures the entire overlay topology of the system where each peer in the overlay is annotated with its available files. We have captured tens of snapshots of the Gnutella system and conducted three types of analysis on available files: (i) Static analysis, (ii) Topological analysis and (iii) Dynamic analysis. Our results reveal several interesting properties of available files in Gnutella that can be leveraged to improve the design and evaluations of P2P file-sharing applications.
Sampling cluster endurance for peer-to-peer based content distribution networks
Several types of Content Distribution Networks are being deployed over the Internet today, based on different architectures to meet their requirements (e.g., scalability, efficiency and resiliency). Peer-to-Peer (P2P) based Content Distribution Networks are promising approaches that have several advantages. Structured P2P networks, for instance, take a proactive approach and provide efficient routing mechanisms. Nevertheless, their maintenance can increase considerably in highly dynamic P2P environments. In order to address this issue, a two-tier architecture that combines a structured overlay network with a clustering mechanism is suggested in a hybrid scheme. In this paper, we examine several sampling algorithms utilized in the aforementioned hybrid network that collect local information in order to apply a selective join procedure. The algorithms are based mostly on random walks inside the overlay network. The aim of the selective join procedure is to provide a well balanced and stable overlay infrastructure that can easily overcome the unreliable behavior of the autonomous peers that constitute the network. The sampling algorithms are evaluated using simulation experiments where several properties related to the graph structure are revealed.
How efficient is BitTorrent?
BitTorrent is arguably the most popular media file distribution protocol used in Internet today. Even though empirically BitTorrent seems to be both efficient and scalable, there has been very little research on the detailed dynamics of its built-in control mechanisms, and their effectiveness across a wide variety of network configurations and protocol parameters. The main goal of this paper is to answer the question of how close BitTorrent is to the optimum, and indirectly how much room there is for further performance optimizations. We develop a centrally scheduled file distribution (CSFD) protocol that can provably minimize the total elapsed time of a one-sender-multiple-receiver file distribution task, and perform a comprehensive comparison between BitTorrent with CSFD. In addition, we compare several peer selection algorithms and analyze the applicability of BitTorrent to real-time streaming applications.