Share Email Print
cover

Proceedings Paper

Understanding the practical limits of the Gnutella P2P system: an analysis of query terms and object name distributions
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

A number of prior efforts analyzed the behavior of popular peer-to-peer (P2P) systems and proposed ways for maintaining the overlays as well as methods for searching for contents using these overlays. However, little was known about how successful users could be in locating the shared objects in these system. There might be a mismatch between the way content creators named objects and the way such objects were queried by the consumers. Our aim was to examine the terms used in the queries and shared object names in the Gnutella file-sharing system. We analyzed the object names of over 20 million objects collected from 40,000 peers as well as terms from over 230,000 queries. We observed that almost half (44.4%) of the queries had no matching objects in the system regardless of the overlay or search mechanism used to locate the objects. We also evaluated the query success rates against random peer groups of various sizes (200, 1K, 2K, 3K, 4K, 5K, 10K and 20K peers sampled from the full 40,000 peers). We showed that the success rates increased rapidly from 200 to 5,000 peers, but only exhibited modest improvements when increasing the number of peers beyond 5,000. Finally, we observed Zipf-like distribution for query terms and the object names. However, the relative popularity of a term in the object names did not correlate with the terms popularity in the query workload. This observation affected the ability of hybrid P2P systems to guide searches by creating a synopsis of the peer object names. A synopsis created by using the distribution of terms in the object names need not represent relevant terms for the query. Our results can be used to guide the design of future P2P systems that are optimized for the observed object names and user query behavior.

Paper Details

Date Published: 28 January 2008
PDF: 12 pages
Proc. SPIE 6818, Multimedia Computing and Networking 2008, 681807 (28 January 2008); doi: 10.1117/12.775128
Show Author Affiliations
William Acosta, Univ. of Notre Dame (United States)
Surendar Chandra, Univ. of Notre Dame (United States)


Published in SPIE Proceedings Vol. 6818:
Multimedia Computing and Networking 2008
Reza Rejaie; Roger Zimmermann, Editor(s)

© SPIE. Terms of Use
Back to Top