Title: A Measurement Study of Peer-to-Peer File Sharing Systems
1A Measurement Study of Peer-to-Peer File
SharingSystems
- Stefan Saroiu, P. Krishna Gummadi, Steven D.
Gribble - Presented by Zhengxiang Pan
- March 18th, 2003
2Introduction
- Napster Gnutella
- Population of users
- Bottleneck bandwidth of hosts latencies
- Duration time of remain connected
- Number of files shared downloaded
3Methodology-architecture
- Napsters architecture
- A cluster of central servers
- Each peer connects to one server
- Servers cooperate to process query
- Gnutellas architecture
- No centralized servers
- Peers form overlay network
- Send a query by a controlled flood
4Methodology-crawler
- Napster crawler
- A larger number of connections to a single server
- Issue popular queries in parallel
- Captured 40-60 local users
- Gnutella crawler
- Iteratively send ping messages with large TTLs
- Discover new hosts by receiving pong messages.
- Capture 25-50 of the total population
5Methodology-directly measure characteristics
- Latency
- Measure the time spent by exchanging a 40-byte
TCP packet. - Lifetime
- Offline not respond to TCP SYN packets
- Inactive respond with TCP RST
- Active accept the connection
- Bottleneck bandwidth
- Approximate to available bandwidth
- Actively measure upstream and downstream using a
few TCP packets
6Results-bandwidth
Downstream upstream bottleneck bandwidth -50
in Napster 60 in Gnutella use broadband
connections -25 in Napster 8 in Gnutella use
modems -20 in Napster 30 in Gnutella have
high bandwidth (gt3Mbps)
7Result-reported bandwidth
22 in Napster report unknown bandwidth
8Result- latency
Latencies for Gnutella users -Unstructured,
ad-hoc, a substantial fraction suffer from
high-lantency -Difference in trans-oceanic peers
9Result- availability
-only 20 peers had an IP-level uptime of 93 or
more -Median session duration 60 minutes
10Result-files
-25 in Gnutella do not share any files -40-60
peers share 5-20 of the shared files
11Result-download upload
the percentage of peers in each bandwidth class
is roughly the same as the percentage of files
shared by that bandwidth class.
12Result- cooperate
-30 of the users that report their bandwidth as
64 Kbps or less actually have a significantly
greater bandwidth. -10 of the users reporting
high bandwidth (3Mbps or higher) in reality have
significantly lower bandwidth.
13Result-resilience of Gnutella overlay
Although highly resilient in the face of random
breakdowns, Gnutella is nevertheless highly
vulnerable in the face of well-orchestrated,
targeted attacks.
14Conclusion
- Heterogeneity of hosts
- Carefully delegate responsibilities
- Clearly evidence of client-like and server-like
behaviors - Peers tend to misreport information if there is
an incentive to do so - Built-in incentive for telling the truth
- Verify reported information