Analyzing Peer-to-Peer Traffic Across Large Networks - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Analyzing Peer-to-Peer Traffic Across Large Networks

Description:

Analyzing Peer-to-Peer Traffic Across Large Networks. Jia Wang. Joint work ... FastTrack:1214 (including Morpheus) Gnutella:6346/6347. DirectConnect:411/412 ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 22
Provided by: resear8
Category:

less

Transcript and Presenter's Notes

Title: Analyzing Peer-to-Peer Traffic Across Large Networks


1
Analyzing Peer-to-Peer Traffic Across Large
Networks
  • Jia Wang
  • Joint work with Subhabrata Sen
  • ATT Labs - Research

2
P2P applications
  • Distributed file sharing
  • Napster, Gnutella, FastTrack, EDonkey,
    DirectConnect
  • Searching v.s. data fetching phases
  • All the communications occur over default ports
  • SuperNodes and Hubs
  • Why is this interesting?
  • Large and growing traffic volume

3
Outline
  • Methodology
  • Data collection
  • Characterization metrics
  • Analysis results
  • Traffic volume and overlay topology
  • System dynamics
  • Traffic characterization
  • P2P vs Web

4
Methodology
  • Challenges
  • Decentralized system
  • Transient peer membership
  • Some popular close proprietary protocols
  • Large-scale passive measurement
  • Flow-level data from routers across a large
    tier-1 ISP backbone
  • Analyze both signaling and data fetching traffic
  • 3 levels of granularity IP, Prefix, AS
  • P2P protocols
  • FastTrack1214 (including Morpheus)
  • Gnutella6346/6347
  • DirectConnect411/412

5
Methodology Discussion
  • Advantages
  • Requires minimal knowledge of P2P protocols
    port number
  • Large scale non-intrusive measurement
  • More complete view of P2P traffic
  • Allows localized analysis
  • Limitations
  • Flow-level data no application-level details
  • Incomplete traffic flows
  • Other issues
  • DHCP, NAT, proxy
  • Host ? IP
  • Asymmetric IP routing

6
Measurements
  • Characterization
  • Overlay network topology
  • Traffic distribution
  • Dynamic behavior
  • Metrics
  • Host distribution
  • Host connectivity
  • Traffic volume
  • Mean bandwidth usage
  • Traffic pattern over time
  • Connection duration and on-time

7
Data cleaning
  • Invalid IPs
  • 10.0.0.0-10.255.255.255
  • 172.16.0.0-172.31.255.255.255
  • 192.168.0.0-192.168.255.255
  • No matched prefixes in routing tables
  • Invalid AS numbers
  • gt 64512
  • Removed 4 flows

8
Overview of P2P traffic
  • Total 800 million flow records
  • FastTrack is the most popular one

9
Host distribution
10
Host connectivity
FastTrack (9/14/2001)
Connectivity is very small for most hosts, very
high for few hosts Distribution is less skewed
at prefix and AS levels
11
Traffic volume distribution
FastTrack (9/14/2001)
  • Significant skews in traffic volume across
    granularities
  • Few entities source most of the traffic
  • Few entities receive most of the traffic

12
Mean bandwidth usage
FastTrack (9/14/2001)
  • Upstream usage lt downstream usage. Possible
    causes are
  • Asymmetric available BW, e.g., DSL, cable
  • Users/ISPs rate-limiting upstream data
    transfers

13
Time of day effect
FastTrack (9/14/2001 GMT)
  • Traffic volume exhibits very strong time-of-day
    effect
  • Milder time-of-day variation for hosts in the
    system

14
Host connection duration on-time
FastTrack (9/14/2001) thd30min
  • Substantial transience most hosts stay in the
    system for a short time
  • Distribution less skewed at the prefix and AS
    levels
  • Using per-cluster or per-AS indexing/caching
    nodes may help

15
Traffic characterization
  • The power law
  • May not be a suitable model for P2P traffic
  • Relationship between metrics
  • Traffic volume
  • Number of IPs
  • On-time
  • Mean bandwidth usage

16
Traffic volume vs. on-time
FastTrack (9/14/2001) top 1 hosts (73 volume)
1
2
  • Volume heavy hitters tend to have long on-times
  • Hosts with short on-times contribute small
    traffic volumes

17
Connectivity vs. on-time
FastTrack (9/14/2001) top 1 hosts (73 volume)
1
2
  • Hosts with high connectivity have long on-times
  • Hosts with short on-times communicate with few
    other hosts

18
P2P vs Web
  • Observations
  • 97 of prefixes contributing P2P traffic also
    contribute Web traffic
  • Heavy hitter prefixes for P2P traffic tend to be
    heavy hitters for Web traffic
  • Prefix stability the daily traffic volume (in
    ) from the prefix does not change over days
  • Experiments 0.01, 0.1, 1, 10 heavy hitters
    gt 10, 30, 50, 90 of the traffic volume

19
Traffic stability
March 2002
Top 0.01 prefixes
Top 1 prefixes
P2P traffic contributed by the top heavy hitter
prefixes is more stable than either Web or total
traffic
20
Summary
  • Measure and characterize P2P traffic across a
    large network
  • Three popular P2P systems
  • Significant increase in both number of users and
    traffic volume
  • Traffic distributions are highly skewed
  • High level system dynamics
  • P2P is significant, but stable component of the
    Internet traffic

21
Acknowledgement
  • ATT Labs
  • Matt Grossglauser, Carsten Lund, Jennifer
    Rexford, Matt Roughan, Fred True
  • External
  • Steve Gribble
Write a Comment
User Comments (0)
About PowerShow.com