Title: Measurements, Analysis, and Modeling of BitTorrent-like Systems
1Measurements, Analysis, and Modeling of
BitTorrent-like Systems
- Lei Guo1, Songqing Chen2, Zhen Xiao3,
- Enhua Tan1, Xiaoning Ding1, and Xiaodong Zhang1
- 1College of William and Mary
- 2George Mason University, 3AT T Labs - Research
2Basic Model of P2P Systems
- Peers sharing different files self-organize into
a P2P network - Exchange files they desire
- Limitations
- Free riding
- Large file downloading
Examples Gnutella, KaZaa, eDonkey/eMule/Overnet
3BitTorrent Fast Delivery with Incentive
- A large file is divided into chunks
- Peers interested in the same file self-organize
into a torrent - Peers exchange file chunks with each other
- Incentive is established by tit for tat
- Very simple and effective, scale fairly well
during flash crowd
Torrent of Bits
4BitTorrent Traffic
- Online users
- 6.8 million in August 2004, 9.6 million in August
2005 (BigChampagne) - Traffic volume
- 53 of all P2P traffic on the Internet in June
2004 (CacheLogic)
P2P traffic 60-80 Other traffic
20-30 Source CacheLogic, 2004
5Limited Understanding of BitTorrent
- Existing studies on BitTorrent systems
(INFOCOM04, SIGCOMM04) - Unrealistic assumptions in system model no
evolution considered - Single-torrent based more than 85 BT users join
multiple torrents - What we are not clear about BitTorrent systems
- Service availability
- Service stability
- Service fairness
- Our objective of this work
- Evolution of single-torrent system, and
limitations of BT - Multi-torrent model for inter-torrent relation
and collaboration
during the entire lifetime
6Outline
- BitTorrent mechanism and our methodology
- Modeling and characterization of single-torrent
system - Modeling and characterization of multi-torrent
system - Inter-torrent collaboration
- Conclusion
7How BitTorrent Works Publishing
announce tracker URL for bootstrap creation
date epoch time of file
creation length file size name file name piece
length chunk size pieces SHA1 hash key
of each chunk
peer list
- The publisher
- Create a meta file
- Publish on a Web site
- Start the tracker site
- Start a BT client as the initial seed
8How BitTorrent Works Downloading
- The downloader
- Download the meta file
- Start a BT client, connect to the tracker site
- Get peer list from tracker
- Get first chunk from other peers (seeds)
9How BitTorrent Works Downloading
- The downloader
- Download the meta file
- Start a BT client, connect to the tracker site
- Get peer list from tracker
- Get first chunk from other peers (seeds)
- Exchange file chunk with other peers
- Download complete become a new seed
10How BitTorrent Works Downloading
- The downloader
- Download the meta file
- Start a BT client, connect to the tracker site
- Get peer list from tracker
- Get first chunk from other peers (seeds)
- Exchange file chunk with other peers
- Download complete become a new seed
- Initial seed leaves
Future performance Depends on the arrival and
departure of new downloaders and seeds
peer list
seed
11Our Methodology of this Study
- Measurement
- BitTorrent traffic pattern
- Meta file downloading and tracker statistics
- Analysis
- BitTorrent user behavior and performance
limitations - Curve fitting, parameter estimation and
validation of mathematical models - Modeling
- Torrent evolution and inter-torrent relation
- Fluid model, probability model, and graph model
12Meta File Downloading
- The first HTTP packets of .torrent file
downloading - Cable network 3,000 downloads, 1,000 torrent
meta files - Server farm 50 tracker sites host hundreds of
torrents - Gigasope fast Internet traffic monitoring tool
by ATT - What information it contains?
- Torrent birth time
- Peer arrival time to the torrent
(packet capture time
of downloading) - About 10 days
13Torrent Statistics on Trackers
- Professional/dedicated tracker sites
- Each may host thousands of torrents at the same
time - http//www.alluvion.org/ and http//www.crapness.c
om/, collected by University of Massachusetts,
Amherst - Ex alluvion -- 1,500 torrents, 550 are fully
traced - What information it contains?
- Torrents torrent birth time, file size, number
of peers/seeds - Peers request time, downloading/uploading bytes,
downloading/uploading bandwidth - Sampled every 0.5 hour for 48 days
14Outline
- BitTorrent mechanism and our methodology
- Modeling and characterization of single-torrent
system - The evolution of torrent over time
- Limitations of current BitTorrent systems
- Modeling and characterization of multi-torrent
system - Inter-torrent collaboration
- Conclusion
15Torrent Popularity
6 in average
time after torrent birth (day)
derivative of CCDF
16Torrent Death
Peer n arrives at time tn
When tn ? ?, what will happen?
inter-arrival time gt seed service time
torrent dead
17Torrent Population and Lifespan
Most torrents are small (avg 102)
Most torrents are short live (avg 8 days)
18Downloading Failure Ratio
- Define
- Avg downloading failure ratio
- about 10
- Different evolution patterns
- Small population ? large Rfail
- Reminder most torrents have small population!
- Altruistic peers make torrents long live
19Torrent Evolution Fluid Model
- Existing model (SIGCOMM 04)
- Constant arrival rate ? const
- Torrent reaches equilibrium
- The correct model
- Exponentially decreasing arrival rate
- Torrent dead finally
- Verified by our measurements
- Two completely different pictures
20Torrent Evolution Modeling Results
- Flash crowd
- Downloader exponentially ?
- Seed exponentially ?
- Peek time
- A very short duration
- Constant arrival model flat peak
- Attenuation a long tail
- Downloader exponentially ?
- Seed exponentially ?
- Constant arrival model is far from the reality
no attenuation - Torrent death
constant arrival model
of downloaders
constant arrival model
of seeds
21Performance Stability
Evolution over time
avg download speed (byte/sec)
Only stable when torrent is large Fluctuate
significantly after peak time
Larger torrents have higher and more table
performance
22Service Unfairness
- Unfairness ? download speed, ? uploading
contribution - Seeds serve high speed downloaders first
- Peers not willing to serve after downloading
- Not due to new file downloading selfish
23Single-torrent Model Summary
- Torrent evolution over time
- Exponentially decreasing arrival rate
- Flash crowd short peak long tailed
attenuation - BitTorrent Limitations
- Content availability torrent death
- Performance stability
- Service fairness
24Outline
- BitTorrent mechanism and our methodology
- Modeling and characterization of single-torrent
system - Modeling and characterization of multi-torrent
system - Traffic pattern and user behavior
- Graph based model of inter-torrent relation
- Inter-torrent collaboration
- Conclusion
25Multi-torrent Environment Dynamics
Torrent birth
Request arrival
Peer birth
CDF of torrents
CDF of requests
CDF of peers
------ raw data ------ linear fit
------ raw data ------ asymptotic fit
------ raw data ------ linear fit
Torrent birth time, request arrival time, and
peer birth time (hour)
- Considering peers and torrents on the Internet as
an open system - Torrent birth rate, torrent request rate, and
peer birth rate are constant - Implication
- The lifecycle of a BT peer downloading, seeding,
sleeping, , dead
26Peer Request Pattern Request Rate
102
108
- Peer request rate
- requests by a peer to different torrents per
unit time
101
104
? r (day)
of torrents
Assume
x torrents
? r
100
100
?r ? 77 years !
0 2000 4000
peers
- Peer request process seems Poisson-like
- Request a new torrent with a probability p
participation probability - Dead with probability 1-p
27Peer Request Pattern Participation Probability
Probability model
peers request at least m torrents
p 0.8551
Another estimation of p
Probability model confirmed
28Inter-torrent Relation Graph How Torrents Can
Help with Each Other?
some peers in torrent i have downloaded j
1
j
i
2
some peers in torrent j have downloaded i
29Inter-torrent Relation Graph How Torrents Can
Help with Each Other?
some peers in torrent i have downloaded j
1
j
i
2
some peers in torrent j have downloaded i
- Edge weight Wi,j number of such peers
30Single-torrent vs. Multi-torrent Model
- Single-torrent model
- ? seed service time, ? download failure rate
- Limited seed service time ?, but inter-arrival
time ? exponentially - Small improvement
- Multi-torrent model
- Old peers come back multiple times
- ? peer arrival rate, ? peer inter-arrival time
- Significant improvement
31Single-torrent vs. Multi-torrent Model
Multi-torrent model
Single-torrent model
0.1
seeds stay 10 times longer ? ? /10
torrent death ?' (T'life) ?
0.01
1?10-6 0
Inter-torrent collaboration is much more
effective than stimulating seeds to serve longer
32Outline
- BitTorrent mechanism and our methodology
- Modeling and characterization of single-torrent
system - Modeling and characterization of multi-torrent
system - Inter-torrent collaboration
- Tracker site overlay
- Instant incentive for collaboration
- Conclusion
33Tracker Site Overlay
B
Neighbor-in
torrents that can serve me
B C
A
Neighbor-out
torrents that I can serve (peer list)
D
D
C
- Self-organized P2P network (a logical structure)
- An instance of inter-torrent relation graph
- A built-in mechanism for content search, cover
99 torrents - Trackerless BitTorrent uses DHT to store meta
file
34Incentive for Inter-Torrent Collaboration
B
A
C
D
Tom
- Instant incentive similar to tit-for-tat
principle - Neighboring cycle detection
- Neighboring cycle construction
- Bandwidth trading get one chunk, serve multiple
peers
35Conclusion
- Extensive analysis and modeling to study the
behaviors of BT-like systems - Tracker trace and .torrent downloading trace
- Mathematical model
- BitTorrent system has its limitations due to
exponentially decreasing peer arrival rate - Service availability, performance stability, and
fairness - Graph based multi-torrent model
- System design for inter-torrent collaboration
36Thank you!
37Backup for Questions
38Torrent Lifespan
- Extract ?t and t from trace
- Get ?0 and ? using linear regression
- Lifespan model verified by measurement
39Torrent Population
Total population
- Model verified by measurement
- Observations
- The population of most torrents are small (102 in
average) - Downloading failure ratio
- Small population ? large Rfail
40Torrent Evolution Fluid Model
Basic equation set
Parameters
x(t) number of downloaders
y(t) number of seeds
?0 initial peer arrival rate
? attenuation parameter of ?
? uploading bandwidth
c downloading bandwidth (c gtgt ?)
? seed leaving rate
? file sharing efficiency
?1,?2 eigen values of the equation set
a,b,c1, c2,d1,d2 constants
Resolution
41Peer Request Pattern Summary
- Multi-torrent environment an open model
- Torrent birth rate 0.9454 per hour (nearly a
constant) - Peer birth rate 19.37 per hour (nearly a
constant) - Torrent request rate (for all peers over all
torrents) 133.39 per hour (nearly a constant) - Actually increase slowly according to
BigChampagne - Peer request pattern
- Lifecycle downloading, seeding, sleeping, ,
next req with prob. p - Peer participation probability 0.85
- Request rate (for different torrents by a peer)
Poission-like
42Tracker Site Overlay
- Table size
- Node degree distribution
- Similar to unstructured P2P networks
- Many content search and msg routing algorithms
- Flooding
- Random walk
-
- Trackerless BitTorrent uses DHT to store meta
file
43Simulation Experiments
without inter-collaboration with
inter-collaboration
performance stability
service fairness
content availability
downloading speed
downloading failure ratio
contribution ratio
Rfail? 0
more balanced
more stable
Inter-torrent collaboration can improve
BitTorrent performance