Title: PeertoPeer P2P Computing
1Peer-to-Peer (P2P)Computing
2Centralized Architectures
- In the previous set of we talked about systems
that have massive scale and that have a
centralized architecture - They have been shown to work well in the area of
volunteer computing - Centralized systems are easy to develop, deploy,
and maintain - Server-side control, updates, etc.
- Problems with centralized architectures
- The server can be a performance bottleneck
- e.g., SETI_at_home pays a lot of money each year to
buy network bandwidth - e.g., SETIhome purchases and maintain decent
servers - The server can be a central point of failure
- e.g., if there is a network outage in the
SETI_at_home building, then nothing works for a
while - An alternative peer-to-peer systems
- Some content here adapted from material generated
by Michael Welzl, at the University of Innsbruck,
Austria
3A Peer?
- Peer one that is of equal standing with
another - P2P builds on the capacity of end-nodes that
participate in the system, treating them all
equal - end-nodes computers of participants
- Made popular for file sharing applications
- But the same idea is used in other domains
- P2P ad-hoc networks (sensor networks)
- Content distribution (BitTorrent)
- Communication (Skype)
- Netowork monitoring
- etc.
4P2P Essential Principles
- Self-organizing, no central management
- A peer is autonomous
- Sharing of resources (storage, CPU, content)
- resources at the edges of the network
- Peers are equal (more or less)
- Large numbers of peers
- Churn is the common case
- Intermittent connectivity, peers come and go
- To be contrasted to the standard client-server
architecture
5P2P Principles
- The big question is how to do something useful
and that works with a bunch of uncoordinated
peers that need to be autonomous - The pay-offs are multiple
- No need for any infrastructure, just a piece of
code that people hopefully install and run on
whatever machine they have - Better resilience to attacks
- No peers is special, so it can go down without
compromising anything - The system relies on many heterogeneous computers
- Different OSs and/or OS versions should make it
difficult for a virus to take the whole thing
down - Some have a vision of almost everything being
P2P - No more Web servers, mail servers, etc.
6Napster
- The term P2P was coined in 1999 by Shawn Fanning,
the original Napster developer - The success of Napster brought the P2P idea to
everybodys attention and made it very popular - A large fraction of all network traffic today is
due to P2P applications - Dropping fraction due to increasing video
streaming - 2007 Youtube bit BitTorrent
- Ironically, Napster wasnt fully P2P
- Files where downloaded directly from
participants computer, without a central data
repository - But there was a centralized server that held the
catalog of files (which computer stores what
right now) - which was really the way in which Napster was
brought down from a legal point of view
7P2P Trend?
- P2P has become very popular, and there is a
little bit of a centralized systems are not
cool feeling around - However, its clear that centralized can work
(look at Google) - So when to decide to build a P2P system?
- Things to consider
- Budget
- Resource relevance
- Trust
- Rate of system change
- Criticality
8P2P or not P2P?
- Budget
- If you have enough money, build a centralized
system - Again, look at Google
- Note that centralized doesnt mean that there
arent multiple servers - Its just not about peers, but about clients
and servers - P2P is useful when budget isnt unlimited
- Resource relevance
- If many users care about the resources, then P2P
is viable - Otherwise it wont work, as there will never be
enough of a core number of active systems - Trust
- Its difficult to build a P2P system with many
untrusted participants (active research problem)
9P2P or not P2P?
- Rate of system change
- peers joining/leaving, content being updated
- Tolerating high change rates is a difficult
research challenges for P2P systems - Criticality
- If you cant live without the service provided by
the system, P2P is a bit iffy
10Structured vs. Unstructured
- P2P systems are typically classified into two
kinds - In unstructured systems, content may be stored on
any peer - In structured systems, content has to be stored
by specific peers - Lets first look at a few important unstructured
systems and discuss their strengths and weaknesses
11Napster
- Napster was the first widely popular P2P system
- Dont mistake the new Napster store with the
old Napster P2P system - Only sharing of MP3 files was possible
- How it worked
- User registers with a central index server
- Gives list of files to be shared
- Central server knows all the peers and files in
the network - Searching based on keywords
- Search results were a list of files with
information about the file and the peer sharing
it - e.g., encoding rate, size of file, peers
bandwidth - some information entered by the user, hence
unreliable
12Napster
13Napster
- Strengths
- Consistent view of the network
- Some answers guaranteed to be correct (e.g.,
nothing found) - Fast and efficient searches
- Weaknesses
- Usual problem with a centralized server
- Money can be thrown at it (e.g., Google)
- Central server susceptible to attacks
- viruses and legal attacks
- Results unreliable
- True of all P2P systems to some degree
14Gnutella
- Gnutella came soon after Napster
- Originally developed by AOL, but the code was out
on the net by mistake. Before it was pulled out
it was too late, and it was out... - Fully decentralized
- No index server
- Had an open protocol
- Which was great for research
- It was never a huge network
- Because it was quickly surpassed by better
systems - No longer in use
15Gnutella
- There are only peers
- Peers are connected in an overlay network
- To join the network, a new peer only needs to
know of one existing peer that is currently a
member - Done via some out-of-band mechanism, like a Web
site - Once a peer joins the network, it learns about
other peers and about the topology of the overlay
network - Queries are flooded over the network
- Downloads happen directly between peers
16Gnutella
- Queries are sent to neighbors
- Neighbors forward queries to their neighbors, and
so on - Until some threshold is reached (a time-to-live
or TTL) - If some reply was found, then its routed back to
the query originator following the path in reverse
17Gnutella
- Strengths
- Fully distributed, no central point of failure
- Open protocol (easy to write clients)
- Very robust against random node failures
- Weaknesses
- Flooding is very inefficient and fails to find
thats looked for pretty often - How to pick the best query radius? is pretty
much impossible to answer
18KaZaA
- KaZaA proposed a very different architecture,
that has influenced most file-sharing systems
after it - On a typical day KaZaA has over 3 million active
users, and over 500 TeraBytes of content - Based on a super-node architecture
- Some peers are better and thus special
- Introducing some hierarchy in the system helps
19KaZaA
- Each SN keeps track of a subset of the peers
- A new peer registers to one SN only
20KaZaA Search
- The KaZaA Query
- A peer sends a query to its SN
- The SN answers for all its peers and then
forwards to other SNs via flooding - Note that the SNs are not fully connected in the
peer-to-peer network of SNs - Other SNs reply
- Finding SuperNodes?
- A normal peer can be promoted if it demonstrates
that it has enough resources - A user can always refuse to become a SN
- About 30,000 SNs at a given time
21KaZaA
- Strengths
- Combine strengths of Napster and Gnutella
- Weaknesses
- Query are still not comprehensive due to limited
flooding - But a much better reach than Gnutella
- Lawsuit against KaZaA eventually successful
- software comes with a list of well-known
supernodes
22Content Distribution
- BitTorrent provided a new approach for file
sharing - Widely used for fully legal content
- Linux distribution, software patches, etc.
- Has its share of litigations
- Goal Quickly replicate a file to a large number
of clients - A new overlay network is built for every file
thats being distributed - You have to know the file reference or torrent
- contains metadata on the content
- You can send a torrent to people, or publish it
- There is no real searching in BitTorrent itself
- Although out-of-band catalogs exist of course
23BitTorrent
- For each new BitTorrent file, one server hosts
the original copy - The file is broken into chunks
- There is also a torrent file which is typically
kept on some web server(s) - Clients download the torrent file
- whose metadata identifies a tracker
- The tracker is a server that keeps track of
currently active clients for a file - The tracker doe not participate in the download
and never holds any data - Note that lawsuits have been successful against
people running trackers!
24BitTorrent
25BitTorrent
- Terminology
- Seed Client with a complete copy of the file
- Leecher Client still downloading the file
- Client contacts tracker and gets a list of other
clients - Gets list of 50 peers
- Client maintains connections to 20-40 peers
- Contacts tracker if number of connections drops
below 20 - This set of peers is called peer set
- Client downloads chunks from peers in peer set
and provides them with its own chunks - Chunks typically 256 KB
- Chunks make it possible to use parallel download
26BitTorrent Tit-for-Tat
- A peer serves peers that serve it
- Encourages cooperation, discourage free-riding
- Peers use rarest first policy for chunk
downloads - Having a rare chunk makes peer attractive to
others - Others want to download it, peer can then
download the chunks it wants - Goal of chunk selection is to maximize entropy of
each chunk - For first chunk, just randomly pick something, so
that peer has something to share - Endgame mode
- Send requests for last sub-chunks to all known
peers - End of download not stalled by slow peers
27BitTorrent Choke/Unchoke
- Peer serves e.g. 4 (default value) peers in peer
set simultaneously - Seeks best (fastest) downloaders if its a seed
- Seeks best uploaders if its a leecher
- Choke is a temporary refusal to upload to a peer
- Leecher serves 4 best uploaders, chokes all
others - Every 10 seconds, it evaluates the transfer
speed - If there is a better peer, choke the worst of
the current 4 - Every 30 seconds peer makes an optimistic
unchoke - Randomly unchoke a peer from peer set
- Idea Maybe it offers better service
- Seeds behave exactly the same way, except they
look at download speed instead of upload speed
28Searching vs. Addressing
- In the peer-to-peer networks weve discussed so
far, on searches for content - The content could be on any peer, so we need to
look for it somehow, e.g.,using keywords - When the system answers didnt find it, that
doesnt mean the content isnt there - This is not at all the way in which a storage
system works - e.g., a file system on your machine
- Storage systems work based on an addressing
scheme - Content (e.g., a file) is known by a unique name
- There is a way to know (not find) where that
unique name is stored - Searching by keyword can be implemented, but as a
separate feature (e.g., Spotlight on Mac OS X) - Such a storage system is typically more efficient
- But perhaps less user friendly
- Some P2P systems attempt to implement content
addressing rather than content searching
29Structured vs. Unstructured
- Unstructured networks/systems
- Based on searching
- Unstructured does NOT mean complete lack of
structure - Network has graph structure
- But peers are free to join anywhere and objects
can be stored anywhere - Structured networks/systems
- Based on addressing
- Network structure determines where peers belong
in the network and where objects are stored - Should be efficient for locating objects
- Big question How to build structured networks?
30Addressing in a Network
- To enable addressing, we must have a scheme to
figure out on which peer a particular file is
stored - This is typically done via some hashing
- Has the file name (e.g., a fully qualified path)
using some has function to create a unique fileID - Using a good hash function is a crucial
- Large hash, so that there are no collision
- Hash that balances the load across the hash space
- A useful abstraction (i.e., abstract data type)
to implement addressing is a Distributed Hash
Table - put(key, value) stores something in the network
- e.g., key hash of file name
- e.g., value file content
- lookup(key) locates something in the network
- returns the value
31HT and DHT
0 2
0 1 2 3 4 5 6 7 8 9
peer A
5
peer B
7 8
peer C
DHT
HT
32Using the Abstraction
Distributed Application
put(key, value)
lookup(key)
value
DHT Implementation
33Implementing a DHT
- Question Which network structure do we use to
support the DHT abstraction??? - How to we identify peers?
- Which other peers does one peer know about?
- How to we route queries?
- Which peer stores what?
34Network Topologies
- The topic of network topologies was a hot topic
in the area of supercomputers - Goal organize nodes of a supercomputer as
vertices of a graph, such that - The graph scales well
- i.e., not too many links per node
- which cost a lot of money in the case of physical
links - The graph has good performance
- i.e., its diameter is small
- diameter max number of hops between two nodes
- Lets see a few examples
35Fully Connected Graph
- Diameter 1
- Number of connections per node N
- Great performance, poor scalability
36Ring
- Diameter N/2
- Number of connections per node 2
- Poor performance, great scalability
37Torus/Grid
- Diameter N/4
- Number of connections per node 4
- Better performance than a ring
- Poorer scalability than a ring
38Hypercube
- Diameter log N
- Number of connections per node log N
- Considered like a good compromise by many (used
to build machines)
- Defined by its dimension, d (N 2d)
39Hypercube Routing
- Each node is identified by a d-bit name
- routing from xxxx to yyyy just keep going to a
neighbor that has a smaller Hamming distance! - we will see this idea again
1111
1110
0110
0111
1010
0011
0010
1011
1101
0101
1100
0100
1001
1000
0001
0000
40Overlay network topologies
- Here were building a P2P system, not a
supercomputer, so maintaining 10 network
connections to 10 neighbor peers doesnt require
10 network cards/links! - Still, we cant go fully connected due to the
size of the routing tables - Lets say we want to have a P2P network with 107
peers (10 million) - Each peer must maintain a routing table that
lists the peers along with some information on
them - at a minimum IP address, port number, peerID
- This could represent quite a bit of memory
- Going through the routing table to fix/repair it
due to churn would take too much time time (and
most of its content would be erroneous) - Therefore, it doesnt scale well
- How about a Hypercube?
- diameterlog(N) and number of connectionlog(N)
is g-r-e-a-t - the easy routing is g-r-e-a-t
- The problem here is that its easily broken by
churn, and its difficult to accommodate new
nodes (number of nodes is power of 2) - Could work with many tweaks
- Question whats a good structure that has some
of the nice properties of an hypercube and is
robust to churn?
41Chord
- Lets look at Chord, a famous DHT project
- Developed at MIT in 2001
- Fairly simple to understand (unlike other DHTs)
- File names and node names are hashed to the same
space, i.e., numbers between 0 and 2m-1 - where m is large enough and the hash function
good enough that collisions happen only with
infinitesimal probability - Each file has a unique fileID
- e.g., hash of its name
- Each peer has a unique peerID
- e.g., hash of its IP address)
- Important there is no difference between a
fileID and a peerID - Theyre just numbers that can be sorted and
compared easily
42The Chord Ring
- Peers are organized as a sorted ring
- Peers are along the ring in increasing order of
peerID - Remember, peerIDs are just numbers
- Called a Chord ring
- Each peer knows its successor and predecessor in
the ring - For now lets assume no churn what-so-ever
- No peer arrives, no peer departs
- Main Chord idea A Peer stores Keys that are
immediately lower than its peerID - Lets look at an example Chord ring and see which
peer stores what
43A Chord Ring
10 peers
P1
Stores keys in 51,56
P8
Stores keys in 8,14
P56
P14
P51
P21
P48
Stores keys in 21,32
P42
P32
P38
Stores keys in 32,38
A peer stores (key,value) pairs whose keys are
lower than the peers peerID and higher than the
peerID of the peers predecessor
44Put() and Lookup()
Principles is the same for Put() and Lookup()
P1
P8
P56
P14
P51
P21
P48
P42
P32
find key 49
P38
45Put() and Lookup()
P1
P8
P56
P14
P51
P21
P48
P42
P32
find key 49
P38
Go around the ring, following the successor
links Stop at the first peerID that is larger
than 49 (peer 51 here) If key 49 was stored in
the network, peer 51 has it
46Scalability and Performance
- The Chord ring as we have shown it is very
scalable - Each peer only needs to know about two other
peers - Very small routing table!
- The problem is that the performance is very poor
- The worst case complexity for a lookup is O(N)
hops, where N is the number of peers - Since N can be on the order of millions, clearly
its not even remotely acceptable - Each hop will take hundreds of milliseconds
- Question how can we make the number of hops
O(log N)? - Answer By adding more edges in the network
47Chord Fingers
- Each peer maintains a finger table that
contains m entries - We have 2m potential peers in the system
- So the finger table has at most O(log N) entries
- The ith entry in the finger table of peer A
stores the peerID of the first peer B that
succeeds A by at least 2i-1 on the chord ring - B successor(n2i-1)
- B is called the ith finger of peer A
- Lets see an example
48Chord Fingers
P1
P8
P56
P14
P51
P21
P48
P42
P32
P38
fingeri first peer that succeeds peer (p2i)
mod 2m
49Using Chord Fingers
Find key 54
P1
P8
P56
P14
P51
P21
P48
P42
P32
P38
- With the finger table, a peer can forward a query
a least halfway to its destination in one hop - One can easily prove that the worst case number
of hops is O(log N)
50Peers joining and leaving
- We now have the nice hypercube property
- routing table O(log N)
- number of hops O(log N)
- Question What happens when a pear joins/leaves
the system? - gracefully, not due to crashes
- Leaving is straightforward
- give (key,value) pairs to successor
- Joining is a bit more complicated, but still
simple - insert oneself in the ring
- take over part of the key space of successor
51Peer Joining
P1
P8
P56
P14
P51
P21
P48
P42
P32
I am a new peer and my peerID is 40
P38
P40
52Peer Joining
P1
P8
P56
P14
P51
P21
P48
P42
P32
I am a new peer and my peerID is 40
P40
P38
- Do a lookup for Key 40 (pretending its a
fileID), to identify along the way the first node
with ID 40 and the first node with ID - Then insert the new peer (in this case between
P38 and P42) - Requires a few successor and predecessor pointer
updates - Requires computing/updating fingers all over the
place (O(log N) messages) - Then take over (key,value) pairs in range 38,40
from P42
53What about crashes?
- Crashes are difficult to handle
- Yet they happen all the time
- Chord uses a stabilization protocol
- Each node periodically engages in some
communications that repair successor and
predecessor pointers and finger tables - Uses a simple mechanism each peer stores
pointers to Log(N) successors, rather than just
one - Therefore its possible to detect missing nodes,
and to repair all connections - There are many theoretical and practical results
that show that this works well in practice - e.g., Lookup failure rate Peer departure rate
- In fact, even graceful peer departures are
treating as crashes, but the stabilization
protocol works so well
54Lookup failures?
- Lookup failures will happen when nodes crash
- The data they stored is no longer there!
- One solution use replication at a higher level
- e.g., use Individual Chord rings so that 10
copies of each value are stored, each with a
different pair - When a lookup fails for one of the keys, try
another on - Then restore the copy that disappeared
- Chord is being used as the basis for several
project - shared storage
- digital libraries
- Downloadable at http//pdos.csail.mit.edu/chord/
55Conclusion
- P2P systems have been successfully used in
several domains - Two classes
- unstructured successful file-sharing systems and
content distribution systems - based on searching
- structured more on the research side, but much
more powerful - based on addressing within DHTs
- Although its difficult to forecast, the future
of P2P system should be pretty cool