Title: Chapter 1: Introduction
1Peer-to-Peer Networks Prashant Dewan Thursday,
16 September 2004
2Introduction
- Client-Server Networks
- Peer-to-Peer Networks
- Hybrid Networks
3Client-Server
4Client-Server Networks
- Most common architecture
- Centralised model for data storage, security,
running applications and network administration - Example Mail Servers, Web Servers, News Servers
5Client-Server Networks (2)
- Based on a scaleable model
- Users network servers
- Provide services such as printing, email etc
- Allow a high level of security to be implemented
- Can be centrally managed
6Client-Server Pros Cons
- Advantages
- Networked web of computers
- Inexpensive but powerful array of processors
- Open systems
- Grows easily
- Individual client operating systems
- Disadvantages
- Maintenance nightmares
- Support tools lacking
- Retraining required
7Peer-to-Peer Networks
Hub
Stations, No Server, Equal Priority Democratic
8Peer-to-Peer Networks (2)
- Resources shared in a decentralised manner
- Shared resources include files and printers
- Should be used where nodes less than ten
- Files are not stored centrally, e.g. Kazaa
- Allows easy node-to-node communication
9Peer-to-Peer Networks (3)
- Support is usually part of OS
- Sharing of files is responsibility of each
participant - Participants form a Workgroup (Windows)
- Workgroup is assigned a name important where
there are multiple workgroups (eg dept.s)
10Advantages
- Easy to configure
- No requirement for server hardware/software
- Users can mange their own resources
- No need for a network administrator
- Reduce total cost
11Disadvantages
- Provide a limited number of connections
- May slow performance of nodes
- Do not allow central management
- Do not have a central store of files
- Users responsible for managing own resources
- Offers very poor security
12P2P Issues
- Organize, maintain overlay network
- node arrivals
- node failures
- Resource allocation/load balancing
- Resource location
- Network proximity routing
13System Architecture
- How to organize participating nodes?
Simple unstructured approach e.g., Gnutella
Pure P2P
Structured DHTs e.g., Pastry, Tapestry
Hybrid unstructured approach e.g., Napster
Hybrid P2P
Structured DHTs with some centralization
14Routing Strategies
- Given a query, how to route it to its destination?
Broadcast each query is broadcast to all its
neighbors. (E.g. Gnutella)
Random walk Query is routed to one of its
neighbors randomly (directed random walk)
DHT Query is routed by looking up a routing
table. (Pastry, Tapestry, Chord)
15Typical systemsNapster
central index
...
16Typical Systems--Napster
- Central index server
- Hybrid approach C/S P2P
- Pros
- Simple yet efficient approach
- Bounded query lookup time
- Cons
- Scalability
- Reliability
17Typical Systems--Gnutella
18Typical Systems--Gnutella
- No central server, purely decentralized P2P
- Constrained broadcast mechanism (TTL)
- Pros
- Simple yet efficient
- Fault-tolerant
- Cons
- High bandwidth consumption
- Scalability
19Typical systems--DHTs
Distributed application
data
get (key)
Put (key, data)
Distributed hash table
- DHT provides the information look up service for
P2P applications. - Nodes uniformly distributed across key space
- Nodes form an overlay network
- Nodes maintain list of neighbors in routing table
20Typical Systems--Pastry
- 128 bit node identifier
- Pastry routes message to node with node id
numerically closest to key value. Like a
distributed hash table. - Routing is based on id prefixes
- Nodes forward messages to a node closer to
keyID - Closer in terms of length of common prefix
- If current nodeID shares first n digits with
keyID - Forward to a node that shares gt n prefix digits
with keyID - If no longer common prefix nodeID is in routing
table - Then forward to a node that is numerically closer
to keyID. - N nodes, at most O(log(2b)N) messages along the
routing path
21Pastry
- One primitive
- route(M, X) route message M to the live node
with nodeId closest to key X - nodeIds and keys are from a large, sparse id
space
22Distributed Hash Tables (DHT)
nodes
k1,v1
k2,v2
k3,v3
P2P overlay network
Operations insert(k,v) lookup(k)
k4,v4
k5,v5
k6,v6
- p2p overlay maps keys to nodes
- completely decentralized and self-organizing
- robust, scalable
23Pastry Leaf sets
- Each node maintains IP addresses of the nodes
with the L/2 numerically closest larger and
smaller nodeIds, respectively. - routing efficiency/robustness
- fault detection (keep-alive)
- application-specific local coordination
24Pastry Routing procedure
if (destination is within range of our leaf set)
forward to numerically closest member else let
l length of shared prefix let d value of
l-th digit in Ds address if (Rld exists)
forward to Rld else forward to a known
node that (a) shares at least as long a
prefix (b) is numerically closer than this node
25Typical Systems--Pastry
- Example in base 4
- Each row down the table, the shared prefix is one
digit longer. The next digit after the shared
prefix is the column number - nodeId 10233102
- E.g. Supposing
- 1. key10233-2-30
- 2. key102-1-2300
26Typical Systems--Pastry
- Designed by Microsoft Research and Rice
University - Pros
- Scalable distributed modest-sized routing table
- Robust purely distributed P2P
- Cons
- Expensive self-organization O (log n) messages
- inadequateness for keyword searching exact file
names needed for query
27Hybrid Networks Advantages
- Client-server apps are still centrally managed
- Users can assign loacl access to their resources
- Workgroups can manage resources without need for
assistance from network administrator
28Hybrid Networks
- Incorporates the best features of workgroups in
Peer-to-Peer with the performance, reliability
and security of server-based systems. - Allows access to central resources, but also
allows users to function at the Peer-to-Peer
level - Users do not have to log in to a central server
29P2P Security
30Failure vs. Attack
- Failure
- Random failure of nodes and/or infrastructure
elements - Attack
- Systematic failure of nodes and/or infrastructure
elements - Scale-free networks are failure-tolerance
- Why?
- Most P2P systems give priority for
failure-tolerance over attack-tolerance
31Possible Targets
- Underlying protocol layers
- P2P routing mechanism
- Nodes themselves
- Trust system
- Homeostasis (of the system)
- Applications/Application Protocols
- Users
32- Still not sure what to write here ?
- Anyways
- Questions?