Routing Overview - PowerPoint PPT Presentation

1 / 25

About This Presentation

Title:

Routing Overview

Description:

The napster server knows the music titles and their sites ... Napster is a hybrid P2P system since a central server is required to coordinate file sharing ... – PowerPoint PPT presentation

Number of Views:70

Avg rating:3.0/5.0

Slides: 26

Provided by: abhayp

Category:

more less

Transcript and Presenter's Notes

Title: Routing Overview

1
Peer-to-Peer Computing CS587x Lecture
Department of Computer Science Iowa State
University
2
What to Cover

Review on some P2P applications
Napster
Gnutella
Freenet
Discussion and summary

3
Resource Sharing

Questions to answer in order to design a
resource-sharing network
How to add new nodes to the network
How can one node know about others
How can a node find and retrieve data
How to manage the shared data

users
4
Client/Server Architecture

Create a server to store the information that
these nodes want to share
The server is the only data source
Clients request data from server
Example mp3.com
A client registers to mp3.com and uploads its
music files to the server
The songs are then stored and indexed on a server
that is part of the web site
Other uses can connect to the web site and
downloads the songs they are interested in
Limitation of C/S model
Scalability is hard to achieve
Presents a single point of failure
Requires administration
Unused resource at the network edge

Central Server

Client-1
Client-n
5
Peer-to-Peer Models

Napster
Gnutella
Freenet

6
Napster

Each node registers to napster.com and provides a
list of its song titles
The napster server knows the music titles and
their sites
The songs themselves are still stored locally
For a node to download a song,
the node contacts the server
The server returns a list of nodes that have the
song
The requesting node selects one of the nodes in
the list to download the file directly from the
node

7
Highlights of Napster

Main innovation a client downloads a music
directly from another client, i.e., P2P
communication
After a client downloads a music, it can serves
other clients
Napster server itself does not have any music
files
It acts as a directory or broker
Advantages
Each consumer contributes its resource (disk and
bandwidth) and content to the community
Contents are more reliable because the same file
is stored in many nodes, which are geographically
distributed
Administration and service cost are minimal
Drawback
Napster is a hybrid P2P system since a central
server is required to coordinate file sharing
The central server presents a single point of
failure

8
Gnutella

Creating a Gnutella network
A node joins the network with a PING to announce
self
IP address, port, number/size of shared files
Receivers forward the Ping to their neighbors
Receivers back-propagate a PONG to announce self
Each Pong includes senders IP address,
number/size of shared files
Maintaining a Gnutella network
PING neighbors periodically
PING Well-known root nodes if starting from
scratch

9
Search Protocol

For node A to request a file (any kind), it
creates a query (A, S, N, T), where S is search
string, N unique request ID, T Time-to-Live
checks local system, if not found
Sends (A, S, N, T) to all Gnutella neighbors
B receives a query (A, S, N, T)
If B has already received query N or T 0, drops
the query
Otherwise, B looks up S locally and sends (N,
Result) to A if anything found
Any kind of look up (could simply grep, or
construct some sql cmd)
If not found locally,
B sends (B, S, N, T-1) to all of its Gnutella
neighbors
B records the fact that A has made the request N
When B receives a response of the form (N,
Result) from one of its neighbors, it forwards
the response to A

10
(No Transcript)
11
Gnutella Messages

PING
request the transitive closure of connected nodes
to identify them, essentially asking the question
"Are you there?
PONG
response by a node upon receiving a PING the
responding node provides its IP address and
number of sharable files it contains. This gives
the answer that "Yes, I am here.
QUERY
request to locate a set of files matching some
filter criteria. These are messages stating, "I
am looking for x".
HITS
response to a query giving a list of files
matching the filter criteria and the IP address
of the provider, can be many in number.
GET/PUSH
request a file provider to contact the requester.
This provides a simple mechanism trying to get
through firewalls

12
Partial Map of a Gnutella Network
13
Highlights of Gnutella

Pure P2P
Unlike Napster, Fully decentralized, no single
point of failure
Limitations
Scalability if you send out a request with a TTL
of 7, and each site contacts six other sites, up
to 6162636465 6667 messages could be
exchanged
Not anonymous since result contains the URL
string, the source provider can be tracked this
is addressed in Freenet

14
Freenet

Freenet is a pure P2P system mainly designed to
support
distributed information storage and retrieval
anonymity for producers, consumers and holders of
information
adaptive respond to usage patterns
Freenet differentiates from Gnutella mainly in
Retrieving data
Storing data
Managing data

15
Architecture

Each file is identified by a binary key
The key is generated using some hash function
Every file is stored, retrieved, and maintained
with its file key
Each node maintains a local data store and a
routing table
data store maintains a set of files
routing table keeps information about neighboring
nodes and the keys that they are thought to hold
A sequence of (file key, node address)
Used for file retrieval

16
Retrieving data

A user first obtains or calculates a key
The user sends a search request message (keyTTL)
to local node
When a node receives a request, it checks its own
data storage
If the specified data is found, returns it
Otherwise, the node looks up its routing table
and forwards the request to the node that has the
nearest key
why do this - the similarity of two keys actually
has nothing to do with that of their
corresponding files?
If this request is successful, the node that has
the target data
returns the data through the search path,
caches the file in its own data store, and
creates a new entry in its routing table

17
Example
Cache file in datastore Create new entry in
routing table
1. Calculate binary file key 2. Check routing
table for node with nearest key
Cache file in datastore Create new entry in
routing table
FOUND
NOT FOUND
NOT FOUND
A
B
1. Check datastore for file
2. Check routing table for node with nearest key
to requested one
3. Try the node with second nearest key
FAILURE
C
D
E
File request (key, hops to live)
Cache file in datastore Create new entry in
routing table
Data reply actual data source
Failure message
18
Effect of Retrieving Mechanism

Anonymity
Uncontrolled replication allows one to deny
responsibility of having the file
Quality of routing improved over time
Nodes specialize in locating sets of similar keys
Files with similar keys are stored in clustering
(why?)
Files are key-clustering instead of
subject-clustering
Transparent replication of popular data
Improved data availability
Replication degree depends on data popularity
Increasing connectivity
The graph becomes more and more connected

19
Effect of Retrieving Mechanism

Major difference from Gnutella searching
Breadth-first search vs. Depth-first search
Replication over the retrieval path
Limitation
Searching for a document that does not exist?

20
Storing data

Calculate binary file key and send insert message
like request (key, hops to live)
When a node receives an insert proposal, it first
checks its own data store
If the key already exists, the users need to try
again using different key
Otherwise, the node looks up the nearest key in
its routing table and forwards the insert to the
corresponding node
If key collision occurs at the adjacent node, the
node notifies the inserted to try another key
If TTL expires without a key collision, an all
clear result will be backwarded to the original
inserter

21
Storing data

Effects of insert mechanism
New files are placed on nodes possessing files
with similar keys
Limitation
How long it takes to insert a file?
How about version management?
Two different files could have the same key and
both may exist in network
Different users must have different name space
The same user must use different file description
(e.g., keywords) for different file
Security is a concern

22
Managing data

File replacement is done using LRU
Data items sorted in decreasing order by time of
most recent request/insert
Outdated documents fade away naturally as routing
table entry will remain for a time
File lifetime
The time period of keep a file is unknown
You cannot delete a file from a Freenet a file
will not disappear unless it is not accessed for
a while
No guarantee that a document you submit today
will exist tomorrow

23
Highlights of Freenet

Pure P2P - similar to Gnutella,
Provides anonymity
Neither data producer and retriever can be
identified
Searching/Storing/Managing are all different
for anonymity and performance purpose

24
P2P Advantages

Efficient use of resources
Client/Server architecture cannot take advantage
of the unused bandwidth, storage, processing
power at the edge of network
Scalability
Each user contributes its resource to the entire
community, instead of just a burden
Reliability
Replicas
Geographic distribution
No single point of failure
Ease of Administration
Nodes self organize
No need to deploy servers to satisfy demand
Built-in fault tolerance, replication, and load
balancing

25
P2P Computing Summary

P2P computing is the sharing of computer
resources by direct exchange between systems
Such resource includes information, processing
cycles, storage, etc.
A P2P network has the following characteristics
Each node behaves as client, server, and router
Nodes are autonomous (no administrative
authority)
Network is dynamic nodes enter and leave the
network frequently
Nodes collaborate directly with each other (not
through well-known servers)
Nodes have widely varying capabilities