Chord: A scalable Peertopeer Lookup Service for Internet Applications

About This Presentation

Title:

Chord: A scalable Peertopeer Lookup Service for Internet Applications

Description:

Distributed Index:Support for Gnutella, Napster keyword search. ... What happens if a node does not know the successor of a key? ... –

Number of Views:136

Avg rating:3.0/5.0

Slides: 19

Provided by: roulada

Category:

more less

Transcript and Presenter's Notes

Title: Chord: A scalable Peertopeer Lookup Service for Internet Applications

1

Chord A scalable Peer-to-peer Lookup Service
for Internet Applications Dariotaki
Roula th.dariotaki_at_di.uoa.gr
1
2

Peer-to-peer system A distributed system without
any centralized control or hierarchical
organization, where the software running at each
node is equivalent in functionality
The goal Locate the node that stores a
particular data item
General Idea of Chord A distributed lookup
protocol that, given a key, it maps the key onto
a node
Benefits
Adapts as nodes join or leave the system
Scalable with communication cost scaling
logarithmically with the number of nodes

Chord features
Load balanceUsing a distributed hash function it
spreads keys evenly over the nodes
DecentralizationFully distributed. No node is
more important than any other
ScalabilityCost of lookup grows with O(logN)
AvailabilityAdjusts its internal tables to
reflect joins and failure of nodes
Flexible NamingNo constraints on the structure
of the keys it looks up

Examples of Chord applications
Cooperative MirroringS/W developers who publish
demonstrations and the demand vary dramatically.
gtbalance load
gtreplication and caching
gtensure authenticity (storing the data under o
Chord key derived from a cryptographic hash of
the data)
Time-Shared StorageStore data in others machine
to ensure that they will always be available. The
datas name can serve as a key to identify the
node responsible for the data at any time
Distributed IndexSupport for Gnutella, Napster
keyword search. Key can be derived from desired
keywords and values can be lists of machines
offering documents with these keywords.

5
The base Chord Protocol Consistent
Hashing Assigns each node and key an m-bit
identifier. It is chosen by hashing the nodes IP
address or the key Identifier length m must be
large enough to make the probability of two nodes
or keys hashing to the same identifier
negligible Assignment Identifiers are ordered
modulo 2m Key k is assigned to the first node
whose identifier is equal to or follows the
identifier of key k (successor node) When
node n joins the network, certain keys previously
assigned to ns successor now become assigned to
n When node n leaves the network, all of its
assigned keys are reassigned to ns successor
m3
6

Theorem 1
For any set of N nodes and K keys, with high
probability
Each node is responsible for at most (1e)K/N
keys (eO(logN))
When an (N1)st node joins or leaves the network,
responsibility for O(K/N) keys changes hands (and
only to or from the joining node)
Scalable Key Location
To implement consistent hashing, each node need
only be aware
of its successor node on the circle.
May be inefficient e.g. traverse all nodes to
find the appropriate
Each node maintains a routing table with m
entries(finger table)
ith finger of n ith entry in finger table of n,
the identity of the first node s that
succeeds n by
at least 2i-1 on the identifier circle

Two important characteristics
Each node stores information about a small number
of nodes and knows more about nodes closely
following it
A nodes finger table generally does not contain
enough information to determine the successor of
an arbitrary key.
What happens if a node does not know the
successor of a key?
It asks a node with ID closer to the key whose
successor is
requested than its own ID
Theorem 2
With high probability the number of nodes that
must be
contacted to find successor in an N-node network
is O(logN)

8
Joining the network Requires 1.Each nodes
successor is correctly maintained 2.For every key
k, node successor(k) is responsible for k (For
fast lookups, finger tables must be
correct) Theorem 3 With high probability any
node joining or leaving an N-node Chord network
will use O(log2N) messages to re-establish the
Chord routing invariants and finger tables For
simplicity each node maintains a predecessor
pointer Step 1Initialize the predecessor and
fingers of node n - Ask an existing node n to
lookup them up - Need for O(mlogN) lookups for m
entries - m entries can be reduced checked to
O(logN) (with additional check for empty
intervals) As a result the overall time for
lookups is O(log2N)
9

Step 2Update the fingers and predecessors of
existing
nodes to reflect the addition of n
Node n will become the ith finger of a node p iff
p precedes n by at least 2i-1 and
the ith finger of node p succeeds n (see e.g.)
Finding and updating these nodes takes O(log2N)
but can be
reduced to O(logN)
Step 3Notify the higher layer s/w so that it can
transfer
state associated with keys (e.g. values) that
node n is now
responsible for
It depends on the higher level s/w using Chord.
Node n can become the successor only for keys
that the
immediately following node was previously
responsible for

10

Example of a node join Node 6 joins the network
11
Concurrent joins Use of a stabilization protocol
to keep successors up to date. Every node run
stabilization periodically What happens if a
lookup occurs before stabilization has
finished? Three cases 1.All finger tables are
currentgtLookup needs O(logN) steps 2.Successors
correct, fingers inaccurategtCorrect lookups but
slow 3.Successors inaccurategtQuery failure.
Retry after short pause Example of
stabilization Existing nodesnp and ns Joining
node n with ID between np and ns n acquire ns
as successor -gt n notifies ns , ns acquire n as
predecessor -gt when np runs stabilization it
asks ns for its predecessor(n) -gt np acquire n
as successor -gt np notifies n, n acquires np as
predecessor. Correct state
12
Theorem 4-5 - Once a node can successfully
resolve a given query, it will always be able to
do so in the future - At some time after the last
join all successor pointers will be
correct Joins dont substantially damage the
performance of fingers. O(logN) hops to reach the
interval close to the target node and a linear
search to find the exact node. Theorem 6 If we
take a stable network with N nodes and another
set of up to N nodes joins the network with no
finger pointers (but with correct successor
pointers), then lookups will still take O(logN)
time with high probability
13

Failures
- Each node keeps a successor-list of its r
nearest successors
If a node notices that its successor failed it
replaces it with the
first live entry in its successor-list.
- When stabilization runs, finger tables will be
updated.
Theorem 7-8
If we use a successor list of length rO(logN) in
a network that is initially stable, and then
every node fail with probability ½, then with
high probability find_successor returns the
closest living successor to the query key and the
expected time to execute find_successor in the
failed network is O(logN)
Chord can store replicas of the data associated
with a key to the next k nodes succeeding the key
gt It can inform the higher layer when successors
come and go, and when the s/w should propagate
new replicas

14
Simulation and experimental results Load
Balance Num of nodes 104 Total num of
keys 10x104-100x104 PDF for 50x104
keys There are nodes with no keys max
num of nodes9.1x mean num Notes The ID do not
cover uniformly the identifiers space The number
of keys per node increase linearly with num of
keys
15

Adding virtual nodes to improve load balance
Num of nodes 104
Total num of keys 100x104
Better distribution of keys as number
of virtual nodes increase
Notes
The tradeoff is the routing table usage will
increase as each actual node
needs r times as much space to store the finger
tables for its virtual
nodes. In practice it is not a problem
-The number of v.n. needed is at most O(logN) so
we must know the total
of nodes. Solution Use an upper bound of
nodes

16
Path Length of nodes traversed during a lookup
operation. O(logN) Num of nodes 2k Num of keys
100x2k PDF 212
nodes Notes The mean path length increases
logarithmically with the numbers of nodes Path
length ½ log2N
17
Simultaneous node failures Num of nodes 104 Num
of keys 100x104 Note Th
e fraction of lookups that fail as a function of
nodes that fail is almost the fraction of nodes
that fail
18
Failed lookups and stabilization Cause 1.The
node responsible for the key has failed 2.Some
finger tables and predecessor pointers are
inconsistent Note It may take more
than one stabilization to completely clear out a
failed node

Write a Comment

User Comments (0)