A Scalable Content Addressable Network - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

A Scalable Content Addressable Network

Description:

Each node owns a zone on the torus. To store key value pair (K1, V1) ... Draw straight line from point in A's zone to P1. Follow straight line using neighbor pointers ... – PowerPoint PPT presentation

Number of Views:180
Avg rating:3.0/5.0
Slides: 22
Provided by: Bog5
Category:

less

Transcript and Presenter's Notes

Title: A Scalable Content Addressable Network


1
A Scalable Content Addressable Network
  • Sylvia Ratnasamy, Paul Francis, Mark Handley, and
    Richard Karp
  • SIGCOMM 2001

2
What is a Scalable Content Addressable Network?
  • Network distributed system
  • Content-addressable
  • Hash table maps keys onto values
  • Indexing Mechanism to map file names to their
    location in the system
  • Scalable
  • Internet-scale hash table
  • Challenge - find a scalable indexing mechanism

3
Design of CAN
  • d-dimensional Cartesian coordinate space
    (d-torus)
  • Each node owns a zone on the torus
  • To store key value pair (K1, V1),
  • K1 mapped to point P1 using uniform hash
    function
  • (K1, V1) stored at the node N that owns the zone
    containing P1

4
Design of CAN
  • Each node stores IP address and coordinate zone
    of adjoining zones
  • This set of neighbors is the nodes routing table

5
Routing
  • How to route from node A to point P1 at (0.7,
    0.6)?
  • Draw straight line from point in As zone to P1
  • Follow straight line using neighbor pointers
  • For d-dimensional space partitioned into n equal
    zones, each node maintains 2d neighbors
  • Average routing path length

6
Node Joins
  • New node finds a node already in CAN
  • New node chooses random point P and sends JOIN
    message to node whose zone contains P, say node N
  • N splits its zone and allocates half to new
    node
  • New node learns neighbor set from N
  • N updates its neighbor set to include new node

7
Node Joins - New Node 7
8
Graceful Node Departure
  • Node explicitly hands over zone to one of its
    neighbors
  • Merge to form valid zone if possible
  • If not, two zones are temporarily handled by
    smallest neighbor

9
Failures
  • Each node periodically sends messages to each of
    its neighbors
  • Absence of message signals node failure
  • Nodes that detects failure initiates takeover
    mechanism
  • Each node sets initializes takeover timer with
    value proportional to its zone size
  • When timer expires, node sends TAKEOVER message
    containing its volume to the failed nodes
    neighbors
  • When node receives TAKEOVER message it either
  • Replies with its own TAKEOVER message if its
    volume is smaller
  • OR cancels its takeover timer
  • Takeover mechanism ensures node with smallest
    volume takes over the zone

10
Design Improvements
  • Increase dimension
  • Reduces routing path length
  • Slightly increases size of routing table
  • Multiple independent coordinate spaces
    (realities)
  • Each node assigned to different zone in each
    reality
  • Can route in any reality
  • Shorter paths and higher fault tolerance
  • Overloading zones
  • More than one peer per zone
  • Can replicate key at each peer in zone - improved
    fault-tolerance

11
Summary of CAN
  • Node arrivals and departures affect a small
    number of neighbors
  • Useful for multi-attribute data

12
Tapestry A Resilient Global-scale Overlay for
Service Deployment
  • Ben Zhao, Ling Huang, Jeremy Stribling, Sean
    Rhea, Anthony Joseph, and John Kubiatowicz

13
What is Tapestry?
  • A dynamic, scalable, fault-tolerant location and
    routing infrastructure
  • Maps each object ID to unique Root Node
  • Each node has local Neighbor Map to facilitate
    routing to any Root Node
  • Publish by reference
  • Prefix-based routing
  • Core API
  • publishObject(ObjectID, ServerID)
  • routeToObject(ObjectID)
  • routeToNode(NodeID)

14
Prefix Routing
  • Node IDs and keys from randomized namespace
    (SHA-1)
  • incremental routing towards destination ID
  • each node has small set of outgoing routes
  • log (n) neighbors per node, log (n) hops
    between any node pair

ID ABCE
ABC0
To ABCE
AB5F
A930
15
Routing Details
Example Octal digits, 212 namespace, 2175 ? 0157
2175
0880
0123
0154
0157
16
Publication / Location
  • Every object has associated Root Node
  • Root keeps pointer to objects location
  • Object O stored at server S
  • S routes to Root(O)
  • Each hop keeps ltO,Sgt in index database
  • Client C routes to Root(O), route to S when ltO,Sgt
    found

17
Handling Failures
  • Routing
  • Store multiple pointers for each routing entry,
    primary and backups
  • When primary fails, promote backups
  • Replication
  • Hash object ID multiple times and publish object
    to multiple roots

18
Surrogate Routing
  • How to determine unique root node for an object
  • Route toward object ID as if node with that ID
    exists
  • Adapt where routing process fails
  • Route around holes
  • When there is no match for next digit, route to
    the next filled entry in the same level of the
    table
  • E.g. if there is no entry for 3, try 4 and then
    5, etc
  • When routing can go no further - the only node
    left at and above the current level is the
    current node, that node is the root

19
Surrogate Routing Example
  • Two copies of object 4388 are published to their
    root node

20
Node Joins
  • inserting node (0123) into network
  • route to own ID, find 012X nodes, fill last
    column
  • request backpointers to 01XX nodes
  • measure distance, add to rTable
  • prune to nearest K nodes
  • repeat 24

Existing Tapestry
21
Summary of Tapestry
  • Proximity routing
  • leverage flexibility in routing rules
  • for each routing table entry, choose node
  • that satisfies prefix requirement
  • and is closest in network latency
  • result end to end latency proportional to
    actual IP latency
  • Publish-by-reference
  • applications choose where to place objects
  • use application-level knowledge to optimize
    access time
Write a Comment
User Comments (0)
About PowerShow.com