A Scalable ContentAddressable Network - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

A Scalable ContentAddressable Network

Description:

Reduce the routing path length : O(dn1/d) Increase the size of routing table : O(d) ... dimensions : shorter path length. Multiple Realities : improve data ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 22
Provided by: Jong5
Category:

less

Transcript and Presenter's Notes

Title: A Scalable ContentAddressable Network


1
A Scalable Content-Addressable Network
  • Sylvia Ratnasamy, Paul Francis Mark Handley,
    Richard Karp, Scott shenker
  • UC Berkeley, ATT

SIGCOMM 2001
2
Introduction
  • CAN (Content-Addressable Network)
  • A distributed infrastructure that provides has
    table-like functionality on Internet-like scales
  • System Goals
  • Scalable
  • Fault-tolerant
  • Self-organizing

3
Basic Design
  • Basic Idea
  • A virtual d-dimensional coordinate space
    (d-torus)
  • Each node owns a Zone in the virtual space
  • Data is stored as (key, value) pair
  • Hash(key) ? a point P in the virtual space
  • (key, value) pair is stored on the node within
    whose Zone the point p locates

4
Basic Design
  • Each node only need to maintain the information
    of neighbors (for routing purpose)
  • Individual nodes maintain 2d neighbors

5
Routing in a CAN
  • Greedy algorithm
  • If P is within the Zone of current node, return
    (key, value) or failure (if no such key)
  • Else forward the query to the neighbor with
    coordinates closest to P
  • Average routing path length
  • (d/4)(n1/d)
  • a d dimensional space
  • partitioned into n equal zones

6
CAN Construction
  • 1. Bootstrap
  • The new node must find a node already in the CAN
  • Bootstrap node supplies the IP addresses of nodes
    currently in CAN using DNS
  • 2. Finding a Zone
  • Randomly choose one point P in the space
  • Send a JOIN request destined for P

7
CAN Construction
  • 3. Split Zone
  • Assign half zone to the new node (X dimension
    first)
  • (key, value) pairs from the half zone are
    transferred to the new node
  • The new node gets the information of neighbors
    from the previous occupant
  • 4. Joining the Routing
  • Notices the neighbors the reallocation of space
  • Affects only O(d) existing nodes

8
(No Transcript)
9
Node Departure
  • Explicit departure
  • Hand over its zone to another node to produce a
    valid single zone
  • Node failure
  • Periodic update messages between neighbors
  • Prolonged absence of an update message from a
    neighbor indicates its failure
  • A takeover mechanism merges the zone

10
(No Transcript)
11
Design Improvements
  • Reduce path length
  • Reduce path latency
  • Increase fault tolerance
  • Increase data availability
  • Improved routing performance and system
    robustness vs. per-node state and system
    complexity

12
Design Improvements
  • Multi-dimensioned coordinate spaces
  • Increase the dimensions of the virtual space
  • Reduce the routing path length O(dn1/d)
  • Increase the size of routing table O(d)

13
Design Improvements
  • Multiple Realities Multiple coordinate spaces
  • Improve data availability
  • Improve routing fault tolerance
  • Reduce the average path length
  • Increase the size of the routing table O(r)

14
Design Improvements
  • Multiple Dimensions vs. Multiple Realities
  • Multiple dimensions shorter path length
  • Multiple Realities improve data availability

15
Design Improvements
  • Better CAN routing metrics
  • When there are more than one choice for
    forwarding, choose the neighbor with the least
    RTT RTT-weighted routing
  • Reduce the per hop latency (2440)

16
Design Improvements
  • Overload coordinate zones
  • Assign more than one node to share the same zone
  • Reduce average path length
  • Reduce the per-hop latency
  • Improve fault tolerance

17
Design Improvements
  • Multiple hash function
  • Replicate a single (key, value) pair at k
    distinct nodes in the system (use k different
    hash functions)
  • Increase data availability
  • Reduce the query latency

18
Design Improvements
  • Uniform Partitioning
  • Using 1-hop volume check
  • Achieve load balancing

19
Design Improvements
  • Caching and Replication
  • Caching Node maintains a cache of the data keys
    it recently accessed
  • Replication A popular data is replicated within
    a region surrounding the original storage node
  • Increase data availability
  • Reduce Query latency
  • Achieve load balancing

20
Design Review
d - dimensionality r - number of realities p
number of peer nodes per zone k number of hash
functions size n 218 topology
Transit-Stub latency (100, 10, 1)
21
Conclusions
  • CAN provides scalable routing and efficient
    indexing
  • CAN is completely self-organizing and
    fault-tolerant
  • Open problem DoS attack
Write a Comment
User Comments (0)
About PowerShow.com