CS 268: Lecture 22 DHT Applications - PowerPoint PPT Presentation

About This Presentation

Title:

CS 268: Lecture 22 DHT Applications

Description:

lookup() returns list with node IDs closer in ID space to block ID ... Lookup(blockID) List of node-ID, IP address finger table with node IDs, IP address ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 44

Provided by: camp206

Learn more at: https://people.eecs.berkeley.edu

Category:

Tags: dht | applications | ip | lecture | lookup

more less

Transcript and Presenter's Notes

Title: CS 268: Lecture 22 DHT Applications

1
CS 268 Lecture 22 DHT Applications
Ion Stoica Computer Science Division Department
of Electrical Engineering and Computer
Sciences University of California,
Berkeley Berkeley, CA 94720-1776
(Presentation based on slides from Robert Morris
and Sean Rhea)
2
Outline

Cooperative File System (CFS)
Open DHT

3
Target CFS Uses
node
node
node
Internet
node
node

Serving data with inexpensive hosts
open-source distributions
off-site backups
tech report archive
efficient sharing of music

4
How to mirror open-source distributions?

Multiple independent distributions
Each has high peak load, low average
Individual servers are wasteful
Solution aggregate
Option 1 single powerful server
Option 2 distributed service
But how do you find the data?

5
Design Challenges

Avoid hot spots
Spread storage burden evenly
Tolerate unreliable participants
Fetch speed comparable to whole-file TCP
Avoid O(participants) algorithms
Centralized mechanisms Napster, broadcasts
Gnutella
CFS solves these challenges

6
CFS Architecture
client
server
client
server
Internet
node
node

Each node is a client and a server
Clients can support different interfaces
File system interface
Music key-word search

7
Client-server interface
Insert file f
Insert block
FS Client
server
server
Lookup block
Lookup file f
node
node

Files have unique names
Files are read-only (single writer, many readers)
Publishers split files into blocks
Clients check files for authenticity

8
Server Structure
DHash
DHash
Chord
Chord
Node 1
Node 2

DHash stores, balances, replicates, caches
blocks
DHash uses Chord SIGCOMM 2001 to locate blocks

9
Chord Hashes a Block ID to its Successor
N10
B112, B120, , B10
Block ID Node ID
N100
B100
Circular ID Space
N32
B11, B30
N80
B65, B70
N60
B33, B40, B52

Nodes and blocks have randomly distributed IDs
Successor node with next highest ID

10
DHash/Chord Interface
Lookup(blockID)
List of ltnode-ID, IP addressgt
DHash
server
Chord
finger table with ltnode IDs, IP addressgt

lookup() returns list with node IDs closer in ID
space to block ID
Sorted, closest first

11
DHash Uses Other Nodes to Locate Blocks
N5
N10
N110
N20
N99
1.
2.
N40
3.
N50
N80
N60
N68
Lookup(BlockID45)
12
Storing Blocks
disk
cache
Long-term block storage

Long-term blocks are stored for a fixed time
Publishers need to refresh periodically
Cache uses LRU

13
Replicate blocks at r successors
N5
N10
N110
N20
N99
Block 17
N40
N50
N80
N60
N68

Node IDs are SHA-1 of IP Address
Ensures independent replica failure

14
Lookups find replicas
N5
N10
N110
2.
N20
1.
3.
N99
Block 17
N40
4.
RPCs 1. Lookup step 2. Get successor list 3.
Failed block fetch 4. Block fetch
N50
N80
N60
N68
Lookup(BlockID17)
15
First Live Successor Manages Replicas
N5
N10
N110
N20
N99
Copy of 17
Block 17
N40
N50
N80
N60
N68

Node can locally determine that it is the first
live successor

16
DHash Copies to Caches Along Lookup Path
N5
N10
N110
1.
N20
N99
2.
N40
4.
RPCs 1. Chord lookup 2. Chord lookup 3. Block
fetch 4. Send to cache
N50
N80
3.
N60
N68
Lookup(BlockID45)
17
Caching at Fingers Limits Load
N32

Only O(log N) nodes have fingers pointing to N32
This limits the single-block load on N32

18
Virtual Nodes Allow Heterogeneity
N60
N10
N101
N5
Node B
Node A

Hosts may differ in disk/net capacity
Hosts may advertise multiple IDs
Chosen as SHA-1(IP Address, index)
Each ID represents a virtual node
Host load proportional to v.n.s
Manually controlled

19
Why Blocks Instead of Files?

Cost one lookup per block
Can tailor cost by choosing good block size
Benefit load balance is simple
For large files
Storage cost of large files is spread out
Popular files are served in parallel

20
Outline

Cooperative File System (CFS)
Open DHT

21
Questions

How many DHTs will there be?
Can all applications share one DHT?

22
Benefits of Sharing a DHT

Amortizes costs across applications
Maintenance bandwidth, connection state, etc.
Facilitates bootstrapping of new applications
Working infrastructure already in place
Allows for statistical multiplexing of resources
Takes advantage of spare storage and bandwidth
Facilitates upgrading existing applications
Share DHT between application versions

23
The DHT as a Service
24
The DHT as a Service
OpenDHT
25
The DHT as a Service
OpenDHT Clients
26
The DHT as a Service
OpenDHT
27
The DHT as a Service
What is this interface?
OpenDHT
28
Its not lookup()
lookup(k)