A Locality Preserving Decentralized File System - PowerPoint PPT Presentation

1 / 22

About This Presentation

Title:

A Locality Preserving Decentralized File System

Description:

Each server responsible for pseudo-random range of ID space. Object are given pseudo-random IDs ... Isolating user data in clusters [Archipelago] ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 23

Provided by: michael874

Category:

more less

Transcript and Presenter's Notes

Title: A Locality Preserving Decentralized File System

1
A Locality Preserving Decentralized File System

Jeffrey Pang
Haifeng Yu
Phil Gibbons
Michael Kaminsky
Srini Seshan

2
Project Intro

Defragmenting DHT data layout for
Improved availability for entire tasks
Amortize data lookup latency

Current DHT Data Layout random placement
Defragmented DHT Data Layout sequential placement

Typical Task/Operation Sizes
30-65 access gt10 8kb-blocks
8-30 access gt100 8kb-blocks

3
Background

EXISTING DHT STORAGE SYSTEMS
Each server responsible for pseudo-random range
of ID space
Object are given pseudo-random IDs

324
987
160
211-400
401-513
150-210
800-999
4
Project Overview

Goal Produce a decentralized read-mostly
filesystem with following properties
Sequential layout of related data
Amortized lookup latency
Improved availability
Some Challenges
Load balancing
Download throughput
Project Focus
System design and implementation

Current DHT Data Layout random placement
Defragmented DHT Data Layout sequential
placement
5
Overview

Background Motivation
Preserving Object Locality
Dynamic Load Balancing
Results
Future Work

6
Preserving Object Locality

Motivation
Fate sharing all objects in a single operation
are more likely to be available at once
Effective caching/prefetching servers Ive
contacted recently are more likely to have what I
want next
Design options
Namespace locality (e.g., filesystem hierarchy)
Dynamic clustering (e.g., based on observed
access patterns)

7
Is Namespace Locality Good Enough?

Initial trace evaluation
Workloads
HP block-level disk trace (1999)
Harvard research NFS trace (2003)
NLANR webcache trace (2003)
Setup
Order files alphabetically according to filepath
10,000 data blocks/server
Calculate failure prob. of each operation
Node failure probability of 5
3 replicas

8
Estimated Availability Across Workloads
9
Encoding Object Names
160 bits
Traditional DHT key encoding
SHA-1 Hash
SHA1(data)
data

Leverage
Large key space (amortized cost over wide-area is
minimal)
Workload properties (e.g., 99 of the time
directory depth lt 12)
Corner cases
Depth or width overflow use 1 bit to signal
overflow region and just use SHA1(filepath)

10
Encoding Object Names
Bill
6
userid
path encode
blockid
Docs
6
1
0

bid
1
6
1
1

bid
1
6
1
2

bid
2
Bob
7
570-600
601-660
661-700
11
Dynamic Load Balancing

Motivation
Hash function is no longer uniform
Uniform ID assignments to nodes leads to load
imbalance
Design options
Simple item balancing (MIT)
Mercury (CMU)

storage load
node number
Load balance with 1024 nodes using the Harvard
trace
12
Load Balancing Algorithm

Basic Idea
Contact a random node in the ring
If myLoad gt deltahisLoad (or vis versa), the
lighter node changes its ID to move before the
heavier node.
Heavy nodes load splits in two.
Node load within factor of 4 in O(log(n)) steps

Mercury optimizations
Continuous sampling of load around the ring
Use estimated load histogram to do informed probes

13
Handling Temporary Resource Constraints

Drastic storage distribution changes can cause
frequent data movement
Node storage can be temporarily constrained
(i.e., no more disk space)
Solution
Lazy data movement
Node responsible for a key keeps a pointer to
actual data blocks
Data blocks can be stored anywhere in system

14
Handling Temporary Resource Constraints
data
data
WRITE
data
15
Results

How much improvement in availability and lookup
latency can we expect?
Setup
Trace-based simulation with Harvard trace
File blocks named using our encoding scheme
Same availability calculation as before
Clients keep open connections to 1-100 of the
most recently contacted data servers
1024 servers

16
Potential Reduction in Lookups
17
Potential Availability Improvement
Random (expected) Ordered (unif) Optimal