PETAL:%20DISTRIBUTED%20VIRTUAL%20DISKS - PowerPoint PPT Presentation

About This Presentation

Title:

PETAL:%20DISTRIBUTED%20VIRTUAL%20DISKS

Description:

petal: distributed virtual disks e. k. lee c. a. thekkath dec src – PowerPoint PPT presentation

Number of Views:154

Avg rating:3.0/5.0

Slides: 31

Provided by: Jehan53

Learn more at: https://www2.cs.uh.edu

Category:

more less

Transcript and Presenter's Notes

Title: PETAL:%20DISTRIBUTED%20VIRTUAL%20DISKS

1
PETALDISTRIBUTED VIRTUAL DISKS

E. K. LeeC. A. Thekkath
DEC SRC

2
Highlights

Paper presents a distributed storage management
system
Petal consists of a collection of
network-connected servers that cooperatively
manage a pool of physical disks
Client see Petal as a highly available
block-level storage partitioned into virtual
disks

3
Introduction

Petal is a distributed storage system that
Tolerates single component failures
Can be geographically distributed to tolerate
site failures
Transparently reconfigures to expand in
performance or capacity
Uniformly balances load and capacity
Provides fast efficient support for backup and
recovery

4
Petal User Interface

Petal appears to its clients as a collection of
virtual disks
Block-level interface
Lower-level service than a DFS
Makes system easier to model, design, implement
and tune
Can support heterogeneous clients and applications

5
Client view
NTFS
EXT2 FS
NTFS
Scalable Network
Petal
Virtualdisks
6
Physical view
NTFS
EXT2 FS
NTFS
Scalable Network
7
Petal Server Modules
Global StateModule
RecoveryModule
LivelinessModule
Virtual toPhysical
Data AccessModule
8
Overall design (I)

All state information is maintained on servers
Clients maintain only hints
Liveness module ensures that all servers will
agree on the system operational status
Uses majority consensus and periodic exchanges of
Im alive/Youre alive? messages

9
Overall design (II)

Information describing
current members of storage system and
currently supported virtual disks
is replicated across all servers
Global state module keeps this information
consistent
Uses Lamports Paxos algorithm
Assumes fail-silent failures of servers

10
Overall design (III)

Data access and recovery modules
Control how client data are distributed and
stored
Support
Simple data striping w/o redundancy
Chained declustering
It distributes mirrored data in a way that
balances load in the event of a failure

11
Address translation (I)

Must translate virtual addresses
ltvirtual-disk ID, offsetgt
into physical addresses
ltserver ID, disk ID, offsetgt
Mechanism should be fast and fault-tolerant

12
Address translation (II)

Uses three replicated data structures
Virtual disk directorytranslates virtual disk
ID into a global map ID
Global maplocates the server responsible for
translating the given offset (block number)
Physical mapLocates physical disk and computers
physical offset within that disk

13
Virtual to physical mapping
vdiskID
offset
diskID and diskOffset on this server
14
Address translation (III)

Three step process
VDir translates virtual disk ID given by client
into a GMap ID
Specified GMap finds server that can translate
given offset
PMap of server translates GMap ID and offset to
a physical disk and a disk offset
Last two steps are almost always performed by
same server

15
Address translation (IV)

There is one GMap per virtual disk
That GMap specifies
Tuple of servers spanned by the virtual disk
Redundancy scheme used to protect data
GMaps are immutable
Cannot be modified
Must create a new GMap

16
Address translation (V)

PMaps are similar to page tables
Each PMap entry maps 64 KB of physical disk
space
Server that performs the translation will usually
perform the disk I/O
Keeping GMaps and PMaps separate minimizes amount
of global information that must be replicated

17
Support for backups

Petal supports snapshots of virtual disks
Snapshots are immutable copies of virtual disks
Created using copy-on-write
VDir maps ltvirtual-disk ID, epoch(?)gt into ltGMap
ID, epochgt
Epoch identifies current version of virtual disks
and snapshots of past versions

18
Incremental reconfiguration (I)

Used to add/remove new servers and new disks
Three simple steps
Create new GMap
Update VDir entries
Redistribute the data
Challenge is to perform the reconfiguration
concurrently with normal client requests

19
Incremental reconfiguration (II)

To solve the problem
Read requests will
Try first new GMap
Switch to old GMap if new GMap has no
appropriate translation
Write requests will always use new GMap

20
Incremental reconfiguration (III)

Observe that new GMap must be created before any
data are moved
Too many read requests will have to consult both
GMaps
Seriously degrades system performance
Do instead incremental changes over a fenced
region of a virtual disk

21
Chained declustering (I)
Virtual Disk
22
Chained declustering (II)

If one server fails, its workload will be almost
equally distributed among remaining servers
Petal uses a primary/secondary scheme for
managing copies
Read requests can go to either primary or
secondary copy
Write requests must go first toprimary copy

23
Petal prototype

Four servers
Each has fourteen 4.3 GB disks
Four clients
Links are 155 Mb/s ATM links
Petal RPC interface has 24 calls

24
Latency of a virtual disk
25
Throughput of a virtual disk
Throughput is mostly limited by CPU overhead (233
MHZ CPUs!)
26
File system performance
(Modified Andrew Benchmark)
27
Conclusion

Block-level interface s simpler and more
flexible than a FS interface
Use of distributed software solutions allows
geographic distribution
Petal performance is acceptable but for write
requests
Must wait for primary and secondary copies to be
successfully updated

28
Paxos the main idea

Proposers propose decision values from an
arbitrary input set and try to collect
acceptances from a majority of the accepters
Learners observe this ratification process and
attempt to detect that ratification has occurred
Agreement is enforced because only one proposal
can get the votes of a majority of accepters

29
Paxos the assumptions

Algorithm for consensus in a message-passing
system
Assumes the existence of Failure Detectors that
let processes give up on stalled processes after
some amount of time
Processes can act as proposers, accepters, and
learners
A process may combine all three roles

30
Paxos the tricky part