Arda : Architecting Storage for Disaster

About This Presentation

Title:

Arda : Architecting Storage for Disaster

Description:

Implementing distributed storage systems is hard. Block oriented interfaces do not help ... Can lessen the burden of storage software. block pool maintenance, ... – PowerPoint PPT presentation

Number of Views:38

Avg rating:3.0/5.0

Slides: 21

Provided by: agniCsaI

Category:

more less

Transcript and Presenter's Notes

Title: Arda : Architecting Storage for Disaster

1
Arda Architecting Storage for Disaster

Ganesh M. Narayan
CSA, IISc

2
Agenda

Storage Abstractions - Motivation
Boxwood - State of the Art
Failures Nature and Extend
Failures Boxwood, and Beyond Boxwood
Rationale Architecture of Arda
Arda on Xen
Conclusions

3
Storage Abstractions

Implementing distributed storage systems is hard
Block oriented interfaces do not help much to
tackle the complexity
Right storage abstraction(s) always help!
Can provide useful functionality
fault-tolerance, scalability, caching, logging
Can lessen the burden of storage software
block pool maintenance, data placement,
prefetching
Can provide better interfaces with richer
semantics
locking, atomicity, operation ordering, I/O
fencing

4
State of the Art

Boxwood Abstractions as the Foundation for
Storage Infrastructure, OSDI 04
No single universal storage abstraction will
serve the needs of all clients
Chunk Store
malloc() like - allocate, free, read, write
Distributed B Tree
Insert, delete, look up, enumerate
Failures
Assumed to be independent, tackled with
replication group

5
Failures Nature Extend

Systems do exhibit large-scale failures, albeit
less often
network failures, power failures, bugs, hardware
defcts
k independent failures assumption may not hold
true always systems should be architected from
ground-up to handle massive failures
State Machine Replication, quorum, RAID ?
Storage abstractions should handle such failures
gracefully
mask failures, provide limited functionality
depending on the severity of failure and possibly
autonomous, local recovery

6
Failures Boxwood

Does not fail gracefully
Trees are inherently minimally connecting and
placement algorithm might introduce anomalies
Co-locate data and metadata D-GRAID
Tracking and co-locating dynamic dependency in
distributed settings is tough diminishing
returns

7
Failures State of the Art

Present Systems 0 or 1 availability
If mere 3 of total storage to fail, FS like
Frangipani, xFS are to suffer 60 data loss!
(Archipelago)

Availability
Failures
Normal Failures
Minor Catastrophe
8
Failures What we want

Graceful degradation non binary availability
Graceful data degradation is the issue, not mere
performance degradation

Availability
Recovering
Failures
limitedl Failures
Minor Catastrophe
Catastrophe
9
Failures Why it is so difficult?

Replication
Necessary, but not sufficient
Diminishing returns Storage Overhead,
Maintenance Overhead, Synchronizing Overhead
K independent, benign storage failures
Nature and Disks can not be fooled!
Whole batches of disks have been found faulty
Disks lie the checksum is not strong enough
Disks fail in myriad of ways
Slowest and least reliable component in the
ecosystem Achilles Heel

10
Distributed Dependencies considered Harmful!

Dependencies
Application File system Abstraction induced
Introducing behind-the-back system dependencies
make systems hard to reason out and predict
Say NO to abstraction induced dependencies!
But abstractions ought to maintain metadata can
not do away with dependencies
should choose an abstraction close to FS/DB
should introduce dependencies judiciously

11
Arda ARchitecting for DisAster

Variable length, typed, abstract Objects
directory, inode, attribute inode, bag-of-bits,
bag-of-records, table
Closely follows the upper layer form and function
Abstraction induced dependencies
Flat space local Object id ltgt global Object id
Object id content hash, guid, hashes...
cacheable, can co-exist
Flat translations can be handled efficiently
hierarchical, process groups. These maps can be
partitioned to handle failures
Services
Replication, Security, Atomicity (granularity ?),
Searching, Data Placement, Load Balancing,
Logging, Exclusive Caching

12
Arda-NFS

NFS over Arda
Almost like pNFS
No intra file parallel transfers
Metadata updates involve translation machinery
Objects for files and directories
But distribution method could differ
ACLs and locking
Fail over is transparent (stateless, shared
objects, lock state in objects)
Object level eager replication
Error handling is simpler transactions like
operations

13
What could we do exclusively in Arda?

Many things!
Fine grained storage access
A mailer could to spread mail records from one or
more mail boxes to multiple machines for
performance or availability
Traditional mailboxes are file based coarse
grained
Multiple indices can co-exist
Searching becomes simpler and context comes for
free
Single, transparent interface for both
indixces and and object management

14
What could we do exclusively in Arda?

Controllable degradation slope
Partitioning and Distributing the Maps
Partition configuration, size determines the
slope but distribution is the key!
Flexible and Multiple means to distribute objects
Different objects are likely to follow different
access patterns and criticality
Since Arda knows the object types, load
distribution can be tuned to suit particular
objects access pattern
Load balancing is more flexible in flat object
space if needed. one could even co-locate
dependent objects

15
Arda on Xen

Why Xen?
Storage utilization
Xen storage is captive, VM images have lots of
redundancy
Xen introduces correlated failures
The K has to be chosen appropriately OR the
data placement need to be Xen aware
Xen clusters
We dont have a CFS that is Xen aware.
XenoLinux CFSs might be inefficient
Cross Domain calls are expensive
VBD introduces a virtualization point Parallax
(HotOS 05)

16
Why Xen?
3
2
4
1
Write from client to server Physical VBD
17
Why Xen?
Dom 0
3
Dom 0
2
1
5
4
Write from client to server Virtual VBD
18
Xen with Arda
2
1
Write from client to server
19
Arda on Xen

Why Xen?
VBD
Point of data indirection we can enrich the vbd
interface exposed to Domains to develop a simple,
efficient file system for XenoLinux should be
simpler with proper object interfaces
Dom0
Point of control indirection act as a local
object maintainer Layouts objects in disk,
manages object properties, provides object
atomicity, maintains local searchable index
We need a shared store for the VM images lots
of redundant, read only data in multiple vbds
Content hash as Object id would solve the problem
neatly
Asynchronous Computation and Hash over Galois
Field

20
Conclusions

Arda appears promising
Performance and Availability
No failures, limited failures, limited
catastrophe, catastrophe degrees of
availability
Xen would be a perfect vehicle to drive Arda
May or may not break existing programs
For any program to take complete advantage of
Arda it need to be ported to Arda. The rest
could reside blithely over a FS that is Arda
aware
NFS over Arda

Write a Comment

User Comments (0)

About PowerShow.com

Arda : Architecting Storage for Disaster - PowerPoint PPT Presentation

Arda : Architecting Storage for Disaster

Implementing distributed storage systems is hard. Block oriented interfaces do not help ... Can lessen the burden of storage software. block pool maintenance, ... – PowerPoint PPT presentation