Arda : Architecting Storage for Disaster - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Arda : Architecting Storage for Disaster

Description:

Implementing distributed storage systems is hard. Block oriented interfaces do not help ... Can lessen the burden of storage software. block pool maintenance, ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 21
Provided by: agniCsaI
Category:

less

Transcript and Presenter's Notes

Title: Arda : Architecting Storage for Disaster


1
Arda Architecting Storage for Disaster
  • Ganesh M. Narayan
  • CSA, IISc

2
Agenda
  • Storage Abstractions - Motivation
  • Boxwood - State of the Art
  • Failures Nature and Extend
  • Failures Boxwood, and Beyond Boxwood
  • Rationale Architecture of Arda
  • Arda on Xen
  • Conclusions

3
Storage Abstractions
  • Implementing distributed storage systems is hard
  • Block oriented interfaces do not help much to
    tackle the complexity
  • Right storage abstraction(s) always help!
  • Can provide useful functionality
  • fault-tolerance, scalability, caching, logging
  • Can lessen the burden of storage software
  • block pool maintenance, data placement,
    prefetching
  • Can provide better interfaces with richer
    semantics
  • locking, atomicity, operation ordering, I/O
    fencing

4
State of the Art
  • Boxwood Abstractions as the Foundation for
    Storage Infrastructure, OSDI 04
  • No single universal storage abstraction will
    serve the needs of all clients
  • Chunk Store
  • malloc() like - allocate, free, read, write
  • Distributed B Tree
  • Insert, delete, look up, enumerate
  • Failures
  • Assumed to be independent, tackled with
    replication group

5
Failures Nature Extend
  • Systems do exhibit large-scale failures, albeit
    less often
  • network failures, power failures, bugs, hardware
    defcts
  • k independent failures assumption may not hold
    true always systems should be architected from
    ground-up to handle massive failures
  • State Machine Replication, quorum, RAID ?
  • Storage abstractions should handle such failures
    gracefully
  • mask failures, provide limited functionality
    depending on the severity of failure and possibly
    autonomous, local recovery

6
Failures Boxwood
  • Does not fail gracefully
  • Trees are inherently minimally connecting and
    placement algorithm might introduce anomalies
  • Co-locate data and metadata D-GRAID
  • Tracking and co-locating dynamic dependency in
    distributed settings is tough diminishing
    returns

7
Failures State of the Art
  • Present Systems 0 or 1 availability
  • If mere 3 of total storage to fail, FS like
    Frangipani, xFS are to suffer 60 data loss!
    (Archipelago)

Availability
Failures
Normal Failures
Minor Catastrophe
8
Failures What we want
  • Graceful degradation non binary availability
  • Graceful data degradation is the issue, not mere
    performance degradation

Availability
Recovering
Failures
limitedl Failures
Minor Catastrophe
Catastrophe
9
Failures Why it is so difficult?
  • Replication
  • Necessary, but not sufficient
  • Diminishing returns Storage Overhead,
    Maintenance Overhead, Synchronizing Overhead
  • K independent, benign storage failures
  • Nature and Disks can not be fooled!
  • Whole batches of disks have been found faulty
  • Disks lie the checksum is not strong enough
  • Disks fail in myriad of ways
  • Slowest and least reliable component in the
    ecosystem Achilles Heel

10
Distributed Dependencies considered Harmful!
  • Dependencies
  • Application File system Abstraction induced
  • Introducing behind-the-back system dependencies
    make systems hard to reason out and predict
  • Say NO to abstraction induced dependencies!
  • But abstractions ought to maintain metadata can
    not do away with dependencies
  • should choose an abstraction close to FS/DB
  • should introduce dependencies judiciously

11
Arda ARchitecting for DisAster
  • Variable length, typed, abstract Objects
  • directory, inode, attribute inode, bag-of-bits,
    bag-of-records, table
  • Closely follows the upper layer form and function
  • Abstraction induced dependencies
  • Flat space local Object id ltgt global Object id
  • Object id content hash, guid, hashes...
    cacheable, can co-exist
  • Flat translations can be handled efficiently
    hierarchical, process groups. These maps can be
    partitioned to handle failures
  • Services
  • Replication, Security, Atomicity (granularity ?),
    Searching, Data Placement, Load Balancing,
    Logging, Exclusive Caching

12
Arda-NFS
  • NFS over Arda
  • Almost like pNFS
  • No intra file parallel transfers
  • Metadata updates involve translation machinery
  • Objects for files and directories
  • But distribution method could differ
  • ACLs and locking
  • Fail over is transparent (stateless, shared
    objects, lock state in objects)
  • Object level eager replication
  • Error handling is simpler transactions like
    operations

13
What could we do exclusively in Arda?
  • Many things!
  • Fine grained storage access
  • A mailer could to spread mail records from one or
    more mail boxes to multiple machines for
    performance or availability
  • Traditional mailboxes are file based coarse
    grained
  • Multiple indices can co-exist
  • Searching becomes simpler and context comes for
    free
  • Single, transparent interface for both
    indixces and and object management

14
What could we do exclusively in Arda?
  • Controllable degradation slope
  • Partitioning and Distributing the Maps
  • Partition configuration, size determines the
    slope but distribution is the key!
  • Flexible and Multiple means to distribute objects
  • Different objects are likely to follow different
    access patterns and criticality
  • Since Arda knows the object types, load
    distribution can be tuned to suit particular
    objects access pattern
  • Load balancing is more flexible in flat object
    space if needed. one could even co-locate
    dependent objects

15
Arda on Xen
  • Why Xen?
  • Storage utilization
  • Xen storage is captive, VM images have lots of
    redundancy
  • Xen introduces correlated failures
  • The K has to be chosen appropriately OR the
    data placement need to be Xen aware
  • Xen clusters
  • We dont have a CFS that is Xen aware.
    XenoLinux CFSs might be inefficient
  • Cross Domain calls are expensive
  • VBD introduces a virtualization point Parallax
    (HotOS 05)

16
Why Xen?
3
2
4
1
Write from client to server Physical VBD
17
Why Xen?
Dom 0
3
Dom 0
2
1
5
4
Write from client to server Virtual VBD
18
Xen with Arda
2
1
Write from client to server
19
Arda on Xen
  • Why Xen?
  • VBD
  • Point of data indirection we can enrich the vbd
    interface exposed to Domains to develop a simple,
    efficient file system for XenoLinux should be
    simpler with proper object interfaces
  • Dom0
  • Point of control indirection act as a local
    object maintainer Layouts objects in disk,
    manages object properties, provides object
    atomicity, maintains local searchable index
  • We need a shared store for the VM images lots
    of redundant, read only data in multiple vbds
  • Content hash as Object id would solve the problem
    neatly
  • Asynchronous Computation and Hash over Galois
    Field

20
Conclusions
  • Arda appears promising
  • Performance and Availability
  • No failures, limited failures, limited
    catastrophe, catastrophe degrees of
    availability
  • Xen would be a perfect vehicle to drive Arda
  • May or may not break existing programs
  • For any program to take complete advantage of
    Arda it need to be ported to Arda. The rest
    could reside blithely over a FS that is Arda
    aware
  • NFS over Arda
Write a Comment
User Comments (0)
About PowerShow.com