OceanStore GlobalScale Persistent Storage - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

OceanStore GlobalScale Persistent Storage

Description:

fault injection: triggering hardware and software error handling paths ... scrubbing: periodic restoration of potentially 'decaying' hardware or software state ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 32
Provided by: Yin65
Category:

less

Transcript and Presenter's Notes

Title: OceanStore GlobalScale Persistent Storage


1
OceanStoreGlobal-Scale Persistent Storage
  • Ying Lu
  • CSCE496/896 Spring 2006

2
Give Credits
  • Many slides are from John Kubiatowicz, University
    of California at Berkeley
  • I have modified them and added new slides

3
Endeavour Maxims
  • Personal Information Mgmt is the Killer App
  • Not corporate processing but management,
    analysis, aggregation, dissemination, filtering
    for the individual
  • Automated extraction and organization of daily
    activities to assist people
  • Information Technology as a Utility
  • Continuous service delivery, on a
    planetary-scale, on top of a highly dynamic
    information base

4
OceanStore Context Ubiquitous Computing
  • Computing everywhere
  • Desktop, Laptop, Palmtop, Cars, Cellphones
  • Shoes? Clothing? Walls?
  • Connectivity everywhere
  • Rapid growth of bandwidth in the interior of the
    net
  • Broadband to the home and office
  • Wireless technologies such as CDMA, Satelite,
    laser
  • Rise of the thin-client metaphor
  • Services provided by interior of network
  • Incredibly thin clients on the leaves
  • MEMs devices -- sensorsCPUwireless net in 1mm3
  • Mobile society people move and devices are
    disposable

5
Questions about information
  • Where is persistent information stored?
  • 20th-century tie between location and content
    outdated
  • (we all survived the Feb 29th bug -- lets move
    on!)
  • How is it protected?
  • Can disgruntled employee of ISP sell your
    secrets?
  • Cant trust anyone (how paranoid are you?)
  • Can we make it indestructible?
  • Want our data to survive the big one!
  • Highly resistant to hackers (denial of service)
  • Wide-scale disaster recovery
  • Is it hard to manage?
  • Worst failures are human-related
  • Want automatic (introspective) diagnose and
    repair

6
First ObservationWant Utility Infrastructure
  • Mark Weiser from Xerox Transparent computing is
    the ultimate goal
  • Computers should disappear into the background
  • In storage context
  • Dont want to worry about backup, obsolescence
  • Need lots of resources to make data secure and
    highly available, BUT dont want to own them
  • Outsourcing of storage already becoming popular
  • Pay monthly fee and your data is out there
  • Simple payment interface ? one bill from one
    company

7
Second ObservationNeed wide-scale deployment
  • Many components with geographic separation
  • System not disabled by natural disasters
  • Can adapt to changes in demand and regional
    outages
  • Wide-scale use and sharing also requires
    wide-scale deployment
  • Bandwidth increasing rapidly, but latency bounded
    by speed of light
  • Handling many people with same system leads to
    economies of scale

8
OceanStoreEveryones data, One big Utility
  • The data is just out there
  • Separate information from location
  • Locality is only an optimization (an important
    one!)
  • Wide-scale coding and replication for durability
  • All information is globally identified
  • Unique identifiers are hashes over names keys
  • Single uniform lookup interface replaces DNS,
    server location, data location
  • No centralized namespace required (such as SDSI)

9
Amusing back of the envelope calculation(courtesy
Bill Bolotsky, Microsoft)
  • How many files in the OceanStore?
  • Assume 1010 people in world
  • Say 10,000 files/person (very conservative?)
  • So 1014 files in OceanStore!
  • If 1 gig files (not likely), get 1 mole of files!
  • Truly impressive number of elements
  • but small relative to physical constants

10
Utility-based Infrastructure
Canadian OceanStore
Sprint
ATT
IBM
Pac Bell
IBM
  • Service provided by confederation of companies
  • Monthly fee paid to one service provider
  • Companies buy and sell capacity from each other

11
Outline
  • Motivation
  • Properties of the OceanStore
  • Specific Technologies and approaches
  • Naming and Data Location
  • Conflict resolution on encrypted data
  • Replication and Deep archival storage
  • Introspective computing for optimization and
    repair
  • Economic models
  • Conclusion

12
Ubiquitous Devices ? Ubiquitous Storage
  • Consumers of data move, change from one device to
    another, work in cafes, cars, airplanes, the
    office, etc.
  • Properties REQUIRED for OceanStore storage
    substrate
  • Strong Security data encrypted in the
    infrastructure resistance to monitoring and
    denial of service attacks
  • Coherence too much data for naïve users to keep
    coherent by hand
  • Automatic replica management and optimization
    huge quantities of data cannot be managed
    manually
  • Simple and automatic recovery from disasters
    probability of failure increases with size of
    system
  • Utility model world-scale system requires
    cooperation across administrative boundaries

13
OceanStore Technologies INaming and Data
Location
  • Requirements
  • System-level names should help to authenticate
    data
  • Route to nearby data without global communication
  • Dont inhibit rapid relocation of data
  • OceanStore approach Two-level search with
    embedded routing
  • Underlying namespace is flat and built from
    secure cryptographic hashes (160-bit SHA-1)
  • Search process combines quick, probabilistic
    search with slower guaranteed search

14
Universal Location Facility
  • Takes 160-bit unique identifier (GUID) and
    Returns the nearest object that matches

15
Routing Two-tiered approach
  • Fast probabilistic routing algorithm
  • Entities that are accessed frequently are likely
    to reside close to where they are being used
    (ensured by introspection)
  • Slower, guaranteed hierarchical routing method

Self-optimizing
16
Probabilistic Routing Algorithm
self-optimizing on the depth of the attenuated
bloom flilter array
n3
n2
n1
n4
self-protecting
Bloom filter on each node Attenuated Bloom
filter on each directed edge.
17
Hierarchical Routing Algorithm
  • Based on Plaxton scheme
  • Every server in the system is assigned a random
    node-ID
  • Objects root
  • each object is mapped to a single node whose
    node-ID matches the objects GUID in the most
    bits (starting from the least significant)
  • Information about the GUID (such as location)
    were stored at its root

18
Construct Plaxton Mesh
0324
1324

19
Basic Plaxton MeshIncremental suffix-based
routing
e
d
c
b
a
20
Use of Plaxton MeshRandomization and Locality
21
OceanStore Enhancements of the Plaxton Mesh
  • Documents have multiple roots (Salted hash of
    GUID)
  • Each node has multiple neighbor links
  • Searches proceed along multiple paths
  • Tradeoff between reliability, performance and
    bandwidth?
  • Dynamic node insertion and deletion algorithms
  • Continuous repair and incremental optimization of
    links

self-healing
self-optimizing
self-managing
22
OceanStore Technologies IIRapid Update in an
Untrusted Infrastructure
  • Requirements
  • Scalable coherence mechanism which can operate
    directly on encrypted data without revealing
    information
  • Handle Byzantine failures
  • Rapid dissemination of committed information
  • OceanStore Approach
  • Operations-based interface using conflict
    resolution
  • Modeled after Xerox Bayou ? updates packets
    includePredicate/update pairs which operate on
    encrypted data
  • User signs Updates and principle party signs
    commits
  • Committed data multicast to clients

23
Update Model
  • Concurrent updates w/o wide-area locking
  • Conflict resolution
  • Updates Serialization
  • A master replica?
  • Role of primary tier of replicas
  • All updates submitted to primary tier of replicas
    which chooses a final total order by following
    Byzantine agreement protocol
  • A secondary tier of replicas
  • The result of the updates is multicast down the
    dissemination tree to all the secondary replicas

24
Tentative UpdatesEpidemic Disemination
25
Committed UpdatesMulticast Dissemination
26
Data Coding Model
  • Two distinct forms of data active and archival
  • Active Data in Floating Replicas
  • Latest version of the object
  • Archival Data in Erasure Coded Fragments
  • A permanent, read-only version of the object
  • During commit, previous version coded with
    erasure-code and spread over 100s or 1000s of
    nodes
  • Advantage any 1/2 or 1/4 of fragments
    regenerates data

27
Floating Replica and Deep Archival Coding
28
Proactive Self-Maintenance
  • Continuous testing and repair of information
  • Slow sweep through all information to make sure
    there are sufficient erasure-coded fragments
  • Continuously reevaluate of risk and redistribute
    data
  • Slow sweep and repair of metadata/search trees
  • Continuous online self-testing of HW and SW
  • Detects flaky, failing, or buggy components via
  • fault injection triggering hardware and software
    error handling paths to verify their
    integrity/existence
  • stress testing pushing HW/SW components past
    normal operating parameters
  • scrubbing periodic restoration of potentially
    decaying hardware or software state
  • Automates preventive maintenance

29
OceanStore Technologies IVIntrospective
Optimization
  • Requirements
  • Reasonable job on global-scale optimization
    problem
  • Take advantage of locality whenever possible
  • Sensitivity to limited storage and bandwidth at
    endpoints
  • Repair of data structures, increasing of
    redundancy
  • Stability in chaotic environment ? Active
    Feedback
  • OceanStore Approach
  • Introspective Monitoring and analysis of
    relationships to cluster information by
    relatedness
  • Time series-analysis of user and data motion
  • Rearrangement and replication in response to
    monitoring
  • Clustered prefetching fetch related objects
  • Proactive-prefetching get data there before
    needed
  • Rearrangement in response to overload and attack

30
Example Client Introspection
  • Client observer and optimizer components
  • greedy agents working on the behalf of the client
  • Watches client activity/combines with historical
    info
  • Performs clustering and time-series analysis
  • forwards results to infrastructure (privacy
    issues!)
  • Monitoring of state of network to adapt behaviour
  • Typical Actions
  • cluster related files together
  • prefetch files that will be needed soon
  • Create/destroy floating replicas

31
OceanStore Conclusion
  • The Time is now for a Universal Data Utility
  • Ubiquitous computing and connectivity is (almost)
    here!
  • Confederation of utility providers is right model
  • OceanStore holds all data, everywhere
  • Local storage is a cache on global storage
  • Provides security in an untrusted infrastructure
  • Large scale system has good statistical
    properties
  • Use of introspection for performance and
    stability
  • Quality of individual servers enhances
    reliability
  • Exploits economies of scale to
  • Provide high-availability and extreme
    survivability
  • Lower maintenance cost
  • self-diagnosis and repair
  • Insensitivity to technology changesJust unplug
    one set of servers, plug in others
Write a Comment
User Comments (0)
About PowerShow.com