Erasure Code Replication - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Erasure Code Replication

Description:

If the storage overhead is S, then there are S copies of data in the system. ... Liew paper: 'Parallel Communications for ATM Network Control and Management' ... – PowerPoint PPT presentation

Number of Views:214
Avg rating:3.0/5.0
Slides: 26
Provided by: compu369
Category:

less

Transcript and Presenter's Notes

Title: Erasure Code Replication


1
Erasure Code Replication
  • Presenter W.K Lin
  • (The Chinese University of Hong Kong)

2
Why we need replication?
  • Storage devices can fail to function.
  • Use replication to increase data availability,
    e.g. RAID
  • The basic idea of replication
  • Place more data in different places and increase
    the chance of finding a data.
  • P2P systems often provide replication.

3
Server-less VoD Architecture
  • No centralized video server to provide the video
    streaming.
  • Each client in the system store a partial video
    blocks.
  • Store the video blocks by erasure code.
  • Not necessary to stream from all peers for
    complete video playback.
  • The clients can stream the video from other
    clients.

4
Some Terminologies
  • Peers are the computers/ storage devices that
    store the data.
  • Peer availability µ is a measure to indicate the
    portion of time that the peer is up/ online.
  • File availability A is the probability to recover
    the file from the duplicated copies of data.
  • Storage overhead S is the ratio of storage
    required for replication to the storage required
    before replication

5
Whole File Replication
  • Whole file replication replicates the complete
    file.
  • If the storage overhead is S, then there are S
    copies of data in the system.
  • File availability Aw

6
Whole File Replication
  • It is not storage effective

Adopted from Replication Strategies for Highly
Available Peer to Peer Networks, Ranjita Bhagwan
et. al,
7
Erasure Code Replication
  • Instead of replicating the whole file, replicate
    a portion of the file.
  • Principle
  • A file is divided into b blocks.
  • Use erasure code to add redundancy to these b
    blocks. We then have n blocks in total.
  • Make the n file blocks dependent to each other
    each file block has partial information of other
    blocks.
  • Any b out of the n blocks are enough to recover
    the original file.

8
Erasure Code Replication
  • Storage overhead S n/b or n Sb.
  • Since we need any b out of the Sb copies to
    recover the file, the file availability Aw is
  • Notice that whole file replication is a special
    case of erasure code replication with b 1.

9
Erasure Code Replication
  • Erasure code replication is more storage effective

Adopted from Replication Strategies for Highly
Available Peer to Peer Networks, Ranjita Bhagwan
et. al,
10
Effectiveness of Erasure Code Replication
  • The effectiveness of erasure code replication is
    determined by two factors
  • combinatorial effect, i.e. SbCb gtgt SC1
  • peer availability factor µb(1-µ)Sb-b
  • Erasure code replication depends on S, b, and µ.

11
Effectiveness of Erasure Code Replication
12
How Erasure Code Replication Performs?
  • File availability A (Aw or Ab) by varying µ and S

13
A Related Problem
  • Lee and Liew paper Parallel Communications for
    ATM Network Control and Management points out a
    similar problem
  • An information string is divided into b parts,
    then encoded into n parts.
  • Any b out of the n parts is enough to recover the
    original information.
  • Very similar to our problem!
  • They prove a necessary bound Sµ gt 1 for reliable
    communication.

14
Erasure Code Bound (Sµ gt 1)
  • The area above the curve define the region that
    erasure code replication is preferred for large b.

15
Erasure Code Replication Sensitivity Analysis
  • We need to use a large b in order to benefit from
    erasure code replication.
  • If the system is operating at a level Sµ 1, a
    little fluctuation of system parameter will harm
    the system.

16
Erasure Code Replication Sensitivity Analysis
  • The system is targeted to operate at S 3, µ
    0.35.
  • Sµ gt 1
  • 10 measurement error of µ.

17
Related Work I
  • Markov chain model for a simple birth/ death
    model

Adopted from Design and Analysis of a
Fault-Tolerant Mechanism for a Server-Less
Video-On-Demand System Lee and Yeung
18
Related Work I
  • Mean time to failure of the model
  • Result

19
Related Work II
  • Another Markov model

c connected state, mean time to stay ? u
disconnected state, mean time to stay µ . d
dead state a the probability of going to
disconnected state d.
Adopted from Data Durability in Peer to Peer
Storage Systems Gil Utard, Antoine Vernois
20
Related Work II
Storage overhead S3
21
Conclusion
  • Traditionally, erasure code replication has been
    very successful, e.g. RAID
  • A strict bound Sµ gt 1, has to be satisfied for
    replication to gain from erasure code
    replication.
  • Erasure code replication is sensitive to system
    measurement errors.
  • Partly explain why erasure code replication is
    not seen in P2P systems.

22
Future Directions
  • Most analysis are based on the assumption that
    all peers have the same availability level.
  • In real system, a peer might have different
    failure and recovery rates.
  • The replica distribution, discovery are opened
    for research
  • How to place/ locate the replicas if the peers
    are having different availabilities?
  • If the system fail, how to recover the lost
    replicas from the system?

23
End of presentation
24
Appendix
  • Proof

Let X be a binomial random variable having mean
µSbµ and variance s2 Sbµ(1-µ).
25
Appendix
  • Similarly,
Write a Comment
User Comments (0)
About PowerShow.com