Solid State Storage (SSS) System Error Recovery

About This Presentation

Title:

Solid State Storage (SSS) System Error Recovery

Description:

Solid State Storage (SSS) System Error Recovery LHO 08 For NASA Langley Research Center Background NASA Langley Research Center is building a system to record ... – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 26

Provided by: peopleVc48

Learn more at: https://www.vcu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Solid State Storage (SSS) System Error Recovery

1
Solid State Storage (SSS) System Error Recovery

LHO 08

For NASA Langley Research Center
2
Background

NASA Langley Research Center is building a system
to record streaming video and other data when the
Space Shuttle docks with the Space Station.
This data will be used to develop algorithms that
will enable the next generation of the space
station to perform autonomous docking.
Due to the harsh environment in space the data
will be stored in a RAID array of solid state
SATA drives with the capability of recovering
data even if two drives fail.
This Solid State Storage (SSS) system is being
developed at VCU.
We will look at the that portion of the system
that deals with drive error recovery.

3
Proposed SSS system Overview
To data recorder
4
SSS Data Recovery

The Solid State Storage (SSS) system will consist
of six solid state data drives. The discussion
will be directed to this specific configuration.
The data will be sector striped across these six
drives.
A modified RAID 6 system capable of recovering
data from two corrupted sectors in a stripe is
proposed.
Optimized for long single-thread transfers that
are multiples of the entire stripe.

5
RAID 5

To illustrate concepts and implications consider
a RAID 5 implementation.
RAID 5 uses striped array with rotating parity.
Optimized for short, multithreaded transfers.
Capable of recovering from a single drive
failure.

6
RAID 5 system consisting of three data drives and
rotating parity. Four stripes for sectors A, B,
C, and D are shown.
7
Rotating Parity

Why rotating parity?
The following steps are necessary to update a
single data sector in a stripe.
The old data sector and the parity sector for the
stripe must be read.
Compute the new parity using the new data sector,
old data sector, and old parity.
Write new data sector and new parity sector.
Thus, to write to a data sector both the data
sector and parity sector must be read and
written.
Since there are many data drives a fixed parity
drive would accessed much more frequently than a
data drive.
This excessive access of a single parity drive is
avoid by rotating parity across all drives.

8
Rotating parity not needed in SSS

The SSS is required to store long data streams.
Not random sectors.
Make the size of these streams a multiple of the
stripe size.
An entire stripe with parity will be buffered.
The entire stripe with party will be
simultaneously written to all drives.
It is not necessary to first read the drives.
The SSS will always read and write entire
stripes.
Easier to implement.
Faster access.

9
Parity

Parity encoding is given by
Where Di represent a data byte in a sector on
drive i.
If both sides of the above equation are exclusive
ored with P, then
D5 for example can be recovered by

10
Parity problem

Using parity it is easy to recover data on a
single drive if we know that drive is bad.
We may have data corruption on a drive without
without the entire drive failing.
Undetectable based on parity alone.
Propose to include a 32-bit CRC in sector.
Simple to implement.
Less than 1 overhead.
In RAID 6 will ensure as long as a stripe has no
more than two bad sectors the data in that stripe
can be recovered.

11
Key Conclusions

Write data as entire stripes.
Used fixed parity drive.
Include sector CRC.

12
Raid 6 (modified)

Use two fixed parity drives (P and Q).
Data can be recovered if two sectors in a stripe
are corrupted.
P parity is the same as RAID 5 (simple XOR).
Easy to encode and easy to recover data.
Q parity is more complicated.

13
Q parity encoding
The Q parity is a Reed-Solomon code given by
Where ? is Galois Field (GF) multiplication and
gi is a constant. For i lt 8 it turns out that gi
2i. For larger i, it not as simple. For example
g8 29. But for the SSS application Q simplifies
to
The problem is how to compute the GF
multiplication.
14
GF multiplication

In ordinary arithmetic multiplication can be
accomplished summing the logs and taking the
inverse log.
GF multiplication is typically accomplished using
lookup tables to find the GF log and inverse log.
The addition in modulo 255.
See Xilinx application note XAPP731 Hardware
Accelerator for RADD 6 Parity Generation / Data
Recovery Controller.

15
(No Transcript)
16
(No Transcript)
17
Examples
18
Examples
Note A?B 0 if A 0 or B 0. This is a
special case and cannot be computed using
logs. It is also worth noting that A?1 A. This
does follow from using logs since logGF(0x01) 0.
19
Elaboration on Galois Field Mathematics

Évariste Galois (1832)
Established many of the ideas of group theory.
Left only sixty pages of mathematical writings.
Mortally wounded in a duel at age 20.
Most of his major centrifugations stem from a
letter written the night before the duel.
His work has had great impact.
Provides powerful tool for investigating
fundamental mathematical problems.
Roots of algebraic equations.
GF theory provides simple proof that an angle
cannot be trisected using only compass and
unmarked straightedge.
This had baffled mathematicians since the time of
Euclid.
Recently applied to computer design and
data-communication systems.

20
Galois Field Mathematics

A Galois Field is a algebraic structure ltG,?,?gt
where G is a set consisting of 2n elements, ? is
addition mod 2 (bit wise XOR) and ? is GF
multiplication. Math similar to ordinary
arithmetic.
? and ? is commutative and associative.
Distributive such that
We are only concerned with GF(28) where the set G
has 256 elements. We will use a hex byte to
specify the elements.
Then A ? A 0x00, A ? 0x00 0x00, A ? 0x01 A

21
GF(28)

The GF log look up tables are generates based on
what in GF theory is called a primitive
polynomial. Primitive polynomials have certain
properties that lead to the error correction
techniques.
GF(28) is generated using the primitive
polynomial
This is the same primitive polynomials use to
determine the feed back path for an 8-bit maximum
count linear feedback shift registers (LFBSRs).
The LFBSR can be use to perform GF
multiplication.

22
The 8 bit LFBSR
Q0 Q1 Q2
Q3 Q4 Q5 Q6 Q7
Or reversing order so that the most significant
bit is at the left
A shift has the same effect as ? 2. In VHDL Q lt
Q(6) Q(5) Q(4) (Q(3) XOR Q(7))
(Q(2) XOR Q(7)) (Q(1) XOR Q(7)) Q(0) Q(7)
23
(No Transcript)
24
(No Transcript)
25
Galois Field Division

Write a Comment

User Comments (0)

About PowerShow.com

Solid State Storage (SSS) System Error Recovery - PowerPoint PPT Presentation

Solid State Storage (SSS) System Error Recovery

Solid State Storage (SSS) System Error Recovery LHO 08 For NASA Langley Research Center Background NASA Langley Research Center is building a system to record ... – PowerPoint PPT presentation