Lecture 4: A Case for RAID (Part 2) - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Lecture 4: A Case for RAID (Part 2)

Description:

Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California MTTF, MTBF, MTTR, AFR MTBF: Mean Time Between Failures Designed for ... – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0
Slides: 37
Provided by: VishalT8
Category:
Tags: raid | case | lecture | mttf | part

less

Transcript and Presenter's Notes

Title: Lecture 4: A Case for RAID (Part 2)


1
Lecture 4 A Case for RAID (Part 2)
  • Prof. Shahram Ghandeharizadeh
  • Computer Science Department
  • University of Southern California

2
Smaller Inexpensive Disks
  • 25 annual reduction in size 40 annual drop in
    price

1 GB, Year 2008 IBM Microdrive _at_ 125
1 GB, Year 1980 IBM 3380 _at_ 40,000
1 inch in height, weighs 1 ounce (16 grams)
Size of a refrigerator, 550 pounds (250 Kg)
3
Inexpensive Disks
  • Less than 9 Cents / Gigabyte of storage

4
Challenge Managing Data is Expensive
  • Cost of Managing Data is 100K/TB/Year
  • High availability Down time is estimated at
    thousands of dollars per minute.
  • Data loss results in lost productivity
  • 20 Megabytes of accounting data requires 21 days
    and costs 19K to reproduce.
  • 50 of companies that lose their data due to a
    disaster never re-open 90 go out of business in
    2 years!

5
Challenge Managing Data is Expensive
  • Cost of Managing Data is 100K/TB/Year
  • High availability Down time is estimated at
    thousands of dollars per minute.
  • Data loss results in lost productivity
  • 20 Megabytes of accounting data requires 21 days
    and costs 19K to reproduce.
  • 50 of companies that lose their data due to a
    disaster never re-open 90 go out of business in
    2 years!

RAID
6
MTTF, MTBF, MTTR, AFR
  • MTBF Mean Time Between Failures
  • Designed for repairable devices
  • Number of hours since the system was started
    until its failure.
  • MTTF Mean Time To Failures
  • Designed for non-repairable devices such as
    magnetic disk drives
  • Disks of 2008 are more than 40 times more
    reliable than disks of 1988.
  • MTTR Mean Time To Repair
  • Number of hours required to replace a disk drive,
    AND
  • Reconstruct the data stored on the failed disk
    drive.
  • AFR Annualized Failure Rate
  • Computed by assuming a temperature for the case
    (40 degrees centigrade), power-on-hours per year
    (say 8,760, 24x7), and 250 average motor
    start/stop cycles per year.

7
Focus on MTTF MTTR
  • MTTF Mean Time To Failures
  • Designed for non-repairable devices such as
    magnetic disk drives
  • Disks of 2008 are more than 40 times more
    reliable than disks of 1988.
  • MTTR Mean Time To Repair
  • Number of hours required to replace a disk drive,
    AND
  • Reconstruct the data stored on the failed disk
    drive.

8
Assumptions
  • MTTF of a disk is independent of other disks in a
    RAID.
  • Assume
  • The MTTF of a disk is once every 100 years, and
  • An array of 1000 such disks.
  • The MTTF of any single disk in the array is once
    every 37 days.

9
RAID
  • RAID organizes D disks into nG groups where each
    group consists of G disks and C parity disks.
    Example
  • D 8
  • G 4
  • C 1
  • nG 8/4 2

Disk 1
Disk 2
Disk 3
Disk 4
Parity 1
Disk 5
Disk 6
Parity 2
Disk 7
Disk 8
Parity Group 1
Parity Group 2
10
RAID
  • RAID organizes D disks into nG groups where each
    group consists of G disks and C parity disks.
    Example
  • D 8
  • G 4
  • C 1
  • nG 8/4 2

Disk 1
Disk 2
Disk 3
Disk 4
Parity 1
Disk 5
Disk 6
Parity 2
Disk 7
Disk 8
Parity Group 1
Parity Group 2
11
RAID With 1 Group
  • With G disks in a group and C check disks, a
    failure is encountered when
  • A disk in the group fails, AND
  • A second disk fails before the failed disk of
    step 1 is repaired.
  • MTTF of a group of disks with RAID is

12
RAID With 1 Group (Cont)
  • Probability of another failure
  • MTTR includes the time required to
  • Replace the failed disk drive,
  • Reconstruct the content of the failed disk.
  • Performing step 2 in a lazy manner increases
    duration of MTTR.
  • And the probability of another failure.
  • What happens if we increase the number of data
    disks in a group?

13
RAID with nG Groups
  • With nG groups, the Mean Time To Failure of the
    RAID is computed in a similar manner

14
Review
  • RAID 1 and 3 were presented in the previous
    lecture.
  • Here is a quick review.

15
RAID 1 Disk Mirroring
  • Contents of disks 1 and 2 are identical.
  • Redundant paths keep data available in the
    presence of either a controller or disk failure.
  • A write operation by a CPU is directed to both
    disks.
  • A read operation is directed to one of the disks.
  • Each disk might be reading different sectors
    simultaneously.
  • Tandems architecture

CPU 1
Controller 1
Controller 2
Disk 1
Disk 2
16
RAID 3 Small Blocks Reads
  • Bit-interleaved.
  • Bad news Small reads of less than the group
    size, requires reading the whole group.
  • E.g., read of one sector, requires read of 4
    sectors.
  • One parity group has the read rate identical to
    one disk.

01011110101010000001101001111
Disk 1
Disk 2
Disk 3
Disk 4
Parity
0 1
0 1
1 1
0 1
1 0
17
RAID 3 Small Block Reads
  • Given a large number of disks, say D12, enhance
    performance by constructing several parity
    groups, say 3.
  • With G (4) disks per group and D (say 8), the
    number of read requests supported by RAID 3 when
    compared with one disks is the number of groups
    (2). Number of groups is D/G.


Disk 1
Disk 2
Disk 3
Disk 4
Parity 1
Disk 5
Disk 6
Parity 2
Disk 7
Disk 8
Parity Group 1
Parity Group 2
18
Any Questions?
19
A Few Questions?
  • Assume one instance of RAID-1 organization. What
    are the values for
  • D
  • G
  • C
  • nG

20
A Few Questions?
  • Assume one instance of RAID-1 organization. What
    are the values for
  • D1
  • G1
  • C1
  • nG1

21
A Few Questions?
  • Assume one instance of RAID-1 organization. What
    are the values for
  • D1
  • G1
  • C1
  • nG1
  • Is the availability characteristics of the
    following Level 3 RAID better than RAID 1?

Disk 1
Disk 2
Disk 3
Disk 4
Parity 1
Parity Group
22
RAID 4
  • Enhances performance of small reads/writes/read-mo
    dify-write. How?
  • Interleave data across disks at the granularity
    of a transfer unit. Minimum size is a sector.
  • Parity block ECC1 is an exclusive or of the bits
    in blocks a, b, c, and d.

Disk 1
Disk 2
Disk 3
Disk 4
Parity
Block a
Block b
Block c
Block d
ECC 1
23
RAID 4
  • Small read retrieves its block from one disk.
  • Now, 4 requests referencing blocks on different
    data disks may proceed in parallel.
  • When compared with 1 disk, throughput of a D disk
    system is D times higher.

Disk 1
Disk 2
Disk 3
Disk 4
Parity
Block a
Block b
Block c
Block d
ECC 1
24
RAID 4 Failures (Cont)
  • If Disk 2 fails, a small read for Block b
    retrieves blocks a, c, d, and ECC 1 from disks 1,
    3, 4, and Parity disks to compute the missing
    block. What is throughput relative to one disk
    now?
  • Once Disk 2 is replaced with a new one, its
    content is constructed either eagerly or in a
    lazy manner. System cannot be too lazy because
    we want to minimize MTTR.

Disk 1
Disk 2
Disk 3
Disk 4
Parity
Block a
Block b
Block c
Block d
ECC 1
25
RAID 4 Failures (Cont)
  • If the Parity disk fails, read of data blocks may
    proceed as in normal mode of operation.
  • Once the Parity disk is replaced, content of new
    Parity disk is constructed either eagerly or
    lazily.

Disk 1
Disk 2
Disk 3
Disk 4
Parity
Block a
Block b
Block c
Block d
ECC 1
26
RAID 4 Small Writes
  • Performance of small writes is improved.
  • To write Block b
  • Read the old Block b and old parity block ECC1,
  • Compute the new parity using the old Block b, new
    Block b, and the old parity
  • New parity (old block xor new block) xor old
    parity ECC1
  • A write requires 4 accesses 2 reads and 2
    writes.

Disk 1
Disk 2
Disk 3
Disk 4
Parity
Block a
Block b
Block c
Block d
ECC 1
27
RAID 4 Bottlenecks
  • For writes, parity disk is a bottleneck.
  • Two different writes to Block b and g must read
    ECC1 and ECC2 from the Parity disk. A queue will
    form on the Parity disk.
  • Performance of small writes is same as RAID 3,
    D/2G.

Disk 1
Disk 2
Disk 3
Disk 4
Parity
Block a
Block b
Block c
Block d
ECC 1
Block e
Block f
Block g
Block h
ECC 2
28
RAID 4 Summary
29
RAID 5 Resolve the Bottleneck
  • Distribute data and check blocks across all disks.

Disk 1
Disk 2
Disk 3
Disk 4
Disk 5
Block a
Block b
Block c
Block d
ECC 1
Block h
Block e
Block f
Block g
ECC 2
Block i
Block j
ECC 3
Block k
Block l
Block p
Block m
ECC 4
Block n
Block o
Block t
ECC 5
Block q
Block r
Block s
30
RAID 5 Resolve the Bottleneck
  • Write of Blocks a and j may proceed in parallel
    now.

Disk 1
Disk 2
Disk 3
Disk 4
Disk 5
Block a
Block b
Block c
Block d
ECC 1
Block h
Block e
Block f
Block g
ECC 2
Block i
Block j
ECC 3
Block k
Block l
Block p
Block m
ECC 4
Block n
Block o
Block t
ECC 5
Block q
Block r
Block s
31
RAID 5 Read Performance
  • Check disks service read requests.
  • With D disks broken into nG groups, number of
    parity disks is nGC. nG D/G.
  • When compared with one disk, the throughput of a
    D disk system is D CD/G times higher.

Disk 1
Disk 2
Disk 3
Disk 4
Disk 5
Block a
Block b
Block c
Block d
ECC 1
Block h
Block e
Block f
Block g
ECC 2
Block i
Block j
ECC 3
Block k
Block l
32
RAID 5 Write Performance
  • For writes, read the referenced block and its
    parity block. Compute the new parity block.
    Write the new data block and its parity block.
  • Continue to use the parity disk.
  • With D disks broken into nG groups, number of
    parity disks is nGC. nG D/G.
  • When compared with one disk, the throughput of a
    D disk system is D/4 (CD/G)/4 times higher.

Disk 1
Disk 2
Disk 3
Disk 4
Disk 5
Block a
Block b
Block c
Block d
ECC 1
Block h
Block e
Block f
Block g
ECC 2
Block i
Block j
ECC 3
Block k
Block l
33
RAID 5 R-M-W Performance
  • For R-M-W, read and write of the data block comes
    for free.
  • the referenced block is already retrieved. Must
    perform one extra disk I/O to read they parity
    block. Compute the new parity block. Write the
    new data block and its parity block.
  • Continue to use the parity disk.
  • With D disks broken into nG groups, number of
    parity disks is nGC. nG D/G.
  • When compared with one disk, the throughput of a
    D disk system is D/2 (CD/G)/2 times higher.

Disk 1
Disk 2
Disk 3
Disk 4
Disk 5
Block a
Block b
Block c
Block d
ECC 1
Block h
Block e
Block f
Block g
ECC 2
Block i
Block j
ECC 3
Block k
Block l
34
RAID 5 Summary
35
RAID 5 Summary
  • Significant improvement in the performance of
    small writes/R-M-W

36
RAID Summary
  • If your workload consists of small R-M-W
    operations, which RAID would you choose?
Write a Comment
User Comments (0)
About PowerShow.com