6'5 Cache Memory - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

6'5 Cache Memory

Description:

The disk is a meal or plastic platter coated with the magnetizable material ... Data is organized into concentric rings, called tracks, on the platter ... – PowerPoint PPT presentation

Number of Views:142
Avg rating:3.0/5.0
Slides: 46
Provided by: masitaab
Category:
Tags: cache | memory | platter

less

Transcript and Presenter's Notes

Title: 6'5 Cache Memory


1
6.5 Cache Memory
  • more effective, but expensive
  • modern disk drives include a small amount of
    internal cache
  • Relatively smaller in size than MM
  • Operates at or near the speed of the processor
  • Sits between MM and the CPU
  • Contains copies of sections of MM

6-17
2
  • A portion of RAM used to speed up access to data
    on a disk
  • computer microprocessor can access more quickly
    than it can access regular RAM.
  • L1 and L2 are levels of cache memory in a
    computer
  • L1cache is usually built onto the microprocessor
    chip itself
  • L2 is usually a separate static RAM (SRAM) chip.

6-17
3
  • If the computer processor can find the data it
    needs for its next operation in cache memory, it
    will save time compared to having to get it from
    RAM
  • Although caching improves performance, there is
    some risk involved. If the computer crashes (due
    to a power failure, for example), the system may
    not have time to copy the cache back to the disk.
    In this case, whatever changes you made to the
    data will be lost.

6-17
4
Cache-MM Interface
  • Assume an access to MM causes a block of K words
    to be transferred to the CM
  • The block transferred is stored in CM as a single
    unit called a slot/line/page
  • Once copied, individual words within a line can
    be accessed by the CPU
  • Data transfer and storage in the cache is done in
    h/w (i.e OS doesnt know about the cache)

6-18
5
Typical Cache Organisation
6-19
6
Cache Operation
  • CPU requests content of memory location
  • Check CM for this data
  • If present, get from CM
  • Otherwise, read the required block from MM to CM
  • Deliver from CM to MM
  • CM includes tags to identify which block of MM is
    in each CM slot

6-20
7
  • Since MMgtgtCM, blocks are mapped to specific lines
    in CM through the use of mapping function
  • 3 mapping functions
  • Direct
  • Associative
  • Set-associative

6-21
8
Direct Mapping
  • Each MM block is assigned to a specific line in
    the CM
  • If M64, C4
  • Line 0 can hold blocks 0, 4, 8, 12,
  • Line 1 can hold blocks 1, 5, 9, 13,
  • Line 2 can hold blocks 2, 6, 10, 14
  • Line 3 can hold blocks 3, 7, 11, 15, ....
  • Direct mapping cache treats a MM address as 3
    distinct fields
  • Tag identifier
  • Line number identifier
  • Word identifier

6-22
9
Direct Mapping Cache Organisation
6-23
10
  • Word identifier specifies the specific word in a
    cache line that is to be read
  • Line identifier specifies the physical line in
    the cache that will hold the referenced address
  • The tag is stored in a cache along with the data
    words of the line
  • For every memory reference that the CPU makes,
    the specific line that would hold the reference
    is determined
  • The tag held in the line is checked to see if the
    correct block is in the cache

6-24
11
6-25
12
Associative Mapping
  • Let a block to be stored in any cache line that
    is not in use
  • Must examine each line in the cache (thru tag id)
    to find the right memory block
  • 2 fields address, word and tag
  • Implement cache in 2 parts
  • The line themselves in SRAM
  • The tag storage in associative memory

6-26
13
Associative Mapping Cache Organisation
6-27
14
Set Associative Mapping
  • Compromise between direct and fully associative
    mappings that builds on the strength of both
  • Divide cache into a number of sets (v), each set
    holding a number of lines (k)
  • A MM block can be stored in any one of the k line
    in a set
  • If a set can hold X lines, the cache is
    referenced to as an X-way set associative cache

Commonly 2, 4-way
6-28
15
Set-Associative Mapping Cache Organisation
6-29
16
Line Replacement Algorithms
Algorithms to determine which line to be replaced
when an AC, SAC is full
  • LRU (Least Recently Used)
  • FIFO (First In First Out)
  • LFU (Least Frequently Used)
  • Random

6-30
17
To update the original copy of the line in MM
Write policy
  • Write through
  • Anytime a word in C is changed, it is also
    changed in MM
  • Both copies always agree
  • Generates lots of memory writes to MM
  • Write back
  • During a write, only change the contents of the
    cache
  • Update MM only when the cache line is to be
    replaced
  • Causes cache coherency problems
  • Complex circuitry to avoid this problem

6-31
18
Number of Caches
  • Single vs. 2-level
  • On chip cache
  • Modern CPU chips have a onboard cache (L1) e.g.
    Pentium 16KB, PowerPC up to 64KB
  • L1 provides best performance gains
  • Secondary, off chip cache (L2) provides higher
    speed access to MM
  • Generally 512KB or less otherwise not
    cost-effective

6-32
19
  • Unified vs. split
  • Unified cache stores data and instructions in one
    cache
  • Only 1 cache to design and operate
  • Cache is flexible and can balance allocation of
    space to instructions or data to best fit the
    execution of the program i.e higher hit ratio
  • Only one cache needs to be designed and
    implemented.
  • Split cache uses 2 caches ( 1 for instructions
    and 1 for data)
  • Must build and manage 2 caches
  • Static allocation of cache sizes
  • Can outperform unified cache in systems that
    support parallel execution and pipelining (reduce
    cache contention)
  • Trend favor split cache???

6-33
20
(No Transcript)
21
6.6 External Memory
  • Magnetic Disks
  • Optical Disks
  • Magnetic Tape
  • RAID

6-34
22
Magnetic Disks
  • The disk is a meal or plastic platter coated with
    the magnetizable material
  • Data is recorded onto and later read from the
    disk using a conducting coil, the head
  • Data is organized into concentric rings, called
    tracks, on the platter
  • Tracks are separated by gaps
  • Disk rotates at a constant speed

6-35
23
6-36
24
Disk characteristics
  • Single vs. multiple platter per drive (each
    platter has it own R/W head)
  • Fixed vs. movable head
  • Fixed head has a head per track
  • Movable uses one head per platter
  • Removable vs. non-removable platters
  • Data accessing times
  • Seek time position the head over the correct
    track
  • Rotational latency time for desired sector to
    come under the head
  • Access time 1 2
  • Block transfer time time to read block (sector)
    off the disk and transfer it to MM

6-37
25
Optical Disks
6-38
26
  • WORMs Write Once Read Many
  • User can produce CD ROMs in limited quantities
  • Specially prepared disk is written to using a
    medium power laser
  • Can be read many times just like a normal CD
    ROMs
  • Permit archival storage

6-39
27
  • Erasable optical disk
  • Combine laser and magnetic technology to permit
    information storage
  • Laser heats an area that can then have e-field
    orientation changed to alter information storage
  • Can be detected using polarized light during reads

6-40
28
Magnetic Tapes
  • The first kind of secondary memory
  • Still widely used
  • Popular for back ups
  • Very cheap but very slow
  • Sequential access
  • Data is organized as records with the physical
    air gaps between records
  • One words is stored across the width of the tape
    and read using multiple read/write heads

6-41
29
RAID Technology
  • RAID (Redundant Array of Independent Disks),
    developed at Berkeley
  • Several parallel disks operating as a single unit
  • 6 levels 0 5

6-42
30
6-43
31
RAID 0
  • No redundancy techniques are used
  • Data is distributed over all disks in the array
  • Data is divided into strips for actual storage
  • Can be used to support high data rate transfer
    rates by having block transfer size be in
    multiples of the strip
  • Can support low response time by having block
    transfer size equal a strip (support multiple
    strip transfers in parallel)

6-44
32
RAID 1
  • All disks are mirrored (duplicated)
  • Data is stored on a disk and its mirror
  • Read from either the disk or its mirror
  • Write must be done to both the disk and mirror
  • Faulty recovery is easy i.e. use the data on the
    mirror
  • Expensive

6-45
33
RAID 2
  • All disks are used for every access disks are
    synchronized together
  • Data strips are small (byte)
  • Error correcting code computer across all disks
    and stored on additional disks
  • Uses fewer disks than RAID 1 but still expensive

6-46
34
RAID 3
  • Like RAID 2 but only a single redundant disk is
    used
  • Parity bit is computed for the set of individual
    bits in the same position on the disks
  • If a drive fails, parity information on he
    redundant disks ca be used to calculate the data
    from the failed disk

6-47
35
RAID 4
  • Access to individual strips rather than to all
    disks at once like RAID 3
  • Bit-by-bit parity is calculated across
    corresponding strips on each disk
  • Parity bits stored in the redundant disk
  • Write penalty
  • For every write to a strip, the parity strip must
    also be recalculated and written
  • Thus 1 logical write equals 2 physical disk
    accesses

6-48
36
RAID 5
  • Parity information is distributed on the data
    disks in a round robin scheme
  • No parity disk needed

6-49
37
6.7 Error Correction
  • Semiconductor memories are subject to errors
  • Hard (permanent) errors
  • Soft (transient) errors
  • Memory systems include logic to detect and / or
    correct errors
  • Width of memory word is increased
  • Number of parity bits required depends on the
    level of detection and correction needed

6-50
38
General Error Detection ad Correction
  • A single error is a single bit flip multiple
    bit flips can occur in word
  • 2M valid data words, where M is the data word
  • 2MK codeword combinations in the memory, where K
    is the code/ parity bits
  • Distribute the 2M valid data words among the 2MK
    codeword combinations such that the distance
    between valid words is sufficient to distinguish
    the error

6-51
39
Single Error Detection and Correction (SED)
  • For each valid codeword, there will be 2K-1
    invalid codewords
  • 2K-1 must be large enough to identify which of
    the MK bit positions is in error
  • Therefore 2K-1 MK
  • 8-bit data, 4 check bit
  • 32-bit data, 6 check bit
  • Bit position is checked by bits Ci such that the
    sum of the subscripts, i, equals n. e.g. position
    10, bit M6, is checked by bits C2 ad C8.

6-52
40
Bit Position
Check Bit
Data Bit
Position Number
6-53
41
Example 8-bit input word 00111001 C1 1 x 0 x
1 x 1 x 0 1 C2 1 x 0 x 1x 1 x 0 1 C4 0 x
0 x 1 x 0 1 C8 1 x 1 x 0 x 0
0 Thus, the code bit in this case is 0111
Odd 1s 1 Even 1s 0
6-54
42
Say data bit 3 in error (i.e changed from 0 to
1), thus the input data is now 00111101 C1 1 x
0 x 1 x 1 x 0 1 C2 1 x 1 x 1x 1 x 0 0 C4
0 x 1 x 1 x 0 0 C8 1 x 1 x 0 x 0
0 The new code bit generated is 0001 Comparing
the two check bits will give syndrome word C8
C4 C2 C1 0 1 1 1 0 0 0
1 0 1 1 0 The result is 0110,
indicating that bit position 6, which contains
data bit 3, is in error.
6-55
43
  • To detect errors, compare the check bits read
    from memory to those computed during the read
    operation by using XOR
  • If the result of the XOR is 0000, no error
  • If on-zero, the numerical value of the result
    indicates the bit position in error
  • If the XOR result was 0110, bit position 6(M3) is
    in error
  • Double error detection can be added by adding
    another check bit that implements a parity check
    for the whole word of MK bits

6-56
44
Chapter Exercises
  • Suggest reasons why RAMs traditionally have been
    organized as only one bit per chip whereas ROMs
    are usually organized with multiple bits per
    chip.
  • Suppose an 8-bit data word stored in memory is
    11000010. Using the Hamming algorithm, determine
    what check bits would be stored in memory with
    the data word.

45
  • Beita Harian 29/1/2005
  • CIP memori pemproses data yang didakwa paling
    laju di dunia untuk aplikasi multimedia.
  • Menurut pengeluarnya, Samsung Electronics Co.
    (Samsung), cip XDR DRAM (eXtreme-Data-Rate
    Dynamic Random Access Memory) 256 megabit ini
    (gambar) adalah 10 kali ganda lebih laju daripada
    cip memori yang digunakan untuk peralatan video,
    konsol games, TV digital, pelayan dan stesen
    kerja pada hari ini. Samsung, pembuat cip memori
    komputer kedua terbesar di dunia, telah memulakan
    pengilangan cip ini yang permintaannya dijangka
    meningkat ke 800 juta unit menjelang 2009.
Write a Comment
User Comments (0)
About PowerShow.com