6'5 Cache Memory - PowerPoint PPT Presentation

1 / 45

About This Presentation

Title:

6'5 Cache Memory

Description:

The disk is a meal or plastic platter coated with the magnetizable material ... Data is organized into concentric rings, called tracks, on the platter ... – PowerPoint PPT presentation

Number of Views:142

Avg rating:3.0/5.0

Slides: 46

Provided by: masitaab

Category:

more less

Transcript and Presenter's Notes

Title: 6'5 Cache Memory

1
6.5 Cache Memory

more effective, but expensive
modern disk drives include a small amount of
internal cache
Relatively smaller in size than MM
Operates at or near the speed of the processor
Sits between MM and the CPU
Contains copies of sections of MM

6-17
2

A portion of RAM used to speed up access to data
on a disk
computer microprocessor can access more quickly
than it can access regular RAM.
L1 and L2 are levels of cache memory in a
computer
L1cache is usually built onto the microprocessor
chip itself
L2 is usually a separate static RAM (SRAM) chip.

6-17
3

If the computer processor can find the data it
needs for its next operation in cache memory, it
will save time compared to having to get it from
RAM
Although caching improves performance, there is
some risk involved. If the computer crashes (due
to a power failure, for example), the system may
not have time to copy the cache back to the disk.
In this case, whatever changes you made to the
data will be lost.

6-17
4
Cache-MM Interface

Assume an access to MM causes a block of K words
to be transferred to the CM
The block transferred is stored in CM as a single
unit called a slot/line/page
Once copied, individual words within a line can
be accessed by the CPU
Data transfer and storage in the cache is done in
h/w (i.e OS doesnt know about the cache)

6-18
5
Typical Cache Organisation
6-19
6
Cache Operation

CPU requests content of memory location
Check CM for this data
If present, get from CM
Otherwise, read the required block from MM to CM
Deliver from CM to MM
CM includes tags to identify which block of MM is
in each CM slot

6-20
7

Since MMgtgtCM, blocks are mapped to specific lines
in CM through the use of mapping function
3 mapping functions
Direct
Associative
Set-associative

6-21
8
Direct Mapping

Each MM block is assigned to a specific line in
the CM
If M64, C4
Line 0 can hold blocks 0, 4, 8, 12,
Line 1 can hold blocks 1, 5, 9, 13,
Line 2 can hold blocks 2, 6, 10, 14
Line 3 can hold blocks 3, 7, 11, 15, ....
Direct mapping cache treats a MM address as 3
distinct fields
Tag identifier
Line number identifier
Word identifier

6-22
9
Direct Mapping Cache Organisation
6-23
10

Word identifier specifies the specific word in a
cache line that is to be read
Line identifier specifies the physical line in
the cache that will hold the referenced address
The tag is stored in a cache along with the data
words of the line
For every memory reference that the CPU makes,
the specific line that would hold the reference
is determined
The tag held in the line is checked to see if the
correct block is in the cache

6-24
11
6-25
12
Associative Mapping

Let a block to be stored in any cache line that
is not in use
Must examine each line in the cache (thru tag id)
to find the right memory block
2 fields address, word and tag
Implement cache in 2 parts
The line themselves in SRAM
The tag storage in associative memory

6-26
13
Associative Mapping Cache Organisation
6-27
14
Set Associative Mapping

Compromise between direct and fully associative
mappings that builds on the strength of both
Divide cache into a number of sets (v), each set
holding a number of lines (k)
A MM block can be stored in any one of the k line
in a set
If a set can hold X lines, the cache is
referenced to as an X-way set associative cache

Commonly 2, 4-way
6-28
15
Set-Associative Mapping Cache Organisation
6-29
16
Line Replacement Algorithms
Algorithms to determine which line to be replaced
when an AC, SAC is full

LRU (Least Recently Used)
FIFO (First In First Out)
LFU (Least Frequently Used)
Random

6-30
17
To update the original copy of the line in MM
Write policy

Write through
Anytime a word in C is changed, it is also
changed in MM
Both copies always agree
Generates lots of memory writes to MM
Write back
During a write, only change the contents of the
cache
Update MM only when the cache line is to be
replaced
Causes cache coherency problems
Complex circuitry to avoid this problem

6-31
18
Number of Caches

Single vs. 2-level
On chip cache
Modern CPU chips have a onboard cache (L1) e.g.
Pentium 16KB, PowerPC up to 64KB
L1 provides best performance gains
Secondary, off chip cache (L2) provides higher
speed access to MM
Generally 512KB or less otherwise not
cost-effective

6-32
19

Unified vs. split
Unified cache stores data and instructions in one
cache
Only 1 cache to design and operate
Cache is flexible and can balance allocation of
space to instructions or data to best fit the
execution of the program i.e higher hit ratio
Only one cache needs to be designed and
implemented.
Split cache uses 2 caches ( 1 for instructions
and 1 for data)
Must build and manage 2 caches
Static allocation of cache sizes
Can outperform unified cache in systems that
support parallel execution and pipelining (reduce
cache contention)
Trend favor split cache???

6-33
20
(No Transcript)
21
6.6 External Memory

Magnetic Disks
Optical Disks
Magnetic Tape
RAID

6-34
22
Magnetic Disks

The disk is a meal or plastic platter coated with
the magnetizable material
Data is recorded onto and later read from the
disk using a conducting coil, the head
Data is organized into concentric rings, called
tracks, on the platter
Tracks are separated by gaps
Disk rotates at a constant speed

6-35
23
6-36
24
Disk characteristics

Single vs. multiple platter per drive (each
platter has it own R/W head)
Fixed vs. movable head
Fixed head has a head per track
Movable uses one head per platter
Removable vs. non-removable platters
Data accessing times
Seek time position the head over the correct
track
Rotational latency time for desired sector to
come under the head
Access time 1 2
Block transfer time time to read block (sector)
off the disk and transfer it to MM

6-37
25
Optical Disks
6-38
26

WORMs Write Once Read Many
User can produce CD ROMs in limited quantities
Specially prepared disk is written to using a
medium power laser
Can be read many times just like a normal CD
ROMs
Permit archival storage

6-39
27

Erasable optical disk
Combine laser and magnetic technology to permit
information storage
Laser heats an area that can then have e-field
orientation changed to alter information storage
Can be detected using polarized light during reads

6-40
28
Magnetic Tapes

The first kind of secondary memory
Still widely used
Popular for back ups
Very cheap but very slow
Sequential access
Data is organized as records with the physical
air gaps between records
One words is stored across the width of the tape
and read using multiple read/write heads

6-41
29
RAID Technology

RAID (Redundant Array of Independent Disks),
developed at Berkeley
Several parallel disks operating as a single unit
6 levels 0 5

6-42
30
6-43
31
RAID 0

No redundancy techniques are used
Data is distributed over all disks in the array
Data is divided into strips for actual storage
Can be used to support high data rate transfer
rates by having block transfer size be in
multiples of the strip
Can support low response time by having block
transfer size equal a strip (support multiple
strip transfers in parallel)

6-44
32
RAID 1

All disks are mirrored (duplicated)
Data is stored on a disk and its mirror
Read from either the disk or its mirror
Write must be done to both the disk and mirror
Faulty recovery is easy i.e. use the data on the
mirror
Expensive

6-45
33
RAID 2

All disks are used for every access disks are
synchronized together
Data strips are small (byte)
Error correcting code computer across all disks
and stored on additional disks
Uses fewer disks than RAID 1 but still expensive

6-46
34
RAID 3

Like RAID 2 but only a single redundant disk is
used
Parity bit is computed for the set of individual
bits in the same position on the disks
If a drive fails, parity information on he
redundant disks ca be used to calculate the data
from the failed disk

6-47
35
RAID 4

Access to individual strips rather than to all
disks at once like RAID 3
Bit-by-bit parity is calculated across
corresponding strips on each disk
Parity bits stored in the redundant disk
Write penalty
For every write to a strip, the parity strip must
also be recalculated and written
Thus 1 logical write equals 2 physical disk
accesses

6-48
36
RAID 5

Parity information is distributed on the data
disks in a round robin scheme
No parity disk needed

6-49
37
6.7 Error Correction

Semiconductor memories are subject to errors
Hard (permanent) errors
Soft (transient) errors
Memory systems include logic to detect and / or
correct errors
Width of memory word is increased
Number of parity bits required depends on the
level of detection and correction needed

6-50
38
General Error Detection ad Correction

A single error is a single bit flip multiple
bit flips can occur in word
2M valid data words, where M is the data word
2MK codeword combinations in the memory, where K
is the code/ parity bits
Distribute the 2M valid data words among the 2MK
codeword combinations such that the distance
between valid words is sufficient to distinguish
the error

6-51
39
Single Error Detection and Correction (SED)

For each valid codeword, there will be 2K-1
invalid codewords
2K-1 must be large enough to identify which of
the MK bit positions is in error
Therefore 2K-1 MK
8-bit data, 4 check bit
32-bit data, 6 check bit
Bit position is checked by bits Ci such that the
sum of the subscripts, i, equals n. e.g. position
10, bit M6, is checked by bits C2 ad C8.

6-52
40
Bit Position
Check Bit
Data Bit
Position Number
6-53
41
Example 8-bit input word 00111001 C1 1 x 0 x
1 x 1 x 0 1 C2 1 x 0 x 1x 1 x 0 1 C4 0 x
0 x 1 x 0 1 C8 1 x 1 x 0 x 0
0 Thus, the code bit in this case is 0111
Odd 1s 1 Even 1s 0
6-54
42
Say data bit 3 in error (i.e changed from 0 to
1), thus the input data is now 00111101 C1 1 x
0 x 1 x 1 x 0 1 C2 1 x 1 x 1x 1 x 0 0 C4
0 x 1 x 1 x 0 0 C8 1 x 1 x 0 x 0
0 The new code bit generated is 0001 Comparing
the two check bits will give syndrome word C8
C4 C2 C1 0 1 1 1 0 0 0
1 0 1 1 0 The result is 0110,
indicating that bit position 6, which contains
data bit 3, is in error.
6-55
43

To detect errors, compare the check bits read
from memory to those computed during the read
operation by using XOR
If the result of the XOR is 0000, no error
If on-zero, the numerical value of the result
indicates the bit position in error
If the XOR result was 0110, bit position 6(M3) is
in error
Double error detection can be added by adding
another check bit that implements a parity check
for the whole word of MK bits

6-56
44
Chapter Exercises

Suggest reasons why RAMs traditionally have been
organized as only one bit per chip whereas ROMs
are usually organized with multiple bits per
chip.
Suppose an 8-bit data word stored in memory is
11000010. Using the Hamming algorithm, determine
what check bits would be stored in memory with
the data word.

Beita Harian 29/1/2005
CIP memori pemproses data yang didakwa paling
laju di dunia untuk aplikasi multimedia.
Menurut pengeluarnya, Samsung Electronics Co.
(Samsung), cip XDR DRAM (eXtreme-Data-Rate
Dynamic Random Access Memory) 256 megabit ini
(gambar) adalah 10 kali ganda lebih laju daripada
cip memori yang digunakan untuk peralatan video,
konsol games, TV digital, pelayan dan stesen
kerja pada hari ini. Samsung, pembuat cip memori
komputer kedua terbesar di dunia, telah memulakan
pengilangan cip ini yang permintaannya dijangka
meningkat ke 800 juta unit menjelang 2009.