Internal Memory - PowerPoint PPT Presentation

About This Presentation

Title:

Internal Memory

Description:

The larger the memory, the slower the access. Rev. 3 (2005-06) by Enrico Nardelli ... Hologram. Rev. 3 (2005-06) by Enrico Nardelli. 12. 4 - Semiconductor Memory ... – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 71

Provided by: adria230

Category:

more less

Transcript and Presenter's Notes

Title: Internal Memory

1
William Stallings Computer Organization and
Architecture
Chapter 4 Internal Memory
2
Memory

How much ?
As much as possible
How fast ?
As fast as possible
How expensive ?
As cheap as possible
Fast memory is expensive
Large memory is expensive
The larger the memory, the slower the access

3
Memory Hierarchy

CPU Registers
L1 cache (on chip)
L2 cache (on board)
Main memory
Disk cache
Disk
Optical
Tape

Access time
Size
Access Frequency
Cost per bit
4
Characteristics

Location
Capacity
Unit of transfer
Access method
Performance
Physical type
Physical characteristics
Organisation

5
Location

CPU
Registers
Internal access directly from CPU
Cache
RAM
External access through I/O module
Disks
CD-ROM,

6
Capacity

Word size
The natural unit of organisation
Usually, it is equal to the numer of bits used
for representing numbers or instructions
Typical word size 8 bits, 16 bits, 32 bits
Number of words (or Bytes)
1 Byte 8 bits 23 bits
1 K Byte 210 Bytes 210 x 23 bits 1024
bytes (Kilo)
1 M Byte 210 K Bytes 1024 K Bytes (Mega)
1 G Byte 210 M Bytes 230 Bytes (Giga)
1 T Byte 210 G Bytes 1024 G Bytes (Tera)

7
Unit of Transfer

Number of bits can be read/written at the same
time
Internal
Usually governed by data bus width
bus width may be equal to word size or (often)
larger
Typical bus width 64, 128, 256 bits
External
Usually a block which is much larger than a word
A related concept addressable unit
Smallest location which can be uniquely addressed
Word internally
Cluster on M disks

8
Access Methods (1)

Sequential
Start at the beginning and read through in order
Access time depends on location of data and
previous location
e.g. tape
Direct
Individual blocks have unique address
Access is by jumping to vicinity plus sequential
search
Access time depends on location and previous
location
e.g. disk

9
Access Methods (2)

Random
Individual addresses identify locations exactly
Access time is independent of location or
previous access
e.g. RAM
Associative
Data is located by a comparison with contents of
a portion of the store
Access time is independent of location or
previous access
e.g. cache

10
Performance

Access time
Time between presenting the address and getting
the valid data
Memory Cycle time
Time may be required for the memory to recover
before next access
Cycle time is access recovery
Transfer Rate
Rate at which data can be moved
TNTA N/R

N number of bits TA access
time TN time need to read N bits R
transfer rate
11
Physical Types

Semiconductor
RAM, ROM, EPROM, Cache
Magnetic
Disk Tape
Optical
CD DVD
Others
Bubble
Hologram

12
Semiconductor Memory

RAM (Random Access Memory)
Misnamed as all semiconductor mem. are random
access
Read/Write
Volatile
Temporary storage
Static or dynamic
ROM (Read only memory)
Permanent storage
Read only

13
Dynamic RAM

Bits stored as charge in capacitors
Charges leak
Need refreshing even when powered
Simpler construction
Smaller per bit
Less expensive
Need refresh circuits
Slower
Main memory (static RAM would be too expensive)

14
Static RAM

Bits stored as on/off switches
No charges to leak
No refreshing needed when powered
More complex construction
Larger per bit
More expensive
Does not need refresh circuits
Faster
Cache (here the faster the better)

15
Read Only Memory (ROM)

Permanent storage
Microprogramming (see later)
Library subroutines
Systems programs (BIOS)
Function tables

16
Types of ROM

Written during manufacture
Very expensive for small runs
Programmable (once)
PROM
Needs special equipment to program
Read mostly
Erasable Programmable (EPROM)
Erased by UV (it can take up to 20 minuts)
Electrically Erasable (EEPROM)
Takes much longer to write than read
a single byte can be erased
Flash memory
Erase memory electrically block-at-a-time

17
Physical Characteristics

Decay (refresh time)
Volatility (needs power source)
Erasable
Power consumption

18
Organisation

Physical arrangement of bits into words
Not always obvious
e.g. interleaved

19
Basic Organization (1)

Basic element memory cell
has 2 stable states one represent 0, the other 1
can be written at least once
can be read

Write
Read
R/W Control
R/W Control
Cell
Cell
Select
Select
Input Data
Output Data
20
Basic Organization (2)

Basic organization of a 512x512 bits chip

Timing and control
Array of Memory Cells (512x512)
Row Address Decoder
A0
9
A8
D0
1
Sense Amplifier and I/O Gate
A9
9
Column Address Decoder
A17
21
Module Organisation

Basic organization of a 256KB chip
8 times a 512x512 bits chip
For a 1 MB chip replicate 4 times this
organization

22
Module Organisation (1 MByte)
23
Organisation for larger sizes

The larger the size the higher the number of
address pins
For 2k words, k pins are needed
A solution to reduce the number of address pins
Multiplex row address and column address
k/2 pins to address 2k Bytes
Adding one more pin doubles range of values so x4
capacity

24
Typical 16 Mb DRAM (4M x 4)
X
X
25
Refreshing (Dynamic RAM)

Refresh circuit included on chip
Disable chip
Count through rows
Read Write back
Takes time
Slows down apparent performance

26
Packaging
X
27
Error Correction

Hard Failure
Permanent defect
Soft Error
Random, non-destructive
No permanent damage to memory
Detected using Hamming error correcting code
it is able to detect and correct 1-bit errors

28
Error Correcting Code Function
29
A simple example of correction (1)
B
A

Correcting errors in 4 bits words
3 control groups
In each control group add 1 parity bit

1
1
1
0
C
B
A
1
1
0
1
1
0
0
C
30
A simple example of correction (2)
B
A

One of the bits change value
Using control bit the right value is restored

1
1
0
1
0
0
0
C
B
A
1
1
0
1
1
0
0
C
31
Compare Circuit

it takes two K-length binary strings X, Y as
input
XXKX1
YYKY1
it returns a K-length binary string Z (syndrome)
ZZKZ1
ZiXi ? Yi for each i1,,K
Z00 means no error

32
Relation between M and K

Z may assume 2K values
the value Z00 means no error
the error may be in any bit among the MK bits
it must be

2K -1 ? MK
Data bits (M) Control Bits (K) Additional Memory ()
4 3 75
8 4 50
16 5 31,25
32 6 18,75
64 7 10,94
128 8 6,25
256 9 3,52
33
How to arrange the MK bits

the MK bits are arranged so that
if Z contains a single bit equal to 1
error occured in the corresponding control bit
if Z contains more than one bit equal to 1
error occured in the i-th bit where i is the
value (in binary) of Z

34
The case M4
bit position 7 6 5 4 3 2 1
position number 111 110 101 100 011 010 001
data bits D4 D3 D2 D1
control bits C4 C2 C1
D1
C1 D1 ? D2 ? D4 C2 D1 ? D3 ? D4 C4 D2 ? D3 ? D4
C1
C2
D4
D2
D3
C4
35
Exercise

Design a Hamming error correcting code for
8-bit words
See the textbook for the solution

36
Cache

Small amount of fast memory
Sits between normal main memory and CPU
May be located on CPU chip or module

37
Cache operation - overview

CPU requests contents of memory location
Check cache for this data
If present (hit), get from cache (fast)
If not present (miss), read required block from
main memory to cache
Then deliver from cache to CPU

38
Cache Performance

Cache access time t1
Memory access time T10
Hit Probability H
Taverage accesstH(Tt)(1-H)t(1-H)T

T average access
H
39
Locality of Reference (Denning68)

Spatial Locality
Memory cells physically close to those just
accessed tend to be accessed
Temporal Locality
During the course of the execution of a program,
all accesses to the same memory cells tend to
close in time
e.g. loops, arrays

40
Typical Cache Organization
41
Cache Design

Size
Mapping Function
Replacement Algorithm
Write Policy
Block Size
Number of Caches

42
Size does matter

Cost
More cache is expensive
Speed
More cache is faster (up to a point)
Checking cache for data takes time

43
Cache-memory mapping

There are M2n/K blocks
C ltlt M
Each block is mapped to a cache line

44
Mapping Function

Word size 1 Byte
Cache of 64KBytes (216 Bytes)
Cache block of 4 bytes
64 KB/4 16K (214) lines of 4 bytes
16MBytes (224) main memory
224/4 4M (222) blocks in main memory
Map 222 blocks to 214 lines of cache

45
A simple example of Direct Mapping
w
r
s-r
00000 00001 00010 00011 00100 00101 00110 00111
01000 01001 01010 .. .. .. 11110 11111

Block 0
Line 0

Block 1
Line 1

Block 2
Line 2

Block 3
Line 3

Block 4
Line 0

Block 15
Line 3
46
Direct Mapping (1)

Each block of main memory is mapped to a specific
cache line
i.e. if a block is in cache, it must be in one
specific place
In a cache of C lines block j is stored into line
i, where i j mod C

47
Direct Mapping (2)

Address is in two parts
w Least Significant Bits (LSB) identify unique
word
s Most Significant Bits (MSB) specify one memory
block
The MSBs are split into
a cache line field r (least significant)
a tag of s-r (most significant)

48
Direct Mapping Summarizing

address length nsw bits
number of addressable units (words) 2sw
block sizecache line size 2w words
number of memory bocks 2sw/2w 2s
number of cache lines C 2r
tag length (s-r) bits

49
Cache Line Mapping Table

Cache line Main Memory blocks held
0 0, C, 2C, ,2s-C
1 1, C1, 2C1, ,
2s-C1
C-1 C-1, 2C-1, 3C-1, ,
2s-1

50
Direct MappingAddress Structure
Tag s-r
Line or Slot r
Word w
14
2
8

24 bit address 16MBytes (224) main memory
2 bit word identifier (4 byte block)
Cache 64 KB/4 16K (214) lines of 4 bytes
22 bit block identifier
8 bit tag (22-14)
14 bit slot or line
No two blocks mapping to the same line have the
same Tag field
Check contents of cache by finding line and
checking Tag

51
Direct Mapping Cache Organization
52
Direct Mapping pros cons

Simple
Inexpensive
Fixed location for given block
If a program repeatedly accesses 2 distinct
blocks that are mapped to the same line, cache
misses are very high (thrashing)

53
Associative Mapping

A main memory block can load into any line of
cache
Memory address is interpreted as tag and word
Tag uniquely identifies block of memory
Every lines tag is examined for a match
Cache searching gets expensive

54
A simple example of Associative Mapping
w
s
00000 00001 00010 00011 00100 00101 00110 00111
01000 01001 01010 .. .. .. 11110 11111

Block 0

Block 1
w0 w1

Block 2
Line 0 Line 1 Line 2 Line 3
0011 0001 0000 0100

Block 3

Block 4
Note a replacement algorithm is needed (see
later)

Block 15
55
Associative Mapping Summarizing

address length nsw
number of addressable units (words) 2sw
block sizecache line size 2w words
number of memory bocks 2sw/2w 2s
number of cache lines not specified
tag length s bits

56
Associative MappingAddress Structure
Word 2 bit
Tag 22 bit

22 bit tag stored with each 4 byte block of data
Compare tag field with tag entry in cache to
check for hit
Least significant 2 bits of address identify
which byte is required from the 4 byte data block

57
Fully Associative Cache Organization
58
Set Associative Mapping

Cache is divided into v sets
Each set contains k lines
number of cache lines Cv?k
A given block maps to any line in a given set
Block j can be in any line of set i, where ij
mod v
There are k lines in a set (k-way set associative
mapping)
k1 direct mapping kC associative mapping
The best choice in practice is 2 lines per set
2 way associative mapping
A given block can be in only one set, but in any
of its 2 lines

59
A simple example of Set Associative Mapping
d
w
s-d
00000 00001 00010 00011 00100 00101 00110 00111
01000 01001 01010 .. .. .. 11110 11111

Block 0
Set 0

Block 1
Set 1
w0 w1

Block 2
Line 0 Line 1 Line 2 Line 3
010 000 111 000
Set 0

Set 0

Block 3
Set 1

Set 1

Block 4
Set 0
Note a replacement algorithm is needed (see
later)

Block 15
Set 1
60
Set Associative Mapping

Address is in two parts
w Least Significant Bits (LSB) identify unique
word
s Most Significant Bits (MSB) specify one memory
block
The MSBs are split into
a cache set field d (least significant)
a tag of s-d (most significant)

61
Set Associative Mapping Summarizing

address length nsw bits
number of addressable units (words) 2sw
block sizecache line size 2w words
number of memory bocks 2sw/2w 2s
number of lines for each cache set k
number of sets v 2d
number of cache lines C k v k 2d
tag length (s -d) bits

62
Set Associative MappingAddress Structure
Word 2 bit
Tag 9 bit
Set 13 bit

number of cache lines 214
number of cache sets 213
each cache set has two lines 2-way set
associative mapping
Use set field to determine cache set to look in
Compare Tag field with all lines in the set to
see if we have a hit

63
Two Way Set Associative Cache Organization
64
Replacement Algorithms (1)Direct mapping

No choice
Each block only maps to one line
Replace that line

65
Replacement Algorithms (2)Associative Set
Associative

Hardware implemented algorithm (to obtain speed)
Least Recently used (LRU)
e.g. in 2 way set associative
Which of the 2 blocks is LRU?
First in first out (FIFO)
replace block that has been in cache longest
Least frequently used
replace block which has had fewest hits
Random
Almost as good as LRU

66
Write Policy

Must not overwrite a cache block unless main
memory is up to date
Multiple CPUs may have individual caches
I/O may address main memory directly

67
Write through

All writes go to main memory as well as cache
Multiple CPUs can monitor main memory traffic to
keep local (to CPU) cache up to date
Lots of traffic
Slows down writes

68
Write back