Internal Memory - PowerPoint PPT Presentation

About This Presentation
Title:

Internal Memory

Description:

William Stallings Computer Organization and Architecture Chapter 4 & 5 Cache Memory and Internal Memory Computer Components: Top Level View Memory How much ? – PowerPoint PPT presentation

Number of Views:129
Avg rating:3.0/5.0
Slides: 73
Provided by: adria230
Category:

less

Transcript and Presenter's Notes

Title: Internal Memory


1
William Stallings Computer Organization and
Architecture
Chapter 4 5 Cache Memory and Internal Memory
2
Computer Components Top Level View
registers
3
Memory
  • How much ?
  • As much as possible
  • How fast ?
  • As fast as possible
  • How expensive ?
  • As cheap as possible
  • Fast memory is expensive
  • Large memory is expensive
  • The larger the memory, the slower the access

4
Memory Hierarchy
  • CPU Registers
  • L1 cache (on chip)
  • L2 cache (on board)
  • Main memory
  • Disk cache
  • Disk
  • Optical
  • Tape

Access time
Size
Access Frequency
Cost per bit
5
Characteristics
  • Location
  • Capacity
  • Unit of transfer
  • Access method
  • Performance
  • Physical type
  • Physical characteristics
  • Organisation

6
Location
  • CPU
  • Registers
  • Internal access directly from CPU
  • Cache
  • RAM
  • External access through I/O module
  • Disks
  • CD-ROM,

7
Capacity
  • Word size
  • The natural unit of organisation
  • Usually, it is equal to the numer of bits used
    for representing numbers or instructions
  • Typical word size 8 bits, 16 bits, 32 bits
  • Number of words (or Bytes)
  • 1 Byte 8 bits 23 bits
  • 1 K Byte 210 Bytes 210 x 23 bits 1024
    bytes (Kilo)
  • 1 M Byte 210 K Bytes 1024 K Bytes (Mega)
  • 1 G Byte 210 M Bytes 230 Bytes (Giga)
  • 1 T Byte 210 G Bytes 1024 G Bytes (Tera)

8
Unit of Transfer
  • Number of bits can be read/written at the same
    time
  • Internal
  • Usually governed by data bus width
  • bus width may be equal to word size or (often)
    larger
  • Typical bus width 64, 128, 256 bits
  • External
  • Usually a block which is much larger than a word
  • A related concept addressable unit
  • Smallest location which can be uniquely addressed
  • Word internally
  • Cluster on M disks

9
Access Methods (1)
  • Sequential
  • Start at the beginning and read through in order
  • Access time depends on location of data and
    previous location
  • e.g. tape
  • Direct
  • Individual blocks have unique address
  • Access is by jumping to vicinity plus sequential
    search
  • Access time depends on location and previous
    location
  • e.g. disk

10
Access Methods (2)
  • Random
  • Individual addresses identify locations exactly
  • Access time is independent of location or
    previous access
  • e.g. RAM
  • Associative
  • Data is located by a comparison with contents of
    a portion of the store
  • Access time is independent of location or
    previous access
  • e.g. cache

11
Performance
  • Access time
  • Time between presenting the address and getting
    the valid data
  • Memory Cycle time
  • Time may be required for the memory to recover
    before next access
  • Cycle time is access recovery
  • Transfer Rate
  • Rate at which data can be moved
  • TNTA N/R

N number of bits TA access
time TN time need to read N bits R
transfer rate
12
Physical Types
  • Semiconductor
  • RAM, ROM, EPROM, Cache
  • Magnetic
  • Disk Tape
  • Optical
  • CD DVD
  • Others

13
Semiconductor Memory
  • RAM (Random Access Memory)
  • Misnamed as all semiconductor mem. are random
    access
  • Read/Write
  • Volatile
  • Temporary storage
  • Static or dynamic
  • ROM (Read only memory)
  • Permanent storage
  • Read only

14
Dynamic RAM
  • Bits stored as charge in capacitors
  • Charges leak
  • Need refreshing even when powered
  • Simpler construction
  • Smaller per bit
  • Less expensive
  • Need refresh circuits
  • Slower
  • Main memory (static RAM would be too expensive)

15
Static RAM
  • Bits stored as on/off switches
  • No charges to leak
  • No refreshing needed when powered
  • More complex construction
  • Larger per bit
  • More expensive
  • Does not need refresh circuits
  • Faster
  • Cache (here the faster the better)

16
Read Only Memory (ROM)
  • Permanent storage
  • Microprogramming (see later)
  • Library subroutines
  • Systems programs (BIOS)
  • Function tables

17
Types of ROM
  • Written during manufacture
  • Very expensive for small runs
  • Programmable (once)
  • PROM
  • Needs special equipment to program
  • Read mostly
  • Erasable Programmable (EPROM)
  • Erased by UV (it can take up to 20 minuts)
  • Electrically Erasable (EEPROM)
  • Takes much longer to write than read
  • a single byte can be erased
  • Flash memory
  • Erase memory electrically block-at-a-time

18
Physical Characteristics
  • Decay (refresh time)
  • Volatility (needs power source)
  • Erasable
  • Power consumption

19
Organisation
  • Physical arrangement of bits into words
  • Not always obvious
  • e.g. interleaved

20
Basic Organization (1)
  • Basic element memory cell
  • has 2 stable states one represent 0, the other 1
  • can be written at least once
  • can be read

Write
Read
R/W Control
R/W Control
Cell
Cell
Select
Select
Input Data
Output Data
21
Basic Organization (2)
  • Basic organization of a 512x512 bits chip

Timing and control
Array of Memory Cells (512x512)
Row Address Decoder
A0
9
A8
D0
1
Sense Amplifier and I/O Gate
A9
9
Column Address Decoder
A17
22
Module Organisation
  • Basic organization of a 256KB chip
  • 8 times a 512x512 bits chip
  • For a 1 MB chip replicate 4 times this
    organization

23
Module Organisation (1 MByte)
24
Organisation for larger sizes
  • The larger the size the higher the number of
    address pins
  • For 2k words, k pins are needed
  • A solution to reduce the number of address pins
  • Multiplex row address and column address
  • k/2 pins to address 2k Bytes
  • Adding one more pin doubles range of values so x4
    capacity

25
Typical 16 Mb DRAM (4M x 4)
X
X
26
Refreshing (Dynamic RAM)
  • Refresh circuit included on chip
  • Disable chip
  • Count through rows
  • Read Write back
  • Takes time
  • Slows down apparent performance

27
Packaging
X
28
Error Correction
  • Hard Failure
  • Permanent defect
  • Soft Error
  • Random, non-destructive
  • No permanent damage to memory
  • Detected using Hamming error correcting code
  • it is able to detect and correct 1-bit errors

29
Error Correcting Code Function
30
A simple example of correction (1)
B
A
  • Correcting errors in 4 bits words
  • 3 control groups
  • In each control group add 1 parity bit

1
1
1
0
C
B
A
1
1
0
1
1
0
0
C
31
A simple example of correction (2)
B
A
  • One of the bits change value
  • Using control bit the right value is restored

1
1
0
1
0
0
0
C
B
A
1
1
0
1
1
0
0
C
32
Compare Circuit
  • it takes two K-length binary strings X, Y as
    input
  • XXKX1
  • YYKY1
  • it returns a K-length binary string Z (syndrome)
  • ZZKZ1
  • ZiXi ? Yi for each i1,,K
  • Z00 means no error

33
Relation between M and K
  • Z may assume 2K values
  • the value Z00 means no error
  • the error may be in any bit among the MK bits
  • it must be

2K -1 ? MK
Data bits (M) Control Bits (K) Additional Memory ()
4 3 75
8 4 50
16 5 31,25
32 6 18,75
64 7 10,94
128 8 6,25
256 9 3,52
34
How to arrange the MK bits
  • the MK bits are arranged so that
  • If Z?0, error occured in the i-th bit where i is
    the value (in binary) of Z

35
The case M4
bit position 7 6 5 4 3 2 1
position number 111 110 101 100 011 010 001
data bits D4 D3 D2 D1
control bits C4 C2 C1
D1
C1 D1 ? D2 ? D4 C2 D1 ? D3 ? D4 C4 D2 ? D3 ? D4
C1
C2
D4
D2
D3
C4
36
Exercise
  • Design a Hamming error correcting code for
    8-bit words
  • See the textbook for the solution

37
Cache
  • Small amount of fast memory
  • Sits between normal main memory and CPU
  • May be located on CPU chip or module

38
Cache operation - overview
  • CPU requests contents of memory location
  • Check cache for this data
  • If present (hit), get from cache (fast)
  • If not present (miss), read required block from
    main memory to cache
  • Then deliver from cache to CPU

39
Cache Performance
  • Cache access time t1
  • Memory access time T10
  • Hit Probability H
  • Taverage accesstH(Tt)(1-H)t(1-H)T

T average access
H
40
Locality of Reference (Denning68)
  • Spatial Locality
  • Memory cells physically close to those just
    accessed tend to be accessed
  • Temporal Locality
  • During the course of the execution of a program,
    all accesses to the same memory cells tend to
    close in time
  • e.g. loops, arrays

41
An example
200 201 202 SUB X, Y 203 BRZ 211
210 BRA 202 211 225 BR
E R1, R2, 235 235
unconditional branch
conditional branch
conditional branch
42
Typical Cache Organization
43
Cache Design
  • Size
  • Mapping Function
  • Replacement Algorithm
  • Write Policy
  • Block Size
  • Number of Caches

44
Size does matter
  • Cost
  • More cache is expensive
  • Speed
  • More cache is faster (up to a point)
  • Checking cache for data takes time

45
Cache-memory mapping
  • There are M2n/K blocks
  • C ltlt M
  • Each block is mapped to a cache line

46
A simple example of Direct Mapping
w
r
s-r
00000 00001 00010 00011 00100 00101 00110 00111
01000 01001 01010 .. .. .. 11110 11111

Block 0
Line 0

Block 1
Line 1

Block 2
Line 2

Block 3
Line 3

Block 4
Line 0

Block 15
Line 3
47
Direct Mapping (1)
  • Each block of main memory is mapped to a specific
    cache line
  • i.e. if a block is in cache, it must be in one
    specific place
  • In a cache of C lines, block j is stored into
    line i, where i j mod C

48
Direct Mapping (2)
  • Address is in two parts
  • w Least Significant Bits (LSB) identify unique
    word
  • s Most Significant Bits (MSB) specify one memory
    block
  • The MSBs are split into
  • a cache line field r (least significant)
  • a tag of s-r (most significant)

49
Direct Mapping Summarizing
  • address length nsw bits
  • number of addressable units (words) 2sw
  • block sizecache line size 2w words
  • number of memory bocks 2sw/2w 2s
  • number of cache lines C 2r
  • tag length (s-r) bits

50
Cache Line Mapping Table
  • Cache line Main Memory blocks held
  • 0 0, C, 2C, ,2s-C
  • 1 1, C1, 2C1, ,
    2s-C1
  • C-1 C-1, 2C-1, 3C-1, ,
    2s-1

51
Mapping Function
  • Word size 1 Byte
  • Cache of 64KBytes (216 Bytes)
  • Cache block of 4 bytes
  • 64 KB/4 16K (214) lines of 4 bytes
  • 16MBytes (224) main memory
  • 224/4 4M (222) blocks in main memory
  • Map 222 blocks to 214 lines of cache

52
Direct MappingAddress Structure
Tag s-r
Line or Slot r
Word w
14
2
8
  • 24 bit address 16MBytes (224) main memory
  • 2 bit word identifier (4 byte block)
  • Cache 64 KB/4 16K (214) lines of 4 bytes
  • 22 bit block identifier
  • 8 bit tag (22-14)
  • 14 bit slot or line
  • No two blocks mapping to the same line have the
    same Tag field
  • Check contents of cache by finding line and
    checking Tag

53
Direct Mapping Cache Organization
54
Direct Mapping pros cons
  • Simple
  • Inexpensive
  • Fixed location for given block
  • If a program repeatedly accesses 2 distinct
    blocks that are mapped to the same line, cache
    misses are very high (thrashing)

55
Associative Mapping
  • A main memory block can load into any line of
    cache
  • Memory address is interpreted as tag and word
  • Tag uniquely identifies block of memory
  • Every lines tag is examined for a match
  • Cache searching gets expensive

56
A simple example of Associative Mapping
w
s
00000 00001 00010 00011 00100 00101 00110 00111
01000 01001 01010 .. .. .. 11110 11111


Block 0


Block 1
w0 w1

Block 2
Line 0 Line 1 Line 2 Line 3
0011 0001 0000 0100


Block 3


Block 4
Note a replacement algorithm is needed (see
later)

Block 15
57
Associative Mapping Summarizing
  • address length nsw
  • number of addressable units (words) 2sw
  • block sizecache line size 2w words
  • number of memory bocks 2sw/2w 2s
  • number of cache lines not specified
  • tag length s bits

58
Associative MappingAddress Structure
Word 2 bit
Tag 22 bit
  • 22 bit tag stored with each 4 byte block of data
  • Compare tag field with tag entry in cache to
    check for hit
  • Least significant 2 bits of address identify
    which byte is required from the 4 byte data block

59
Fully Associative Cache Organization
60
Set Associative Mapping
  • Cache is divided into v sets
  • Each set contains k lines
  • number of cache lines Cv?k
  • A given block maps to any line in a given set
  • Block j can be in any line of set i, where ij
    mod v
  • There are k lines in a set (k-way set associative
    mapping)
  • k1 direct mapping kC associative mapping
  • The best choice in practice is 2 lines per set
  • 2 way associative mapping
  • A given block can be in only one set, but in any
    of its 2 lines

61
A simple example of Set Associative Mapping
d
w
s-d
00000 00001 00010 00011 00100 00101 00110 00111
01000 01001 01010 .. .. .. 11110 11111

Block 0
Set 0

Block 1
Set 1
w0 w1

Block 2
Line 0 Line 1 Line 2 Line 3
010 000 111 000
Set 0

Set 0

Block 3
Set 1

Set 1

Block 4
Set 0
Note a replacement algorithm is needed (see
later)

Block 15
Set 1
62
Set Associative Mapping
  • Address is in two parts
  • w Least Significant Bits (LSB) identify unique
    word
  • s Most Significant Bits (MSB) specify one memory
    block
  • The MSBs are split into
  • a cache set field d (least significant)
  • a tag of s-d (most significant)

63
Set Associative Mapping Summarizing
  • address length nsw bits
  • number of addressable units (words) 2sw
  • block sizecache line size 2w words
  • number of memory bocks 2sw/2w 2s
  • number of lines for each cache set k
  • number of sets v 2d
  • number of cache lines C k v k 2d
  • tag length (s -d) bits

64
Set Associative MappingAddress Structure
Word 2 bit
Tag 9 bit
Set 13 bit
  • number of cache lines 214
  • number of cache sets 213
  • each cache set has two lines 2-way set
    associative mapping
  • Use set field to determine cache set to look in
  • Compare Tag field with all lines in the set to
    see if we have a hit

65
Two Way Set Associative Cache Organization
66
Replacement Algorithms (1)Direct mapping
  • No choice
  • Each block only maps to one line
  • Replace that line

67
Replacement Algorithms (2)Associative Set
Associative
  • Hardware implemented algorithm (to obtain speed)
  • Least Recently used (LRU)
  • e.g. in 2 way set associative
  • Which of the 2 blocks is LRU?
  • First in first out (FIFO)
  • replace block that has been in cache longest
  • Least frequently used
  • replace block which has had fewest hits
  • Random
  • Almost as good as LRU

68
Write Policy
  • Multiple CPUs may have individual caches
  • I/O may address main memory directly

cache(s) and main memory may become
non-consistent
69
Write through
  • All writes go to main memory as well as cache
  • Multiple CPUs can monitor main memory traffic to
    keep local (to CPU) cache up to date
  • Lots of traffic
  • Slows down writes

70
Write back
  • Updates initially made in cache only
  • Update bit for cache slot is set when update
    occurs
  • If block has to be replaced, write to main memory
    only if update bit is set
  • I/O must access main memory through cache
  • N.B. 15 of memory references are writes
  • Caches of other devices get out of sync
  • Cache coherency problem (a general problem in
    distributed systems !)

71
Block Size
  • Too small
  • Locality of reference is not used
  • Too large
  • Locality of reference is lost
  • Typical block size 8 32 bytes

72
Number of Caches
  • 2 levels of cache
  • L1 on chip (since technology allows it)
  • L2 on board (to fill the speed gap)
  • 2 kinds of cache
  • Data cache
  • Instruction cache
  • To allow instruction parallel processing and data
    fetching interfere
Write a Comment
User Comments (0)
About PowerShow.com