CS 162 Ch 7: Virtual Memory LECTURE 13 - PowerPoint PPT Presentation

About This Presentation

Title:

CS 162 Ch 7: Virtual Memory LECTURE 13

Description:

... multiple programs to use (different chunks of physical) memory at same ... Any chunk of Virtual Memory assigned to any chunk of Physical Memory ('page') Stack ... – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 21

Provided by: davep173

Learn more at: http://www.cs.ucr.edu

Category:

more less

Transcript and Presenter's Notes

Title: CS 162 Ch 7: Virtual Memory LECTURE 13

1
CS 162Ch 7 Virtual Memory LECTURE 13

Instructor L.N. Bhuyan
www.cs.ucr.edu/bhuyan

2
Improving Cache Miss Latency-Reducing DRAM Latency

Same as improving DRAM latency
What is random access memory (RAM)? What are
static RAM (SRAM) and dynamic RAM (DRAM)?
What is DRAM Cell organization? How are the cells
arranged internally? Memory addressing?
Refreshing of DRAMs? Difference between DRAM and
SRAM?
Access time of DRAM Row access time column
access time refreshing
What are page-mode and nibble-mode DRAMs?
Synchronous SRAM or DRAM Ability to transfer a
burst of data given a starting address and a
burst length suitable for transferring a block
of data from main memory to cache.

3
Main Memory Organizations Fig. 7.13
C
P
U
C
P
U
C
P
U
M
u
l
t
i
p
l
e
x
o
r
C
a
c
h
e
C
a
c
h
e
C
a
c
h
e
B
u
s
B
u
s
B
u
s
M
e
m
o
r
y
M
e
m
o
r
y
M
e
m
o
r
y
M
e
m
o
r
y
M
e
m
o
r
y
b
a
n
k

1
b
a
n
k

2
b
a
n
k

3
b
a
n
k

0
interleaved memory organization
wide memory organization
M
e
m
o
r
y
one-word widememory organization
DRAM access time gtgt bus transfer time
4
Memory Access Time Example

Assume that it takes 1 cycle to send the address,
15 cycles for each DRAM access and 1 cycle to
send a word of data.
Assuming a cache block of 4 words and one-word
wide DRAM (fig. 7.13a), miss penalty 1 4x15
4x1 65 cycles
With main memory and bus width of 2 words (fig.
7.13b), miss penalty 1 2x15 2x1 33
cycles. For 4-word wide memory, miss penalty is
17 cycles. Expensive due to wide bus and control
circuits.
With interleaved memory of 4 memory banks and
same bus width (fig. 7.13c), the miss penalty 1
1x15 4x1 20 cycles. The memory controller
must supply consecutive addresses to different
memory banks. Interleaving is universally adapted
in high-performance computers.

5
Virtual Memory

Idea 1 Many Programs sharing DRAM Memory so that
context switches can occur
Idea 2 Allow program to be written without
memory constraints program can exceed the size
of the main memory
Idea 3 Relocation Parts of the program can be
placed at different locations in the memory
instead of a big chunk.
Virtual Memory
(1) DRAM Memory holds many programs running at
same time (processes)
(2) use DRAM Memory as a kind of cache for disk

6
Virtual Memory has own terminology

Each process has its own private virtual address
space (e.g., 232 Bytes) CPU actually generates
virtual addresses
Each computer has a physical address space
(e.g., 128 MegaBytes DRAM) also called real
memory
Address translation mapping virtual addresses to
physical addresses
Allows multiple programs to use (different chunks
of physical) memory at same time
Also allows some chunks of virtual memory to be
represented on disk, not in main memory (to
exploit memory hierarchy)

7
Mapping Virtual Memory to Physical Memory
Virtual Memory

Divide Memory into equal sizedchunks (say, 4KB
each)

Stack

Any chunk of Virtual Memory assigned to any chunk
of Physical Memory (page)

Physical Memory
64 MB
Single Process
Heap
Static
Code
0
0
8
Handling Page Faults

A page fault is like a cache miss
Must find page in lower level of hierarchy
If valid bit is zero, the Physical Page Number
points to a page on disk
When OS starts new process, it creates space on
disk for all the pages of the process, sets all
valid bits in page table to zero, and all
Physical Page Numbers to point to disk
called Demand Paging - pages of the process are
loaded from disk only as needed

9
Comparing the 2 levels of hierarchy

Cache Virtual Memory
Block or Line Page
Miss Page Fault
Block Size 32-64B Page Size 4K-16KB
Placement Fully AssociativeDirect Mapped,
N-way Set Associative
Replacement Least Recently UsedLRU or
Random (LRU) approximation
Write Thru or Back Write Back
How Managed Hardware SoftwareHardware (Operati
ng System)

10
How to Perform Address Translation?

VM divides memory into equal sized pages
Address translation relocates entire pages
offsets within the pages do not change
if make page size a power of two, the virtual
address separates into two fields
like cache index, offset fields

virtual address
Virtual Page Number
Page Offset
11
Mapping Virtual to Physical Address
Virtual Address
31 30 29 28 27 ..12 11 10
9 8 ... 3 2 1 0
Virtual Page Number
Page Offset
1KB page size
Translation
Page Offset
Physical Page Number
9 8 ... 3 2 1 0
29 28 27 ..12 11 10
Physical Address
12
Address Translation

Want fully associative page placement
How to locate the physical page?
Search impractical (too many pages)
A page table is a data structure which contains
the mapping of virtual pages to physical pages
There are several different ways, all up to the
operating system, to keep this data around
Each process running in the system has its own
page table

13
Address Translation Page Table
Virtual Address (VA)
virtual page nbr
offset
Page Table
...
V
A.R.
P. P. N.

Access Rights
Physical Page Number
Val -id
Physical Memory Address (PA)
...
Page Table is located in physical memory
Access Rights None, Read Only, Read/Write,
Executable
disk
14
Handling Page Faults

A page fault is like a cache miss
Must find page in lower level of hierarchy
If valid bit is zero, the Physical Page Number
points to a page on disk
When OS starts new process, it creates space on
disk for all the pages of the process, sets all
valid bits in page table to zero, and all
Physical Page Numbers to point to disk
called Demand Paging - pages of the process are
loaded from disk only as needed

15
Optimizing for Space

Page Table too big!
4GB Virtual Address Space 4 KB page ? 220 ( 1
million) Page Table Entries ? 4 MB just for Page
Table of single process!
Variety of solutions to tradeoff Page Table size
for slower performance when miss occurs in TLB
Use a limit register to restrict page table
size and let it grow with more pages,Multilevel
page table, Paging page tables, etc.
(Take O/S Class to learn more)

16
How Translate Fast?

Problem Virtual Memory requires two memory
accesses!
one to translate Virtual Address into Physical
Address (page table lookup)
one to transfer the actual data (cache hit)
But Page Table is in physical memory!
Observation since there is locality in pages of
data, must be locality in virtual addresses of
those pages!
Why not create a cache of virtual to physical
address translations to make translation fast?
(smaller is faster)
For historical reasons, such a page table cache
is called a Translation Lookaside Buffer, or TLB

17
Typical TLB Format
Virtual Physical Valid Ref Dirty Access Page
Nbr Page Nbr Rights
data
tag

TLB just a cache of the page table mappings
Dirty since use write back, need to know
whether or not to write page to disk when
replaced
Ref Used to calculate LRU on replacement
TLB access time comparable to cache (much
less than main memory access time)

18
Translation Look-Aside Buffers

TLB is usually small, typically 32-4,096 entries
Like any other cache, the TLB can be fully
associative, set associative, or direct mapped

data
data
virtualaddr.
physicaladdr.
TLB
Cache
Main Memory
miss
hit
hit
Processor
miss
PageTable
Disk Memory
OS FaultHandler
page fault/protection violation
19
DECStation 3100/MIPS R2000
3
1

3
0

2
9

1
5

1
4

1
3

1
2

1
1

1
0

9

8

3

2

1

0

Virtual Address
P
a
g
e

o
f
f
s
e
t
V
i
r
t
u
a
l

p
a
g
e

n
u
m
b
e
r
1
2
2
0
P
h
y
s
i
c
a
l

p
a
g
e

n
u
m
b
e
r
V
a
l
i
d
D
i
r
t
y
T
a
g
TLB
T
L
B

h
i
t
64 entries, fully associative
2
0
P
a
g
e

o
f
f
s
e
t
P
h
y
s
i
c
a
l

p
a
g
e

n
u
m
b
e
r
Physical Address
C
a
c
h
e

i
n
d
e
x
P
h
y
s
i
c
a
l

a
d
d
r
e
s
s

t
a
g
B
y
t
e
1
4
2
1
6
o
f
f
s
e
t
T
a
g
D
a
t
a
V
a
l
i
d
Cache
16K entries, direct mapped
3
2
D
a
t
a
C
a
c
h
e

h
i
t
20
Real Stuff Pentium Pro Memory Hierarchy

Address Size 32 bits (VA, PA)
VM Page Size 4 KB, 4 MB
TLB organization separate i,d TLBs (i-TLB
32 entries, d-TLB 64 entries) 4-way set
associative LRU approximated hardware
handles miss
L1 Cache 8 KB, separate i,d 4-way set
associative LRU approximated 32 byte
block write back
L2 Cache 256 or 512 KB