Virtual Memory - PowerPoint PPT Presentation

About This Presentation

Title:

Virtual Memory

Description:

... to a physical address by a combination of hardware and software ... From 4 KB to 16 KB are typical, and some designers are considering size as large as 64 KB. ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 52

Provided by: eecis

Learn more at: https://www.eecis.udel.edu

Category:

more less

Transcript and Presenter's Notes

Title: Virtual Memory

1
Virtual Memory
2
Review The memory hierarchy

Take advantage of the principle of locality to
present the user with as much memory as is
available in the cheapest technology at the speed
offered by the fastest technology

Processor
Increasing distance from the processor in access
time
L1
L2
Main Memory
Secondary Memory
(Relative) size of the memory at each level
3
Virtual memory

Use main memory as a cache for secondary memory
Allows efficient and safe sharing of memory among
multiple programs
Provides the ability to easily run programs
larger than the size of physical memory
Automatically manages the memory hierarchy (as
one-level)
What makes it work? again the Principle of
Locality
A program is likely to access a relatively small
portion of its address space during any period of
time
Each program is compiled into its own address
space a virtual address space
During run-time each virtual address must be
translated to a physical address (an address in
main memory)

4
IBM System/350 Model 67
5
VM simplifies loading and sharing

Simplifies loading a program for execution by
avoiding code relocation
Address mapping allows programs to be load in any
location in physical memory
Simplifies shared libraries, since all sharing
programs can use the same virtual addresses
Relocation does not need special OS hardware
support as in the past

6
Virtual memory motivation

Historically, there were two major motivations
for virtual memory to allow efficient and safe
sharing of memory among multiple programs, and to
remove the programming burden of a small, limited
amount of main memory.
PattHenn
a system has been devised to make the core
drum combination appear to programmer as a single
level store, the requisite transfers taking place
automatically
Kilbum et al.

7
Terminology

Page fixed sized block of memory 512-4096 bytes
Segment contiguous block of segments
Page fault a page is referenced, but not in
memory
Virtual address address seen by the program
Physical address address seen by the cache or
memory
Memory mapping or address translation next slide

8
Memory management unit
from Processor
to Memory
9
Address translation

A virtual address is translated to a physical
address by a combination of hardware and software

Virtual Address (VA)
31 30 . . .
12 11 . .
. 0
Page offset
Virtual page number

So each memory request first requires an address
translation from the virtual space to the
physical space
A virtual memory miss (i.e., when the page is not
in physical memory) is called a page fault

10
Mapping virtual to physical space
64K virtual address space 32K main memory
Main memory address
Virtual address
4K
4K
(b)
(a)
11
A paging system
Physical memory
Virtual page number
Page table
The page table maps each page in virtual memory
to either a page in physical memory or a page
stored on disk, which is the next level in the
hierarchy.
12
A virtual address cache (TLB)
The TLB acts as a cache on the page table for
the entries that map to physical pages only
13
Two Programs Sharing Physical Memory

A programs address space is divided into pages
(all one fixed size) or segments (variable sizes)
The starting location of each page (either in
main memory or in secondary memory) is contained
in the programs page table

Program 1 virtual address space
main memory
Program 2 virtual address space
14
Typical ranges of VM parameters

These figures, contrasted with the values for
caches, represent increases of 10 to 100,000
times.

15
Some virtual memory design parameters
16
Technology

Technology Access Time per GB in 2004
SRAM 0.5 5ns 4,000 10,000
DRAM 50 - 70ns 100 - 200
Magnetic disk 5 -20 x 106ns 0.5 - 2

17
Address Translation Consideration

Direct mapping using register sets
Indirect mapping using tables
Associative mapping of frequently used pages

18
Fundamental considerations

The Page Table (PT) must have one entry for each
page in virtual memory!
How many Pages?
How large is PT?

19
4 key design issues

Pages should be large enough to amortize the high
access time. From 4 KB to 16 KB are typical, and
some designers are considering size as large as
64 KB.
Organizations that reduce the page fault rate are
attractive. The primary technique used here is to
allow flexible placement of pages. (e.g. fully
associative)

20
4 key design issues (cont.)

Page fault (misses) in a virtual memory system
can be handled in software, because the overhead
will be small compared to the access time to
disk. Furthermore, the software can afford to
used clever algorithms for choosing how to place
pages, because even small reductions in the miss
rate will pay for the cost of such algorithms.
Using write-through to manage writes in virtual
memory will not work since writes take too long.
Instead, we need a scheme that reduce the number
of disk writes.

21
Page Size Selection Constraints

Efficiency of secondary memory device (slotted
disk/drum)
Page table size
Page fragmentation last part of last page
Program logic structure logic block size lt 1K
4K
Table fragmentation full PT can occupy large,
sparse space
Uneven locality text, globals, stack
Miss ratio

22
An Example

Case 1
VM page size 512
VM address space 64K
Total virtual page 128 pages

64K 512
23
An Example (cont.)

Case 2
VM page size 512 29
VM address space 4G 232
Total virtual page 8M pages
Each PTE has 32 bits so total PT size
8M x 4 32M bytes
Note assuming main memory has working set
4M byte or 213
8192 pages

4G 512

4M 512
222 29
24
An Example (cont.)

How about
VM address space 252 (R-6000)
(4 Petabytes)
page size 4K bytes
so total number of virtual pages

252 212
240 !
25
Techniques for Reducing PT Size

Set a lower limit, and permit dynamic growth
Permit growth from both directions (text, stack)
Inverted page table (a hash table)
Multi-level page table (segments and pages)
PT itself can be paged ie., put PT itself in
virtual address space (Note some small portion
of pages should be in main memory and never paged
out)

26
LSI-11/73 Segment Registers
27
VM implementation issues

Page fault handling hardware, software or both
Efficient input/output slotted drum/disk
Queue management. Process can be linked on
CPU ready queue waiting for the CPU
Page in queue waiting for page transfer from
disk
Page out queue waiting for page transfer to disk
Protection issues read/write/execute
Management bits dirty, reference, valid.
Multiple program issues context switch,
timeslice end

28
Where to place pages

Placement
OS designers always pick lower miss rates vs.
simpler placement algorithm
So, fully associativity -
VM pages can go anywhere in the main M (compare
with sector cache)
Question
why not use associative hardware?
( of PT entries too big!)

29
How to handle protection and multiple users
If s/u 1 - supervisor mode PME(x) C 1-page
PFA modified PME(x) P 1-page is private to
process PME(x) pid is process identification
number PME(x) PFA is page frame address
Virtual to read address translation using page map
30
Page fault handling

When a virtual page number is not in TLB, then PT
in M is accessed (through PTBR) to find the PTE
Hopefully, the PTE is in the data cache
If PTE indicates that the page is missing a page
fault occurs
If so, put the disk sector number and page number
on the page-in queue and continue with the next
process
If all page frames in main memory are occupied,
find a suitable one and put it on the page-out
queue

31
Fast address translation

PT must involve at least two accesses of memory
for each memory fetch or store
Improvement
Store PT in fast registers example Xerox 256
regs
Implement VM address cache (TLB)
Make maximal use of instruction/data cache

32
Some typical values for a TLB might be
Miss penaly some time may be as high as upto 100
cycles. TLB size can be as long as 16 entries.
33
TLB design issues

Placement policy
Small TLBs full-associative can be used
large TLBs full-associative may be too slow
Replacement policy random policy is used for
speed/simplicity
TLB miss rate is low (Clark-Emer data 85 34
times smaller then usual cache miss rate
TLB miss penalty is relatively low it usually
results in a cache fetch

34
TLB design issues (cont.)
contd

TLB-miss implies higher miss rate for the main
cache
TLB translation is process-dependent
strategies for context switching
1. tagging by context
2. flushing

complete purge by
context (shared)
No absolute answer
35
A Case Study DECStation 3100
Virtual address
31 30 29 28 27 .....15 14 13 12 11 10 9 8
..3 2 1 0
Virtual page number Page offset
12
20
TLB
20
TLB hit
Physical address
16
Tag
14
2
Index
Byte offset
Valid Tag Data
Cache
32
Cache hit
Data
36
DECStation 3100 TLB and cache
37
IBM System/360-67 memory management unit
CPU cycle time 200 nsMem cycle time 750 ns
38
IBM System/360-67 address translation
Offset (12)
Page (12)
Bus-out Address (from CPU)
Segment (12)
Offset (12)
Virtual Address (32)
Page (8)
Dynamic Address Translation (DAT)
Offset (12)
Page (12)
Bus-in Address (to memory)
39
IBM System/360-67 associative registers
Offset (12)
VM Page (12)
Bus-out Address (from CPU)
115
22
5
59
31
88
44
45
9
110
130
41
77
7
12
27
Offset (12)
PH Page (12)
Bus-in Address (to memory)
40
IBM System/360-67 segment/page mapping
Virtual Address (24)
(4)
Offset (12)
Page (8)
Segment Table Reg (32)

Segment Table
Phys Page (24 bit addr)
Page Table 2
0
VRW
0
0
1
Virtual Page (32 bit addr)
VRW
1
1
2
0

3
2
VRW
255
1
4
3

4
Page Table 4
1,048,575
4095
5
VRW
0

VRW
1
4095
V Valid bitR Reference BitW Write (dirty)
Bit

VRW
255
41
Virtual addressing with a cache

Thus it takes an extra memory access to translate
a VA to a PA

This makes memory (cache) accesses very expensive
(if every access was really two accesses)
The hardware fix is to use a Translation
Lookaside Buffer (TLB) a small cache that keeps
track of recently used address mappings to avoid
having to do a page table lookup

42
Making address translation fast
Virtual page
Physical page base addr
V
1 1 1 1 1 1 0 1 0 1 0
Main memory
Page Table (in physical memory)
Disk storage
43
Translation lookaside buffers (TLBs)

Just like any other cache, the TLB can be
organized as fully associative, set associative,
or direct mapped

V Virtual Page Physical Page
Dirty Ref Access

TLB access time is typically smaller than cache
access time (because TLBs are much smaller than
caches)
TLBs are typically not more than 128 to 256
entries even on high end machines

44
A TLB in the memory hierarchy

A TLB miss is it a page fault or merely a TLB
miss?
If the page is loaded into main memory, then the
TLB miss can be handled (in hardware or software)
by loading the translation information from the
page table into the TLB
Takes 10s of cycles to find and load the
translation info into the TLB
If the page is not in main memory, then its a
true page fault
Takes 1,000,000s of cycles to service a page
fault
TLB misses are much more frequent than true page
faults

45
Two Machines Cache Parameters
46
TLB Event Combinations
47
TLB Event Combinations
Yes what we want!
Yes although the page table is not checked if
the TLB hits
Yes TLB miss, PA in page table
Yes TLB miss, PA in page table, but data not in
cache
Yes page fault
Impossible TLB translation not possible if page
is not present in memory
Impossible data not allowed in cache if page
is not in memory
48
Reducing Translation Time

Can overlap the cache access with the TLB access
Works when the high order bits of the VA are used
to access the TLB while the low order bits are
used as index into cache

Block offset
2-way Associative Cache
Index
PA Tag
VA Tag
Tag
Data
Tag
Data
PA Tag
TLB Hit

Cache Hit
Desired word
49
Why Not a Virtually Addressed Cache?

A virtually addressed cache would only require
address translation on cache misses

but
Two different virtual addresses can map to the
same physical address (when processes are sharing
data), i.e., two different cache entries hold
data for the same physical address synonyms
Must update all cache entries with the same
physical address or the memory becomes
inconsistent

50
The Hardware/Software Boundary

What parts of the virtual to physical address
translation is done by or assisted by the
hardware?
Translation Lookaside Buffer (TLB) that caches
the recent translations
TLB access time is part of the cache hit time
May allot an extra stage in the pipeline for TLB
access
Page table storage, fault detection and updating
Page faults result in interrupts (precise) that
are then handled by the OS
Hardware must support (i.e., update
appropriately) Dirty and Reference bits (e.g.,
LRU) in the Page Tables
Disk placement
Bootstrap (e.g., out of disk sector 0) so the
system can service a limited number of page
faults before the OS is even loaded

51
Very little hardware with software assisst
Software
The TLB acts as a cache on the page table for
the entries that map to physical pages only
52
Summary

The Principle of Locality
Program likely to access a relatively small
portion of the address space at any instant of
time.
Temporal Locality Locality in Time
Spatial Locality Locality in Space
Caches, TLBs, Virtual Memory all understood by
examining how they deal with the four questions
Where can block be placed?
How is block found?
What block is replaced on miss?
How are writes handled?
Page tables map virtual address to physical
address
TLBs are important for fast translation