Virtual Memory

About This Presentation

Title:

Virtual Memory

Description:

Contents of top-level segment registers (for this example) Pointer to top-level table (page table) ... Sometimes, top-level page tables called 'directories' ... – PowerPoint PPT presentation

Number of Views:120

Avg rating:3.0/5.0

Slides: 47

Provided by: ranveer7

Learn more at: https://www.cs.cornell.edu

Category:

Tags: memory | of | page | top | virtual

more less

Transcript and Presenter's Notes

Title: Virtual Memory

1
Virtual Memory
2
Announcements

Prelim coming up in one week
In 203 Thurston, Thursday October 16th,
10101125pm, 1½ hour
Topics Everything up to (and including)
Thursday, October 9th
Lectures 1-13, chapters 1-9, and 13 (8th ed)
Review Session will be this Thursday, October
9th
Time and Location TBD Possibly 630pm 730pm
Nazruls office hours changed for today
1230m - 230pm in Upson 328
Homework 3 due today, October 7th
CS 4410 Homework 2 graded. (Solutions avail via
CMS).
Mean 45 (stddev 5), High 50 out of 50
Common problems
Q1 did not satisfy bounded waiting
mutual exclusion was not violated

2
3
Homework 2, Question 1

Dekkers Algorithm (1965)

CSEnter(int i) insidei true while(
insideJ) if (turn J) inside
i false while(turn J) continue
insidei true
CSEnter(int i) insidei true while(
insideJ) insidei false
while(turn J) continue insidei tr
ue

CSExit(int i)
turn J
insidei false

4
Review Multi-level Translation

Illusion of a contiguous address space
Physicall reality
address space broken into segments or fixed-size
pages
Segments or pages spread throughout physical
memory
Could have any number of levels. Example (top
segment)
What must be saved/restored on context switch?
Contents of top-level segment registers (for this
example)
Pointer to top-level table (page table)

4
5
Review Two-Level Page Table

Tree of Page Tables
Tables fixed size (1024 entries)
On context-switch save single PageTablePtr
register
Sometimes, top-level page tables called
directories (Intel)
Each entry called a (surprise!) Page Table Entry
(PTE)

5
6
What is in a PTE?

What is in a Page Table Entry (or PTE)?
Pointer to next-level page table or to actual
page
Permission bits valid, read-only, read-write,
execute-only
Example Intel x86 architecture PTE
Address same format previous slide (10, 10,
12-bit offset)
Intermediate page tables called Directories
P Present (same as valid bit in other
architectures)
W Writeable
U User accessible
PWT Page write transparent external cache
write-through
PCD Page cache disabled (page cannot be
cached)
A Accessed page has been accessed recently
D Dirty (PTE only) page has been modified
recently
L L1?4MB page (directory only). Bottom 22
bits of virtual address serve as offset

6
7
Examples of how to use a PTE

How do we use the PTE?
Invalid PTE can imply different things
Region of address space is actually invalid or
Page/directory is just somewhere else than
memory
Validity checked first
OS can use other (say) 31 bits for location info
Usage Example Demand Paging
Keep only active pages in memory
Place others on disk and mark their PTEs invalid
Usage Example Copy on Write
UNIX fork gives copy of parent address space to
child
Address spaces disconnected after child created
How to do this cheaply?
Make copy of parents page tables (point at same
memory)
Mark entries in both sets of page tables as
read-only
Page fault on write creates two copies
Usage Example Zero Fill On Demand
New data pages must carry no information (say be
zeroed)
Mark PTEs as invalid page fault on use gets
zeroed page

7
8
How is the translation accomplished?

What, exactly happens inside MMU?
One possibility Hardware Tree Traversal
For each virtual address, takes page table base
pointer and traverses the page table in hardware
Generates a Page Fault if it encounters invalid
PTE
Fault handler will decide what to do
More on this next lecture
Pros Relatively fast (but still many memory
accesses!)
Cons Inflexible, Complex hardware
Another possibility Software
Each traversal done in software
Pros Very flexible
Cons Every translation must invoke Fault!
In fact, need way to cache translations for
either case!

8
9
Caching Concept

Cache a repository for copies that can be
accessed more quickly than the original
Make frequent case fast and infrequent case less
dominant
Caching underlies many of the techniques that are
used today to make computers fast
Can cache memory locations, address
translations, pages, file blocks, file names,
network routes, etc
Only good if
Frequent case frequent enough and
Infrequent case not too expensive
Important measure Average Access time (Hit
Rate x Hit Time) (Miss Rate x Miss Time)

9
10
Why Bother with Caching?
Processor-DRAM Memory Gap (latency)
1000
Moores Law (really Joys Law)
100
Performance
10
Less Law?
1
1989
1980
1981
1983
1984
1985
1986
1987
1988
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
1982
Time
10
11
Another Major Reason to Deal with Caching

Too expensive to translate on every access
At least two DRAM accesses per actual DRAM
access
Or perhaps I/O if page table partially on disk!
Even worse problem What if we are using caching
to make memory access faster than DRAM access???
Solution? Cache translations!
Translation Cache TLB (Translation Lookaside
Buffer)

11
12
Why Does Caching Help? Locality!

Temporal Locality (Locality in Time)
Keep recently accessed data items closer to
processor
Spatial Locality (Locality in Space)
Move contiguous blocks to the upper levels

12
13
Review Memory Hierarchy of a Modern Computer
System

Take advantage of the principle of locality to
Present as much memory as in the cheapest
technology
Provide access at speed offered by the fastest
technology

13
14
A Summary on Sources of Cache Misses

Compulsory (cold start) first reference to a
block
Cold fact of life not a whole lot you can do
about it
Note When running billions of instruction,
Compulsory Misses are insignificant
Capacity
Cache cannot contain all blocks access by the
program
Solution increase cache size
Conflict (collision)
Multiple memory locations mapped to same cache
location
Solutions increase cache size, or increase
associativity
Two others
Coherence (Invalidation) other process (e.g.,
I/O) updates memory
Policy Due to non-optimal replacement policy

14
15
Review Where does a Block Get Placed in a Cache?

Example Block 12 placed in 8 block cache

15
16
Other Caching Questions

What line gets replaced on cache miss?
Easy for Direct Mapped Only one possibility
Set Associative or Fully Associative
Random
LRU (Least Recently Used)
What happens on a write?
Write through The information is written to both
the cache and to the block in the lower-level
memory
Write back The information is written only to
the block in the cache
Modified cache block is written to main memory
only when it is replaced
Question is block clean or dirty?

16
17
Caching Applied to Address Translation
TLB
Physical Memory
CPU
Cached?
Translate (MMU)

Question is one of page locality does it exist?
Instruction accesses spend a lot of time on the
same page (since accesses sequential)
Stack accesses have definite locality of
reference
Data accesses have less page locality, but still
some
Can we have a TLB hierarchy?
Sure multiple levels at different sizes/speeds

17
18
What Actually Happens on a TLB Miss?

Hardware traversed page tables
On TLB miss, hardware in MMU looks at current
page table to fill TLB (may walk multiple
levels)
If PTE valid, hardware fills TLB and processor
never knows
If PTE marked as invalid, causes Page Fault,
after which kernel decides what to do afterwards
Software traversed Page tables (like MIPS)
On TLB miss, processor receives TLB fault
Kernel traverses page table to find PTE
If PTE valid, fills TLB and returns from fault
If PTE marked as invalid, internally calls Page
Fault handler
Most chip sets provide hardware traversal
Modern operating systems tend to have more TLB
faults since they use translation for many
things
Examples
shared segments
user-level portions of an operating system

18
19
Goals for Today

Virtual memory
How does it work?
Page faults
Resuming after page faults
When to fetch?
What to replace?
Page replacement algorithms
FIFO, OPT, LRU (Clock)
Page Buffering
Allocating Pages to processes

19
20
What is virtual memory?

Each process has illusion of large address space
232 for 32-bit addressing
However, physical memory is much smaller
How do we give this illusion to multiple
processes?
Virtual Memory some addresses reside in disk

20
21
Virtual Memory

Separates users logical memory from physical
memory.
Only part of the program needs to be in memory
for execution
Logical address space can therefore be much
larger than physical address space
Allows address spaces to be shared by several
processes
Allows for more efficient process creation

21
22
Virtual Memory

Load entire process in memory (swapping), run it,
exit
Is slow (for big processes)
Wasteful (might not require everything)
Solutions partial residency
Paging only bring in pages, not all pages of
process
Demand paging bring only pages that are
required
Where to fetch page from?
Have a contiguous space in disk swap file
(pagefile.sys)

22
23
How does VM work?

Modify Page Tables with another bit (valid)
If page in memory, valid 1, else valid 0
If page is in memory, translation works as
before
If page is not in memory, translation causes a
page fault

32 V1 4183 V0 177 V1 5721 V0
0 1 2 3
Mem
Page Table
23
24
Page Faults

On a page fault
OS finds a free frame, or evicts one from memory
(which one?)
Want knowledge of the future?
Issues disk request to fetch data for page (what
to fetch?)
Just the requested page, or more?
Block current process, context switch to new
process (how?)
Process might be executing an instruction
When disk completes, set valid bit to 1, and
current process in ready queue

24
25
Steps in Handling a Page Fault
25
26
Resuming after a page fault

Should be able to restart the instruction
For RISC processors this is simple
Instructions are idempotent until references are
done
More complicated for CISC
E.g. move 256 bytes from one location to another
Possible Solutions
Ensure pages are in memory before the instruction
executes

26
27
Page Fault (Cont.)

Restart instruction
block move
auto increment/decrement location

27
28
When to fetch?

Just before the page is used!
Need to know the future
Demand paging
Fetch a page when it faults
Prepaging
Get the page on fault some of its neighbors,
or
Get all pages in use last time process was swapped

28
29
Performance of Demand Paging

Page Fault Rate 0 ? p ? 1.0
if p 0 no page faults
if p 1, every reference is a fault
Effective Access Time (EAT)
EAT (1 p) x memory access
p (page fault overhead
swap page out
swap page in
restart overhead
)

29
30
Demand Paging Example

Memory access time 200 nanoseconds
Average page-fault service time 8
milliseconds
EAT (1 p) x 200 p (8 milliseconds)
(1 p) x 200 p x 8,000,000
200 p x 7,999,800
If one access out of 1,000 causes a page fault
EAT 8.2 microseconds.
This is a slowdown by a factor of 40!!

30
31
What to replace?

What happens if there is no free frame?
find some page in memory, but not really in use,
swap it out
Page Replacement
When process has used up all frames it is allowed
to use
OS must select a page to eject from memory to
allow new page
The page to eject is selected using the Page
Replacement Algorithm
Goal Select page that minimizes future page
faults

31
32
Page Replacement

Prevent over-allocation of memory by modifying
page-fault service routine to include page
replacement
Use modify (dirty) bit to reduce overhead of page
transfers only modified pages are written to
disk
Page replacement completes separation between
logical memory and physical memory large
virtual memory can be provided on a smaller
physical memory

32
33
Page Replacement
33
34
Page Replacement Algorithms

Random Pick any page to eject at random
Used mainly for comparison
FIFO The page brought in earliest is evicted
Ignores usage
Suffers from Beladys Anomaly
Fault rate could increase on increasing number of
pages
E.g. 0 1 2 3 0 1 4 0 1 2 3 4 with frame sizes 3
and 4
OPT Beladys algorithm
Select page not used for longest time
LRU Evict page that hasnt been used the
longest
Past could be a good predictor of the future

34
35
First-In-First-Out (FIFO) Algorithm

Reference string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3,
4, 5
3 frames (3 pages can be in memory at a time per
process) 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
4 frames 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
Beladys Anomaly more frames ? more page faults