Title: Virtual Memory
1Virtual Memory
- Fred Kuhns
- (fredk_at_arl.wustl.edu, http//www.arl.wustl.edu/fr
edk) - Department of Computer Science and Engineering
- Washington University in St. Louis
2Virtual Memory - A Preview
- Application is allocated a large virtual address
space. - Requires virtual to physical address translations
- Machines offer either continuous virtual address
space or a segmented one. - HW enforced Memory protection
- Can be implemented in HW (pages) or SW (memory
overlays).
3An Example Paged Virtual Memory
Working set
Physical address space
P1 virtual address space
P2 virtual address space
resident
Address Translation
Non-resident
4Example Paging System
CPU
Unitialized data
Stack and heap
DRAM
Allocated virtual pages
Low Address (0x00000000)
Swap
app1 Address space
Disk
UFS
High Address (0x7fffffff)
Text and initialized data
app1
5Typical UNIX VM Architecture
- File Mapping - Fundamental Organizational scheme
- Types of Mappings Shared and Private
- Has an OO Flavor
- Memory Object represents mapping from physical
page to data object - Integrated with vnodes
- FS provides name space
6File Mapping - read/write interface
VM Approach
Traditional Approach
Process P1
Process P1
mmap() Address space read/write Copy
Copy
Virtual Memory System
Buffer Cache
P1 pages
Copy
Copy
7Return to More General Material
8Paging and Segmented Architectures
- Memory references are dynamically translated into
physical addresses at run time - A process may be swapped in and out of main
memory such that it occupies different regions - A process may be broken up into pieces that do
not need to be located contiguously in main
memory - All pieces of a process do not need to be loaded
in main memory during execution
9Execution of a Program
- Resident set Operating system brings into main
memory portions of the memory space - If process attempts to access a non-resident page
then the MMU raises an exception causing the OS
to block process waiting for page to be loaded. - Steps for loading a process memory pages
- Operating system issues a disk I/O Read request
- Another process is dispatched to run while the
disk I/O takes place - An interrupt is issued when disk I/O complete
which causes the operating system to place the
affected process in the Ready state
10Advantages
- More processes may be maintained in main memory
- Only load in some of the pieces of each process
- With so many processes in main memory, it is very
likely a process will be in the Ready state at
any particular time - A process may be larger than all of main memory
11Types of Memory
- Types of memory
- Real memory Main memory
- Virtual memory main plus secondary storage
permitting more effective multiprogramming and
reduces constraints of main memory - Paging may result in Thrashing
- Swapping out a piece of a process just before
that piece is needed - The processor spends most of its time swapping
pieces rather than executing user instructions
12Exploiting the Principle of Locality
- Programs tend to demonstrate temporal and spatial
locality of reference - Only a few pieces of a process will be needed
over a short period of time - Possible to make intelligent guesses about which
pieces will be needed in the future - This suggests that virtual memory may work
efficiently - Working set model
- Assumes a slowing changing locality of reference
- Set periodically changes
- resident set size versus fault rate
- may set high and low thresholds
13Virtual Memory Support
- Hardware must support paging and/or segmentation
- Operating system must be able to manage the
movement of pages and/or segments between
secondary memory and main memory
14Paging
- Virtual and physical memory divided into fixed
size pages - Page tables translate virtual page to a physical
page. - The entire page table may take up too much main
memory - Page tables are also stored in virtual memory
- When a process is running, part of its page table
is in main memory - Problem translating every address is expensive
requiring possibly several memory access - Solution is to use a cache of recently mapped
addresses, the translation lookaside buffer - Managing the pages in memory
- Pages marked as resident or non-resident
- non-resident pages cause page faults
- Policies
- Fetch, Placement, Replacement
15General Requirements for Paging
- Prevent process from changing own memory maps
- CPU distinguishes between resident and
non-resident pages - Load pages and restart interrupted program
instructions - Determine if pages have been modified
16Address Translation - General
CPU
virtual address
MMU
cache
Physical address
data
Global memory
17Memory Management Unit
- Translates Virtual Addresses
- page tables
- Translation Lookaside Buffer
- Page tables
- One for kernel addresses
- one or more for user space processes
- Page Table Entry (PTE) one per virtual page
- 32 bits - page frame, protection, valid,
modified, referenced
18Address Translation
- Virtual address
- virtual page number offset
- Finds PTE for virtual page
- Extract physical page and adds offset
- Fail (MMU raises an exception - page fault)
- bounds error - outside address range
- validation error - non-resident page
- protection error - not permitted access
19MMU details
- Limit PT size
- segments
- page the page table (multi-level page table)
- MMU registers point to the current page table(s)
- kernel and MMU can modify page tables and
registers - Problem
- Page tables require perhaps multiple memory
access per instruction - Solution Translation Lookaside Buffer (TLB)
- rely on HW caching (virtual address cache)
- cache the translations themselves - TLB
20TLB Details
- Associative cache of address translations
- Entries may contain a tag identifying the process
as well as the virtual address. - Why is this important?
- MMU typically manages the TLB
- Kernel may need to invalidate entries,
- Would the kernel ever need to invalidate entries?
21Translation Lookaside Buffer
- Contains page table entries that have been most
recently used - Functions same way as a memory cache
- Given a virtual address, processor examines the
TLB - If present (a hit), the frame number is retrieved
and the real address is formed - If not found (a miss), page number is used to
index the process page table
22Address Translation Overview
MMU
Virtual address
CPU
physical address
cache
TLB
Page tables
23Page Table Entry
Y bits
X bits
Virtual address
Page Table Entry (PTE)
frame number
M
R
control bits
Z bits
- Resident bit indicates if page is in memory
- Modify bit to indicate if page has been altered
since loaded into main memory - Other control bits, for example dirty bit
- frame number, this is the physical frame address.
24Example 1-level address Translation
Virtual address
DRAM Frames
12 bits
20 bits
Frame X
X
offset
add
PTE
frame number
M
R
control bits
(Process) Page Table
current page table register
25SuperSPARC Reference MMU
Physical address
offset
Physical page
Context Tbl Ptr register
Context Tbl
12 Bits
24 Bits
PTD
Level 1
Level 2
PTD
Level 2
PTD
Context register
12 bit
PTE
6 bits
8 bits
6 bits
12 bits
Virtual address
4096
index 1
index 2
index 3
offset
virtual page
- 12 bit index for 4096 entries
- 8 bit index for 256 entries
- 6 bit index for 64 entries
- Virtual page number has 20 bits for 1M pages
- Physical frame number has 24 bits with a 12 bit
offset,permitting 16M frames.
26Page Table Descriptor/Entry
Page Table Descriptor
type
Page Table Pointer
2 1 0
Page Table Entry
ACC
C
M
R
Physical Page Number
type
8 7 6 5 4 2 1 0
Type PTD, PTE, Invalid C - Cacheable M -
Modify R - Reference ACC - Access permissions
27from William Stallings, Operating Systems, 4th
edition, Prentice Hall
28(No Transcript)
29Page Size
- Smaller page size, less amount of internal
fragmentation - Smaller page size, more pages required per
process - More pages per process means larger page tables
- Larger page tables means large portion of page
tables in virtual memory - Secondary memory is designed to efficiently
transfer large blocks of data so a large page
size is better
30Page Size
- Small page size, large number of pages will be
found in main memory - As time goes on during execution, the pages in
memory will all contain portions of the process
near recent references. Page faults low. - Increased page size causes pages to contain
locations further from any recent reference.
Page faults rise.
31Page Size - continued
- Multiple page sizes provide the flexibility
needed to effectively use a TLB - Large pages can be used for program instructions
- Small pages can be used for threads
- Most operating system support only one page size
32Example Page Sizes
33Segmentation
- May be unequal, dynamic size
- Simplifies handling of growing data structures
- Allows programs to be altered and recompiled
independently - Lends itself to sharing data among processes
- Lends itself to protection
34Segment Tables
- corresponding segment in main memory
- Each entry contains the length of the segment
- A bit is needed to determine if segment is
already in main memory - Another bit is needed to determine if the segment
has been modified since it was loaded in main
memory
35Segment Table Entries
36Combined Paging and Segmentation
- Paging is transparent to the programmer
- Paging eliminates external fragmentation
- Segmentation is visible to the programmer
- Segmentation allows for growing data structures,
modularity, and support for sharing and
protection - Each segment is broken into fixed-size pages
37Combined Segmentation and Paging
38(No Transcript)
39Basic Paging Policies
- Fetch Policy when to bring page into memory
- Demand paging fetch when referenced
- Prepaging brings in more pages than needed
- Replacement Policy which page to remove from
memory - Page removed should be the page least likely to
be referenced in the near future - Most policies predict the future behavior on the
basis of past behavior - Frame Locking - If frame is locked, it may not be
replaced. - Placement Policy where fetched page is placed
40Demand Paging
- Static Paging Algorithms fixed number of pages
allocated to process - Optimal
- Not recently used
- First-in, First-out and Second chance
- Clock algorithm
- Least Recently Used
- Least Frequently Used
- Dynamic Paging Algorithms
- Working Set Algorithms
41Static Replacement Algorithms
- Optimal
- Selects for replacement that page for which the
time to the next reference is the longest - Impossible to have perfect knowledge of future
events - Not Recently used
- recall M and R bits
- when page fault occurs divide each page into one
of 4 groups 0) R M 1) R M 2) R M 3)
R M and replace from the lowest ordered
non-empty group. - disadvantage is having to scan all pages when a
fault occurs
42Static Replacement Algorithms
- First-in, first-out (FIFO)
- Treats page frames allocated to a process as a
circular buffer - Pages are removed in round-robin style
- Simplest replacement policy to implement
- But may remove needed pages
- Second Chance Algorithm
- improvement to FIFO by using the R bit
- if R bit is set then clear and move to end of
list otherwise replace - may result in expensive list manipulations
43Static Replacement Algorithms
- Least Recently Used (LRU)
- Replaces the page that has not been referenced
for the longest time - By the principle of locality, this should be the
page least likely to be referenced in the near
future - Each page could be tagged with the time of last
reference. This would require a great deal of
overhead. - Expensive to implement if not supported in
hardware
44Static Replacement Algorithms
- Least Frequently Used
- software implementation of least recently used
- associate counter with each page initialized to 0
and increment by R each clock tick - problem is history ... slow to react to changing
reference string - Implementing Least Frequently Used
- shift counter right 1 bit and add R to leftmost
bit. This is known as aging.
45Static Replacement Algorithms
- Clock Policy
- Behavior is same as second chance but avoids
extra list manipulations - put pages on a circular list in the form of a
clock with a hand referencing the current
page - when page fault occurs, if current page has R
0 then replace otherwise set R o and advance to
next page. Keep advancing until locate page with
R 0.
46from William Stallings, Operating Systems, 4th
edition, Prentice Hall
47Dynamic Replacement Algorithms
- Working set algorithm aka demand paging
- working set set of pages currently used by
process - OS keeps track of processes working set, the
working set model, and is modeled as w(k,t), for
the k most recent references and process virtual
time t. - implementations use virtual time rather then k,
the virtual time of a process. Can track the
pages referenced in the past x seconds of virtual
time - Required scanning entire page list, expensive
- Compromise the is the WSClock algorithm stamp
page with
48WSClock Algorithm
- Similar to clock algorithm with circular list and
clock hand. - scans list and if R 1 then it is cleared and
the current virtual time is written. - advance hand
- if R 0 then check timestamp gt T then replace
(if dirty schedule for write else put on free
list)
49Some Details
- Stack Algorithms
- pages loaded with allocation m is always subset
of those loaded with allocation of m1. - Page Buffering
- Replaced page is added to one of two lists
- Fixed process page allocation
- gives a process a fixed number of pages within
which to execute - when a page fault occurs, one of the pages of
that process must be replaced - Variable process page allocation
- number of pages allocated to a process varies
over the lifetime of the process
50Variable Allocation, Global Scope
- Easiest to implement
- Adopted by many operating systems
- Operating system keeps list of free frames
- Free frame is added to resident set of process
when a page fault occurs - If no free frame, replaces one from another
process
51Variable Allocation, Local Scope
- When new process added, allocate number of page
frames based on application type, program
request, or other criteria - When page fault occurs, select page from among
the resident set of the process that suffers the
fault - Reevaluate allocation from time to time
52Cleaning Policy
- Demand cleaning
- a page is written out only when it has been
selected for replacement - Precleaning
- pages are written out in batches
- Best approach uses page buffering
- Replaced pages are placed in two lists
- Modified and unmodified
- Pages in the modified list are periodically
written out in batches - Pages in the unmodified list are either reclaimed
if referenced again or lost when its frame is
assigned to another page
53Load Control
- Determines the number of processes that will be
resident in main memory - Too few processes, many occasions when all
processes will be blocked and much time will be
spent in swapping - Too many processes will lead to thrashing
54Process Suspension
- Lowest priority process
- Faulting process
- this process does not have its working set in
main memory so it will be blocked anyway - Last process activated
- this process is least likely to have its working
set resident - Process with smallest resident set
- this process requires the least future effort to
reload - Largest process
- obtains the most free frames
- Process with the largest remaining execution
window