Title: OMSE 510: Computing Foundations 8: The Address Space
1OMSE 510 Computing Foundations8 The Address
Space
- Chris Gilmore ltgrimjack_at_cs.pdx.edugt
- Portland State University/OMSE
Material Borrowed from Jon Walpoles lectures
2Today
- Memory Management
- Virtual/Physical Address Translation
- Page Tables
- MMU, TLB
3Memory management
- Memory a linear array of bytes
- Holds O.S. and programs (processes)
- Each memory cell is named by a unique memory
address - Recall, processes are defined by an address
space, consisting of text, data, and stack
regions - Process execution
- CPU fetches instructions from the text region
according to the value of the program counter
(PC) - Each instruction may request additional operands
from the data or stack region
4Virtual memory management overview
- What have know about memory management?
- Processes require memory to run
- We prove the appearance that the entire process
is resident during execution - We know some functions/code in processes never
get invoked - Error detection and recovery routines
- In a graphics package, functions like smooth,
sharpen, brighten, etc... may not get invoked - Virtual Memory - allows for the execution of
processes that may not be completely in memory
(extension of paging technique from the last
chapter)
5Virtual memory overview
- Goals
- Hides physical memory from user
- Allows higher degree of multiprogramming (only
bring in pages that are accessed) - Allows large processes to be run on small amounts
of physical memory - Reduces I/O required to swap in/out processes
- (makes the system faster)
- Requires
- Pager - page in /out pages as required
- Swap space in order to hold processes that are
partially complete - Hardware support to do address translation
6Addressing memory
- Cannot know ahead of time where in memory a
program will be loaded! - Compiler produces code containing embedded
addresses - these addresses cant be absolute ( physical
addresses) - Linker combines pieces of the program
- Assumes the program will be loaded at address 0
- We need to bind the compiler/linker generated
addresses to the actual memory locations
7Relocatable address generation
0 100 175
1000 1100 1175
Library Routines
Library Routines
Prog P foo() End P
P push ... jmp _foo foo ...
P push ... jmp 75 foo ...
0 75
P push ... jmp 175 foo ...
P push ... jmp 1175 foo ...
Compilation
Assembly
Linking
Loading
8Address binding
- Address binding
- fixing a physical address to the logical address
of a process address space - Compile time binding
- if program location is fixed and known ahead of
time - Load time binding
- if program location in memory is unknown until
run-time AND location is fixed - Execution time binding
- if processes can be moved in memory during
execution - Requires hardware support!
91000 1100 1175
Library Routines
0 100 175
Library Routines
Compile Time Address Binding
P push ... jmp 1175 foo ...
P push ... jmp 175 foo ...
Load Time Address Binding
Execution Time Address Binding
1000 1100 1175
Library Routines
0 100 175
Library Routines
P push ... jmp 1175 foo ...
P push ... jmp 175 foo ...
Base register
1000
10Memory management architectures
- Fixed size allocation
- Memory is divided into fixed partitions
- Dynamically sized allocation
- Memory allocated to fit processes exactly
11Runtime binding base limit registers
- Simple runtime relocation scheme
- Use 2 registers to describe a partition
- For every address generated, at runtime...
- Compare to the limit register ( abort if larger)
- Add to the base register to give physical memory
address
12Dynamic relocation with a base register
- Memory Management Unit (MMU) - dynamically
converts logical addresses into physical address - MMU contains base address register for running
process
Relocation register for process i
Max Mem
Max addr
1000
process i
0
Program generated address
Physical memory address
Operating system
MMU
0
13Protection using base limit registers
- Memory protection
- Base register gives starting address for process
- Limit register limits the offset accessible from
the relocation register
limit
base
register
register
Physical
memory
logical
address
address
yes
lt
no
addressing error
14Multiprogramming with base and limit registers
- Multiprogramming a separate partition per
process - What happens on a context switch?
- Store process As base and limit register values
- Load new values into base and limit registers for
process B
Partition E
limit
Partition D
Partition C
base
Partition B
Partition A
OS
15896K
128K
O.S.
16576K
896K
P
320K
1
128K
O.S.
O.S.
128K
17576K
352K
896K
P
224K
2
P
P
320K
320K
1
1
128K
O.S.
O.S.
O.S.
128K
128K
1864K
P
3
576K
352K
288K
896K
P
P
224K
224K
2
2
P
P
P
320K
320K
320K
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
1964K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
2064K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
64K
P
3
288K
96K
P
128K
4
P
320K
1
O.S.
128K
2164K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
64K
64K
P
P
3
3
288K
288K
96K
96K
P
P
128K
128K
4
4
P
320K
320K
1
O.S.
O.S.
128K
128K
2264K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
64K
64K
64K
P
P
P
3
3
3
288K
288K
288K
96K
96K
96K
P
P
P
128K
128K
128K
4
4
4
96K
P
320K
320K
1
P
224K
5
O.S.
O.S.
O.S.
128K
128K
128K
2364K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
64K
64K
64K
64K
P
P
P
P
3
3
3
3
288K
288K
288K
288K
96K
96K
96K
96K
???
128K
P
P
P
P
128K
128K
128K
128K
P
4
4
4
4
6
96K
96K
P
320K
320K
1
P
224K
P
224K
5
5
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
24Swapping
- When a program is running...
- The entire program must be in memory
- Each program is put into a single partition
- When the program is not running...
- May remain resident in memory
- May get swapped out to disk
- Over time...
- Programs come into memory when they get swapped
in - Programs leave memory when they get swapped out
25Basics - swapping
- Benefits of swapping
- Allows multiple programs to be run concurrently
- more than will fit in memory at once
Max mem
Process i
Swap in
Process m
Process j
Process k
Swap out
Operating system
0
26Swapping can also lead to fragmentation
27Dealing with fragmentation
- Compaction from time to time shift processes
around to collect all free space into one
contiguous block - Placement algorithms First-fit, best-fit,
worst-fit
64K
256K
P
3
288K
96K
P
288K
3
???
P
128K
P
128K
4
P
6
6
P
128K
96K
4
P
224K
P
224K
5
5
O.S.
O.S.
128K
128K
28Influence of allocation policy
29How big should partitions be?
- Programs may want to grow during execution
- More room for stack, heap allocation, etc
- Problem
- If the partition is too small programs must be
moved - Requires modification of base and limit regs
- Why not make the partitions a little larger than
necessary to accommodate some growth? - Fragmentation
- External fragmentation unused space between
partitions - Internal fragmentation unused space within
partitions
30Allocating extra space within partitions
31Managing memory
- Each chunk of memory is either
- Used by some process or unused (free)
- Operations
- Allocate a chunk of unused memory big enough to
hold a new process - Free a chunk of memory by returning it to the
free pool after a process terminates or is
swapped out
32Managing memory with bit maps
- Problem - how to keep track of used and unused
memory? - Technique 1 - Bit Maps
- A long bit string
- One bit for every chunk of memory
- 1 in use
- 0 free
- Size of allocation unit influences space required
- Example unit size 32 bits
- overhead for bit map 1/33 3
- Example unit size 4Kbytes
- overhead for bit map 1/32,769
33Managing memory with bit maps
34Managing memory with linked lists
- Technique 2 - Linked List
- Keep a list of elements
- Each element describes one unit of memory
- Free / in-use Bit (Pprocess, Hhole)
- Starting address
- Length
- Pointer to next element
35Managing memory with linked lists
0
36Merging holes
- Whenever a unit of memory is freed we want to
merge adjacent holes!
37Merging holes
38Merging holes
39Merging holes
40Merging holes
41Managing memory with linked lists
- Searching the list for space for a new process
- First Fit
- Next Fit
- Start from current location in the list
- Not as good as first fit
- Best Fit
- Find the smallest hole that will work
- Tends to create lots of little holes
- Worst Fit
- Find the largest hole
- Remainder will be big
- Quick Fit
- Keep separate lists for common sizes
42Fragmentation
- Memory is divided into partitions
- Each partition has a different size
- Processes are allocated space and later freed
- After a while memory will be full of small holes!
- No free space large enough for a new process even
though there is enough free memory in total - This is external fragmentation
- If we allow free space within a partition we have
internal fragmentation
43Solution to fragmentation?
- Allocate memory in equal fixed size units?
- Reduces external fragmentation problems
- But what about wasted space inside a unit due to
internal fragmentation? - How big should the units be?
- The smaller the better for internal fragmentation
- The larger the better for management overhead
- Can we use a unit size smaller than the memory
needed by a process? - Ie, allocate non-contiguous units to the same
process? - but how would the base and limit registers work?
44Using pages for non-contiguous allocation
- Memory divided into fixed size page frames
- Page frame size 2n bytes
- Lowest n bits of an address specify byte offset
in page - But how do we associate page frames with
processes? - And how do we map memory addresses within a
process to the correct memory byte in a page
frame? - Solution
- Processes use virtual addresses
- Hardware uses physical addresses
- hardware support for virtual to physical address
translation
45Virtual addresses
- Virtual memory addresses (what the process uses)
- Page number plus byte offset in page
- Low order n bits are the byte offset
- Remaining high order bits are the page number
bit 0
bit n-1
bit 31
20 bits
12 bits
page number
offset
Example 32 bit virtual address Page size 212
4KB Address space size 232 bytes 4GB
46Physical addresses
- Physical memory addresses (what the CPU uses)
- Page frame number plus byte offset in page
- Low order n bits are the byte offset
- Remaining high order bits are the page frame
number
bit 0
bit n-1
bit 24
12 bits
12 bits
Page frame number
offset
Example 24 bit physical address Page frame size
212 4KB Max physical memory size 224 bytes
16MB
47Address translation
- Hardware maps page numbers to page frame numbers
- Memory management unit (MMU) has multiple
registers for multiple pages - Like a base register except its value is
substituted for the page number rather than added
to it - Why dont we need a limit register for each page?
48Memory Management Unit (MMU)
49Virtual address spaces
- Here is the virtual address space
- (as seen by the process)
Lowest address
Highest address
Virtual Addr Space
50Virtual address spaces
- The address space is divided into pages
- In x86, the page size is 4K
Page 0
0 1 2 3 4 5 6 7 N
Page 1
A Page
Page N
Virtual Addr Space
51Virtual address spaces
- In reality, only some of the pages are used
0 1 2 3 4 5 6 7 N
Unused
Virtual Addr Space
52Physical memory
- Physical memory is divided into page frames
- (Page size frame size)
0 1 2 3 4 5 6 7 N
Physical memory
Virtual Addr Space
53Virtual and physical address spaces
- Some page frames are used to hold the pages of
this process
0 1 2 3 4 5 6 7 N
These frames are used for this process
Virtual Addr Space
Physical memory
54Virtual and physical address spaces
- Some page frames are used for other processes
0 1 2 3 4 5 6 7 N
Used by other Processes
Virtual Addr Space
Physical memory
55Virtual address spaces
- Address mappings say which frame has which page
0 1 2 3 4 5 6 7 N
Virtual Addr Space
Physical memory
56Page tables
- Address mappings are stored in a page table in
memory - One page table entry per page...
- Is this page in memory? If so, which frame is it
in?
0 1 2 3 4 5 6 7 N
Virtual Addr Space
Physical memory
57Address mappings and translation
- Address mappings are stored in a page table in
memory - Typically one page table for each process
- Address translation is done by hardware (ie the
MMU) - How does the MMU get the address mappings?
- Either the MMU holds the entire page table (too
expensive) - Or the MMU holds a portion of the page table
- MMU caches page table entries
- called a translation look-aside buffer (TLB)
58Address mappings and translation
- What if the TLB needs a mapping it doesnt have?
- Software managed TLB
- it generates a TLB-miss fault which is handled by
the operating system (like interrupt or trap
handling) - The operating system looks in the page tables,
gets the mapping from the right entry, and puts
it in the TLB - Hardware managed TLB
- it looks in a pre-specified memory location for
the appropriate entry in the page table - The hardware architecture defines where page
tables must be stored in memory
59A Simple Architecture
- Page size
- 4 Kbytes
- Virtual addresses (logical addresses)
- 32 bits --gt 4GB virtual address space
- 2M Pages --gt 20 bits for page number
60A Simple Architecture
- Page size
- 4 Kbytes
- Virtual addresses (logical addresses)
- 32 bits --gt 4GB virtual address space
- 2M Pages --gt 20 bits for page number
0
11
12
32
20 bits
12 bits
page number
offset
61A Simple Architecture
- Physical addresses
- 32 bits --gt 4 Gbyte installed memory (max)
- 4096K Frames --gt 20 bits for frame number
- Hardware Extensions
62A Simple Architecture
- The page table mapping
- Page Directory -gt Page Table --gt Frame
- Virtual Address
0
32
12 bits
20 bits
Page Frame (Physical Memory)
Page table (1M entries)
63Quiz
- What is the difference between a virtual and a
physical address? - Why are programs not usually written using
physical addresses?
64Page tables
- When and why do we access a page table?
- On every instruction to translate virtual to
physical addesses?
65Page tables
- When and why do we access a page table?
- On every instruction to translate virtual to
physical addresses? NO! - On TLB miss faults to refill the TLB
- During process creation and destruction
- When a process allocates or frees memory?
66Translation Lookaside Buffer (TLB)
- Problem
- MMU must go to page table on every memory access!
67Translation Lookaside Buffer (TLB)
- Problem
- MMU must go to page table on every memory access!
- Solution
- Cache the page table entries in a hardware cache
- Small number of entries (e.g., 64)
- Each entry contains
- Page number
- Other stuff from page table entry
- Associatively indexed on page number
68Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
0
12
13
31
frame number
offset
physical address
69Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
70Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
71Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
72Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
73Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
74Software operation of TLB
- What if the entry is not in the TLB?
- Go to page table
- Find the right entry
- Move it into the TLB
- Which entry to replace?
- Hardware TLB refill
- Page tables in specific location and format
- Software refill
- Hardware generates trap (TLB miss fault)
- Lets the OS deal with the problem
- Page tables become entirely a OS data structure!
- Want to do a context switch?
- Must empty the TLB
- Just clear the Valid Bit
75Software operation of TLB
- What should we do with the TLB on a context
switch? - How can we prevent the next process from using
the last processs address mappings? - Option 1 empty the TLB
- New process will generate faults until its pulls
enough of its own entries into the TLB - Option 2 just clear the Valid Bit
- New process will generate faults until its pulls
enough of its own entries into the TLB - Option 3 the hardware maintains a process id tag
on each TLB entry - Hardware compares this to a process id held in a
specific register on every translation
76Page tables
- Do we access a page table when a process
allocates or frees memory?
77Page tables
- Do we access a page table when a process
allocates or frees memory? - Not necessarily
- Library routines (malloc) can service small
requests from a pool of free memory within a
process - When these routines run out of space a new page
must be allocated and its entry inserted into the
page table
78Page tables
- When and why do we access a page table?
- On every instruction to translate virtual to
physical addresses? NO! - On TLB miss faults to refill the TLB
- During process creation and destruction
- When a process allocates or frees memory?
- Library routines (malloc) can service small
requests from a pool of free memory within a
process - When these routines run out of space a new page
must be allocated and its entry inserted into the
page table - During swapping/paging to disk
79Page tables
- In a well provisioned system, TLB miss faults
will be the most frequently occurring event - TLB miss fault
- Given a virtual page number we must find the
right page table entry - Fastest approach index the page table using
virtual page numbers
80Page table design
- Page table size depends on
- Page size
- Virtual address length
- Memory used for page tables is overhead!
- How can we save space?
- and still find entries quickly?
- Two main ideas
- Multi-level page tables
- Inverted page tables
81Multi-level Page Tables
82Multi-level Page Tables
frames in memory
Top-level Page table
2nd-level tables
83Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory
Top-level Page table
2nd-level tables
84Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory
Top-level Page table
2nd-level tables
85Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory
Top-level Page table
2nd-level tables
86Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory
Top-level Page table
2nd-level tables
87Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory
Top-level Page table
2nd-level tables
88Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory
Top-level Page table
2nd-level tables
89Multi-level page tables
- Ok, so how does this save space?
- Not all pages within a virtual address space are
allocated - Not only do they not have a page frame, but that
range of virtual addresses is not being used - So no need to maintain complete information about
it - Some intermediate page tables are empty and not
needed - We could also page the page table
- This saves space but slows access a lot!
90The x86 architecture
- Page size
- 4 Kbytes
- Virtual addresses (logical addresses)
- 32 bits --gt 4GB virtual address space
- 2M Pages --gt 20 bits for page number
91The x86 architecture
- The page table mapping
- Page Directory -gt Page Table --gt Frame
- Virtual Address
0
32
12 bits
10 bits
10 bits
Page Frame (Physical Memory)
Page Directory (1024 entries)
Page Table (1024 entries)
92Inverted page tables
- Problem
- Page table overhead increases with address space
size - Page tables get too big to fit in memory!
- Consider a computer with 64 bit addresses
- Assume 4 Kbyte pages (12 bits for the offset)
- Virtual address space 252 pages!
- Page table needs 252 entries!
- This page table is much too large for memory!
- But we only need fast access to translations for
those pages that are in memory! - A 256 Mbyte memory can only hold 64 4Kbyte pages
- So we really only need 64 page table entries!
93Inverted page tables
- An inverted page table
- Has one entry for every frame of memory
- Tells which page is in that frame
- Is indexed by frame number not page number!
- So how can we search it?
- If we have a page number (from a faulting
address) and want to find it page table entry, do
we - Do an exhaustive search of all entries?
94Inverted page tables
- An inverted page table
- Has one entry for every frame of memory
- Tells which page is in that frame
- Is indexed by frame number not page number!
- So how can we search it?
- If we have a page number (from a faulting
address) and want to find it page table entry, do
we - Do an exhaustive search of all entries?
- No, thats too slow!
- Why not maintain a hash table to allow fast
access given a page number?
95Inverted Page Table
96Which page table design is best?
- The best choice depends on CPU architecture
- 64 bit systems need inverted page tables
- Some systems use a combination of regular page
tables together with segmentation (later)
97Page tables
A typical page table entry
98Performance of memory translation
- Why cant memory address translation be done in
software? - How often is translation done?
- What work is involved in translating a virtual
address to a physical address? - indexing into page tables
- interpreting page descriptors
- more memory references!
99Memory hierarchy performance
- The memory hierarchy consists of several types
of memory - L1 cache (typically on die)
- L2 cache (typically available)
- Memory (DRAM, SRAM, RDRAM,)
- Disk (lots of space available)
- Tape (even more space available)
1 cycle
0.5 ns!
0.5 ns - 20 ns
1 - 40 cycles
40 - 80 ns
80 - 160
8 - 13 ms
16M - 26M
longer than you want!
360 Billion
100Performance of memory translation (2)
- How can additional memory references be avoided?
- TLB - translation look-aside buffer
- an associative memory cache for page table
entries - if there is locality of reference, performance is
good
101Translation lookaside buffer
p
o
CPU
page
frame
TLB Hit
Physical memory
f
o
TLB
Page Table
102TLB entries
103TLB implementation
- In order to be fast, TLBs must implement an
associative search where the cache is searched in
parallel. - EXPENSIVE
- The number of entries varies (8 -gt 2048)
- Because the TLB translates logical pages to
physical pages, the TLB must be flushed on every
context switch in order to work - Can improve performance by associating process
bits with each TLB entry - A TLB must implement an eviction policy which
flushes old entries out of the TLB - Occurs when the TLB is full
104Page table organization
- How big should a virtual address space be?
- what factors influence its size?
- How big are page tables?
- what factors determine their size?
- Can page tables be held entirely in cache?
- can they be held entirely in memory even?
- How big should page sizes be?
105Page Size Issues
Choose a large page size More loss due to
internal fragmentation Assume a process is using
5 regions of memory heavily ... Will need 5
pages, regardless of page size ---gt Ties up
more memory Choose a small page size The page
table will become very large Example Virtual
Address Space 4G bytes Page Size 4K (e.g.,
Pentium) Page table size 1M entries! (4Mbytes)
106Address space organization
- How big should a virtual address space be?
- Which regions of the address space should be
allocated for different purposes - stack, data,
instructions? - What if memory needs for a region increase
dynamically? - What are segments?
- What is the relationship between segments and
pages? - Can segmentation and paging be used together?
- If segments are used, how are segment selectors
incorporated into addresses?
107Memory protection
- At what granularity should protection be
implemented? - page-level?
- segment level?
- How is protection checking implemented?
- compare page protection bits with process
capabilities and operation types on every access - sounds expensive!
- How can protection checking be done efficiently?
- segment registers
- protection look-aside buffers
108Memory protection with paging
- Associate protection bits with each page table
entry - Read/Write access - can provide read-only access
for re-entrant code - Valid/Invalid bits - tells MMU whether or not the
page exists in the process address space - Page Table Length Register (PTLR) - stores how
long the page table is to avoid an excessive
number of unused page table entries
Frame R/W V/I
Page Table
109Handling accesses to invalid pages
- The page table is used to translate logical
addresses to physical addresses - Pages that are not in memory are marked invalid
- A page fault occurs when there is an access to an
invalid page of a process - Page faults require the operating system to
- suspend the process
- find a free frame in memory
- swap-in the page that had the fault
- update the page table entry (PTE)
- restart the process
110Page fault handling in more detail
- Hardware traps to kernel
- General registers saved
- OS determines which virtual page needed
- OS checks validity of address, seeks page frame
- If eviction needed frame is dirty, write it to
disk
111Page fault handling in more detail
- OS brings new page in from disk
- Page tables updated
- Faulting instruction backed up to when it began
- Faulting process scheduled
- Registers restored
- Program continues
112Anatomy of a page fault
A
Restart Proc.
Logical memory
Update PTE
Bring in page
Page fault
Physical memory
Find Frame
Page table
O.S.
Get from backing store
113Locking pages in memory
- An Issue to be aware of
- Virtual memory and I/O occasionally interact
- Process issues call for read from device into
buffer - while waiting for I/O, another processes starts
up - has a page fault
- buffer for the first process may be chosen to be
paged out - Need to specify some pages locked (pinned)
- exempted from being target pages
114Quiz
- Why is hardware support required for dynamic
address translation? - What is a page table used for?
- What is a TLB used for?
- How many address bits are used for the page
offset in a system with 2KB page size?
115Memory protection
- At what granularity should protection be
implemented? - page-level?
- A lot of overhead for storing protection
information for non-resident pages - segment level?
- Coarser grain than pages
- Makes sense if contiguous groups of pages share
the same protection status
116Memory protection
- How is protection checking implemented?
- compare page protection bits with process
capabilities and operation types on every
load/store - sounds expensive!
- Requires hardware support!
- How can protection checking be done efficiently?
- Use the TLB as a protection look-aside buffer
- Use special segment registers
117Protection lookaside buffer
- A TLB is often used for more than just
translation - Memory accesses need to be checked for validity
- Does the address refer to an allocated segment of
the address space? - If not segmentation fault!
- Is this process allowed to access this memory
segment? - If not segmentation/protection fault!
- Is the type of access valid for this segment?
- Read, write, execute ?
- If not protection fault!
118Page-grain protection checking with a TLB
119Segment-grain protection
- All pages within a segment usually share the same
protection status - So we should be able to batch the protection
information - Why not just use segment-size pages?
- Segments vary in size
- Segments change size dynamically (stack, heap
etc)
120Segmentation in a single address space
Example A compiler
121Segmented address spaces
- Traditional Virtual Address Space
- flat address space (1 dimensional)
- Segmented Address Space
- Program made of several pieces
- Each segment is like a mini-address space
- Addresses within a segment start at zero
- The program must always say which segment it
means - either embed a segment id in an address
- or load a value into a segment register
- Addresses
- Segment Offset
- Each segment can grow independently of others
122Segmented memory
Each space grows, shrinks independently!
123Separate instruction and data spaces
One address space Separate I and D spaces
124Page sharing
- In a large multiprogramming system...
- Some users run the same program at the same time
- Why have more than one copy of the same page in
memory??? - Goal
- Share pages among processes (not just threads!)
- Cannot share writable pages
- If writable pages were shared processes would
notice each others effects - Text segment can be shared
125Page sharing
Physical memory
Process 1 address space
Process 1 page table
Stack (rw)
Process 2 address space
Process 2 page table
Data (rw) Instructions (rx)
126Page sharing
- Fork system call
- Copy the parents virtual address space
- ... and immediately do an Exec system call
- Exec overwrites the calling address space with
the contents of an executable file (ie a new
program) - Desired Semantics
- pages are copied, not shared
- Observations
- Copying every page in an address space is
expensive! - processes cant notice the difference between
copying and sharing unless pages are modified!
127Page sharing
- Idea Copy-On-Write
- Initialize new page table, but point entries to
existing page frames of parent - Share pages
- Temporarily mark all pages read-only
- Share all pages until a protection fault occurs
- Protection fault (copy-on-write fault)
- Is this page really read only or is it writable
but temporarily protected for copy-on-write? - If it is writable
- copy the page
- mark both copies writable
- resume execution as if no fault occurred
128On Page replacement..
- Paging performance
- Paging works best if there are plenty of free
frames. - If all pages are full of dirty pages...
- Must perform 2 disk operations for each page
fault
129Page replacement
- Assume a normal page table
- User-program is executing
- A PageInvalidFault occurs!
- The page needed is not in memory
- Select some frame and remove the page in it
- If it has been modified, it must be written back
to disk - the dirty bit in its page table entry tells us
if this is necessary - Figure out which page was needed from the
faulting addr - Read the needed page into this frame
- Restart the interrupted process by retrying the
same instruction
130Page replacement algorithms
- Which frame to replace?
- Algorithms
- The Optimal Algorithm
- First In First Out (FIFO)
- Not Recently Used (NRU)
- Second Chance / Clock
- Least Recently Used (LRU)
- Not Frequently Used (NFU)
- Working Set (WS)
- WSClock
131The optimal page replacement algorithm
- Idea
- Select the page that will not be needed for the
longest time
132Optimal page replacement
- Replace the page that will not be needed for the
longest - Example
a a a a b b b b c c c c
d d d d X
133Optimal page replacement
- Select the page that will not be needed for the
longest time - Example
a a a a a a a a a b b b b
b b b b b c c c c c c c c
c d d d d e e e e e
X X
134The optimal page replacement algorithm
- Idea
- Select the page that will not be needed for the
longest time - Problem
- Cant know the future of a program
- Cant know when a given page will be needed next
- The optimal algorithm is unrealizable
135The optimal page replacement algorithm
- However
- We can use it as a control case for simulation
studies - Run the program once
- Generate a log of all memory references
- Use the log to simulate various page replacement
algorithms - Can compare others to optimal algorithm
136FIFO page replacement algorithm
- Always replace the oldest page
- Replace the page that has been in memory for the
longest time.
137FIFO page replacement algorithm
- Replace the page that was first brought into
memory - Example Memory system with 4 frames
a a a b c c c c
d d X
138FIFO page replacement algorithm
- Replace the page that was first brought into
memory - Example Memory system with 4 frames
a a a a a b b b c c
c c e e d d d d
X
139FIFO page replacement algorithm
- Replace the page that was first brought into
memory - Example Memory system with 4 frames
a a a a a a a c c
b b b b b b b c c c c e e
e e e e d d d d d d d
a X X X
140FIFO page replacement algorithm
- Always replace the oldest page.
- Replace the page that has been in memory for the
longest time. - Implementation
- Maintain a linked list of all pages in memory
- Keep it in order of when they came into memory
- The page at the front of the list is oldest
- Add new page to end of list
141FIFO page replacement algorithm
- Disadvantage
- The oldest page may be needed again soon
- Some page may be important throughout execution
- It will get old, but replacing it will cause an
immediate page fault
142Page table referenced and dirty bits
- Each page table entry (and TLB entry) has a
- Referenced bit - set by TLB when page read /
written - Dirty / modified bit - set when page is written
- If TLB entry for this page is valid, it has the
most up to date version of these bits for the
page - OS must copy them into the page table entry
during fault handling - On Some Hardware...
- ReadOnly bit but no dirty bit
143Page table referenced and dirty bits
- Idea
- Software sets the ReadOnly bit for all pages
- When program tries to update the page...
- A trap occurs
- Software sets the Dirty Bit and clears the
ReadOnly bit - Resumes execution of the program
144Not recently used page replacement alg.
- Use the Referenced Bit and the Dirty Bit
- Initially, all pages have
- Referenced Bit 0
- Dirty Bit 0
- Periodically... (e.g. whenever a timer interrupt
occurs) - Clear the Referenced Bit
145Not recently used page replacement alg.
- When a page fault occurs...
- Categorize each page...
- Class 1 Referenced 0 Dirty 0
- Class 2 Referenced 0 Dirty 1
- Class 3 Referenced 1 Dirty 0
- Class 4 Referenced 1 Dirty 1
- Choose a victim page from class 1 why?
- If none, choose a page from class 2 why?
- If none, choose a page from class 3 why?
- If none, choose a page from class 4 why?
146Second chance page replacement alg.
- Modification to FIFO
- Pages kept in a linked list
- Oldest is at the front of the list
- Look at the oldest page
- If its referenced bit is 0...
- Select it for replacement
- Else
- It was used recently dont want to replace it
- Clear its referenced bit
- Move it to the end of the list
- Repeat
- What if every page was used in last clock tick?
- Select a page at random
147Clock algorithm (same as second chance)
- Maintain a circular list of pages in memory
- Set a bit for the page when a page is referenced
- Clock sweeps over memory looking for a victim
page that does not have the referenced bit set - If the bit is set, clear it and move on to the
next page - Replaces pages that havent been referenced for
one complete clock revolution
148Least recently used algorithm (LRU)
- Keep track of when a page is used.
- Replace the page that has been used least
recently.
149LRU page replacement
- Replace the page that hasnt been referenced in
the longest time
150LRU page replacement
- Replace the page that hasnt been referenced in
the longest time
a a a a a a a a a a b b b
b b b b b b b c c c c e e
e e e d d d d d d d d d c
c X X X
151Least recently used algorithm (LRU)
- But how can we implement this?
- Implementation 1
- Keep a linked list of all pages
- On every memory reference,
- Move that page to the front of the list.
- The page at the tail of the list is replaced.
- on every memory reference...
- Not feasible in software
152LRU implementation
- Take referenced and put at head of list
153LRU implementation
- Take referenced and put at head of list
a a b b c c d d
154LRU implementation
- Take referenced and put at head of list
a a a a b b b b c c c c d d
d d X
155LRU implementation
- Take referenced and put at head of list
a a a a a a a a a a b b b
b b b b b b b c c c c e e
e e e d d d d d d d d d c
c X X X
156Least recently used algorithm (LRU)
- But how can we implement this?
- without requiring every access to be recorded?
- Implementation 2
- MMU (hardware) maintains a counter
- Incremented on every clock cycle
- Every time a page table entry is used
- MMU writes the value to the entry
- timestamp / time-of-last-use
- When a page fault occurs
- Software looks through the page table
- Idenitifies the entry with the oldest timestamp
157Least recently used algorithm (LRU)
- What if we dont have hardware support?
- Implementation 3
- No hardware support
- Maintain a counter in software
- One every timer interrupt...
- Increment counter
- Run through the page table
- For every entry that has ReferencedBit 1
- Update its timestamp
- Clear the ReferencedBit
- Approximates LRU
- If several have oldset time, choose one
arbitrarily
158Not frequently used algorithm (NFU)
- Associate a counter with each page
- On every clock interrupt, the OS looks at each
page. - If the Reference Bit is set...
- Increment that pages counter clear the bit.
- The counter approximates how often the page is
used. - For replacement, choose the page with lowest
counter.
159Not frequently used algorithm (NFU)
- Problem
- Some page may be heavily used
- ---gt Its counter is large
- The programs behavior changes
- Now, this page is not used ever again (or only
rarely) - This algorithm never forgets!
- This page will never be chosen for replacement!
160Modified NFU with aging
- Associate a counter with each page
- On every clock tick, the OS looks at each page.
- Shift the counter right 1 bit (divide its value
by 2) - If the Reference Bit is set...
- Set the most-significant bit
- Clear the Referenced Bit
- 100000 32
- 010000 16
- 001000 8
- 000100 4
- 100010 34
- 111111 63
161Paged Memory Mangement
162Working set page replacement
- Demand paging
- Pages are only loaded when accessed
- When process begins, all pages marked INVALID
- Locality of Reference
- Processes tend to use only a small fraction of
their pages - Working Set
- The set of pages a process needs
- If working set is in memory, no page faults
- What if you cant get working set into memory?
163Working set page replacement
- Thrashing
- If you cant get working set into memory pages
fault every few instructions - No work gets done
164Working set page replacement
- Prepaging (prefetching)
- Load pages before they are needed
- Main idea
- Indentify the processs working set
- How big is the working set?
- Look at the last K memory references
- As K gets bigger, more pages needed.
- In the limit, all pages are needed.
165Working set page replacement
- The size of the working set
k (the time interval)
166Working set page replacement
- Idea
- Look back over the last T msec of time
- Which pages were referenced?
- This is the working set.
- Current Virtual Time
- Only consider how much CPU time this process has
seen. - Implementation
- On each clock tick, look at each page
- Was it referenced?
- Yes Make a note of Current Virtual Time
- If a page has not been used in the last T msec,
- It is not in the working set!
- Evict it write it out if it is dirty.
167Working set page replacement
168WSClock page replacement algorithm
- All pages are kept in a circular list (ring)
- As pages are added, they go into the ring.
- The clock hand advances around the ring.
- Each entry contains time of last use.
- Upon a page fault...
- If Reference Bit 1...
- Page is in use now. Do not evict.
- Clear the Referenced Bit.
- Update the