OMSE 510: Computing Foundations 8: The Address Space - PowerPoint PPT Presentation

About This Presentation
Title:

OMSE 510: Computing Foundations 8: The Address Space

Description:

OMSE 510: Computing Foundations 8: The Address Space Chris Gilmore Portland State University/OMSE Material Borrowed from Jon Walpole s lectures – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 187
Provided by: Francis154
Learn more at: http://web.cecs.pdx.edu
Category:

less

Transcript and Presenter's Notes

Title: OMSE 510: Computing Foundations 8: The Address Space


1
OMSE 510 Computing Foundations8 The Address
Space
  • Chris Gilmore ltgrimjack_at_cs.pdx.edugt
  • Portland State University/OMSE

Material Borrowed from Jon Walpoles lectures
2
Today
  • Memory Management
  • Virtual/Physical Address Translation
  • Page Tables
  • MMU, TLB


3
Memory management
  • Memory a linear array of bytes
  • Holds O.S. and programs (processes)
  • Each memory cell is named by a unique memory
    address
  • Recall, processes are defined by an address
    space, consisting of text, data, and stack
    regions
  • Process execution
  • CPU fetches instructions from the text region
    according to the value of the program counter
    (PC)
  • Each instruction may request additional operands
    from the data or stack region

4
Virtual memory management overview
  • What have know about memory management?
  • Processes require memory to run
  • We prove the appearance that the entire process
    is resident during execution
  • We know some functions/code in processes never
    get invoked
  • Error detection and recovery routines
  • In a graphics package, functions like smooth,
    sharpen, brighten, etc... may not get invoked
  • Virtual Memory - allows for the execution of
    processes that may not be completely in memory
    (extension of paging technique from the last
    chapter)

5
Virtual memory overview
  • Goals
  • Hides physical memory from user
  • Allows higher degree of multiprogramming (only
    bring in pages that are accessed)
  • Allows large processes to be run on small amounts
    of physical memory
  • Reduces I/O required to swap in/out processes
  • (makes the system faster)
  • Requires
  • Pager - page in /out pages as required
  • Swap space in order to hold processes that are
    partially complete
  • Hardware support to do address translation

6
Addressing memory
  • Cannot know ahead of time where in memory a
    program will be loaded!
  • Compiler produces code containing embedded
    addresses
  • these addresses cant be absolute ( physical
    addresses)
  • Linker combines pieces of the program
  • Assumes the program will be loaded at address 0
  • We need to bind the compiler/linker generated
    addresses to the actual memory locations

7
Relocatable address generation
0 100 175
1000 1100 1175
Library Routines
Library Routines
Prog P foo() End P
P push ... jmp _foo foo ...
P push ... jmp 75 foo ...
0 75
P push ... jmp 175 foo ...
P push ... jmp 1175 foo ...
Compilation
Assembly
Linking
Loading
8
Address binding
  • Address binding
  • fixing a physical address to the logical address
    of a process address space
  • Compile time binding
  • if program location is fixed and known ahead of
    time
  • Load time binding
  • if program location in memory is unknown until
    run-time AND location is fixed
  • Execution time binding
  • if processes can be moved in memory during
    execution
  • Requires hardware support!

9
1000 1100 1175
Library Routines
0 100 175
Library Routines
Compile Time Address Binding
P push ... jmp 1175 foo ...
P push ... jmp 175 foo ...
Load Time Address Binding
Execution Time Address Binding
1000 1100 1175
Library Routines
0 100 175
Library Routines
P push ... jmp 1175 foo ...
P push ... jmp 175 foo ...
Base register
1000
10
Memory management architectures
  • Fixed size allocation
  • Memory is divided into fixed partitions
  • Dynamically sized allocation
  • Memory allocated to fit processes exactly

11
Runtime binding base limit registers
  • Simple runtime relocation scheme
  • Use 2 registers to describe a partition
  • For every address generated, at runtime...
  • Compare to the limit register ( abort if larger)
  • Add to the base register to give physical memory
    address

12
Dynamic relocation with a base register
  • Memory Management Unit (MMU) - dynamically
    converts logical addresses into physical address
  • MMU contains base address register for running
    process

Relocation register for process i
Max Mem
Max addr
1000
process i
0
Program generated address

Physical memory address
Operating system
MMU
0
13
Protection using base limit registers
  • Memory protection
  • Base register gives starting address for process
  • Limit register limits the offset accessible from
    the relocation register

limit
base
register
register
Physical
memory
logical
address
address
yes

lt
no
addressing error
14
Multiprogramming with base and limit registers
  • Multiprogramming a separate partition per
    process
  • What happens on a context switch?
  • Store process As base and limit register values
  • Load new values into base and limit registers for
    process B

Partition E
limit
Partition D
Partition C
base
Partition B
Partition A
OS
15
896K
128K
O.S.
16
576K
896K
P
320K
1
128K
O.S.
O.S.
128K
17
576K
352K
896K
P
224K
2
P
P
320K
320K
1
1
128K
O.S.
O.S.
O.S.
128K
128K
18
64K
P
3
576K
352K
288K
896K
P
P
224K
224K
2
2
P
P
P
320K
320K
320K
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
19
64K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
20
64K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
64K
P
3
288K
96K
P
128K
4
P
320K
1
O.S.
128K
21
64K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
64K
64K
P
P
3
3
288K
288K
96K
96K
P
P
128K
128K
4
4
P
320K
320K
1
O.S.
O.S.
128K
128K
22
64K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
64K
64K
64K
P
P
P
3
3
3
288K
288K
288K
96K
96K
96K
P
P
P
128K
128K
128K
4
4
4
96K
P
320K
320K
1
P
224K
5
O.S.
O.S.
O.S.
128K
128K
128K
23
64K
64K
P
P
3
3
576K
352K
288K
288K
896K
P
P
224K
224K
224K
2
2
P
P
P
P
320K
320K
320K
320K
1
1
1
1
128K
O.S.
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
64K
64K
64K
64K
P
P
P
P
3
3
3
3
288K
288K
288K
288K
96K
96K
96K
96K
???
128K
P
P
P
P
128K
128K
128K
128K
P
4
4
4
4
6
96K
96K
P
320K
320K
1
P
224K
P
224K
5
5
O.S.
O.S.
O.S.
O.S.
128K
128K
128K
128K
24
Swapping
  • When a program is running...
  • The entire program must be in memory
  • Each program is put into a single partition
  • When the program is not running...
  • May remain resident in memory
  • May get swapped out to disk
  • Over time...
  • Programs come into memory when they get swapped
    in
  • Programs leave memory when they get swapped out

25
Basics - swapping
  • Benefits of swapping
  • Allows multiple programs to be run concurrently
  • more than will fit in memory at once

Max mem
Process i
Swap in
Process m
Process j
Process k
Swap out
Operating system
0
26
Swapping can also lead to fragmentation
27
Dealing with fragmentation
  • Compaction from time to time shift processes
    around to collect all free space into one
    contiguous block
  • Placement algorithms First-fit, best-fit,
    worst-fit

64K
256K
P
3
288K
96K
P
288K
3
???
P
128K
P
128K
4
P
6
6
P
128K
96K
4
P
224K
P
224K
5
5
O.S.
O.S.
128K
128K
28
Influence of allocation policy
29
How big should partitions be?
  • Programs may want to grow during execution
  • More room for stack, heap allocation, etc
  • Problem
  • If the partition is too small programs must be
    moved
  • Requires modification of base and limit regs
  • Why not make the partitions a little larger than
    necessary to accommodate some growth?
  • Fragmentation
  • External fragmentation unused space between
    partitions
  • Internal fragmentation unused space within
    partitions

30
Allocating extra space within partitions
31
Managing memory
  • Each chunk of memory is either
  • Used by some process or unused (free)
  • Operations
  • Allocate a chunk of unused memory big enough to
    hold a new process
  • Free a chunk of memory by returning it to the
    free pool after a process terminates or is
    swapped out

32
Managing memory with bit maps
  • Problem - how to keep track of used and unused
    memory?
  • Technique 1 - Bit Maps
  • A long bit string
  • One bit for every chunk of memory
  • 1 in use
  • 0 free
  • Size of allocation unit influences space required
  • Example unit size 32 bits
  • overhead for bit map 1/33 3
  • Example unit size 4Kbytes
  • overhead for bit map 1/32,769

33
Managing memory with bit maps
34
Managing memory with linked lists
  • Technique 2 - Linked List
  • Keep a list of elements
  • Each element describes one unit of memory
  • Free / in-use Bit (Pprocess, Hhole)
  • Starting address
  • Length
  • Pointer to next element

35
Managing memory with linked lists
0

36
Merging holes
  • Whenever a unit of memory is freed we want to
    merge adjacent holes!

37
Merging holes
38
Merging holes
39
Merging holes
40
Merging holes
41
Managing memory with linked lists
  • Searching the list for space for a new process
  • First Fit
  • Next Fit
  • Start from current location in the list
  • Not as good as first fit
  • Best Fit
  • Find the smallest hole that will work
  • Tends to create lots of little holes
  • Worst Fit
  • Find the largest hole
  • Remainder will be big
  • Quick Fit
  • Keep separate lists for common sizes

42
Fragmentation
  • Memory is divided into partitions
  • Each partition has a different size
  • Processes are allocated space and later freed
  • After a while memory will be full of small holes!
  • No free space large enough for a new process even
    though there is enough free memory in total
  • This is external fragmentation
  • If we allow free space within a partition we have
    internal fragmentation

43
Solution to fragmentation?
  • Allocate memory in equal fixed size units?
  • Reduces external fragmentation problems
  • But what about wasted space inside a unit due to
    internal fragmentation?
  • How big should the units be?
  • The smaller the better for internal fragmentation
  • The larger the better for management overhead
  • Can we use a unit size smaller than the memory
    needed by a process?
  • Ie, allocate non-contiguous units to the same
    process?
  • but how would the base and limit registers work?

44
Using pages for non-contiguous allocation
  • Memory divided into fixed size page frames
  • Page frame size 2n bytes
  • Lowest n bits of an address specify byte offset
    in page
  • But how do we associate page frames with
    processes?
  • And how do we map memory addresses within a
    process to the correct memory byte in a page
    frame?
  • Solution
  • Processes use virtual addresses
  • Hardware uses physical addresses
  • hardware support for virtual to physical address
    translation

45
Virtual addresses
  • Virtual memory addresses (what the process uses)
  • Page number plus byte offset in page
  • Low order n bits are the byte offset
  • Remaining high order bits are the page number

bit 0
bit n-1
bit 31
20 bits
12 bits
page number
offset
Example 32 bit virtual address Page size 212
4KB Address space size 232 bytes 4GB
46
Physical addresses
  • Physical memory addresses (what the CPU uses)
  • Page frame number plus byte offset in page
  • Low order n bits are the byte offset
  • Remaining high order bits are the page frame
    number

bit 0
bit n-1
bit 24
12 bits
12 bits
Page frame number
offset
Example 24 bit physical address Page frame size
212 4KB Max physical memory size 224 bytes
16MB
47
Address translation
  • Hardware maps page numbers to page frame numbers
  • Memory management unit (MMU) has multiple
    registers for multiple pages
  • Like a base register except its value is
    substituted for the page number rather than added
    to it
  • Why dont we need a limit register for each page?

48
Memory Management Unit (MMU)
49
Virtual address spaces
  • Here is the virtual address space
  • (as seen by the process)

Lowest address
Highest address
Virtual Addr Space
50
Virtual address spaces
  • The address space is divided into pages
  • In x86, the page size is 4K

Page 0
0 1 2 3 4 5 6 7 N
Page 1
A Page
Page N
Virtual Addr Space
51
Virtual address spaces
  • In reality, only some of the pages are used

0 1 2 3 4 5 6 7 N
Unused
Virtual Addr Space
52
Physical memory
  • Physical memory is divided into page frames
  • (Page size frame size)

0 1 2 3 4 5 6 7 N
Physical memory
Virtual Addr Space
53
Virtual and physical address spaces
  • Some page frames are used to hold the pages of
    this process

0 1 2 3 4 5 6 7 N
These frames are used for this process
Virtual Addr Space
Physical memory
54
Virtual and physical address spaces
  • Some page frames are used for other processes

0 1 2 3 4 5 6 7 N
Used by other Processes
Virtual Addr Space
Physical memory
55
Virtual address spaces
  • Address mappings say which frame has which page

0 1 2 3 4 5 6 7 N
Virtual Addr Space
Physical memory
56
Page tables
  • Address mappings are stored in a page table in
    memory
  • One page table entry per page...
  • Is this page in memory? If so, which frame is it
    in?

0 1 2 3 4 5 6 7 N
Virtual Addr Space
Physical memory
57
Address mappings and translation
  • Address mappings are stored in a page table in
    memory
  • Typically one page table for each process
  • Address translation is done by hardware (ie the
    MMU)
  • How does the MMU get the address mappings?
  • Either the MMU holds the entire page table (too
    expensive)
  • Or the MMU holds a portion of the page table
  • MMU caches page table entries
  • called a translation look-aside buffer (TLB)

58
Address mappings and translation
  • What if the TLB needs a mapping it doesnt have?
  • Software managed TLB
  • it generates a TLB-miss fault which is handled by
    the operating system (like interrupt or trap
    handling)
  • The operating system looks in the page tables,
    gets the mapping from the right entry, and puts
    it in the TLB
  • Hardware managed TLB
  • it looks in a pre-specified memory location for
    the appropriate entry in the page table
  • The hardware architecture defines where page
    tables must be stored in memory

59
A Simple Architecture
  • Page size
  • 4 Kbytes
  • Virtual addresses (logical addresses)
  • 32 bits --gt 4GB virtual address space
  • 2M Pages --gt 20 bits for page number

60
A Simple Architecture
  • Page size
  • 4 Kbytes
  • Virtual addresses (logical addresses)
  • 32 bits --gt 4GB virtual address space
  • 2M Pages --gt 20 bits for page number

0
11
12
32
20 bits
12 bits
page number
offset
61
A Simple Architecture
  • Physical addresses
  • 32 bits --gt 4 Gbyte installed memory (max)
  • 4096K Frames --gt 20 bits for frame number
  • Hardware Extensions

62
A Simple Architecture
  • The page table mapping
  • Page Directory -gt Page Table --gt Frame
  • Virtual Address

0
32
12 bits
20 bits
Page Frame (Physical Memory)
Page table (1M entries)
63
Quiz
  • What is the difference between a virtual and a
    physical address?
  • Why are programs not usually written using
    physical addresses?

64
Page tables
  • When and why do we access a page table?
  • On every instruction to translate virtual to
    physical addesses?

65
Page tables
  • When and why do we access a page table?
  • On every instruction to translate virtual to
    physical addresses? NO!
  • On TLB miss faults to refill the TLB
  • During process creation and destruction
  • When a process allocates or frees memory?

66
Translation Lookaside Buffer (TLB)
  • Problem
  • MMU must go to page table on every memory access!

67
Translation Lookaside Buffer (TLB)
  • Problem
  • MMU must go to page table on every memory access!
  • Solution
  • Cache the page table entries in a hardware cache
  • Small number of entries (e.g., 64)
  • Each entry contains
  • Page number
  • Other stuff from page table entry
  • Associatively indexed on page number

68
Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
0
12
13
31
frame number
offset
physical address
69
Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
70
Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
71
Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
72
Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
73
Hardware operation of TLB
virtual address
0
12
13
31
page number
offset
Key
Page Number
Frame Number
Other
unused
D R W V
23
37
unused
50
D R W V
17
unused
24
D R W V
92
unused
19
D R W V
5
unused
12
6
D R W V
0
12
13
31
frame number
offset
physical address
74
Software operation of TLB
  • What if the entry is not in the TLB?
  • Go to page table
  • Find the right entry
  • Move it into the TLB
  • Which entry to replace?
  • Hardware TLB refill
  • Page tables in specific location and format
  • Software refill
  • Hardware generates trap (TLB miss fault)
  • Lets the OS deal with the problem
  • Page tables become entirely a OS data structure!
  • Want to do a context switch?
  • Must empty the TLB
  • Just clear the Valid Bit

75
Software operation of TLB
  • What should we do with the TLB on a context
    switch?
  • How can we prevent the next process from using
    the last processs address mappings?
  • Option 1 empty the TLB
  • New process will generate faults until its pulls
    enough of its own entries into the TLB
  • Option 2 just clear the Valid Bit
  • New process will generate faults until its pulls
    enough of its own entries into the TLB
  • Option 3 the hardware maintains a process id tag
    on each TLB entry
  • Hardware compares this to a process id held in a
    specific register on every translation

76
Page tables
  • Do we access a page table when a process
    allocates or frees memory?

77
Page tables
  • Do we access a page table when a process
    allocates or frees memory?
  • Not necessarily
  • Library routines (malloc) can service small
    requests from a pool of free memory within a
    process
  • When these routines run out of space a new page
    must be allocated and its entry inserted into the
    page table

78
Page tables
  • When and why do we access a page table?
  • On every instruction to translate virtual to
    physical addresses? NO!
  • On TLB miss faults to refill the TLB
  • During process creation and destruction
  • When a process allocates or frees memory?
  • Library routines (malloc) can service small
    requests from a pool of free memory within a
    process
  • When these routines run out of space a new page
    must be allocated and its entry inserted into the
    page table
  • During swapping/paging to disk

79
Page tables
  • In a well provisioned system, TLB miss faults
    will be the most frequently occurring event
  • TLB miss fault
  • Given a virtual page number we must find the
    right page table entry
  • Fastest approach index the page table using
    virtual page numbers

80
Page table design
  • Page table size depends on
  • Page size
  • Virtual address length
  • Memory used for page tables is overhead!
  • How can we save space?
  • and still find entries quickly?
  • Two main ideas
  • Multi-level page tables
  • Inverted page tables

81
Multi-level Page Tables
82
Multi-level Page Tables
frames in memory

Top-level Page table
2nd-level tables
83
Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory

Top-level Page table
2nd-level tables
84
Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory

Top-level Page table
2nd-level tables
85
Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory

Top-level Page table
2nd-level tables
86
Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory

Top-level Page table
2nd-level tables
87
Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory

Top-level Page table
2nd-level tables
88
Multi-level Page Tables
A Virtual Address
10-bits
10-bits
12-bits
PT1
offset
PT2
frames in memory

Top-level Page table
2nd-level tables
89
Multi-level page tables
  • Ok, so how does this save space?
  • Not all pages within a virtual address space are
    allocated
  • Not only do they not have a page frame, but that
    range of virtual addresses is not being used
  • So no need to maintain complete information about
    it
  • Some intermediate page tables are empty and not
    needed
  • We could also page the page table
  • This saves space but slows access a lot!

90
The x86 architecture
  • Page size
  • 4 Kbytes
  • Virtual addresses (logical addresses)
  • 32 bits --gt 4GB virtual address space
  • 2M Pages --gt 20 bits for page number

91
The x86 architecture
  • The page table mapping
  • Page Directory -gt Page Table --gt Frame
  • Virtual Address

0
32
12 bits
10 bits
10 bits
Page Frame (Physical Memory)
Page Directory (1024 entries)
Page Table (1024 entries)
92
Inverted page tables
  • Problem
  • Page table overhead increases with address space
    size
  • Page tables get too big to fit in memory!
  • Consider a computer with 64 bit addresses
  • Assume 4 Kbyte pages (12 bits for the offset)
  • Virtual address space 252 pages!
  • Page table needs 252 entries!
  • This page table is much too large for memory!
  • But we only need fast access to translations for
    those pages that are in memory!
  • A 256 Mbyte memory can only hold 64 4Kbyte pages
  • So we really only need 64 page table entries!

93
Inverted page tables
  • An inverted page table
  • Has one entry for every frame of memory
  • Tells which page is in that frame
  • Is indexed by frame number not page number!
  • So how can we search it?
  • If we have a page number (from a faulting
    address) and want to find it page table entry, do
    we
  • Do an exhaustive search of all entries?

94
Inverted page tables
  • An inverted page table
  • Has one entry for every frame of memory
  • Tells which page is in that frame
  • Is indexed by frame number not page number!
  • So how can we search it?
  • If we have a page number (from a faulting
    address) and want to find it page table entry, do
    we
  • Do an exhaustive search of all entries?
  • No, thats too slow!
  • Why not maintain a hash table to allow fast
    access given a page number?

95
Inverted Page Table
96
Which page table design is best?
  • The best choice depends on CPU architecture
  • 64 bit systems need inverted page tables
  • Some systems use a combination of regular page
    tables together with segmentation (later)

97
Page tables
A typical page table entry
98
Performance of memory translation
  • Why cant memory address translation be done in
    software?
  • How often is translation done?
  • What work is involved in translating a virtual
    address to a physical address?
  • indexing into page tables
  • interpreting page descriptors
  • more memory references!

99
Memory hierarchy performance
  • The memory hierarchy consists of several types
    of memory
  • L1 cache (typically on die)
  • L2 cache (typically available)
  • Memory (DRAM, SRAM, RDRAM,)
  • Disk (lots of space available)
  • Tape (even more space available)

1 cycle
0.5 ns!
0.5 ns - 20 ns
1 - 40 cycles
40 - 80 ns
80 - 160
8 - 13 ms
16M - 26M
longer than you want!
360 Billion
100
Performance of memory translation (2)
  • How can additional memory references be avoided?
  • TLB - translation look-aside buffer
  • an associative memory cache for page table
    entries
  • if there is locality of reference, performance is
    good

101
Translation lookaside buffer
p
o
CPU
page
frame
TLB Hit
Physical memory
f
o
TLB
Page Table
102
TLB entries
103
TLB implementation
  • In order to be fast, TLBs must implement an
    associative search where the cache is searched in
    parallel.
  • EXPENSIVE
  • The number of entries varies (8 -gt 2048)
  • Because the TLB translates logical pages to
    physical pages, the TLB must be flushed on every
    context switch in order to work
  • Can improve performance by associating process
    bits with each TLB entry
  • A TLB must implement an eviction policy which
    flushes old entries out of the TLB
  • Occurs when the TLB is full

104
Page table organization
  • How big should a virtual address space be?
  • what factors influence its size?
  • How big are page tables?
  • what factors determine their size?
  • Can page tables be held entirely in cache?
  • can they be held entirely in memory even?
  • How big should page sizes be?

105
Page Size Issues
Choose a large page size More loss due to
internal fragmentation Assume a process is using
5 regions of memory heavily ... Will need 5
pages, regardless of page size ---gt Ties up
more memory Choose a small page size The page
table will become very large Example Virtual
Address Space 4G bytes Page Size 4K (e.g.,
Pentium) Page table size 1M entries! (4Mbytes)
106
Address space organization
  • How big should a virtual address space be?
  • Which regions of the address space should be
    allocated for different purposes - stack, data,
    instructions?
  • What if memory needs for a region increase
    dynamically?
  • What are segments?
  • What is the relationship between segments and
    pages?
  • Can segmentation and paging be used together?
  • If segments are used, how are segment selectors
    incorporated into addresses?

107
Memory protection
  • At what granularity should protection be
    implemented?
  • page-level?
  • segment level?
  • How is protection checking implemented?
  • compare page protection bits with process
    capabilities and operation types on every access
  • sounds expensive!
  • How can protection checking be done efficiently?
  • segment registers
  • protection look-aside buffers

108
Memory protection with paging
  • Associate protection bits with each page table
    entry
  • Read/Write access - can provide read-only access
    for re-entrant code
  • Valid/Invalid bits - tells MMU whether or not the
    page exists in the process address space
  • Page Table Length Register (PTLR) - stores how
    long the page table is to avoid an excessive
    number of unused page table entries

Frame R/W V/I
Page Table
109
Handling accesses to invalid pages
  • The page table is used to translate logical
    addresses to physical addresses
  • Pages that are not in memory are marked invalid
  • A page fault occurs when there is an access to an
    invalid page of a process
  • Page faults require the operating system to
  • suspend the process
  • find a free frame in memory
  • swap-in the page that had the fault
  • update the page table entry (PTE)
  • restart the process

110
Page fault handling in more detail
  • Hardware traps to kernel
  • General registers saved
  • OS determines which virtual page needed
  • OS checks validity of address, seeks page frame
  • If eviction needed frame is dirty, write it to
    disk

111
Page fault handling in more detail
  • OS brings new page in from disk
  • Page tables updated
  • Faulting instruction backed up to when it began
  • Faulting process scheduled
  • Registers restored
  • Program continues

112
Anatomy of a page fault
A
Restart Proc.
Logical memory
Update PTE
Bring in page
Page fault
Physical memory
Find Frame
Page table
O.S.
Get from backing store
113
Locking pages in memory
  • An Issue to be aware of
  • Virtual memory and I/O occasionally interact
  • Process issues call for read from device into
    buffer
  • while waiting for I/O, another processes starts
    up
  • has a page fault
  • buffer for the first process may be chosen to be
    paged out
  • Need to specify some pages locked (pinned)
  • exempted from being target pages

114
Quiz
  • Why is hardware support required for dynamic
    address translation?
  • What is a page table used for?
  • What is a TLB used for?
  • How many address bits are used for the page
    offset in a system with 2KB page size?

115
Memory protection
  • At what granularity should protection be
    implemented?
  • page-level?
  • A lot of overhead for storing protection
    information for non-resident pages
  • segment level?
  • Coarser grain than pages
  • Makes sense if contiguous groups of pages share
    the same protection status

116
Memory protection
  • How is protection checking implemented?
  • compare page protection bits with process
    capabilities and operation types on every
    load/store
  • sounds expensive!
  • Requires hardware support!
  • How can protection checking be done efficiently?
  • Use the TLB as a protection look-aside buffer
  • Use special segment registers

117
Protection lookaside buffer
  • A TLB is often used for more than just
    translation
  • Memory accesses need to be checked for validity
  • Does the address refer to an allocated segment of
    the address space?
  • If not segmentation fault!
  • Is this process allowed to access this memory
    segment?
  • If not segmentation/protection fault!
  • Is the type of access valid for this segment?
  • Read, write, execute ?
  • If not protection fault!

118
Page-grain protection checking with a TLB
119
Segment-grain protection
  • All pages within a segment usually share the same
    protection status
  • So we should be able to batch the protection
    information
  • Why not just use segment-size pages?
  • Segments vary in size
  • Segments change size dynamically (stack, heap
    etc)

120
Segmentation in a single address space
Example A compiler
121
Segmented address spaces
  • Traditional Virtual Address Space
  • flat address space (1 dimensional)
  • Segmented Address Space
  • Program made of several pieces
  • Each segment is like a mini-address space
  • Addresses within a segment start at zero
  • The program must always say which segment it
    means
  • either embed a segment id in an address
  • or load a value into a segment register
  • Addresses
  • Segment Offset
  • Each segment can grow independently of others

122
Segmented memory
Each space grows, shrinks independently!
123
Separate instruction and data spaces
One address space Separate I and D spaces
124
Page sharing
  • In a large multiprogramming system...
  • Some users run the same program at the same time
  • Why have more than one copy of the same page in
    memory???
  • Goal
  • Share pages among processes (not just threads!)
  • Cannot share writable pages
  • If writable pages were shared processes would
    notice each others effects
  • Text segment can be shared

125
Page sharing
Physical memory
Process 1 address space
Process 1 page table
Stack (rw)
Process 2 address space
Process 2 page table
Data (rw) Instructions (rx)
126
Page sharing
  • Fork system call
  • Copy the parents virtual address space
  • ... and immediately do an Exec system call
  • Exec overwrites the calling address space with
    the contents of an executable file (ie a new
    program)
  • Desired Semantics
  • pages are copied, not shared
  • Observations
  • Copying every page in an address space is
    expensive!
  • processes cant notice the difference between
    copying and sharing unless pages are modified!

127
Page sharing
  • Idea Copy-On-Write
  • Initialize new page table, but point entries to
    existing page frames of parent
  • Share pages
  • Temporarily mark all pages read-only
  • Share all pages until a protection fault occurs
  • Protection fault (copy-on-write fault)
  • Is this page really read only or is it writable
    but temporarily protected for copy-on-write?
  • If it is writable
  • copy the page
  • mark both copies writable
  • resume execution as if no fault occurred

128
On Page replacement..
  • Paging performance
  • Paging works best if there are plenty of free
    frames.
  • If all pages are full of dirty pages...
  • Must perform 2 disk operations for each page
    fault

129
Page replacement
  • Assume a normal page table
  • User-program is executing
  • A PageInvalidFault occurs!
  • The page needed is not in memory
  • Select some frame and remove the page in it
  • If it has been modified, it must be written back
    to disk
  • the dirty bit in its page table entry tells us
    if this is necessary
  • Figure out which page was needed from the
    faulting addr
  • Read the needed page into this frame
  • Restart the interrupted process by retrying the
    same instruction

130
Page replacement algorithms
  • Which frame to replace?
  • Algorithms
  • The Optimal Algorithm
  • First In First Out (FIFO)
  • Not Recently Used (NRU)
  • Second Chance / Clock
  • Least Recently Used (LRU)
  • Not Frequently Used (NFU)
  • Working Set (WS)
  • WSClock

131
The optimal page replacement algorithm
  • Idea
  • Select the page that will not be needed for the
    longest time

132
Optimal page replacement
  • Replace the page that will not be needed for the
    longest
  • Example

a a a a b b b b c c c c
d d d d X
133
Optimal page replacement
  • Select the page that will not be needed for the
    longest time
  • Example

a a a a a a a a a b b b b
b b b b b c c c c c c c c
c d d d d e e e e e
X X
134
The optimal page replacement algorithm
  • Idea
  • Select the page that will not be needed for the
    longest time
  • Problem
  • Cant know the future of a program
  • Cant know when a given page will be needed next
  • The optimal algorithm is unrealizable

135
The optimal page replacement algorithm
  • However
  • We can use it as a control case for simulation
    studies
  • Run the program once
  • Generate a log of all memory references
  • Use the log to simulate various page replacement
    algorithms
  • Can compare others to optimal algorithm

136
FIFO page replacement algorithm
  • Always replace the oldest page
  • Replace the page that has been in memory for the
    longest time.

137
FIFO page replacement algorithm
  • Replace the page that was first brought into
    memory
  • Example Memory system with 4 frames

a a a b c c c c
d d X
138
FIFO page replacement algorithm
  • Replace the page that was first brought into
    memory
  • Example Memory system with 4 frames

a a a a a b b b c c
c c e e d d d d
X
139
FIFO page replacement algorithm
  • Replace the page that was first brought into
    memory
  • Example Memory system with 4 frames

a a a a a a a c c
b b b b b b b c c c c e e
e e e e d d d d d d d
a X X X
140
FIFO page replacement algorithm
  • Always replace the oldest page.
  • Replace the page that has been in memory for the
    longest time.
  • Implementation
  • Maintain a linked list of all pages in memory
  • Keep it in order of when they came into memory
  • The page at the front of the list is oldest
  • Add new page to end of list

141
FIFO page replacement algorithm
  • Disadvantage
  • The oldest page may be needed again soon
  • Some page may be important throughout execution
  • It will get old, but replacing it will cause an
    immediate page fault

142
Page table referenced and dirty bits
  • Each page table entry (and TLB entry) has a
  • Referenced bit - set by TLB when page read /
    written
  • Dirty / modified bit - set when page is written
  • If TLB entry for this page is valid, it has the
    most up to date version of these bits for the
    page
  • OS must copy them into the page table entry
    during fault handling
  • On Some Hardware...
  • ReadOnly bit but no dirty bit

143
Page table referenced and dirty bits
  • Idea
  • Software sets the ReadOnly bit for all pages
  • When program tries to update the page...
  • A trap occurs
  • Software sets the Dirty Bit and clears the
    ReadOnly bit
  • Resumes execution of the program

144
Not recently used page replacement alg.
  • Use the Referenced Bit and the Dirty Bit
  • Initially, all pages have
  • Referenced Bit 0
  • Dirty Bit 0
  • Periodically... (e.g. whenever a timer interrupt
    occurs)
  • Clear the Referenced Bit

145
Not recently used page replacement alg.
  • When a page fault occurs...
  • Categorize each page...
  • Class 1 Referenced 0 Dirty 0
  • Class 2 Referenced 0 Dirty 1
  • Class 3 Referenced 1 Dirty 0
  • Class 4 Referenced 1 Dirty 1
  • Choose a victim page from class 1 why?
  • If none, choose a page from class 2 why?
  • If none, choose a page from class 3 why?
  • If none, choose a page from class 4 why?

146
Second chance page replacement alg.
  • Modification to FIFO
  • Pages kept in a linked list
  • Oldest is at the front of the list
  • Look at the oldest page
  • If its referenced bit is 0...
  • Select it for replacement
  • Else
  • It was used recently dont want to replace it
  • Clear its referenced bit
  • Move it to the end of the list
  • Repeat
  • What if every page was used in last clock tick?
  • Select a page at random

147
Clock algorithm (same as second chance)
  • Maintain a circular list of pages in memory
  • Set a bit for the page when a page is referenced
  • Clock sweeps over memory looking for a victim
    page that does not have the referenced bit set
  • If the bit is set, clear it and move on to the
    next page
  • Replaces pages that havent been referenced for
    one complete clock revolution

148
Least recently used algorithm (LRU)
  • Keep track of when a page is used.
  • Replace the page that has been used least
    recently.

149
LRU page replacement
  • Replace the page that hasnt been referenced in
    the longest time

150
LRU page replacement
  • Replace the page that hasnt been referenced in
    the longest time

a a a a a a a a a a b b b
b b b b b b b c c c c e e
e e e d d d d d d d d d c
c X X X
151
Least recently used algorithm (LRU)
  • But how can we implement this?
  • Implementation 1
  • Keep a linked list of all pages
  • On every memory reference,
  • Move that page to the front of the list.
  • The page at the tail of the list is replaced.
  • on every memory reference...
  • Not feasible in software

152
LRU implementation
  • Take referenced and put at head of list

153
LRU implementation
  • Take referenced and put at head of list

a a b b c c d d
154
LRU implementation
  • Take referenced and put at head of list

a a a a b b b b c c c c d d
d d X
155
LRU implementation
  • Take referenced and put at head of list

a a a a a a a a a a b b b
b b b b b b b c c c c e e
e e e d d d d d d d d d c
c X X X
156
Least recently used algorithm (LRU)
  • But how can we implement this?
  • without requiring every access to be recorded?
  • Implementation 2
  • MMU (hardware) maintains a counter
  • Incremented on every clock cycle
  • Every time a page table entry is used
  • MMU writes the value to the entry
  • timestamp / time-of-last-use
  • When a page fault occurs
  • Software looks through the page table
  • Idenitifies the entry with the oldest timestamp

157
Least recently used algorithm (LRU)
  • What if we dont have hardware support?
  • Implementation 3
  • No hardware support
  • Maintain a counter in software
  • One every timer interrupt...
  • Increment counter
  • Run through the page table
  • For every entry that has ReferencedBit 1
  • Update its timestamp
  • Clear the ReferencedBit
  • Approximates LRU
  • If several have oldset time, choose one
    arbitrarily

158
Not frequently used algorithm (NFU)
  • Associate a counter with each page
  • On every clock interrupt, the OS looks at each
    page.
  • If the Reference Bit is set...
  • Increment that pages counter clear the bit.
  • The counter approximates how often the page is
    used.
  • For replacement, choose the page with lowest
    counter.

159
Not frequently used algorithm (NFU)
  • Problem
  • Some page may be heavily used
  • ---gt Its counter is large
  • The programs behavior changes
  • Now, this page is not used ever again (or only
    rarely)
  • This algorithm never forgets!
  • This page will never be chosen for replacement!

160
Modified NFU with aging
  • Associate a counter with each page
  • On every clock tick, the OS looks at each page.
  • Shift the counter right 1 bit (divide its value
    by 2)
  • If the Reference Bit is set...
  • Set the most-significant bit
  • Clear the Referenced Bit
  • 100000 32
  • 010000 16
  • 001000 8
  • 000100 4
  • 100010 34
  • 111111 63

161
Paged Memory Mangement
  • Concepts.

162
Working set page replacement
  • Demand paging
  • Pages are only loaded when accessed
  • When process begins, all pages marked INVALID
  • Locality of Reference
  • Processes tend to use only a small fraction of
    their pages
  • Working Set
  • The set of pages a process needs
  • If working set is in memory, no page faults
  • What if you cant get working set into memory?

163
Working set page replacement
  • Thrashing
  • If you cant get working set into memory pages
    fault every few instructions
  • No work gets done

164
Working set page replacement
  • Prepaging (prefetching)
  • Load pages before they are needed
  • Main idea
  • Indentify the processs working set
  • How big is the working set?
  • Look at the last K memory references
  • As K gets bigger, more pages needed.
  • In the limit, all pages are needed.

165
Working set page replacement
  • The size of the working set

k (the time interval)
166
Working set page replacement
  • Idea
  • Look back over the last T msec of time
  • Which pages were referenced?
  • This is the working set.
  • Current Virtual Time
  • Only consider how much CPU time this process has
    seen.
  • Implementation
  • On each clock tick, look at each page
  • Was it referenced?
  • Yes Make a note of Current Virtual Time
  • If a page has not been used in the last T msec,
  • It is not in the working set!
  • Evict it write it out if it is dirty.

167
Working set page replacement
168
WSClock page replacement algorithm
  • All pages are kept in a circular list (ring)
  • As pages are added, they go into the ring.
  • The clock hand advances around the ring.
  • Each entry contains time of last use.
  • Upon a page fault...
  • If Reference Bit 1...
  • Page is in use now. Do not evict.
  • Clear the Referenced Bit.
  • Update the
Write a Comment
User Comments (0)
About PowerShow.com