Title: Memory Management and Virtual Memory
1 Memory Management and Virtual Memory
- Background
- Overlays versus Swapping
- Contiguous Allocation
-
2Memory-Management Unit (MMU)
- The run-time mapping from virtual to physical
addresses is done by the memory-management unit
(MMU), which is a hardware device. - The user program deals with logical addresses it
never sees the real physical addresses. - Two different types of addresses
- Logical range 0 to max.
- Physical range R0 to Rmax where R is a base
value. - The user generates only logical addresses and
thinks that the process runs in locations 0 to
max.
3Memory-Management Unit (MMU) (Cont.)
- In MMU scheme, the value in the relocation
register is added to every address generated by a
user process at the time it is sent to memory. - Example Dynamic relocation using a relocation
register.
4Overlays
- The entire program and data of a process must be
in the physical memory for the process to
execute. - The size of a process is limited to the size of
physical memory. - If a process is larger than the amount of memory,
a technique called overlays can be used. - Overlays is to keep in memory only those
instructions and data that are needed at any
given time. - When other instructions are needed, they are
loaded into space that was occupied previously by
instructions that are no longer needed. - Overlays are implemented by user, no special
support needed from operating system, programming
design of overlay structure is complex.
5Overlays Example
- Example Consider a two-pass assembler.
- Pass1 constructs a symbol table.
- Pass2 generates machine-language code.
- Assume the following
- To load everything at once, we need 200k of
memory.
Size (k 1024 bytes)
Pass1 70k
Pass2 80k
Symbol table 20k
Common routines 30k
Total size 200k
6Overlays Example (Cont.)
- If only 150K is available, we cannot run our
process. - Notice that Pass1 and Pass2 do not need to be in
memory at same time. - So, we define two overlays
- Overlay A symbol table, common routines, and
Pass1. - Overlay B symbol table, common routines, and
Pass2. - We add overlay driver 10k and start with overlay
A in memory. - When finish Pass1, we jump to overlay driver,
which reads overlay B into memory overwriting
overlay A and transfer control to Pass2.
7Overlays Example (Cont.)
- Overlay A needs 130k and Overlay B needs 140k.
Symbol table
Common routines
Overlay driver
8Swapping
- A process can be swapped temporarily out of
memory to a backing store, and then brought back
into memory for continued execution. - Backing store fast disk large enough to
accommodate copies of all memory images for all
users must provide direct access to these memory
images. - Roll out, roll in swapping variant used for
priority-based scheduling algorithms
lower-priority process is swapped out so
higher-priority process can be loaded and
executed. - Normally, a process that is swapped out will be
swapped back into the same memory space that it
occupied previously. - Example In a multiprogramming environment with
Round Robin CPU scheduling, when time slice
expires, the memory manager swap out the process
that just finished, and swap in another process
to the memory space that has been freed.
9Schematic View of Swapping
- Major part of swap time is transfer time total
transfer time is directly proportional to the
amount of memory swapped. - The context-switch time in swapping system is
high. - Modified versions of swapping are found on many
systems, i.e., UNIX and Microsoft Windows.
10Example 1 Swapping
- Let Process P1 size is 100KB and the transfer
rate of a hard disk is 1MB/second. - To transfer P1 (100KB) to or from memory takes
100K/1000K per second, which is 1/10 second 100
milliseconds. - Assume the average latency is 8 milliseconds,
then to swap in or out takes 108 milliseconds.
The total swap (in and out) is 216 milliseconds
for 100KB process. - For efficient CPU utilization, we want our
execution time for each process to be long
relative to the swap time. - A Round Robin scheduling algorithm, the time
slice should be larger than 216 milliseconds
(from the above example).
11Contiguous Allocation
- Main memory usually is divided into two
partitions - For the resident operating system
- For the user processes.
O.S.
User
12Example 2 Swapping
- A computer system has 1MB of main memory. The
O.S. takes 100KB. - Maximum size of user process is 900KB.
- User process may be smaller than 900KB.
- From previous example if process P1 is 100K the
swap time is 108 milliseconds. But, if P1 is 900K
then the swap time is 908 milliseconds. - As the size of a process increases the swap time
increases too.
O.S. 100K
User 900K
Main Memory
13Single Partition Allocation
- Single-partition allocation
- Needs Protection.
- Protect O.S. code and data from changes by the
user processes. - Protect user processes from one another.
- We can provide protection by using a
relocation-register with a limit register. - Relocation register contains value of smallest
physical address. - Limit register contains range of logical
addresses. - Each logical address must be less than the limit
register.
14Hardware Support for Relocation Limit Registers
15Multiple - Partition Allocation
- Several user processes residing in memory at the
same time. - Memory can be divided into a number of
fixed-sized partitions, where each partition may
contain exactly one process. Therefore, the
degree of multiprogramming is bound by the number
of partitions. - Or all memory is available for user processes as
one large block (hole). - When a process arrives and needs memory, we
search for a hole large enough for this process. - If we find a space, we allocate only as much
memory as is needed, keeping the rest available
to satisfy future requests.
16Multiple - Partition Allocation (Cont.)
- Hole block of available memory holes of
various size are scattered throughout memory. - Operating system maintains information abouta)
allocated partitions b) free partitions (hole)
17Multiple - Partition Allocation (Cont.)
- When no available block of memory (hole) is large
enough to hold process, the O.S. waits until a
large block is available. - In general, there is at any time a set of holes,
of various sizes, scattered throughout memory. - When a process arrives and needs memory, we
search this set for a hole that is large enough
for this process. - If the hole is too large, it is split into two
- One part is allocated to the arriving process.
- The other is returned to the set of holes.
- When a process terminates, it releases its block
of memory, which is then placed back in the set
of holes. - If the new hole is adjacent to other holes, we
merge these adjacent holes to form one larger
hole.
18Dynamic Storage-Allocation Problem
- First-fit, best-fit, and worst fit are the most
common strategies used to select a free hole from
the set of available holes. - First-fit Allocate the first hole that is big
enough. Searching starts at the beginning of the
set of holes. We can stop searching as soon as we
find a free hole that is large enough. - Best-fit Allocate the smallest hole that is big
enough must search entire list, unless ordered
by size. Produces the smallest leftover hole. - Worst-fit Allocate the largest hole must also
search entire list, unless ordered by size.
Produces the largest leftover hole, which may be
more useful than the smaller leftover hole from
best-fit approach. - First-fit and best-fit better than worst-fit in
terms of speed and storage utilization.
19Fragmentation
- External fragmentation total memory space
exists to satisfy a request, but it is not
contiguous storage is fragmented into a large
number of small holes. - Example We have a total external fragmentation
of (300260)560KB. - If P5 is 500KB, then this space would be large
enough to run P5. But the space is not
contiguous. - The selection of first-fit versus best-fit can
affect the amount of fragmentation.
0 400K O.S.
1000K P1
1700K P4
2000K Free
2300K P3
2560K Free
20Fragmentation (Cont.)
- Internal fragmentation Memory that is internal
to a partition, but is not being used, because it
is too small. - Example Assume next request is for 18462 bytes.
- If we allocate exactly the requested block, we
are left with a hole of 2 bytes. - The overhead to keep track of this hole will be
larger than the hole itself. So, we ignore this
small hole (internal fragmentation).
O.S.
P7
Hole of 18464 bytes Free
P4
21 Virtual Memory
- Background
- Demand Paging
- Performance of Demand Paging
- Page Replacement
- Page-Replacement Algorithms
22Background (Cont.)
- Virtual memory is the separation of user logical
memory from physical memory. This separation
allows an extremely large virtual memory to be
provided for programmers when only a smaller
physical memory is available. - Only part of the program needs to be in memory
for execution. - Logical address space can therefore be much
larger than physical address space. - Need to allow pages to be swapped in and out.
- Virtual memory can be implemented via
- Demand paging
- Demand segmentation
23Diagram virtual memory larger than physical
memory
Physical Memory
Page 0
Page 1
Page 2
Page n
Virtual Memory
Memory Map
24Transfer of a Pages Memory to Contiguous Disk
Space
Main Memory
25Valid-Invalid bit
- We need hardware support to distinguish between
those pages that are in memory and those pages
are on the disk. - The valid-invalid bit scheme can be used
- Valid indicates that the associated pages is
both legal and in memory. - Invalid indicates that the page either is not
valid (not in logical address space) or is valid
but is currently on the disk. - What happens if the process tries to use a page
that was not brought into memory? - Access to a page marked invalid causes a
page-fault trap.
26Valid-Invalid Bit (Cont.)
- With each page table entry a validinvalid bit is
associated(1 ? in-memory, 0 ? not-in-memory) - Initially validinvalid but is set to 0 on all
entries. - Example of a page table snapshot.
-
- During address translation, if validinvalid bit
in page table entry is 0 ? page fault.
27Page Table when some pages are not in main memory
0 A
1 B
2 C
3 D
4 E
5 F
6 G
7 H
Logical Memory Logical Memory
0
1
2
3
4 A
5
6 C
7
8
9 F
Physical Memory Physical Memory
0 4 v
1 i
2 6 v
3 i
4 i
5 9 v
6 i
7 i
28Steps in Handling a Page Fault (Cont.)
- We check an internal table for this process, to
determine whether the reference was a valid or
invalid memory access. - If the reference was invalid, we terminate
process. If it was valid, but we have not yet
brought in that page, we now page in the latter. - We find a free frame.
- We schedule a disk operation to read the desired
page into the newly allocated frame. - When the disk read is complete, we modify the
internal table kept with the process and the page
table to indicate that the page is now in memory. - We restart the instruction that was interrupted
by the illegal address trap. The process can now
access the page as though it had always been in
memory.
29What happens if there is no free frame?
- Page replacement find some page in memory, but
not really in use, swap it out. - Algorithm.
- Performance want an algorithm which will result
in minimum number of page faults. - Same page may be brought into memory several
times.
30Performance of Demand Paging
- Effective Access Time (EAT) for a demand-paged
memory. - Memory Access Time (ma) for most computers now
ranges from 10 to 200 nanoseconds. - If there is no page fault, then EAT ma.
- If there is page fault, then
- EAT (1 p) x (ma) p x (page-fault time).
- p the probability of a page fault (0 ? p ? 1),
- we expect p to be close to zero ( a few page
faults). - If p0 then no page faults, but if p1 then
every reference - is a fault
- If a page fault occurs, we must first read the
relevant page from disk, and then access the
desired word.
31Performance of Demand Paging (Cont.)
- We are faced with three major components of the
page-fault service time - Service the page-fault interrupt.
- Read in the page.
- Restart the process.
- A typical hard disk has
- An average latency of 8 milliseconds.
- A seek of 15 milliseconds.
- A transfer time of 1 milliseconds.
- Total paging time (8151) 24 milliseconds,
including hardware and software time, but no
queuing (wait) time.
32Demand Paging Example 1
- Assume an average page-fault service time of 25
milliseconds (10-3), and a Memory Access Time of
100 nanoseconds (10-9). Find the Effective Access
Time? - Solution Effective Access Time (EAT)
- (1 p) x (ma) p x (page fault time)
- (1 p) x 100 p x 25,000,000
- 100 100 x p 25,000,000 x p
- 100 24,999,900 x p.
- Note The Effective Access Time is directly
proportional to the page-fault rate. -
33Page Replacement
- Example Assume each process contains 10 pages
and uses only 5 pages. - If we had 40 frames in physical memory then we
can run 8 processes instead of 4 processes.
Increasing the degree of multiprogramming. - If we run 6 processes (each of which is 10 pages
in size), but uses only 5 pages. We have higher
CPU utilization and throughput, also 10 frames to
spare (i.e., 6x530 frames needed out of 40
frames). - It is possible each process tries to use all 10
of its pages, resulting in a need for 60 frames
when only 40 are available.
34Need for Page Replacement
0 H
1 Load M
2 J
3 M
Logical Memory for user 1 Logical Memory for user 1
0 monitor
1
2 D
3 H
4 Load M
5 J
6 A
7 E
Physical Memory Physical Memory
0 3 v
1 4 v
2 5 v
3 i
Page table for user 1 Page table for user 1
0 A
1 B
2 D
3 E
Logical Memory for user 2 Logical Memory for user 2
0 6 v
1 i
2 2 v
3 7 v
Page table for user 2 Page table for user 2
35The Operating System has Several Options
- Terminate user process.
- Swap out a process, freeing all its frames, and
reducing the level of multiprogramming. - Page replacement takes the following approach
- If no frames is free, find one that is not
currently being used and free it. - We can free a frame by writing its contents to
swap space, and changing the page table to
indicate that the page is no longer in memory.
36Page Replacement
Swap out victim page
victim
Physical Memory
O i
F v
Page table Page table
37Page Replacement (Cont.)
- The page-fault service time is now modified to
include page replacement - Find the location of the desired page on the
disk. - Find a free frame
- If there is a free frame use it.
- Otherwise, use a page-replacement algorithm to
select a victim frame. - Write the victim page to the disk change the
page and frame tables accordingly. - Read the desired page into the newly free frame
change the page and frame tables. - Restart the user process.
38Page Replacement (Cont.)
- Note If no frames are free, two page transfers
(one out and one in) are required. This doubles
the page-fault service time and will increase the
effective access time accordingly. - This overhead can be reduced by the use of a
modify (dirty) bit. - Each page or frame may have a modify bit
associated with it in the hardware. - To implement demand paging, we must develop
- Frame-allocation algorithm, to decide how many
frames to allocate to each process. - Page-replacement algorithm, to select the frames
that are to be replaced.
39Page-Replacement Algorithms
- We want a page replacement algorithm with the
lowest page-fault rate. - We evaluate an algorithm by running it on a
particular string of memory references (reference
string) and computing the number of page faults
on that string. - The string of memory references is called a
reference string. - We can generate reference strings by tracing a
given system and recording the address of each
memory reference. - In our examples, the reference string is
- 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5.
40Page-Replacement Algorithms (Cont.)
- Example If we trace a particular process, we
might record the following address sequence - 0100, 0432, 0101, 0102, 0609, 0601, 0612
- This is reduced to the following reference
string 1, 4, 1, 6 - As the number of frames available increases, the
number of page faults will decrease. - From the above example If we had 3 or more
frames, we would have only 3 faults, one fault
for the first reference to each page. If we had
only one frame, we would have a replacement with
every reference resulting in 4 faults.
41First-In-First-Out (FIFO) Algorithm
- A FIFO algorithm associates with each page the
time when that page was brought into memory. - When a page must be replaced, the oldest page is
chosen. - Or we can use a FIFO queue to hold all pages in
memory. - We replace the page at the head of the queue.
- When a page is brought into memory, we insert it
at the tail of the queue.
42Example FIFO Algorithm
- Reference string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3,
4, 5 - Let 3 frames are initially empty (3 pages can be
in memory at a time per process). - The first 3 references (1, 2, 3) cause page
faults, and are brought into these empty frames. - 4 frames
43Optimal (OPT) Algorithm
- An optimal algorithm has the lowest page-fault
rate of all algorithms. - An optimal algorithm will never suffer from
Beladys anomaly. - Replace the page that will not be used for the
longest period of time. - This algorithms guarantees the lowest possible
page-fault rate for a fixed number of frames. - The optimal algorithm is difficult to implement,
because it requires future knowledge of the
reference string. - Similar situation with Shortest-Job-First in CPU
scheduling.
44Example OPT Algorithm
- Initially 4 frames empty.
- Reference String 1, 2, 3, 4, 1, 2, 5, 1, 2, 3,
4, 5 - How do you know this?
- Used for measuring how well your algorithm
performs.
45Least Recently Used (LRU) Algorithm
- The key distinction between FIFO and OPT
algorithms is that FIFO uses the time when a page
was brought into memory the OPT uses the time
when a page is to be used (future). - LRU algorithm uses the time when a page has not
been used for the longest period of time (Past). - LRU replacement associates with each page the
time of that pages last use.
46Example LRU Algorithm
- Looking backward in time.
- Reference string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3,
4, 5 - Note Number of page faults (on same reference
string) using - LRU is 8.
- FIFO is 10.
- OPT is 6.