Title: Chapter 8.1: Memory Management
1Chapter 8.1 Memory Management
2Chapter 8 Memory Management
- Chapter 8.1
- Background
- Swapping
- Contiguous Allocation
- Chapter 8.2
- Paging
- Chapter 8.3
- Segmentation
- Segmentation with Paging
3Background
- We know that In order to facilitate throughput to
support the CPU in executing programs, a program
must be brought into memory for it to be run - There are a number of memory management schemes
available nowadays and the choice depends on many
factors the most prominent of which are paging
and segmentation each of which is very much
influenced by a corresponding hardware design
necessary to support these management schemes. - Each of the major approaches to managing memory
requires its own hardware organization and
supporting data structures.
4Introduction
- Fact To keep the CPU executing, a number of
processes must be in memory at any given instant. - Memory is organized into arrays of words or bytes
each of which are directly addressable. (We use
hexadecimal addresses) - A program counter, in the control unit of the
CPU, contains the address of the next instruction
to fetch and execute. - The execution of an instruction may involve a
number of memory fetches not just the
instruction to be fetched and executed, but
oftentimes operands that must also be fetched and
manipulated. - When an instruction is executed, Say Add X to Y,
as part of the execution of the instruction, both
X and Y must be each fetched. After an
instruction is executed, the result must be
stored back into memory (at Y). - Memory units only see streams of addresses that
need to be accessed. - We are interested in the continuous sequence of
memory addressed that are generated by the
running program that require memory accesses. - In order to fully appreciate how memory is
accessed, we need to start with the hardware,
since this will drive how memory is accessed.
5 Basic Hardware
- Your book cites main memory as built into the
processor itself. I differ with this view. - The CPU does consist of registers and control
logic for executing a process. I consider main
memory a separate, physical unit. So, - Instructions contains an opcode that indicates
the nature of the instruction to be executed, and
addresses of data needed to execute the
instruction. - If an instruction refers to something on, say,
disk, then these data must be fetched and moved
into primary memory before the data can be
directly accessed and processed in the CPU. - The CPU clock, and the time it takes to tick,
is referred to as a clock cycle.
6Clocks, Executions, and Timing
- In the CPU, an instruction can be decoded and
many register-type operations can take place
within one tick of the clock. - So once the data is in the processor, things
occur very quickly. - But access to main memory, like for values of X
and Y to add, require a number of clock cycles. - We dont like to delay the CPU, so we normally
add a faster layer of memory access between the
processor and main memory called cache memory.
(We spoke about this in earlier chapters) - Also, the operating system must be protected from
inadvertent access as well as from the
programmers address space for the process that
is executing. - So, whenever an address is developed, care must
be taken to ensure that the specific memory
address desired is not in violation other
protected areas. - Is the address located within the operating
system area? Other areas?? - This leads us to the concept of Contiguous
Allocation and basic and limit registers.
7A base and a limit register define a logical
address space
Memory is arranged with the OS and its associate
support found in low memory User processes share
higher memory. Each process has two registers
associated with it one contains the starting
address of the process and the second
register contains the limit (or range). See
figure all addresses developed during
execution that fall between these values are in
range and addressable. These are okay Any
addresses outside this range are considered
fatal errors and attempting to access such as
address results in a trap to the operating
system..
8HW address protection with base and limit
registers
Base and Limit registers are loaded only by the
operating system which uses special privileged
instructions only available in kernel
mode. Since only the operating system can access
base and limit registers, this gives the
operating system control of users memory and
user programs.
9Address Binding
- Programs normally reside on disk in executable
form, as some kind of .exe file (.exe. .dll,
.class. Others) - Program must be brought into memory for it to be
run - Input queue collection of processes on the disk
that are waiting to be brought into memory to run
the program - User programs go through several steps before
being run. Sometimes the program (or parts of
the program) may be moved back to disk during its
execution - Briefly, some of the steps that a program must go
through before being run are captured on the next
slide.
10Multi-step Processing of a User Program
Discuss briefly.
11Binding of Instructions and Data to Memory
Addresses in a source program are usually
symbolic, as in Add X to Y. A
compiler, among other things, must bind (map
associate) these logical / symbolic addresses to
physical / relocatable addresses in primary
memory. A binding is a mapping from one address
space to another, such as X is mapped to
location 74014 in primary memory. Address
binding of instructions and data to memory
addresses canhappen at three different stages
12Where / When Binding May Occur
- Compile time If memory location known a priori,
absolute code can be generated must recompile
code if starting location changes - This may sometimes occur that certain items
must be found in some specific memory locations. - Load time Must generate relocatable code if
memory location is not known at compile time.
Binding is delayed to load time. - This is very frequently the case.
- Execution time Binding delayed until run time
if the process can be moved during its execution
from one memory segment to another. - Need hardware support for address maps (e.g.,
base and limit registers). - These various binding schemes constitute much of
what we discuss in this chapter and their
associated required hardware support.
13Logical vs. Physical Address Space
- The concept of a logical address space that is
bound to a separate physical address space is
central to proper memory management - Logical address generated by the CPU also
referred to as virtual addresses - Physical address address seen by the memory
unit - Compile and load-time address-binding methods
generate identical logical and physical
addresses. - But execution-time address binding scheme results
in differing logical and physical addresses. - We refer to these logical addresses as virtual
addresses. In this text we will use the logical
address and virtual address interchangeably. - The set of all logical addresses generated by a
program is a logical address space. - The set of all physical addresses corresponding
to these logical addresses is a physical address
space. - In the execution-time address-binding scheme, the
logical and physical address spaces differ. Much
more on virtual memory in Chapter 9.
14Memory-Management Unit (MMU)
- The MMU refers to a hardware device that maps
virtual to physical address - In such a MMU scheme, the value in the relocation
register, is added to every address generated by
a user process at the time it is sent to memory - We will see via the next slide.
- The user program deals with logical addresses it
never sees the real physical addresses - Heres a simple scheme as an example.
15Dynamic Relocation using a Relocation Register
Base register now called a relocation
register. Value in relocation register is simply
added to every address generated by the user
process at the time it is sent to primary
memory. This reasonable simple scheme and was
used by MSDOS family of Intel 80X86 chips.
User only deals with logical addresses. Memory
mapping only occurs when memory accesses are
desired. Users generate only logical addresses.
These logical addresses must each be mapped into
physical addresses before a memory access may be
undertaken. Logical to physical memory
addressing is central to good memory management.
16Dynamic Loading
- Very important concept.
- So far, everything (process and data) has to be
all located in central memory for execution. - Very wasteful, since often many large parts of a
process may not be executed in a given run. Yet
these occupy space! - Too, size of process is limited by the size of
physical memory too! - Here, in dynamic loading, a routine is not loaded
until it is called - This provides for better memory-space
utilization an infrequently used routine may
never be loaded - Routines are maintained in executable format on
disk - Useful when large amounts of code are needed to
handle infrequently occurring cases - No special support required from OS is required
implemented through program design
17Dynamic Loading - more
- Main routine is loaded into memory
- When a routine is needed, the calling routine
first checks to see if the desired routine is in
memory. - If desired routine is not in memory, a
relocatable linking loader is called to load the
desired routine and update the programs address
tables. - Control can be passed to this routine.
18Dynamic Linking
- Dynamic Linking is another way to create
executables. - Here the compiler develops an object module,
which is not yet executable. - This object module is stored on disk with a
stub for each language library routine this
object module might need in order to create a
real executable. - In static linking, language libraries, such as
those that might carry out input/output, contain
required supplementary code and are combined with
the object module by a linkage editor into what
is called a load module. - Here, linking to where we have a binary
executable is postponed until execution time. - This load module can be stored on disk just as
the object module (not in executable form but
compiled) may also be stored on disk then
linked prior to execution.
19Dynamic Linking more
- In creating the object module by the compiler,
the compiler includes a stub which indicates
how/where to find the library routine (if it is
currently in memory), or that this routine
requires the OS to load the library routine. - The stub replaces itself with an address to the
loaded (or found) library routine. - This approach, unlike dynamic loading, does
require help from the OS. - Clearly, only the OS can check to see if the
desired routine is in memory. - So there is some overhead the first time this
library routine is fetched and loaded into
memory but subsequent executions of this
routine only require a branch to this routine
plus its execution. - Only one copy of the routine becomes memory
resident. - Dynamic linking is particularly useful for
libraries - I used this approach a lot in years past with IBM
mainframe operating system.
20Process Swapping
- A process can be swapped temporarily out of
memory to some kind of high speed backing store,
and then brought back into memory later for
continued execution - Backing store fast disk, large enough to
accommodate copies of all memory images for all
users must provide direct access to these memory
images - Memory manager can swap out a process and swap in
a new process into the space freed up by the
swapped process. - We also must be very certain that there is
another ready process for the CPU when one
process uses its time quantum up. - We also want to ensure that the CPU can undertake
a good bit of computing while swapping is going
on in the background. - One time metric for transfer is called total
transfer time and is directly proportional to
the amount of memory swapped - Modified versions of swapping are found on many
systems (i.e., UNIX, Linux, and Windows)
21Schematic View of Swapping
Address Binding may restrict that the swapped
process must return to the place it previously
occupied. If binding is done at load or
assembly time, then the process cannot be
easily moved to a different location. For
execution-time binding, a process may be
returned to a different area of memory because
the physical addresses are developed as part of
execution.
22There are Issues on Swapping
- Any kind of efficient swapping requires a very
fast back up disk store with no time spent for
head select on the disk. - The backup store must be large enough for all
memory images and must be, of course, directly
accessible. - There must be a ready queue of all processes
whose memory images are on the backup store or
are in memory ready to be executed. - When the CPU scheduler looks for the next task to
execute, the dispatcher checks first to see if
the next process is currently in memory. If so,
fine. Go there. - If not and there is no free memory space, the
dispatcher may swap out a process currently in
memory and swap in a process from the backup
store.
23Swapping and Overhead
- Swapping takes a good bit of time (see your
book). - For a 10MB process, swap time (bringing in a
process and swapping out a process) may require
516 msec, but this is .5 seconds!!! - This is a lot of time! A lot of CPU time!!!
- This implies that the CPU scheduling algorithm
have a time quantum significantly higher than .5
sec to make up for this overhead. - Then again, just how much swapping do we allow?
24Issues on Swapping 2
- Another issue the process to be swapped must be
completely idle. - But what if there is a pending I/O?
- If the I/O is taking place asynchronously and
will access user and I/O buffers, data could
conceivably be brought into an area now occupied
by a newly swapped in process!!! - Not good!!
- Heuristic Never swap a process out that is
awaiting an I/O. - Swapping is still used in a few systems, but the
overhead for swapping is high. - This is often not considered a great memory
management scheme. But there are modifications
for swapping that improve performance.
25Contiguous Allocation
- We now discuss how memory is allocated to support
both the operating system and user processes. - These first scheme is called Contiguous
Allocation. - Memory is divided into two partitions
- The OS is normally allocated to low memory
addresses because the interrupt vector (vector of
addresses of interrupt handlers) is traditionally
located in low memory addresses. - Moreso, there is certain code that is address
bound. - In this memory management scheme, user processes
occupy a single, contiguous set of addresses in
memory. - This means that an entire process address space
consists of contiguous memory locations with no
holes in its area. - The next several slides address Contiguous Memory
Allocation.
26Memory Mapping and Protection
- We protect the operating system and user
processes via a relocation register that weve
discussed earlier, and a limit register. - Essentially, a limit register is the
displacement of a declared variable or
instruction from the beginning of the program. - This value is added to a relocation register, the
location where the process executable code is
loaded, when every developed address is computed.
- Each logical address will be less than the limit
register (from the start of the program) and the
MMU dynamically maps a logical address into a
physical location. - Also, please note that the values in the
relocation and limit registers must be stored and
retrieved as part of context switching. - Even the operating system can grow, as not all
routines must be physically in memory all the
time. We call such routines transient operating
system code. - E.g. Device drivers or other operating system
services that might not frequently be used. Can
be rolled in and out as needed.
27Memory Allocation
- This simple form of memory management consists of
dividing memory into several fixed-sized
partitions. - Note not all partitions are identical in size.
- Each partition (whatever its size) contains
exactly one process. - All addresses are contiguous.
- When a partition becomes available, it becomes
available for another process.
28Memory Allocation
- In fixed partitions, system keeps a table of
memory partitions of available and occupied. - Hole block of available memory holes of
various size are scattered throughout memory - When a process arrives, it is allocated memory
from a hole large enough to contain it. - The rest of the hole remains available and
re-allocatable to another process. - At any instant, we have a list of block sizes and
an input queue of awaiting processes. - Memory is allocated to processes until a process
requires more memory than is available. - OS can wait until space becomes available
(running existing programs) or skip down in the
input queue to see if another waiting process may
be serviced by available holes. - Unfortunately with this scheme we can have a set
of holes of various sizes scattered all over. - Freed process space is combined with adjacent
available memory space to form larger holes. - This process is called (in general) dynamic
storage allocation. - There are many solutions to problems arising from
hole sizes as we strive to maximum throughput in
the system.
OS
OS
OS
OS
Can See
process 5
process 5
process 5
process 5
process 9
process 9
process 8
process 10
process 2
process 2
process 2
process 2
29Dynamic Storage-Allocation Problem
How to satisfy a request of size n from a list
of free holes
- First-fit Allocate the first hole that is big
enough - Can start searching at beginning of set of holes
or where last first-fit search ended. - Stop when sufficient size is encountered.
- Best-fit Allocate the smallest hole that is
just big enough - In Best-Fit, we must search entire list, unless
list is ordered by size. - Best-Fit produces the smallest leftover hole.
- Worst-fit Allocate the largest hole must also
search entire list (unless list is sorted). - Worst-Fit produces the largest leftover hole.
- Worst-Fit can be best because it leaves (more
probably) a larger, usable hole than that which
might be left over from a best-fit approach.
First-fit and best-fit are better than worst-fit
in terms of speed and storage utilization Neither
first fit nor best fit is clearly better than the
other in terms of storage utilization, but first
fit is generally faster.
30Fragmentation - External
- Any of these memory management schemes cause
Fragmentation! - External Fragmentation
- total memory space exists to satisfy a subsequent
request, but the remaining available memory is
not contiguous - Lots of small fragments here and there.
- External Fragmentation easily arises from
first-fit and best-fit strategies. - If the sum of the fragments could be combined, we
might be able to run other processes. - Statistics have shown that on average (for first
fit) even with some optimization, given n
allocated blocks, another 0.5n blocks will be
lost to fragmentation. Thats a lot!! - This implies that perhaps one-third of memory may
be unusable. (known as the 50 rule).
31Fragmentation - Internal
- Internal Fragmentation
- Suppose we need almost an entire available block
size for a process. - Keeping track of small blocks may be of very
little use and create overhead. - General approach to solving internal
fragmentation problem is - break memory into fixed size blocks, and
- allocate memory in units based on overall
required block size. - So, if we divide up memory into 2K blocks, only
part of the last block will be wasted for a given
process. - If we divide memory into 4K blocks, well have
fewer blocks, which is good, but perhaps more
internal fragmentation on the last block - This wasted space is called internal
fragmentation.
32So, what to do about Fragmentation?
- Can reduce external fragmentation by compaction
- Shuffle memory contents to place all free memory
together in one large block - But Compaction is possible only if relocation is
dynamic, and is done at execution time - The simplest compaction algorithm is to move all
processes toward one end of memory such that all
holes move in the other direction producing one
large hold of available memory. - If compaction is possible, one must consider the
cost. - Overhead here can be quite high
- Perhaps the best solution is to permit the
logical address space of a process to occupy
non-contiguous storage space. - This thinking has given rise to modern allocation
schemes paging and segmentation.
33End of Chapter 8.1