Chapter 8: Memory Management - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Chapter 8: Memory Management

Description:

Dr. T. Doom. 8.1. Chapter 8: Memory Management. A ... MS-DOS .COM format programs. CEG 433/633 - Operating Systems I. Dr. T. Doom. 8.3. Load-time Binding ... – PowerPoint PPT presentation

Number of Views:331
Avg rating:3.0/5.0
Slides: 41
Provided by: drtrav
Category:

less

Transcript and Presenter's Notes

Title: Chapter 8: Memory Management


1
Chapter 8 Memory Management
  • A program may not be executed until it is
  • associated with a process
  • brought into memory
  • In allow multi-programming, the OS must be able
    to allocate memory to each process
  • Several processes at once
  • Requires a Memory Management scheme and
    appropriate hardware support
  • Security?
  • The memory management scheme has a large impact
    upon how a program for a particular platform must
    be designed and compiled
  • How much memory is available?
  • How do should we bind addresses?

2
Address Binding
  • Instruction and data addresses in program source
    code are symbolic
  • goto errjmp
  • X A B
  • These symbolic addresses must be bound to
    addresses in physical memory before the code can
    be executed
  • Address binding a mapping from one address
    space to another
  • The address binding can take place at compile
    time, load time, or execution time.
  • Compile-time Binding the compiler generates
    absolute code
  • memory location must be known a priori
  • must recompile to move code
  • MS-DOS .COM format programs

3
Load-time Binding
  • Most modern compilers generate relocatable object
    code
  • symbolic address are bound to a relocatable
    address
  • i.e. 286 bytes from the beginning for the module
    doomC.o
  • The linkage editor (linker) combines the multiple
    modules into a relocatable executable
  • The load module (loader) is places the program in
    memory
  • The loader performs the final binding of
    relocatable addresses to absolute addresses
  • Load-time Binding Bind relocatable code to
    address on load
  • Must generate relocatable code
  • Memory location need not be known at compile time
  • If starting address must change, we must reload
    code

4
Execution-time Binding
  • A logical (or virtual) address space may be bound
    to a separate physical address space
  • Provides an abstraction of physical memory
  • Logical (virtual) address generated by the CPU
  • Physical address address seen by the memory
    unit
  • The user program deals with logical addresses it
    never sees the real physical addresses
  • Memory-Management Unit (MMU) Hardware device
    that translates CPU-generated logical addresses
    into physical memory addresses
  • Execution-time Binding Binding delayed until
    run time
  • process can be moved during its execution from
    one memory segment to another
  • logical and physical addresses differ (requires
    mapping)
  • requires hardware and OS support for address
    mapping

5
Memory-Management Unit (MMU)
  • Logical and physical addresses are the same in
    compile-time and load-time address-binding
    schemes logical (virtual) and physical addresses
    differ in execution-time address-binding scheme.
  • The user program deals with logical addresses it
    never sees the real physical addresses.
  • Hardware device that maps virtual to physical
    address.
  • In most basic MMU scheme, all logical addresses
    begin at 0, and the base register is replaced by
    a relocation register
  • The value in the relocation register is added to
    every logical address generated by a user process
    at the time it is sent to memory to generate the
    necessary physical address
  • To move the program, simply change the value in
    the register
  • The limit register remains unchanged
  • Thus, each logical address is bound to a physical
    address
  • Is security maintained?

6
Can we reduce memory requirements?
  • Loading Placing the program in memory
  • Dynamic Loading Routine is not loaded until it
    is called
  • Program must check and load before calling
  • If a needed routine is not available in memory,
    the relocatable linker/loader loads the routine
    and updates the programs address tables
  • Better memory-space utilization unused routine
    is never loaded
  • Size of executable is unchanged
  • Runtime footprint is smaller
  • Useful when large amounts of code are needed to
    handle infrequently occurring cases.
  • No special support from the operating system is
    required
  • Implemented through program design

7
Can we reduce executable size?
  • Linking combining object modules into an
    executable
  • Most OSes require static linking
  • All library routines become part of the
    executable
  • Modern OSes often allow dynamic linking
  • Linking postponed until execution time
  • Instead of placing the code for each library
    routine in the executable, include only a stub (a
    small piece of code) which
  • locates the appropriate memory-resident library
    routine
  • replaces itself with the address of the routine,
    and executes the routine
  • Executable footprint is reduced
  • program will not run w/o libraries
  • New (minor) versions of the library do not
    require recompilation
  • Some operating systems provide support for
    sharing the memory associated with library
    modules between processes (shared libs.)
  • Very efficient! No read() required, less overall
    memory usage

8
What if there isnt enough memory?
  • How can we execute an executable whose code
    footprint is larger than the memory available?
  • This was a major problem in the 60s and 70s for
    general purpose computers and remains a major
    problem
  • Consider memory usage in an e-mail pager or ISDN
    box
  • Solution Keep in memory only those instructions
    and data that are needed at any given time
    overload during run-time
  • Overwrite this memory with a new set of
    instructions and data when we get to a
    significantly different part of the code
  • Each set of instructions/data is an overlay
  • Programming design of overlay structure is
    non-trivial
  • No special support needed from operating system
  • Implemented by user design
  • Modern general purpose OSes use virtual memory to
    deal with this problem

9
How does the OS allocate memory?
  • Contiguous Allocation Scheme All memory granted
    to a process must be contiguous
  • Single-partition contiguous allocation
  • Only one partition exists in memory for user
    processes
  • Only one user process is granted memory at a time
  • The resident operating system must also be held
    in memory
  • OS size changes as transient code is loaded
  • Place OS in low memory, use relocation-register
    to define the beginning of the user partition
  • Relocation-register protects the OS code and data
  • Alows relocation of user code if OS requirements
    change
  • Relocation register contains value of smallest
    physical address limit register contains range
    of logical addresses each logical address must
    be less than the limit register
  • To change context, must swap out main memory to a
    backing store

10
Swapping
  • A process can be suspended and swapped
    temporarily out of memory to a backing store, and
    then brought back into memory for continued
    execution
  • Backing store usually a fast disk large enough
    to accommodate copies of all memory images for
    all users must provide direct access to these
    memory images
  • swap may be from memory (conventional) to memory
    (extended)
  • Roll out, roll in swapping variant used for
    priority-based scheduling algorithms (or
    round-robin with a huge quantum) lower-priority
    process is swapped out so higher-priority process
    can be loaded and executed.
  • Major part of swap time is transfer time total
    transfer time is directly proportional to the
    amount of memory swapped.
  • Requires execution-time binding if process can be
    restored to a different memory space then it
    occupied previously
  • OS management of I/O buffers required to swap a
    process awaiting I/O
  • Modified versions of swapping are found on many
    systems, i.e., UNIX and Microsoft Windows

11
Swapping in Single Partition Scheme
12
Contiguous Allocation (Cont.)
  • For multi-processing systems it is far more
    efficient to allow several user processes to
    allocate memory
  • The OS must keep track of the size and owner of
    each partition
  • The OS must determine how and where to allocate
    new requests
  • Multiple-partition contiguous allocation
  • Fixed-partition Memory is pre-partitioned, the
    OS must assign each process to the best free
    partition
  • Hard limit to the number of processes in memory
  • Efficient?

13
Contiguous Allocation (Cont.)
  • Multiple-partition contiguous allocation
  • Dynamic allocation Memory is partitioned by the
    OS on the fly
  • Operating system maintains information abouta)
    allocated partitions b) free partitions (hole)
  • Hole block of available memory holes of various
    size are scattered throughout memory.
  • When a process arrives, it is allocated memory
    from a hole large enough to accommodate it

14
Dynamic Storage-Allocation Problem
  • How do we satisfy a request of size n from a list
    of free holes. Optimization metrics include
    speed and storage utilization.
  • First-fit Allocate the first hole that is big
    enough. Search begins at top of list. Fast
    search.
  • Next-fit Allocate the first hole that is big
    enough. Search begins at the end of the last
    search. Fast search.
  • Best-fit Allocate the smallest hole that is big
    enough must search entire list, unless ordered
    by size. Produces the smallest leftover hole.
  • Worst-fit Allocate the largest hole must also
    search entire list, unless ordered by size.
    Produces the largest leftover hole.
  • Simulation shows that
  • First-fit is better (in terms of storage
    utilization) than worst-fit
  • First-fit is as good (in terms of storage
    utilization) than best-fit
  • First-fit is faster than best-fit
  • Next-fit is generally better than first-fit

15
Fragmentation
  • How do we measure storage utilization?
  • How much space is wasted?
  • Internal fragmentation allocated memory may be
    slightly larger than requested memory this size
    difference is memory internal to a partition, but
    not being used
  • Problem in fixed-partition allocation
  • External fragmentation total memory space
    exists to satisfy a request, but it is not
    contiguous.
  • Problem in dynamic allocation
  • 50 rule Simulations show that for n-blocks,
    n/2-blocks of memory are wasted. 1/3 of memory
    is lost to fragmentation
  • External fragmentation can be reduced by
    compaction
  • Shuffle memory contents to place all free memory
    together in one large block
  • Compaction is possible only if relocation is
    dynamic, and is done at execution time and if the
    OS provides I/O buffers so that devices dont DMA
    reallocated memory

16
Non-Contiguous Memory Allocation
  • Goal Reduce memory loss to external
    fragmentation without incurring the overhead of
    compaction
  • Solution Abandon the requirement that
    allocation memory be contiguous.
  • Non-contiguous memory allocation approaches
    include
  • Paging Allow logical address space of a process
    to be noncontiguous in physical memory. This
    complicates the binding (MMU) but allows the
    process to be allocated physical memory wherever
    it is available.
  • Segmentation Allow the segmentation of a
    process into many logically connected components.
    Each begins at its own (local) virtual address
    0.
  • This allows many other useful features, including
    protection permisions on a per segment basis,
    etc.
  • Example segmentation Text, Data, Stack.
  • Segmentation with Paging Hybrid approach

17
Paging
  • Physical memory is broken up into fixed-size
    partitions called frames
  • Logical memory is broken up into frame-size
    partitions called pages
  • The OS keeps track of all free frames
  • Frame size Page size (power of 2, usually 512
    - 8k bytes)
  • To run a program of size n pages, need to find n
    free frames and load program
  • Internal fragmentation (average of 50 of one
    page per process)
  • Logical addresses must be mapped to physical
    addresses
  • Set up a page table to note which frame holds
    each page
  • Logical Address generated by CPU is divided into
  • Page number (p) used as an index into a page
    table which contains base address of each page in
    physical memory
  • Page offset (d) combined with base address to
    define the physical memory address that is sent
    to the memory unit

18
Paging Example
19
Implementing Paging
  • Paging is transparent to the process (still
    viewed as contiguous)
  • Divide a m-bit logical address for a system with
    pages of size 2n into
  • n-bit page offset (d)
  • (m-n)-bit page number (p)
  • The page number p is an index to the page table
    which stores the location of the frame
  • Frames and pages are the same size, thus the
    displacement within a page is also the
    displacement within the frame
  • Mapping is
  • Physical address page-table(p) d

20
Address Translation Architecture
21
Page Size
  • How large should a page be?
  • Smaller pages reduce internal fragmentation
  • Larger pages reduce the number of page table
    entries
  • If s is the average process size, p is the page
    size (in bytes) and e is the of bytes per page
    table entry, then
  • For current process sizes, and available physical
    memory, optimal page sizes range between 512 - 8K
    bytes
  • Page table must be kept in main memory.
  • Why? If a page is 8k (12 bits) and the CPU uses
    a 32-bits address then there are 220 possible
    pages per process
  • of bits per entry depends upon size of physical
    memory
  • The memory consumed by this table is
    overhead/waste

22
Implementation of Page Table
  • The page table must be kept in main memory
  • Page-table base register (PTBR) points to the
    page table
  • add PTBR page number (p) to get lookup address
  • Page-table length register (PRLR) indicates size
    of the table
  • Only make the page table as large as necessary
  • Addresses in unallocated pages cause an exception
  • For each CPU memory access in there are two
    physical accesses
  • access the page table (in memory) to retrieve
    frame
  • access the data/instruction
  • The inefficiency of this two memory access
    solution can be reduced by the use of a special
    fast-lookup hardware cache for the page table
  • associative registers or translation look-aside
    buffers (TLBs)
  • Hit Ratio The percentage for which the necessary
    data is present in the cache
  • otherwise, get data from page table in main memory

23
Effective Access Time
  • Effective Access Time (EAT) is a weighted average

tTLB time required for a TLB lookup tmem time
required for an access to main memory ? hit
ratio EAT ? ( tTLB tmem) (1-
?)(tTLBtmemtmem)
  • Even for fairly small TLBs, hit ratios of .98 -
    .99 are common
  • Most programs refer to memory very sequentially
    and locally
  • The 32-entry TLB in the 486 generally has a .98
    hit ratio
  • Thus, we can implement paging without suffering a
    significant latency cost
  • Try it with TLB search of 20ns, Memory access of
    100ns, and hit ratios of .80 and .98

24
Memory Protection
  • Protections bits are included for each entry in
    the page table
  • Valid-invalid bit indicates if the associated
    page is in the process logical address space,
    and is thus a legal page
  • Machines which have a PTLR can avoid the wasted
    page table entries necessary to house the i bit.
  • RO/RW/X bits indicates if the page should be
    considered read-only, read-write and/or
    executable
  • Protection exceptions are calculated in parallel
    with the physical address (after the page table
    lookup)
  • Page tables allow processes to share memory by
    having their page tables point to the same frame
  • Note Processes can not reference physical memory
    that the OS does not allow them to via page table
    setup
  • The OS keeps a frame-table (one entry per frame)
    which indicates if each frame is full or empty,
    to which process the frame is allocated, when was
    it last referenced, etc
  • Memory protection implemented by associating
    protection bit with each frame

25
Shared Pages
  • Private code and data
  • Each process keeps a separate copy of the code
    and data
  • Shared code
  • To be sharable, code must be reentrant (or
    pure)
  • All non-self modifying code is pure - it never
    changes during execution (I.e. read only code)
  • Each process has its own copy of registers and
    data storage to hold the data for its process
    execution
  • One copy of reentrant code can be shared among
    processes (i.e., text editors, compilers, window
    systems)
  • Problem Shared code must appear in at the same
    location in the logical address space of each
    process
  • internal branch and memory addresses must be
    consistent

26
Shared Pages Example
27
Two-Level Paging
  • Consider a page table for a 32-bit logical
    address space on a machine with a 32-bit physical
    address space and size 4K pages
  • logical space/page size 232 / 212 220 entries
  • physical space/frame size 232/212 220, 20
    bits/entry 12 protection bits 4 Bytes/entry
  • Page table size 220 entries 4 Bytes/entry 4
    MB
  • 4 MB gtgt 4K The page table itself is larger than
    one page!
  • We cant allocate the page table in contiguous
    memory
  • We must page the page table! The page number is
    divided into
  • How many 4 Byte entries per 4K page? 212/22 210
  • a 10-bit page offset
  • How many bits remain? 20 - 10 10
  • a 10-bit page number
  • Thus, a logical address is divided pi, an index
    into the outer page table, and p2, the
    displacement within the page of the outer page
    table

28
Two-Level Page-Table Scheme
29
Multilevel Paging Performance
  • The concept can be extended to any number of
    page-table levels
  • Since each level is stored as a separate table in
    memory, covering a logical address to a physical
    one may take many memory accesses
  • Even though time needed for one memory access is
    increased, caching (via TLB) permits performance
    to remain reasonable
  • Example In a system with a two-level paging
    scheme, a memory access time of 100ns, and 20ns
    TLB with a hit rate of 98 percent
  • effective access time 0.98 x (20 100)
  • 0.02
    x (20 100 100 100)
  • 124
    nanoseconds.which is only a 24 percent slowdown
    in memory access time.

30
Inverted Page Table
  • Problem Each process requires its own page
    table, which consists many entries (possibly
    millions). How can we reduce this overhead?
  • Solution The number of frames is fixed (and
    shared between the processes). Store the
    process/page information by frame!
  • One entry for each real page of memory
  • Entry consists of the virtual address of the page
    stored in that real memory location, with
    information about the process that owns that page
  • Concern Decreases memory needed to store each
    page table, but increases time needed to search
    the table when a page reference occurs
  • Use hash table to limit the search to one or at
    most a few page-table entries
  • hash table requires another memory lookup (of
    course)
  • Concern for later The use of an inverted page
    table does not obviate the need for a normal page
    table in demand paged systems (ch. 9)

31
Inverted Page Table Architecture
32
Segmentation
  • Segmentation is a non-contiguous memory
    allocation scheme
  • simpler than paging, but not as efficient
  • supports user view of memory
  • Programmers tend not to consider memory as a
    linear array of bytes, they prefer to view memory
    as a collection of variable sized segments
  • Never forget, however, that memory is a linear
    array of bytes
  • A segment is a logical unit such as
  • main program, procedure, function, local
    variables, global variables, common block, stack,
    symbol table, arrays, etc.
  • Segmentation is a memory management scheme that
    supports this user view of memory
  • segments are numbered and referred to by that
    number
  • a logical address consists of a segment, and an
    offset
  • A mapping between segments and physical addresses
    must be performed

33
Logical View of Segmentation
1
2
3
4
user space
physical memory space
34
Segmentation Architecture
  • Logical address consists of a two tuple
  • ltsegment-number, offsetgt,
  • Segment table maps two-dimensional physical
    addresses each table entry has
  • base contains the starting physical address
    where the segments reside in memory.
  • limit specifies the length of the segment.
  • Segment-table base register (STBR) points to the
    segment tables location in memory.
  • Segment-table length register (STLR) indicates
    number of segments used by a program
  • segment number s is legal if s
    lt STLR.

35
Segmentation Architecture (Cont.)
  • Relocation
  • dynamic (execution-time)
  • by segment table
  • Sharing
  • similar to sharing in a paged system
  • shared segments
  • must have same segment number in each program
  • protection/sharing bits in each segment table
    entry
  • Memory allocation
  • segment vary in length
  • dynamic-storage problem first fit/best fit?
  • external fragmentation
  • segmentation dont use frames, thus external
    fragmentation exists
  • periodic compaction may be necessary and is
    possible as dynamic relocation is supported

36
Sharing of segments
37
Hybrid Segmentation with Paging
  • Segmentation and paging have their advantages and
    disadvantages
  • segmentation suffers from dynamic allocation
    problems
  • lengthy search time for a memory hole
  • external fragmentation can waste significant
    resources
  • paging reduces dynamic allocation problems
  • quick search (just find enough empty frames if
    they exist)
  • eliminates external fragmentation
  • Note it does introduce internal fragmentation
  • Solution page the segments!
  • First seen in MULTICS, dominates current
    allocation schemes
  • Solution differs from pure segmentation in that
    the segment-table entry contains not the base
    address of the segment, but rather the base
    address of a page table for the segment

38
MULTICS Address Translation Scheme
39
Generalized Summary
  • Parkinsons Law Programs expand to fill
    available memory
  • Mono-programmed systems
  • One user process in memory
  • OS and device drivers also present
  • Overlays used to increase program size
  • Relocatable at compile-time only
  • Protection Base and limit register
  • Multi-programmed systems/fixed number of tasks
    (OS/360 MFT)
  • Memory allocation on fixed-sized/numbered
    partitions
  • Queue for each partition size
  • Relocatable at load time
  • Protection Base and limit register, or
    protection code (pid) if multiple non-contiguous
    blocks are allowed

40
Generalized Summary
  • Multi-programmed and time-shared systems with
    variable partitions
  • Memory manager must keep track of partitions and
    holes
  • Dynamic allocation algorithm First-fit,
    Next-fit, Best-fit, etc.
  • Compaction to reduce external fragmentation
  • Protection
  • relocation (base) register and limit register, or
  • virtual addresses - the OS produces the physical
    address user programs can not generate addresses
    which belong to other processes
  • Relocatable during execution (or no compaction
    possible)
  • Change relocation register value or page-to-frame
    mapping
Write a Comment
User Comments (0)
About PowerShow.com