Linux Virtual Memory for Intel Processor - PowerPoint PPT Presentation

About This Presentation
Title:

Linux Virtual Memory for Intel Processor

Description:

All process use the same logical address and segment descriptors. ... Page table lookup code : mm/memory.c. 19. Paging in Linux (4) ... Reverse Mapping ... – PowerPoint PPT presentation

Number of Views:323
Avg rating:3.0/5.0
Slides: 30
Provided by: coe288
Category:

less

Transcript and Presenter's Notes

Title: Linux Virtual Memory for Intel Processor


1
Linux Virtual Memory for Intel Processor
  • Debzani Deb

2
Overview
  • Overview of Virtual memory.
  • What are the supports available in Intel
    architecture for virtual memory.
  • How Linux use those hardware support and
    implement virtual memory.
  • Process Address Space.
  • Page fault handler.
  • What are the additional improvements in
    kernel2.6.
  • References.

3
Introduction
  • In Virtual Memory environment a large logical
    address space is simulated with a small amount of
    physical memory (RAM) and some disk storage (swap
    space).
  • Processors addressable logical address is
    converted to physical address during program
    execution.
  • Implementation requires extensive hardware
    assistance and a lot of complex OS code and time.
  • Virtual memory can be implemented as
  • Paging Fixed sized memory blocks.
  • Segmentation variable sized memory blocks.
  • Fetch technique Demand Paging
  • Replacement technique Least Recently Used (LRU)
    algorithm.

4
Why Virtual Memory?
RAM
Process may be too big for Physical Memory There
are more active process than the physical memory
can hold. Solution Virtual Memory where a
large virtual address space(4GB) for each process
is simulated with a small amount of physical
memory (RAM) and some disk storage (swap space).
Process 2 (50 MB)
Process 3 (30 MB)
Process 1 (50 MB)
OS (8 MB)
5
Virtual Memory
Process 1
Process 2
RAM
Page1(1)
Page1(2)
Page1(1)
Page1(1)
Process 1 Running
Process 2 Scheduled to run
Page 3(2)
Process 1 Sleep
Process 2 Running
Process 2 faulted
Page 2(1)
Page 2(2)
Page 2(1)
Page 3(1)
Page 3(2)
Page1(2)
Page 2(2)
Page 4(1)
OS (8 MB)
Page 5(1)
Page 5(2)
Page 6(2)
Page 6(1)
The system works because principle of locality
holds. Thrashing System swaps in/out all the
time, no real work is done.
Page 7(2)
Page 7(1)
6
IA-32 Virtual Memory
  • IA-32 architecture supports either pure
    segmentation or segmentation/paging virtual
    memory.
  • Logical address
  • Consists of a segment selector(16 bit) and an
    offset(32 bit).
  • Linear Address (LA) or Virtual Address (VA)
  • The base address of the segment offset. This 32
    bit address is used to address 4GB of memory.
  • Physical Address (PA)
  • 32 bit Address in RAM.

7
IA-32 Virtual Memory
8
IA-32 Segmentation(1)
  • Segment Registers (6)
  • Hold and retrieve segment selectors quickly.
  • CS (Code segment register) points to a segment
    containing program instructions. Also includes
    Current privilege Level (CPL) field to denote
    privilege level 0 means kernel mode and 3 means
    user mode.
  • DS (Data segment register) points to a segment
    containing static and external data.
  • SS (Stack segment register) points to a segment
    containing the current program stack.
  • ES, FS GS are general purpose registers and may
    refer to arbitrary data segments.

9
IA-32 Segmentation(2)
  • Segment Descriptors (8 Byte)
  • Unique Segment Identifier.
  • Stored in Global Descriptor Table (GDT).
  • Contains
  • 32 bit Base address of the segment
  • 20 bit limit
  • 4 bit Type that denote segment type and access
    rights.
  • DPL (Descriptor Privilege Level) Field 0 means
    use is restricted to only kernel mode, 3 means
    both mode.

10
IA-32 Protection
  • Protection
  • Intel Use 4 Privilege levels 0-3 with 0 being
    the most privilege level.
  • The privilege level of executing program is
    determined by the privilege level of the code
    segment currently executing.
  • CPL (Current privilege level) Bit 0 1 of CS
    (code segment) register.
  • The processor changes CPL when program control is
    transferred to a code segment with a different
    privilege level.
  • DPL (Descriptors privilege level) Bits in
    Segment descriptor. When the currently executing
    code segment attempts to access a segment, The
    DPL is compared to the CPL of CS.
  • Programs executing in a high privilege level can
    not access segments with a lower privilege level
    while programs low privilege level can access all
    segments.

11
Segmentation in Linux
  • There is no mode bit to disable segmentation.
  • Linux prefer paging over segmentation because of
    simplicity and portability.
  • The pages are divided among 4 Segments.
  • All process use the same logical address and
    segment descriptors.
  • GDT is implemented is /arch/i386/kernet/head.S
  • Each time CPL in CS change, DS and SS changed
    correspondingly.
  • SS points to DS.

Segments used by Linux Type DPL Accessed By
Kernel Code Code, Read, Execute 0 Kernel
Kernel Data Data, Read , Write 0 Kernel
User Code Code, Read, Execute 3 Both
User Data Data, Read , Write 3 Both
12
Protection in Linux
  • Segments overlap in linear address space
    /arch/i386/kernet/head.S
  • Thus access is effectively allowed to the entire
    virtual address space using any of the above
    segments.
  • All processes have two segments
  • 0 - 3GB user segment
  • 3GB - 4GB kernel segment
  • Boundary is determined by PAGE_OFFSET
    0xC00000000.
  • Process in user mode (CPL 3) can only access
    addresses lower than 3 GB (only segments with DPL
    3).
  • Process in kernel mode (e.g. after a system call)
    can access both. When CPL 0, can access
    segments (DPL 0,3)
  • Any distinction between code and data is enforced
    at the page level, not at the segment level R/W
    , U/S bit of page.

13
IA-32 Paging
  • Paging
  • RAM is partitioned into fixed-sized page frames.
  • Linear address is divided into same size pages
  • The processor use information contained in page
    directories and page tables (stored in RAM) to
    map linear to physical address and to generate
    page fault exception.
  • Translation Lookaside Buffers (TLB) are used to
    store most recently accessed page directory and
    table entries to reduce access time.
  • Intel supports 4KB, 2MB, 4MB page size.
  • Paging is controlled by three flags in the
    processors control registers and sets by OS
    during initialization.
  • PG (paging) Available in all Intel processor
    starting from 80386. Enable paging.
  • PSE (page size extensions) Introduced in the
    Pentium processor. Permit large page(4 MB/2 MB
    when PAE is set)
  • PAE (physical address extension) Introduced in
    the Pentium Pro processors. Provides a method of
    extending physical address to 36 bits(64MB).
    Support page size of 4 KB/2 MB.

14
Page Table and directories
  • 32 bit linear address is divided into 3
    fields(4KB page)
  • Page Directory Most significant 10 bits (1024
    entry)
  • Page Table The intermediate 10 bits (1024 entry)
  • Offset Least significant 12 bits (Each page is
    4KB)
  • Incase of 2MB/4MB page, most significant 10 bits
    are for page directory and rest 22 bits are for
    page offset. Page tables are not used.

15
Page Directory and Page table Entries
  • When 32 bit address and 4KB page used
  • 20 bit base address, bits 12 through 32.
  • Present when set, Page is in RAM.
  • Read/Write When set, page can be read and
    written into.
  • User/supervisor When set, user privilege level,
    otherwise both.
  • Accessed sets each time paging unit access the
    entry.
  • PCD (page-level cache disable) and PWT
    (page-level write through)
  • Dirty Applies page table entries only. Sets when
    the page is accessed for write.
  • Global Introduced in Pentium Pro. Applies page
    table entries only. When set indicates a global
    page and prevent the page flushed from TLB when
    context switch occurs.
  • Page size Applies page directories only. When 1
    refers to 2MB/4MB page frame PGD points to
    page. 4KB page when 0.
  • This flags are checked by hardware to see whether
    requested kind of addressing can be performed.

16
Paging in Linux(1)
  • Linux uses 3 level paging to adopt to 64 bit
    architectures.
  • Page global directory (PGD)
  • Page Middle directory (PMD)
  • Page table
  • Linear address is divided into four parts three
    table offset and an page offset.
  • What happens with IA-32, which use only two level
    page tables?
  • Linux makes the PMD entry points back to PGD.
  • IA-32 contains 1024 entries in PGD, one entry in
    PMD and 1024 entries in page table.
  • Each process has its own PGD. During context
    switch, PGD base value of the process executing
    next is loaded into CR3 and TLB get flushed.

17
Paging in Linux(2)
  • Linux use PAE, but dont use PSE.
  • Also use page size (PS) flag of PGD to refer
    different page size for that specific PGD.
  • Mixing 4MB and 4 KB page size
  • Kernel use large page(4MB) and one level
    translation to reduce TLB entries and memory.
  • Application use 4KB page.

PAE PS of PGD Page size Physical Address size
0 0 4KB 32 bit
0 1 4MB 32 bit
1 0 4KB 36 bit
1 1 2MB 36 bit
18
Paging in Linux(3)
  • include/asm-i386/page.h
  • 5 define PAGE_SHIFT 12
  • 6 define PAGE_SIZE (1UL ltlt PAGE_SHIFT)
  • 7 define PAGE_MASK ((PAGE_SIZE-1))
  • include/asm-i386/pgtable.h include/asm-i386/pgtab
    le-2level.h
  • Page table lookup code mm/memory.c

19
Paging in Linux (4)
  • The linear address space is split into two parts.
  • The userspace(0-3GB) can be addressed in both
    mode
  • Kernel space(3GB-4GB) can be accessed in only
    kernel mode.
  • PAGE_OFFSET is defined as 0xc0000000 (3 GB)
  • Kernel Paging (4 MB page)
  • Kernel code and data stored in a group of
    reserved page frame.
  • Never be dynamically assigned or swapped to disk.
  • Kernel maintains a set of page tables rooted at
    Master Kernel Page Global Directory.
  • How kernel initializes its own page tables?
  • swapper_pg_dir is initialized during kernel
    compilation.
  • Phase 1 Kernel can address the first 8 MB of RAM
    by either LA identical to PA or 8MB starts from
    0xc0000000.
  • Phase 2 Only transform LA starts from 0xc0000000
    to PA from 0.
  • Where Paging starts? /arch/i386/kernel/head.S

20
Physical Memory Management
  • Physical memory is divided into three Zones DMA,
    Normal HighMEM.
  • Page frames are assigned from these zones.
  • Each physical page is associated with a page
    descriptor
  • All pages are stored in mem_map array.
  • Requesting page frames alloc_pages() allocates
    groups of contiguous page frames and use buddy
    system.
  • If alloc_pages cant find a free page frame, it
    calls try_to_free_pages() to reclaim.
  • try_to_free_pages() reclaim pages according to
    LRU algorithm.
  • Memory for small data structures are carried out
    by Slab Allocator.

21
Process Address Space
  • The linear address space is split into two parts.
  • The userspace(0-3GB) changes with each context
    switch and accessed in both mode.
  • Kernel space(3GB-4GB) remains constant and
    accessed while in kernel mode.
  • Memory descriptor mm_struct.
  • One structure exits for each process and is
    shared among threads.
  • Memory descriptor for kernel threads.

PAGE_OFFSET 0xC0000000
Kernel code data
User code data
22
Memory Regions
  • Full address space rarely used
  • Each address space consists of several non
    overlapping page aligned regions that are in use.
  • Each region contains pages with same protection
    and purpose.
  • A list of mapped regions by /proc/PID/maps
  • Regions are described by vm_area_struct
  • If a file is memory mapped, the file pointer is
    available through vm_file.
  • do_mmap(), find_vma(), get_unmapped_aera()

23
Process Address Space
Linear Address
Memory Regions
mmap_cache
mmap
Memory Descriptor
24
Page faulting
  • Demand fetching
  • Page is only fetched from swap space when
    hardware raise a page fault exception, which then
    the OS traps and allocates a page.
  • A number of pages after the faulting page is
    prefetched.
  • Two types of page fault
  • Major Has to read from disk, expensive.
  • Minor Page in swap cache, protection fault.
  • Architecture specific function do_page_fault().
  • basically decides what type of fault and how can
    it be handled.
  • If it is a valid page fault in a valid memory
    region then call architecture independent
    function handle_mm_fault().
  • It allocates the required page table entries and
    calls handle_pte_fault.

25
Do_page_fault() flow diagram
26
handle_mm_fault() Call graph
handle_mm_fault Allocates required page table
entries, if they dont exist
handle_pte_fault Based on properties,
corresponding handlers are called
do_swap_page Pages swapped out to disk
do_wp_page Copy on Write (COW) page
do_no_page If first time allocation
do_anonymous_page Handle anonymous access
27
Copy on Write (COW)
  • During fork kernel duplicates the parent address
    space to child. It requires
  • Allocating page frames for the page tables of
    child process.
  • Allocating page frames for the pages of the child
    process.
  • Copying the pages of parent process to the pages
    of child process.
  • Linux use an efficient copy on write approach
  • The pages and page table entries are shared
    between parent and child process and cant be
    modified.
  • Whenever either one tries to write, a write fault
    occurs.
  • Kernel then duplicates the page into a new page
    frame and marks it as writable.
  • The original page frame remain write protected.
    When other process tries to write, kernel check
    whether it is only owner. If so then the page
    become writable.

28
Whats different in 2.6
  • The big change is Linux's new support for NUMA
    servers. Support for high end systems with
    multiple processors, with separate memory pools
    directly connected to each processor.
  • Support for Intel's PAE (Physical Address
    Extension) allows the access up to 64 GB of RAM
    in paged mode. Linux can now run applications
    that access large blocks of memory.
  • For example, bigger databases are now supported
    on Linux.
  • Reverse Mapping
  • Multiple virtual pages (pages shared by different
    processes) might point to the same physical page.
  • The technique is useful when the kernel wants to
    free a particular physical page.

29
References
  • IA-32 Intel Architecture Software Developers
    Manual Volume 3 System Programming Guide
    (Document 253668) Chapter 3 4.
  • Bovet, D., and Cesati, M. Understanding the Linux
    Kernel. O'Reilly, 2001. (chapter 2, 7, 8 16)
  • Virtual memory management for Linux 2.4 kernel
    Description    Code documentation
  • http//home.earthlink.net/jknapka/linux-mm/vmoutl
    ine.html
  • Dietel Dietel, Operating Systems, Prentice Hall
    , 2004
  • The Wonderful World of Linux 2.6 by Joseph
    Pranevich
Write a Comment
User Comments (0)
About PowerShow.com