Title: COMP 206: Computer Architecture and Implementation
1COMP 206Computer Architecture and Implementation
- Montek Singh
- Wed., Nov. 17, 2004
- Topic Virtual Memory
2Outline
- Introduction
- Address Translation
- VM Organization
- Examples
- Reading HP3 Section 5.10
- For background Refer to PH (Comp. Org.)
3Characteristics
4Addressing
- Always a congruence mapping
- Assume
- 4GB VM composed of 220 4KB pages
- 64MB DRAM main memory composed of 16384 page
frames (of same size) - Only those pages (of the 220) that are not empty
actually exist - Each is either in main memory or on disk
- Can be located with two mappings (implemented
with tables)
Virtual address (virtual page number, page
offset) VA (VPN, offset) 32 bits (20
bits 12 bits) Physical address (real page
number, page offset) PA (RPN, offset) 26
bits (14 bits 12 bits)
5Address Translation
VA ? PA (VPN, offset within page) ? (RPN,
offset within page) VA ? disk address
- RPN fM(VPN)
- In reality, VPN is mapped to a page table entry
(PTE) - which contains RPN
- as well as miscellaneous control information
(e.g., valid bit, dirty bit, replacement
information, access control)
6Single-Level, Direct Page Table in MM
- Fully associative mapping
- when VM page is brought in from disk to MM, it
may go into any of the real page frames - Simplest addressing scheme one-level, direct
page table - (page table base address VPN) PTE or page
fault - Assume that PTE size is 4 bytes
- Then whole table requires 4?220 4MB of main
memory - Disadvantage 4MB of main memory must be reserved
for page tables, even when the VM space is almost
empty
7Single-Level Direct Page Table in VM
- To avoid tying down 4MB of physical memory
- Put page tables in VM
- Bring into MM only those that are actually needed
- Paging the page tables
- Needs only 1K PTEs in main memory, rather than
4MB - Slows down access to VM pages by possibly needing
disk accesses for the PTEs
8Multi-Level Direct Page Table in MM
- Another solution to storage problem
- Break 20-bit VPN into two 10-bit parts
- VPN (VPN1, VPN2)
- This turns original one-level page table into a
tree structure - (1st level base address VPN1) 2nd level base
address - (2nd level base address VPN2) PTE or page
fault - Storage situation much improved
- Always need root node (1K 4-byte entries 1 VM
page) - Ned only a few of the second level nodes
- Allocated on demand
- Can be anywhere in main memory
- Access time to PTE has doubled
9Inverted Page Tables
- Virtual address spaces may be vastly larger (and
more sparsely populated) than real address spaces - less-than-full utilization of tree nodes in
multi-level direct page table becomes more
significant - Ideal (i.e., smallest possible) page table would
have one entry for every VM page actually in main
memory - Need 4?16K 64KB of main memory to store this
ideal page table - Storage overhead 0.1
- Inverted page table implementations are
approximations to this ideal page table - Associative inverted page table in special
hardware (ATLAS) - Hashed inverted page table in MM (IBM, HP PA-RISC)
10Translation Lookaside Buffer (TLB)
- To avoid two or more MM accesses for each VM
access, use a small cache to store (VPN, PTE)
pairs - PTE contains RPN, from which RA can be
constructed - This cache is the TLB, and it exploits locality
- DEC Alpha (32 entries, fully associative)
- Amdahl V/8 (512 entries, 2-way set-associative)
- Processor issues VA
- TLB hit
- Send RA to main memory
- TLB miss
- Make two or more MM accesses to page tables to
retrieve RA - Send RA to MM
- (Any of these may cause page fault)
11TLB Misses
- Causes for TLB miss
- VM page is not in main memory
- VM page is in main memory, but TLB entry has not
yet been entered into TLB - VM page is in main memory, but TLB entry has been
removed for some reason (removed as LRU,
invalidated because page table was updated, etc.) - Miss rates are remarkably low (0.1)
- Miss rate depends on size of TLB and on VM page
size (coverage) - Miss penalty varies from a single cache access to
several page faults
12Dirty Bits and TLB Two Solutions
- TLB is read-only cache
- Dirty bit is contained only in page table in MM
- TLB contains only a write-access bit
- Initially set to zero (denying writing of page)
- On first attempt to write VM page
- An exception is caused
- Sets the dirty bit in page table in MM
- Resets the write access bit to 1 in TLB
- TLB is a read-write cache
- Dirty bit present in both TLB and page table in
MM - On first write to VM page
- Only dirty bit in TLB is set
- Dirty bit in page table is brought up-to-date
- when TLB entry is evicted
- when VM page and PTE are evicted
13Virtual Memory Access Time
- Assume existence of TLB, physical cache, MM, disk
- Processor issues VA
- TLB hit
- Send RA to cache
- TLB miss
- Exception Access page tables, update TLB, retry
- Memory reference may involve accesses to
- TLB
- Page table in MM
- Cache
- Page in MM
- Each of these can be a hit or a miss
- 16 possible combinations
14Virtual Memory Access Time (2)
- Constraints among these accesses
- Hit in TLB ? hit in page table in MM
- Hit in cache ? hit in page in MM
- Hit in page in MM ? hit in page table in MM
- These constraints eliminate eleven combinations
15Virtual Memory Access Time (3)
- Number of MM accesses depends on page table
organization - MIPS R2000/R4000 accomplishes table walking with
CPU instructions (eight instructions per page
table level) - Several CISC machines implement this in
microcode, with MC88200 having dedicated hardware
for this - RS/6000 implements this completely in hardware
- TLB miss penalty dominated by having to go to
main memory - Page tables may not be in cache
- Further increase in miss penalty if page table
organization is complex - TLB misses can have very damaging effect on
physical caches
16Page Size
- Choices
- Fixed at design time (most early VM systems)
- Statically configurable
- At any moment, only pages of same size exist in
system - MC68030 allowed page sizes between 256B and 32KB
this way - Dynamically configurable
- Pages of different sizes coexist in system
- Alpha 21164, UltraSPARC 8KB, 64KB, 512KB, 4MB
- MIPS R10000, PA-8000 4KB, 16Kb, 64KB, 256 KB, 1
MB, 4 MB, 16 MB - All pages are aligned
- Dynamic configuration is a sophisticated way to
decrease TLB miss - Increasing TLB entries increases processor
cycle time - Increasing size of VM page increases internal
memory fragmentation - Needs fully associative TLBs
17Segmentation and Paging
- Paged segments Segments are made up of pages
- Paging system has flat, linear address space
- 32-bit VA (10-bit VPN1, 10-bit VPN2, 12-bit
offset) - If, for given VPN1, we reach max value of VPN2
and add 1, we reach next page at address (VPN1,
0) - Segmented version has two-dimensional address
space - 32-bit VA (10-bit segment , 10-bit page
number, 12-bit offset) - If, for given segment , we reach max page number
and add 1, we get an undefined value - Segments are not contiguous
- Segments do not need to have the same size
- Size can even vary dynamically
- Implemented by storing upper bound for each
segment and checking every reference against it
18Example 1 Alpha 21264 TLB
19Example 2 Hypothetical Virtual Mem