Title: IA32 Paging Scheme
1IA32 Paging Scheme
- Introduction to the Intel x86s support for
virtual memory
2What is paging?
- Its a scheme for dynamically remapping addresses
for fixed-size memory-blocks
Virtual address-space
Physical address-space
3Whats paging good for?
- For efficient time-sharing among multiple
tasks, an operating system needs to have several
programs residing in main memory at the same time
- To accomplish this using actual physical
memory-addressing would require doing
address-relocation calculations each time a
program was loaded (to avoid conflicting with any
addresses already being used)
4Why use Paging?
- Use of paging allows relocations to be done
just once (by the linker), and every program can
reuse the same addresses
Task 3
Task 1
physical memory
Task 2
5How to enable paging
Control Register CR0
31
0
P G
C D
N W
A M
W P
N E
E T
T S
E M
M P
P E
Protected-Mode must be enabled (PE1)
Then Paging can be enabled (set PG1)
Here is how you can enable paging (if CPU is in
protected-mode) mov cr0, eax get current
machine status bts 31, eax turn on the
PE-bits image mov eax, cr0 put modified
status in CR0 jmp . 2 now flush the
prefetch queue but you had better prepare the
mapping beforehand!
6Several paging schemes
- Intels design for paging has continued to
evolve since its introduction in 80386 CPU - Our Core-2 Quad CPUs support the initial paging
design (plus several extensions) - Here we shall describe the initial design (its
simplest and it remains the default) - It is based on subdividing the entire 4GB
physical address-space into 4-KB blocks
7Terminology
- The 4KB memory-blocks are called page frames --
and they are non-overlapping - Therefore each page-frame begins at a
memory-address which is a multiple of 4K - Remember 4K 4 x 1024 4096 212
- So the address of any page-frame will have its
lowest 12-bits equal to zeros - Example page six begins at 0x00006000
8Control Register CR3
- Register CR3 is used by the CPU to find the
paging-tables in memory which define its
virtual-to-physical address-translation - Specifically, CR3 points to a page-frame, called
the Page Directory, which contains addresses of
frames called Page Tables - An address in CR3 must be page aligned
31
0
CR3
Physical Address of the Page-Directory
9Two-Level Translation Scheme
PAGE TABLES
PAGE DIRECTORY
PAGE FRAMES
CR3
10Page-Directory
- The Page-Directory occupies one frame, so it has
room for 1024 4-byte entries - Each page-directory entry can contain a pointer
to a further data-structure, called a Page-Table
(also page-aligned 4KB size) - Each Page-Table occupies one frame and has enough
room for 1024 4-byte entries - Page-Table entries can contain pointers
11Address-translation
- The CPU examines any virtual address it
encounters, subdividing it into three fields
31 22 21
12 11
0
offset into page-frame
index into page-directory
index into page-table
10-bits
10-bits
12-bits
This field selects one of the 1024
array-entries in the Page-Directory
This field selects one of the 1024
array-entries in that Page-Table
This field provides the offset to one of
the 4096 bytes in that Page-Frame
12Identity-mapping
- When the CPU first turns on the paging
capability, it must be executing code from an
identity-mapped page (or it crashes!)
identity-mapping
code
code
physical memory
virtual memory
13Additional mappings
- Besides having at least one page that is
identity-mapped (for turning paging on),
there can be multiple other mappings
data
data
identity-mapping
code
code
data
data
physical memory
virtual memory
14Demo program
- We wrote a very simple demo-program showing how
to create a Page-Directory and a Page-Table for
an identity-mapping of the page-frame that
contains program-code, plus a non-identity
mapping for the initial page of the video display
memory - This demo is named vrampage.s (you can find it
on our CS 630 course website)
15Demos page-mapping
program arena
one page-table
page-directory
unused unused
unused
video memory
CR3
Our vrampage.s demo-program uses only four
page-frames of physical memory (16K) 1) the
programs arena (at 0x00010000) 2) the
page-directory (at 0x00011000) 3) only one
page-table (at 0x00012000) 4) one page of vram
(at 0x000B8000)
16Virtual-to-Physical
video memory
0x000B8000
page-directory
page-table
code and data
code and data
0x00010000
0x00010000
video memory
0x00000000
physical address-space
virtual address-space
17The demos table-entries
- Our page-directory uses only one entry
- And our page-table uses only two entries
0x00011 003
pgdir0x000
0x000B8 003
pgtbl0x000
0x00010 003
pgtbl0x010
identity-mapping
18The segment descriptors
- Our demos GDT uses three descriptors
executable segment at virtual-address
0x00010000
0x0000009A010000FFFF
writable segment at virtual-address 0x00010000
0x00000092010000FFFF
writable segment at virtual-address 0x00000000
0x00000092000000FFFF
19Page-Level protection
- Each entry in a Page-Table can assign a
collection of attributes to the Page-Frame it
points to for example - The P-bit (page is present) can be used by an
operating system to implement demand paging and
memory-mapping of disk-files - The W/R-bit can be used to mark a page as either
Writable (1) or as Read-Only (0) - The U/S-bit can be used to mark a page as
User-accessible or as Supervisor-only
20Format of a Page-Table entry
31
12 11 10 9 8 7 6 5 4 3 2 1 0
PAGE-FRAME BASE ADDRESS
P
W
U
P W T
P C D
A
D
0
0
AVAIL
LEGEND P Present (1yes, 0no) W Writable
(1 yes, 0 no) U User (1 yes, 0 no)
A Accessed (1 yes, 0 no) D Dirty (1
yes, 0 no)
PWT Page Write-Through (1yes, 0 no) PCD
Page Cache-Disable (1 yes, 0 no)
21Format of a Page-Directory entry
31
12 11 10 9 8 7 6 5 4 3 2 1 0
PAGE-TABLE BASE ADDRESS
P
W
U
P W T
P C D
A
0
P S
0
AVAIL
LEGEND P Present (1yes, 0no) W Writable
(1 yes, 0 no) U User (1 yes, 0 no)
A Accessed (1 yes, 0 no)
PS Page-Size (04KB, 1 4MB)
PWT Page Write-Through (1yes, 0 no) PCD
Page Cache-Disable (1 yes, 0 no)
NOTE The PS-bit is only meaningful when the
PSE-bit in register CR4 is set
22Violations
- When a task violates the page-attributes of any
Page-Frame, the CPU will generate a Page-Fault
Exception (interrupt 0x0E) - Then the operating systems page-fault
exception-handler gets control and can take
whatever action it deems is suitable - The CPU will provide help to the OS in
determining why a Page-Fault occurred
23The Error-Code format
- The CPU will push an Error-Code onto the
operating systems stack
3 2 1 0
P
W / R
U / S
reserved (0)
Legend P (Present) 0attempted access was to
a not-present page W/R (Write/Read)
1attempted to write to a read-only page U/S
(User/Supervisor) 1user attempted access to a
supervisor page NOTE User means that CPL
3 Supervisor means that CPL 0, 1, or 2
24Control Register CR2
- Whenever a Page-Fault is encountered, the CPU
will save the virtual-address that caused that
fault into the CR2 register - If the CPU was trying to modify the value of an
operand in a read-only page, then that
operands virtual address is written into CR2 - If the CPU was trying to read the value of an
operand in a supervisor-only page (or was trying
to fetch-and-execute an instruction) while CPL3,
the relevant virtual address will be written into
CR2
25CR3 and Task-Switching
32-bits
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68
72 76 80 84 88 92 96 100
link
esp0
ss0
esp1
ss1
esp2
Page-Table Directory Base
ss2
PTDB
EIP
This value will get loaded into register CR3
as part of the context-switching mechanism
when paging has been enabled (PG1) So the
incoming task will automatically have its
own individual mapping of its virtual
address-space to page-frames in the CPUs
physical address-space
26 longwords
ss0
ss0
EFLAGS
ss0
ss0
EAX
ss0
ss0
ECX
ss0
ss0
EDX
ss0
ss0
EBX
ss0
ss0
ESP
ss0
ss0
EBP
ss0
ss0
ESI
ss0
ss0
EDI
ES
CS
SS
DS
field is static
FS
GS
field is volatile
LDTR
IOMAP
TRAP
field is reserved
I/O permission bitmap
26Extensions to paging scheme
- The Core-2 Quad CPU provides several enhancements
to the original 386 paging - These enhancements are optional and must be
selectively enabled by software - Control Register CR4 implements bits to turn on
the desired paging-extension and some other
enhancements that are unrelated to the paging
architectures
27Control Register CR4
31
13 10 9 8 7 6 5 4
3 2 1 0
V M X E
P C E
P G E
M C E
P A E
P S E
D E
T S D
P V I
V M E
Legend (for paging-related extensions) PSE
Page-Size Extension is enabled (1 yes, 0 no)
PAE Page-Address Extension is enabled (1
yes, 0 no) PGE Page-Global Extension is
enabled (1 yes, 0 no)
28What about efficiency?
- When paging is enabled, every reference to memory
requires the CPU to translate the
virtual-address into a physical-address - That translation is based on table-lookups
- These lookups must be done sequentially
- So address-translation could be costly in terms
of CPU speed a high percentage of instructions
typically refer to memory
29The TLB solution
- When the CPU has performed the table lookups that
map a virtual-address to a physical-address, it
remembers that relationship by saving the pair
of page-addresses (virtual-page ? physical page)
in a special CPU cache known as the TLB
(Translation Look-aside Buffer) - Subsequent references to this same page can be
resolved quickly -- via that cache!
304-way set-associative
- The TLB is implemented as a 4-way
set-associative cache -- its like a
parallelized version of a Hash Table (with
evictions) - Due to the locality of reference principle, the
TLB concept generally works well in most common
programming contexts as an efficient speedup of
the page-address table-lookup translation-mechanis
m - Modifying CR3 invalidates the TLB cache