Title: Processor
1(No Transcript)
2Processor
- Processor Optimizations
- Instruction Pipelining
- Several instructions are worked on simultaneously
- Instruction Parallelism
- Several instructions are executed in parallel
- Register Renaming
- Change a register without affecting the logic
- Speculative Execution
- One or both pathways are executed in advance
3Memory Caches
- An instruction to fetch data from main memory
causes the processor to stall-that is, consume
many cycles doing nothing - Space locality
- Instruction, data
- Time locality
- Small loop
- Data typically is moved between cache and main
memory in chunks bigger than a word (cache line
128bytes)
4Memory Caches
- Cache hit cache miss
- Write through write behind/delayed write
- Write hit
- If the same address is written down to cache
before the old data goes down to main memory, we
get a write hit - L1 Cache L2 Cache
5Memory Caches
6Memory caches
7Memory caches
- Internal buffer
- 0.031KB (32Bytes)
- L1
- 4KB-32KB
- L2
- 512KB-4KB
- Main memory
- 128MB-1GB
8Multiple Processors
- Symmetric MultiProcessor(SMP)
- Typically, multiple processors of an SMP do not
share caches each has its own private cache - Cache coherency
- Use bus snooping protocol
- Monitors the memory bus all the time
9(No Transcript)
10(No Transcript)
11Purpose of the locks
- To allow writes to be atomic and serialized (as
required by Posix) - To provide critical sections in code, where
multiple clients might want to modify the same
metadata (e.g. file creation and deletion) - To provide coherency of cached data
- When reading data from disk into cache such that
other readers may also do the same, but prevents
other writers from modifying parts of the data as
it is being read - When writing data from cache to disk such that
other writers and readers cannot access
inconsistent data (as it is being written)
12Lock modes and their compatibility
13Relationship between Frangipani, lock clerk the
distributed lock manager
Node A
Node B
Node C
Token request / revoke / release
DLM
14Lock Tables, Locks Lock Groups
DLS_Context
User specified hierarchy
Lock Table 1
Lck Grp Server 1
Mapped at creation by hashing name
Lck Grp Server 2
Lock Table 2
Mapping is reassigned whenever number of servers
change
Lck Grp Server 3
Lock Table 3
15Sequence of events
Refresh cache
R 1
Holds Read token
Requests Read token
Invalidate cache
R 2
Holds Read token
Requests Read token
Refresh cache
Invalidate cache
R 3
Holds Read token
Invalidate cache
Downgrade token to Read
Writes Data to disk
Requests Write token
W 1
Requests Write token
time
t1
t2
t7
t9
t10
t3
t4
t5
t6
t8
Revokes Read token
Read token released
Grants Write token
Degrade Write token
Grants Read token
DLM
- Scenario 3 Readers (R1, R2, R3) 1 writer W1
- Readers hold Read locks, while cached data is
valid - Cache is revalidated on grant after revoke
16Failure handling
- DLM is designed to withstand all single points of
failure - Alive state of each member (clients or servers)
is ascertained through pings - Clients pinged every 5 seconds
- Servers pinged every 30 seconds
- Lock mastering distribution checked every 10
seconds - Servers recover by querying all clients for
granted locks, and rebuilding the lock states - Clients recover when a server picks an alternate
client to cleanup the failed clients locks
17Pinging servers to detect state changes Group
Ownership Adjustments
time
t1
t2
t3
t4
t5
t6
t7
Adjust Groups? No Change
Adjust Groups? No Change
DLM1
Adjust Groups? Publish New Grp Plan
Publish DLM1 Failure
Adjust Groups? No Change
DLM2
30 sec
Failover
10 sec
10 sec
Ping Server
Ping Server
Paxos Operation Lost Grps VOID New Grps RVYG
Ping Server
Paxos Operation Mark DLM1 failed
DLM3
10 sec
30 sec
Ping Server
Failover
10 sec
Ping Server
Paxos Operation Lost Grps VOID New Grps RVYG
Paxos Operation Mark DLM1 failed
18Lock Recovery on Server failure
New Grp 5 Server DLM2
New Grp 7/9 Server DLM3
DLC1
Holds Share 1 token _at_Grp5
Get list of tokens Grp 5
Get list of tokens Grp 7/9
New Grp 5 Server DLM2
DLC2
Holds Share 1 token _at_Grp7
Get list of tokens Grp 5
Get list of tokens Grp 7/9
New Grp 7/9 Server DLM3
DLC3
Holds Excl 2 token _at_Grp9
New Grp 5 Server DLM2
New Grp 7/9 Server DLM3
Get list of tokens Grp 5
Get list of tokens Grp 7/9
time
t1
t2
t3
t4
t5
t6
DLM1
Paxos Op Reassigns Groups
Failover Thread detects DLM1 departure
Group 5 state Recovering
Rvy Thread Grp5 state Ready
DLM2
DLM3
Paxos Op Reassigns Groups
Groups 7 9 states Recovering
Rvy Thread Grp7/9 states Ready
- Scenario 3 Clients, 3 Servers
- Clients hold tokens in Grps 5 9
- DLM1 serves Grp5, Grp9
- Lock Server DLM1 dies
19Lock Recovery on Client failure
DLC1
Holds table A tokens _at_Grp5
DLC2
Holds table B tokens _at_Grp5
Recover tbl A For DLC1
DLC3
Holds table C tokens _at_Grp9
Error! Already recovering
time
t1
t2
t3
t4
t5
DLM1
For each tbl Find a client (e.g DLC2) in same
group as DLC1
DLC1 failure detected
Unregister DLC1 As a client
For each tbl Find a client (e.g DLC2) in same
group as DLC1
Unregister DLC1 As a client
DLC1 failure detected
DLM2
20Memory Addresses
- Logical address
- Included in the machine language instructions to
specify the address of an operand or of an
instruction - Each logical address consists of a segment and an
offset that denotes the distance from the start
of the segment to the actual address - Linear address
- A single 32-bit unsigned integer that can be used
to address up to 4GB (0x00000000 0xffffffff) - Physical address
- Used to address memory cells in memory chips
21Linear Address
Logical Address
Physical Address
Segmentation UNIT
Paging UNIT
22Segmentation in Hardware
- A logical address consists of two parts
- The segment identifier is a 16-bit field called
- Segment Selector
- The offset is a 32-bit field
- Provides segmentation registers
- cs
- The code segment register, which points to a
segment containing program instruction - ss
- The stack segment register, which points to a
segment containing the current program stack - ds
- The data segment register, which points to a
segment containing static and external data
23Segment Descriptors
- 8-byte Segment Descriptor that describes the
segment characteristics - Stored either in the Global Descriptor Table(GDT)
or in the Local Descriptor Table(LDT) - Usually only one GDT is defined, while each
process is permitted to have its own LDT if it
needs to create additional segments - The address of the GDT in main memory is
contained in the gdtr processor register and the
address of the currently used LDT is contained in
the ldtr processor register
24Segment Descriptors
- 32-bit base field that contains the linear
address of the first byte of the segment - G granularity flag. If it is cleared, the segment
size is expressed in bytes otherwise, it is
expressed in multiples of 4096 bytes - 20-bit limit field that denotes the segment
length in bytes - When G is set to 0, the size of a non-null
segment may vary 1byte and 1MB otherwise, it may
vary between 4KB and 4GB - An S system flag. If it is cleared, the segment
is a system segment that stores kernel data
structures otherwise, it is a normal code or
data segment - 4-bit Type field that characterizes the segment
type and its access types
25Segment Descriptors
- 4-bit Type field
- Code Segment Descriptor
- Indicates that the segment descriptor refers to a
code segment - Data Segment Descriptor
- Indicates that the segment descriptor refers to a
data segment - Task State Segment Descriptor
- Indicates that the segment descriptor refers to a
task state segment (TSS) a segment used to save
the contents of the processor registers - Local Descriptor Table Descriptor
- Indicates that the segment descriptor refers to a
segment containing an LDT
26Segment Descriptors
- DPL (Descriptor Privilege Level)
- Used to restrict accesses to the segment. It
represents the minimal CPU privilege level
requested for accessing the segment - A segment with its DPL set to 0 is Kernel Mode
- A segment with its DPL set to 3 is User Mode
- Segment-Present flag
- Equal to 0 if the segment is currently not stored
in main memory. - Linux always sets this field to 1, since it never
swaps out whole segments to disk - An additional flag called D or B
- Used differently, depending on whether the
segment contains code or data - Set to 1 if the addresses used as segment offsets
are 32 bits long and it is cleared if they are 16
bits long
27Segmentation Unit
Segment
Descriptor Table
Segment descriptor
Segmentation register
Nonprogrammable register
Segment Selector
28Segmentation Unit
- 80x86 processor provides a nonprogrammable
register to speed up the translation of logical
address into linear addresses - Each nonprogrammable register contains the 8-byte
segment descriptor specified by the segment
selector - Every time a segment selector is loaded in a
segmentation register, the corresponding segment
descriptor is loaded from memory into the
matching nonprogrammable CPU register - Translations of logical addresses referring to
that segment can be performed without accessing
the GDT or LDT stored in main memory - Accesses to the GDT or LDT are necessary only
when the contents of the segmentation register
change
29Segment Descriptor
- If the GDT is at 0x00020000 ( the value stored in
the gdtr register) and the index specified by the
segment selector is 2, - the address of the corresponding segment
descriptor is - 0x00020000 (2x8) 0x00020010
30Segment Selector
- 13-bit index that identifies the corresponding
segment descriptor entry contained in the GDT or
in the LDT - TI (Table Indicator) flag that specifies whether
the Segment Descriptor is included in the
GDT(TI0) or in the LDT(TI1) - RPL (Requestor Privilege Level) 2-bit field,
which is precisely the Current Privilege Level of
the CPU when the corresponding Segment Selector
is loaded into the cs register
31Translating a logical address
gdt/ldt
linear address
Descriptor
gdtr/ldtr
Index
TI
Selector
Offset
32Translating a logical address
- Examines the TI field of the segment selector to
determine which descriptor table stores the
segment descriptor - Compute the address of the segment descriptor
from the index field of the segment selector. The
index field is multiplied by 8 ( the size of a
segment descriptor), and the result is added to
the content of the gdtr or ldtr register - Adds the offset of the logical address to the
base field of the segment descriptor, thus
obtaining the linear address
33Segmentation in Linux
- Linux uses segmentation in a very limited way
- Memory management is simpler when all processes
use the same segment register values when that
share the same set of linear addresses - One of the design objectives of linux is
portability to a wide range of architectures
RISC architectures in particular have limited
support for segmentation - All processes use the same logical addresses and,
except 80x86, linux stores all segment
descriptors in the Global Descriptor Table (GDT) - GDT is implemented by the array gdt_table
referred to by the gdt variable
34Segmentation in Linux
- A kernel code segment
- Base 0x00000000
- Limit 0xfffff
- G (granularity flag)1, for segment size
expressed in pages - S (system flag)1, for normal code or data
segment - Type0xa, for code segment that can be read and
executed - DPL (Descriptor Privilege Level)0, for Kernel
Mode - D/B (32-bit address flag)1, for 32-bit offset
addresses
35Segmentation in Linux
- A kernel data segment
- Base 0x00000000
- Limit0xfffff
- G (granularity flag)1, for segment size
expressed in pages - S (System flag) 1, for normal code or data
segment - Type2, for data segment that can be read and
written - DPL (Descriptor Privilege Level)0, for Kernel
mode - D/B (32-bit address flag)1, for 32-bit offset
addresses
36Segmentation in LINUX
- A user code segment shared by all processes in
User Mode - Base0x00000000
- Limit0xfffff
- G (granularity flag)1, for segment size
expressed in pages - S (system flag)1, for normal code or data
segment - Type0xa, for code segment that can be read and
executed - DPL (Descriptor Privilege Level)3, for User Mode
- D/B (32-bit address flag)1, for 32-bit offset
addresses
37Segmentation in Linux
- A user data segment shared by all processes in
User Mode - Base0x00000000
- Limit0xfffff
- G (Granularity flag)1, for segment size
expressed in pages - S (System flag)1, for normal code or data
segment - Type2, for data segment that can be read and
written - DPL (Descriptor Privilege Level)3, for User Mode
- D/B (32-bit address flag)1, for 32-bit offset
addresses
38Segmentation in Linux
- A Task State Segment (TSS) for each processor.
All the Task State Segments are sequentially
stored in the init_tss array - Base field of the TSS descriptor for the nth CPU
points to the nth component of the init_tss array - G (granularity) flag is cleared
- Limit field is set to 0xeb, since the TSS segment
is 236bytes - Type field is set to 9 or 11
- DPL is set to 0, since processes in user mode are
mot allowed to access TSS segments