Kein Folientitel - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Kein Folientitel

Description:

Title: Kein Folientitel Author: Udi Last modified by: brinks Created Date: 7/4/2002 8:33:44 PM Document presentation format: Bildschirmpr sentation – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 57
Provided by: udi7
Category:

less

Transcript and Presenter's Notes

Title: Kein Folientitel


1
Computer Architecture Slide Sets WS
2011/2012 Prof. Dr. Uwe Brinkschulte Prof. Dr.
Klaus Waldschmidt
Part 7 Instruction Set Architecture (ISA)
2
Programming model
The Instruction Set Architecture (ISA) is the
programming model which is needed for programming
a processor. All details concerning the
implementation of the processor are out of focus
in the ISA. Therefore the ISA can be regarded
as an abstract interface between the compiler and
the microarchitecture of the processor.
3
Programming model
  • The following key questions lead us to the
    specification of this interface
  • How data is represented?
  • Where data is stored?
  • How data is accessed?
  • How instructions are coded?
  • Which instructions are available to process
    data?

4
Programming model
  • Therefore, the ISA defines
  • machine data types
  • address space organisation
  • register model
  • addressing modes
  • machine instruction set

5
Programming model
Since the programming model abstracts from
implementation details it is realized either in
hardware (real processors) or in software
(virtual processors). For instance, if the
instruction set includes an instruction for
multiplication, the CPU of the processor needs a
digital combinatorial circuit for multiplication.
In this sense, a relation between the abstract
ISA and the microarchitecture exists.
6
Machine data types
  • A data type is a tuple of values and operations
    which can be performed on these values.
  • The operations are implemented by the machine
    instructions.
  • Machine data types (like data types in high level
    languages) are classified into structured and
    unstructured data types.
  • An additional class are the primitive data types.

7
Primitive machine data types
  • Bit value set 0,1 operations AND, OR, XOR,
    negation, compare
  • Byte value set bit pattern (8
    bit) normally smallest addressable unit
    operations same as for bit, additionally
    ADD, SUB, MUL, DIV, SHIFT, ROTATE,
  • Word value set normally a multiple of
    bytes largest addressable unit (in a single
    operation) operations same as for byte
  • (sometimes the following convention is used
    Half-Word 16 Bit Word 32
    Bit Double Word 64 Bit)

8
Examples for more data types
n-1 i 0
n 8,16,32
- vector (bit) - BCD number (binary coded
decimal) - Binary number unsigned - two
complement number - floating point
number - string
7 0 15 0
1
0
3
2
1
0
8, 16 Bit
31 0
7
6
5
4
3
2
1
0
32 Bit
n-1 0
n 8, 16, 32
MSB LSB
n-1 0
n 8, 16, 32
MSBsign bit LSB
31 23 22 0
biased.expon.
s
fraction
n-1 0 n-1 0 n-1
0
...
n 8, 16, 32
(taken from MC680x0)
9
Address space organisation
Physical organisation depends on the processor
7 0
15 8 7 0
31 24 23 16 15
8 7 0
0 n
0 n
0 n
. . .
. . .
. . .
. . .
. . .
. . .
. . .
32 bit processor
16 bit processor
8 bit processor
n physical address, n 2 address bus width
10
Address space organisation
Locical organisation byte oriented access for
most processor types
7 0
physical word on a 8 bit processor
0 1 2 3 m
physical word on a 16 bit processor
physical word on a 32 bit processor
. . .
m logical address, m n bit width / 8
11
Address space organisation
Physical to logical mapping
31 24 23 16 15
8 7 0
0 n
0
1
2
3
4
5
6
7
8
9
10
11
locical address
12
13
14
15
physical address
. . .
. . .
. . .
. . .
m-3
m-2
m-1
m
12
Address space organisation
Aligned access the accessed word is aligned
according to its length in the physical
address space
(logical adress mod length) 0
31 24 23 16 15
8 7 0
bytes to byte boundaries
0 n
byte
byte
byte
byte
half-words to half-word boundaries
half-word
half-word
words to word boundaries
word
. . .
. . .
. . .
. . .
13
Address space organisation
Unaligned (misaligned) access the accessed word
is not aligned according to its length in
the physical address space
(logical
adress mod length) ? 0
31 24 23 16 15
8 7 0
0 n
half-word
-word
word
half-
. . .
. . .
. . .
. . .
Some processors do not support unaligned access
(e.g. SPARC)
14
Byte order in words
Two different formats
8 Bit - byte 16 Bit - word 32 Bit - word
N
big endian byte ordering
N 1
N
N 1
N
N 2
N 3
31 24
23 16
15 8
7 0
Word address is the address of the most
significant byte (used e.g. in MC680x0 or SPARC)
little endian byte ordering
8 Bit - byte 16 Bit - word 32 Bit - word
N
N 1
N
N 1
N
N 2
N 3
31 24
23 16
15 8
7 0
Word address is the address of the least
significant byte (used e.g. in Pentium family)
N least significant byte, N 3 most
significant byte
15
Byte order in words
Locical (byte oriented) memory organization of a
32 bit word
big endian byte ordering
N 3
b b1 b2 b3
N 2
byte address
N 1
N
little endian byte ordering
N
b b1 b2 b3
N 1
byte address
N 2
N 3
16
Register model
  • The number of registers being part of a processor
    varies between 20 and 200. The advantage of data
    storage in registers against DRAM or
    SRAM-memories are
  • faster access time
  • register addresses could be shorter with respect
    to the instruction format.
  • An ISA is called Load-Store-ISA if all machine
    instructions except register load and store
    instructions operate on the register file only.

17

Register model
  • Registers are classified into hidden registers
    and programmer visible registers.
  • The visible registers are the workplace of the
    programmer and are often organized as register
    files.
  • Hidden registers are supply registers needed for
    the internal functionality of the processing unit
    (CPU).
  • Both visible and hidden registers are designed
    for various purpose and functionality.

18

Register model
  • A register model defines which processor
    registers are visible (addressable) to the
    programmer.
  • Usually these are the working registers and the
    state register.
  • The state register monitors the state of the
    processor through conditional flags.
  • It shows for example whether the processor
    operates in system or user mode.
  • The state register is mostly read-only
  • Commonly existing hidden registers are the
    instruction register and the memory interface
    registers.

19
Register implementation
D0
D1
D31
32 bit register with D-Latches
D Q
D Q
D Q
........
clk
clk
D0
Q0
D1
Q1
Q0
Q1
Q31
....
Q31
D31
Symbol
Q0
Q2
Q1
Q3
D Q Q
D Q Q
D Q Q
D Q Q
Asynchronous counter with D-Latches
clk
20

Common visible registers
  • program counter (PC) - contains the next
    instruction address
  • state register (SR) - monitors the state of the
    processor
  • stackpointer (SP) - stores the top of the stack
  • accumulator (ACCU) stores computation results
    (in older or simple processors)
  • data registers (DXi) - storing operands for
    computations
  • address registers (AXi) - storing operand
    addresses
  • general purpose registers (GPi) - storing either
    operands or operand addresses

21

Common hidden registers
  • instruction register (IR) contains the
    currently processed instruction
  • instruction queue (IQ) - contains the next
    instructions to be processed
  • memory address register (MAR) - buffers the
    address of a memory access (e.g. to
    save or load a general purpose
    register)
  • memory data register (MDR) - buffers the content
    of a memory access (e.g. to save or load a
    general purpose register)

22

Program counter register
  • Pointer to the next instruction to be executed
  • Normally incremented
  • Set by a jump, jump subroutine, interrupt, return
    or return from interrupt instruction

31
0

N - 4 N N 4 M
Add
A
B
Jump
M


23

Stackpointer register
  • Addresses a location in the memory which is
    organized as a stack (LIFO).
  • Elements can be pushed (write) and popped (read)
    only from the top of the stack.
  • Consequence Data are stored in a subsequent
    order
  • Used e.g. for jump subroutine/return operation on
    PC

31
0

N - 4 N N 4
Push
X
Pop

Some processors distinguish between user
stackpointer (e.g. for jump subroutine/return)
and supervisor stackpointer (e.g. for
interrupt/return from interrupt)
24

Sampe CISC register set
Intel Pentium
25

Sampe RISC register set
Power PC (extract)
  • The register file of RISC processors has to be
    much bigger compared to CISC processors.
  • A RISC needs more registers, because the register
    file is source and destination of all arithmetic
    or logic instructions.

26
Multiple register sets
27
Multiple register sets
Processors with multiple register sets a step
towards multithreaded processors
  • Processor with multiple register sets
  • Each register set can store the program counter
    (PC) and the state register (SR)
  • PC and SR exist only once
  • gt several contexts can be stored, fast
    context switching
  • Multithreaded processor
  • multiple PCs and SRs exist
  • instructions from several threads can be
    executed at the same time in the pipeline
  • gt several contexts can be processed

28
Multiple overlapping register sets,register
windows
  • The registers of a register file are grouped into
    blocks called windows.
  • These overlapping windows are used by the
    subroutines of a program.
  • MORS (multiple overlapping register set)

jump subroutine
Overall register set
Register window 1
Register window 2
Register window 3
Register window n
return
29
Multiple overlapping register sets,register
windows
  • Simplifies parameter passing on jumping to
    subroutines
  • Each subroutine has its own working space within
    the register file
  • Parameters can be directly passed with no need to
    copy registers or pass parameters by memory
  • gt mainly used in RISC processors
  • Two possible approaches
  • Fixed size register window
  • Variable size register window

30
Fixed size window
preceding window
local register
alternative register naming r31 i7 r24
i0 r23 I7 r16 I0 r15 o7 r8 o0 r7
g7 r0 g0
r31 r24
save
In i1
continued
r23 r16
current window
Local i1
r15 r8
r31
Out i1
In i
succeeding window
Local i
r31
Out i
In i-1
CWP
r8
Local i-1
r7 r0
global registers
restore
r8
Out i-1
continued
based on SPARC architecture
31
Fixed size window
  • In case of the SPARC architecture, a window
    consists of 32 registers of which the first 8
    also belong to the preceding window and the last
    8 also belong to the succeeding window.
  • The registers are addressed relative to the
    current window pointer (CWP).
  • A subroutine call is performed by incrementing
    the CWP and saving the PC.
  • The parameters are passed through the overlapping
    registers of the two windows.
  • The content of the program counter is saved
    (return address) into one of these registers.
  • A time consuming save and reload of registers is
    omitted.
  • In case of an overflow of the MORS the window
    contents have to be saved to a stack.

32
Variable size window
global registers
local registers
preceding
0
0
previous RSP
r0 r1
gr0
r66
gr1
Local Out
current window
?
current RSP
?
r0 r1
In Local Out
gr63
63
?
?
r64
based on AMD 29000 architecture
r65
127
33
Register size of processors with 3-address
architecture

processor/architecture (vendor) of general purpose registers of general purpose registers of general purpose registers bit width bit width bit width
processor/architecture (vendor) overall directly accessible register width register address immediate operands instr.
Alpha 21364 (Compaq) 32 32 64 Bit 5 Bit 8 Bit 32 Bit
Am29000 (AMD) 192 192 32 Bit 8 Bit 8 Bit 32 Bit
ARM7TDMI (ARM) 16 16 32 Bit 4 Bit 8 Bit 32 Bit
Crusoe TM5800 (Transmeta) 64 64 32 Bit 6 Bit - -
pa-8700 (HP) 32 32 64 Bit 5 Bit 11 Bit 32 Bit
Itanium 2 (Intel, HP) 128 128 64 Bit 7 Bit 8 Bit 41 Bit
MC88100 (Motorola) 32 32 32 Bit 5 Bit 16 Bit 32 Bit
MIPS65 20Kc (MIPS) 32 32 64 Bit 5 Bit 16 Bit 32 Bit
Nemesis C (TU Berlin) 96 16 32 Bit 4 Bit 1 Bit 16 Bit
PowerPC 970 (IBM) 32 32 64 Bit 5 Bit 16 Bit 32 Bit
UltraSPARC III Cu (SUN) 160 32 64 Bit 5 Bit 13 Bit 32 Bit
34
Register size of processors with 2-address
architecture

processor (vendor) of general purpose registers of general purpose registers of general purpose registers bit width bit width bit width
processor (vendor) overall directly accessible register width register address immediate operands smallest instr.
Athlon (AMD X86-64) 16 16 64 Bit 4 Bit 8 - 32 Bit 8 Bit
ColdFire MFC5206 (Motorola) 8 8 8 8 32 Bit 3 Bit 8 - 32 Bit 16 Bit
MC680xx (Motorola) 8 8 8 8 32 Bit 3 Bit 8 - 32 Bit 16 Bit
Pentium X (Intel X86) 8 8 32 Bit 3 Bit 8 - 32 Bit 8 Bit
35
Addressing modes
  • Machine instructions normally hold information
    about the operand addresses.
  • This can either be a physical address, e.g. a
    register number or the address of a memory
    location, or it can be an address specification.
  • An address specification defines how to calculate
    the address.
  • Thus, the address information determines the
    location of the operand(s) belonging to the
    instruction using one of many addressing modes.

36
Addressing modes
  • Instruction format
  • e.g. arithmetic instruction

operands needed for the execution defined by the
opcode
target source source
opcode
operand register memory address
specification itself number location
(dynamic address calculation) The result of
the dynamic address calculation is called
effective address
37


Addressing modes
  • immediate The operand is part of the
    instruction.
  • memory direct and register direct The
    instruction contains the operand address.
  • register indirect The instruction contains
    a register number pointing to a register
    holding the address of the operand. In
    assembler code this addressing mode is typically
    denoted by register name

38


Addressing modes
  • memory indirect A register addressed in
    the instruction contains the address of a
    memory cell which holds the operand address.
  • register offset The instruction contains a
    register number and an offset. The operand
    address is the sum of the registers content and
    the offset.
  • implicit The instruction implicitly targets
    a single register (like the ACCU)

39


Effective address
The address is calculated from several parts
found in the instruction and in registers or
memory cells at runtime (dynamic address
calculation). The calculated address is defined
as effective address.
  • Reasons for using dynamic address calculation
  • Addresses of data structure elements are
    composed of the first address of the data
    structure and the offset of the element to the
    beginning. Often this offset is unknown at
    compile time, therefore the effective address
    has to be calculated at runtime.
  • Repeated execution of the same instruction,
    e.g. in a loop, often accesses successive memory
    addresses which have to be calculated at
    runtime.

40


Effective address (cont.)
  • An operand address often is unknown at compile
    time, because it is calculated during program
    execution.
  • The partitioning of addresses into a base
    address stored in a register and an offset
    simplifies the handling of shift able
    variables and shift able program code.

41
Addressing modes 1
instruction
immediate
operand
e.g. LOAD 8, r1
instruction
memory direct
eff. address
e.g. LOAD (2000), r1
memory
o p e r a n d
instruction
register direct
eff. address
e.g. LOAD r2, r1
register
o p e r a n d
42
Addressing modes 2
instruction
register address
register indirect
register
e.g. LOAD (r2), r1
e f f e c t i v e a d d r e s s
memory
o p e r a n d
instruction
register indirect with predecrement
register address
register
e.g. LOAD -(r2), r1
m e m o r y a d d r e s s
-
decrement
eff. address
memory
o p e r a n d
43
Addressing modes 3
instruction
register address
register
m e m o r y a d d r e s s
register indirect with displacement (indexed)

displacement
register

i n d e x
eff. address
e.g. LOAD.B 126(r3)(r2), r1
scaling 1, 2 or 4
memory
o p e r a n d
44
Addressing modes 4
instruction
register address
register
m e m o r y a d d r e s s
memory indirect

displacement1
memory
indirect memory address
e.g. LOAD 28(126(r2)), r1

displacement2
eff. address
memory
o p e r a n d
45
Addressing modes 5
instruction
register address
register
m e m o r y a d d r e s s
memory indirect (post indexed)

displacement1
memory
indirect memory address
e.g. LOAD.B 28(r3)(126(r2)), r1

displacement2
register
eff. address

i n d e x
memory
scaling 1, 2 or 4
o p e r a n d
46
Addressing modes 6
instruction
register address
register
m e m o r y a d d r e s s
memory indirect (preindexed)

displacement1
register

i n d e x
memory
scaling 1, 2 or 4
indirect memory address
e.g. LOAD.B 28(126(r3)(r2)), r1
eff. address

displacement2
memory
o p e r a n d
47
Access to branch target table by PC relative
addressing
memory
JMP disp (PC)(rn)
(PC)
branch target table access through program
counter relative addressing
displacement


target 0
target 1
i n d e x

target 2
48
Machine instruction set
  • The machine instruction set of a computer
    normally includes instructions of different
    formats, e.g. 0-address instructions, 1-address
    instructions, 2-address instructions and
    3-address instructions.
  • An instruction is divided into so called fields.
  • The more address fields an instruction contains
    the smaller the number of addressable memory
    cells and/or the number of operations encoded in
    the opcode field becomes (if we assume a
    constant instruction length).

49
Variable length vs. constant length instruction
format
  • Variable length (e.g. 16 - 256 Bit) mostly used
    in CISC architectures
  • flexible instruction format
  • high code density
  • long immediate and displacement values
  • Constant length (e.g. 32 Bit) mostly used in
    RISC architectures
  • simple and fast fetch
  • simple and fast decode
  • simplified pipelining

50
Scheme of basic operations of common processors
basic operations
conditional operations
unconditional operations
combinatorial operations
control flow operations
simple branches
transport operations
arithmetic logic operations
system branches
subroutine branches
load operations
arithmetic operations
store operations
logic and shift operations
call
call
return
return
semaphore operations
state and control operations
51


Instruction classes
  • Instruction sets are divided into groups
    combining instructions with similar
    functionality
  • Typical instruction groups
  • transport instructions
  • arithmetic instructions
  • logic instructions
  • shift and rotate instructions
  • bitwise instructions
  • string and array instructions
  • branch instructions
  • system instructions
  • synchronization instructions

52
Load store architecture
  • All instructions - except load and store
    instructions - address registers only.
  • Load and store instructions are needed to
    transfer data to and from main memory.
  • Mainly used in RISC ISA, combined with pipelining
    it allows to complete most instructions in one
    cycle
  • Furthermore, the address fields of instructions
    becomes shorter as they only have to address a
    register instead of a memory address.
  • A load store ISA accelerates a machine if there
    are only small caches or if the caches are
    completely missing and a big register file is
    available.

53
Two examples for an instruction format
Example An arithmetic instruction SUBc r3, r7,
r21 binary code 11010 10101 00111 00011 1
0000000000 hexcode D54E3800
Example A store instruction STORE r24,
126(r5) binary code 00111 11000 00101
00000000001111110 hexcode 3E0A007E
31 26 21 16 11 0
31 26 21 16
0

OP
TR
SR1
SR2
OP
SR
BR
c x
DP
instruction format OP opcode TR target
register SRn source register c/x set/do not set
condition code
instruction format OP opcode SR source
register BR base register DP displacement
(signed)
54
State register of a RISC processor (based on
SPARC-architecture)
31
0
16 15
I E
P S
SR
N Z V C
IM
CWP
S
supervisor/user
carry
interrupt mask
previous S-bit
overflow
interrupt enable
conditional bits
zero
current window pointer
negative
55
Conditional codes dependent on conditional bits Z
(zero),N (negative), C (carry) und V (overflow).
Mnemonics according toMotorolas ColdFire MFC5206
processor.
conditional value mnemonic operation expression operand type
equal not equal eq ne ? Z ? Z independent
higher than higher than or same lower than lower than or same ht hs lo ls gt lt ? C ? ? Z ? C C C ? Z unsigned
greater than greater than or equal less than less than or equal gt ge lt le gt lt (N V) ? ? Z (N V) (N ? V) (N ? V) ? Z signed
arithmetic overflow arithmetic shortfall negative positive vs vc ne pl V ? V N ? N signed
56
Multimedia instructions
  • Typical SIMD instructions to process a single
    operation on a set of data (e.g. changing the
    brightness of image pixels)
  • Operations can be on packed integers (e.g. MMX on
    Pentium) or packed floats (e.g. SSE2 on Pentium)
  • Typical operations arithmetic (saturated or
    overflow), logic, compare, pack, unpack
  • Example
Write a Comment
User Comments (0)
About PowerShow.com