Instruction Set Architecture

About This Presentation

Title:

Instruction Set Architecture

Description:

Instruction Set Architecture Pradondet Nilagupta Spring 2005 (original notes from Prof. Baniasadi, Prof. Shaaban, Prof. Katz) Outline ISA Introduction ISA Classifying ... – PowerPoint PPT presentation

Number of Views:1010

Avg rating:3.0/5.0

Slides: 118

Provided by: Pradondet3

Category:

more less

Transcript and Presenter's Notes

Title: Instruction Set Architecture

1
Instruction Set Architecture

Pradondet Nilagupta
Spring 2005
(original notes from Prof. Baniasadi,
Prof. Shaaban, Prof. Katz)

2
Outline

ISA Introduction
ISA Classifying
Memory Addressing
Addressing Modes
Operands
Encoding ISA

3
Hot Topics in Computer Architecture

1950s and 1960s
Computer Arithmetic
1970 and 1980s
Instruction Set Design
ISA Appropriate for Compilers
1990s
Design of CPU
Design of memory system
Design of I/O system
Multiprocessors
Instruction Set Extensions

4
Instruction Set Architecture

Instruction set architecture is the structure of
a computer that a machine language programmer
must understand to write a correct (timing
independent) program for that machine.
The instruction set architecture is also the
machine description that a hardware designer must
understand to design a correct implementation of
the computer.

5
Evolution of Instruction Sets
Single Accumulator (EDSAC 1950)
Accumulator Index Registers
(Manchester Mark I, IBM 700 series 1953)
Separation of Programming Model from
Implementation
High-level Language Based
Concept of a Family
(B5000 1963)
(IBM 360 1964)
General Purpose Register Machines
Load/Store Architecture
Complex Instruction Sets
(CDC 6600, Cray 1 1963-76)
(Vax, Intel 432 1977-80)
RISC
(Mips,SPARC,HP-PA,IBM RS6000, . . .1987)
6
Instruction Set Architecture

The instruction set architecture serves as the
interface between software and hardware

software
instruction set
hardware
7
Interface Design

A good interface
Lasts through many implementations (portability,
compatibility)
Is used in many different ways (generality)
Provides convenient functionality to higher
levels
Permits an efficient implementation at lower
levels

8
What Are the Components of an ISA? (1/2)

Sometimes known as The Programmers Model of the
machine
Storage cells
General and special purpose registers in the CPU
Many general purpose cells of same size in memory
Storage associated with I/O devices
The machine instruction set
The instruction set is the entire repertoire of
machine operations
Makes use of storage cells, formats, and results
of the fetch/execute cycle
i.e., register transfers

9
What Are the Components of an ISA? (2/2)

The instruction format
Size and meaning of fields within the instruction
The nature of the fetch-execute cycle
Things that are done before the operation code is
known

10
Instruction Set Architecture
Computer Program (Instructions)
Programmer's View
ADD SUBTRACT AND OR COMPARE . . .
01010 01110 10011 10001 11010 . . .
Memory
CPU
I/O
Computer's View
Princeton (Von Neumann) Architecture
Harvard Architecture
--- Data and Instructions mixed in same
memory ("stored program computer") --- Program
as data (dubious advantage) --- Storage
utilization --- Single memory interface
--- Data Instructions in separate
memories --- Has advantages in certain
high performance implementations
11
Basic Issues in Instruction Set Design
--- What operations (and how many) should be
provided LD/ST/INC/BRN sufficient to
encode any computation But not useful
because programs too long! --- How (and how
many) operands are specified Most
operations are dyadic (eg, A lt- B C)
Some are monadic (eg, A lt- B) --- How to
encode these into consistent instruction
formats Instructions should be
multiples of basic data/address widths
Typical instruction set 32 bit word
basic operand addresses are 32 bits long
basic operands, like integers, are 32 bits long
in general case, instruction could reference
3 operands (A B C) challenge encode
operations in a small number of bits!
12
Execution Cycle
Instruction Fetch
Obtain instruction from program storage
Instruction Decode
Determine required actions and instruction size
Operand Fetch
Locate and obtain operand data
Execute
Compute result value or status
Result Store
Deposit results in storage for later use
Next Instruction
Determine successor instruction
13
What Must be Specified?
Instruction Fetch

Instruction Format or Encoding
how is it decoded?
Location of operands and result
where other than memory?
how many explicit operands?
how are memory operands located?
which can or cannot be in memory?
Data type and Size
Operations
what are supported
Successor instruction
jumps, conditions, branches

Instruction Decode
Operand Fetch
Execute
Result Store
Next Instruction
14
ISA

What are the important questions?

15
Operand Locations in Four ISA Classes
GPR
16
ISA Classes

ISA Classes?
Stack
Accumulator
Register memory
Register register/load store

Input1
Input2
Operation
Output
17
ISA Classes

Accumulator
1 address add A acc acc memA
1x address addx A acc acc memA x
Stack
0 address add tos tos next
General Purpose Register
2 address add A B EA(A) EA(A) EA(B)
3 address add A B C EA(A) EA(B) EA(C)
Load/Store
3 address add Ra Rb Rc Ra Rb Rc
load Ra Rb Ra memRb
store Ra Rb memRb Ra

tos
stack
Comparison
Bytes per instruction? Number of Instructions?
Cycles per instruction?
18
ISA Classes Stack

Operate on TOS, put result TOS
C AB?
PUSH A
PUSH B
ADD
POP C
Memory not touched

TOP OF STACK
Operation
MEMORY
19
ISA Classes Accumulator

Accumulator
Implicit input output.
C AB?
LOAD A - Put A in Accumulator
ADD B - Add B with AC put result in AC
STORE C- Put AC in C

Accumulator (AC)
Operation
MEMORY
20
ISA Classes Register-Memory

Input, Output Register or Memory
C AB?
LOAD R1, A
ADD R3, R1, B
STORE R3, C

LOAD/STORE ARCH.
C AB?
LOAD R1, A
LOAD R2, B
ADD R3, R1, R2
STORE R3, C

Example
How do we compute CAB using four classes of
ISAs

Stack Accumulator Register(Reg-Mem) Register(Load/Store)
Push A Push B Add Pop C Load A Add B Store C Load R1,A Add R3,R1,B Store R3,C Load R1,A Load R2,B Add R3,R1,R2 Store R3,C
Comparing Number of Instruction?
23
General Purpose Registers Dominate
24
General-Purpose Register (GPR) Machines

Every ISA designed after 1980 uses a load-store
GPR architecture (i.e RISC, to simplify CPU
design).
Registers, like any other storage form internal
to the CPU, are faster than memory.
Registers are easier for a compiler to use.
GPR architectures are divided into several types
depending on two factors
Whether an ALU instruction has two or three
operands.
How many of the operands in ALU instructions may
be memory addresses.

25
ISA Examples

Machine Number of General
Architecture year
Purpose Registers

EDSAC IBM 701 CDC 6600 IBM 360 DEC PDP-8 DEC
PDP-11 Intel 8008 Motorola 6800 DEC VAX Intel
8086 Motorola 68000 Intel 80386 MIPS HP
PA-RISC SPARC PowerPC DEC Alpha HP/Intel
IA-64 AMD64 (EMT64)
1 1 8 16 1 8 1 1 16 1 16 8 32 32 32 32 32 128 16
accumulator accumulator load-store register-memory
accumulator register-memory accumulator accumulat
or register-memory memory-memory extended
accumulator register-memory register-memory load-s
tore load-store load-store load-store load-store l
oad-store register-memory
1949 1953 1963 1964 1965 1970 1972 1974 1977 1978
1980 1985 1985 1986 1987 1992 1992 2001 2003
26
Pros and cons for each ISA type
Machine Type Advantages Disadvantages
Stack
Accumulator
Register
27
Pros and cons for each ISA type
Machine Type Advantages Disadvantages
Stack Short instructions Good code density Simple to decode instruction Lack of random access Efficient code hard to get Stack if often a bottleneck
Accumulator Minimal internal state Short instruction Simple to decode instruction Very high memory traffic
Register Lots of code generation option Efficient code (compiler options) Longer instructions Complex instructions
28
Examples of GPR Machines
For Arithmetic/Logic Instructions

Number of Maximum number
memory addresses of operands allowed
SPARK, MIPS
0
3 PowerPC, ALPHA
1
2 Intel 80x86,
Motorola 68000
2
2 VAX
3
3 VAX

29
Pros/Cons of Mem. Operands/Operands (1/3)

Register-register 0 memory operands/instr, 3
(register) operands/instr
Pro Simple, fixed-length instruction encoding.
Simple code generation model. Instructions take
similar numbers of clocks to execute
Con Higher instruction count than architectures
with memory references in instructions. Some
instructions are short and bit encoding may be
wasteful.

30
Pros/Cons of Mem. Operands/Operands (2/3)

Registermemory (1,2)
Pro Data can be accessed without loading first.
Instruction format tends to be easy to encode and
yields good density.
Con Operands are not equivalent since a source
operand in a binary operation is destroyed.
Encoding a register number and a memory address
in each instruction may restrict the number of
registers. Clocks per instruction varies by
operand location.

31
Pros/Cons of Mem. Operands/Operands (3/3)

Memorymemory (3,3)
Pro Most compact. Doesnt waste registers for
temporaries.
Con Large variation in instruction size,
especially for three-operand instructions. Also,
large variation in work per instruction. Memory
accesses create memory bottleneck.

32
Memory Addressing

How do we specify memory addresses?
This issue is independent of type of ISA(they
all need to address memory)
We need to specify
(1) Operand sizes
(2) Address alignment
(3) Byte ordering for multi-byte operands
(4) Addressing Modes

33
Operand Sizes (1)

Byte (8 bits), half-word (16 bits),word (32
bits), double word (64 bits)
An ISA may (and typically does)support multiple
operand sizes
Instruction must specify the operand size
E.g. LOAD.b R1,A vs. LOAD.w R1,A
Why? Make sure theres no garbage data
But usually there is a default size
Most commonly word on 32-bit machines
On x86, different register names for different
sizes
(And think about Amdahls Law too)

34
Alignment (2)

For multi-byte memory operands
An aligned address for an n-byte operandis an
address that is a multiple of n
Word-aligned 0, 4, 8, 12, etc.
An ISA can require alignment of operands
Assume it is required unless otherwise specified
MIPS all memory operands must be
aligned(special two-instruction sequences
foraccessing unaligned data in memory)
x86 no alignment required(but unaligned
accesses are slower)

35
Byte Ordering (Endianness) (3)

Layout of multi-byte operands in memory
Little endian (x86)
Least significant byteat lowest address in
memory
Big endian (most other ISAs)
Most significant byteat lowest address in memory
Assume this ordering unless otherwise specified
Some ISAs support both byte ordering
E.g. MIPS has a little-endian mode

36
Another view of Endianness

No, were not making this up.
at word address 100 (assume a 4-byte word)
long a 0x11223344
big-endian (MSB at word address) layout
100 101 102 103
100 11 22 33 44
0 1 2 3
little-endian (LSB at word address) layout
103 102 101 100
11 22 33 44 100
3 2 1 0

37
Endianness Continued

Usually instruction sets are byte addressed
Provide access for bytes (8 bits), half-words (16
bits), words (32 bits), double words (64 bits)
Two different ordering types big/little endian

Little Endian
31
23
15
7
0
Puts byte w/addr. xx00 at least significant
position in the word
Big Endian
31
23
15
7
0
Puts byte w/addr. xx00 at most significant
position in the word
38
Addressing Modes (4)

What is the location of an operand?
Three basic possibilities
Register operand is in a register
Register number encoded in the instruction
Immediate operand is a constant
Constant encoded in the instruction
Memory operand is in memory
Many address modes possibilities

39
Immediate Addressing Mode

Operand is a constant encoded in instruction
Can we have any value as an immediate?
x86 yes. of bytes used to encode the
instruction will change to accommodate.
RISC no, instruction size is fixed (e.g. 32
bits)
Immediates in RISC
Typically a load immediate instruction
Some bits used to specify the instruction opcode
Remaining bits encode the immediate value
This is OK most-frequently needed constants have
few bits
MIPS also has a special two-instruction
sequenceto put a full 32-bit immediate into a
register

40
Size of Immediate Operand
(comments?)
41
Memory Addressing Modes (A)

Register Indirect
Address is in a register
LD R1, (R2)
Use access via pointer or computed address
Direct (Absolute)
Address is a constant
LD R1, (100)
Use access to static data
Note constant encoded in the instruction

42
Memory Addressing Modes (B)

Displacement
Address is registerimmediate
LD R1, 100(R2)
Local variables, fields in a structure
Can simulate register-indirect and direct modes
LD R1,0(R2) and LD R1, 100(R0)
Note displacement encoded in instruction

43
Size of Displacement
44
Memory Addressing Modes (C)

Autoincrement
Address is in a register
LD R1, (R2)
Register value increased by d after access(d is
the data size, e.g. 4 for word accesses)
Some flavors useful for iterating through arrays,
implementing a stack, etc.
Indexed, Memory Indirect, Scaled, Autodecrement
See textbook (Page 98)

45
FYI More Addressing Modes
Addressing Mode Example Instruction Meaning When Used
Register Add R4, R3 R4 ? R4 R3 When a value is in a register
Immediate Add R4, 3 R4 ? R4 3 For constants
Displacement Add R4, 100(R1) R4 ? R4 Mem100R1 Accessing local variables
Register deferred or Indirect Add R4, (R1) R4 ? R4 MemR1 Accessing using pointer or computed address
Indexed Add R3, (R1R2) R3 ? R3 MemR1R2 Array addressing R1 base of array, R2 index amount
Direct or Absolute Add R1, (1001) R1 ? R1 Mem1001 Accessing static data addr. constant may need to be big
46
FYI Addressing modes continued
Addressing Mode Example Instruction Meaning When Used
Memory indirect or Memory deferred Add R1, _at_(R3) R1 ? R1 MemMemR3 If R3 is the address of a pointer p, then mode yields p
Autoincrement Add R1, (R2) R1 ? R1MemR2 R2 ? R2 d Useful for stepping through arrays within a loop R2 points to start of array each ref. increments R2 by d
Autodecrement Add R1, -(R2) R1 ? R1-MemR2 R2 ? R2 d Same as autoincrement can be used for push/pop on stack
Scaled Add R1, 100(R2), R3 R1 ? R1 Mem100R2R3d Used to index arrays
47
Use of Addressing Modes (DSP)

Hand-coded
Uses more powerful modes when possible
Figure 2.11 in textbook
Goes through mix of addressing modes and the of
time each is used for a TI DSP
70
Immediate, displacement, register immediate,
direct
25
Auto increment, Auto decrement
Thoughts?
(random comment make these 6 addressing out
of 17 total the fastest)

48
Conrol Flow Instructions

Up until now implicitly have discussed memory
and arithmetic instructions
Now, control flow4 basic types
Procedure Call and Return
Jumps
Conditional branches

49
Addr. Modes for CF Instructions

PC-relative
Most commonly used for branches and jumps
Position-independent code
Target known at compile time
Register indirect
Used when target not known at compile
time(procedure returns, virtual functions and
function pointers, dynamically loaded libraries,
case/switch statements, etc.)

50
Size of Branch Displacement
(comments?)
51
Branch Conditions
(thoughts on s, what constructs they apply to,
etc.?)
52
Call/Return Instructions

Call
Minimum save return addressto the stack (x86)
or in a register (MIPS)
Can create a stack frame, save registers, etc.
Return
Jumps to return address
Can pop the stack frame, restore registers, etc.
Simpler typically turns out to be better
E.g. many functions do not need a stack frame

53
MIPS Call/Return

Call Jump-And-Link (JAL ltfunctiongt)
Puts return address into R31,then jumps to
target address
Return Register-Indirect Jump (JR R31)
Jumps to address in R31(no special RET
instruction)
Stack frame create/pop via ordinary add/sub
instrs(stack-pointer register is R29)
Register save/restore via ordinary load/store
instrs

54
Procedure call essentials (1)Caller/Callee
Mechanics
who does what when?

Four places

foo() bar(int a)

int temp 3 bar(42)
... ...
return(temp a)

2. callee at entry
1. caller at call time
4. caller after return
3. callee at exit
55
Procedure call essentials (2)MIPS Registers
56
Procedure call essentials (3)Good Strategy
do most work at callee entry/exit

Caller at call time
put arguments in a0..a4
save any caller-save temporaries
jalr ..., ra
Callee at entry
allocate all stack space
save ra s0..s3 if necessary
Callee at exit
restore ra s0..s3 if used
deallocate all stack space
put return value in v0
Caller after return
retrieve return value from v0
restore any caller-save temporaries

most of the work
57
Procedure call essentials (4)Summary

Summary
Caller saves registers (outside the agreed upon
convention i.e. ax) at point of call
Callee saves registers (per convention i.e. sx)
at point of entry
Callee restores saved registers, and re-adjusts
stack before return
Caller restores saved registers, and re-adjusts
stack before resuming from the call
Big ?
Is this clear? I can work through an example if
needed

58
Instruction Encoding

Instruction must specify
What is supposed to be done (opcode)
What are the operands
Three popular formats
Variable format (VAX, x86)
Opcode specifies how many operands, operands are
listed after opcode
Each operand has an address specifier and an
address field
Address specifier describes addressing mode for
that operand
Fixed format (RISC)
All instructions of the same size
Opcode specifies addressing mode for load/store
operations
All other operations use register operands
Hybrid Format (IBM 360, some DSP processors)
Several (but few) fixed size instruction formats
Some formats have address specifier fields

59
How is the operation specified?

Typically in a bit field called the opcode
Also must encode addressing modes, etc.
Some options

Variable
.
Operation of operands
Address Specifier 1
Address Field 1
Address Specifier n
Address Field n
Operation
Address Field 1
Address Field 2
Address Field 3
Fixed
Operation
Address Specifier
Address Field
Operation
Address Specifier 1
Address Specifier 2
Address Field
Hybrid
Operation of operands
Address Specifier
Address Field 1
Address Field 2
60
Some random comments

Variable addressing mode allows virtually all
addressing modes with all operations
Best when many addressing modes operations
Fixed addressing mode combines operation
addressing mode into opcode
Best when few addressing modes and operations
Good for RISC
Hybrid approach is 3rd alternative
Usually need a separate address specifier per
operand
When encoding instructions, of registers and
addressing modes can affect instruction size

61
Instruction Encoding Tradeoffs

Decoding vs. Programming
Fixed formats easy to decode
Variable formats easier to program (assembler)
But we mostly use compilers now
Number of registers
More registers help optimization (a lot)
Operand fields smaller with few registers
In general, we want many (e.g. 32) registers,but
do we want even more is still an issue

62
Helping the Compiler Writers

Regularity and Orthogonality
General-Purpose Registers
If an operation works with one data type,is
should work with all supported data types
If an operation works with one addr. mode
Primitives, not solutions
E.g. JAL vs. elaborate function call instruction
Simplify tradeoffs
Bind constants at compile time

63
Todays compilers work like this
Dependencies
Function
Pass
Front-end per language
Transform language to common, intermediate form

Language dependent
Machine independent

Intermediate representation

Somewhat language dependent
Largely machine independent

For example, procedure inlining and loop
transformations
High-level optimizations

Small language dependencies
Machine dependencies slight
(I.e. register counts/types)

Including global and local optimization
register allocation
Global optimizer
Detailed instruction selection and
machine-dependent optimizations (assembler next?)

Highly machine dependent
Language independent

Code generator
64
How the architect can help the compiler writer

Keep in mind
Most programs are locally simple!
Simple translations work just fine
Complexity arises b/c program require lots of
instructions and they must interact globally
Also b/c of the whole multiple pass thing
The compiler writers corollary/rule/manifesto
Make the frequent cases fast and the rare cases
correct!

65
Reading Assignment

Read Section 2.12 (MIPS Architecture)
Especially Figure 2.27
Not required, but good to read anyway ?
Section 2.14 (Fallacies and Pitfalls)
Section 2.16 (Historical Perspective)

66
The DLX mProcessor

A generic mP that well use from time-to-time
Very similar to a MIPS machine
Compiled by taking the average of a of recent
experimental and commercial machines
Has 32 general purpose registers (R0, R1, R31)
and floating point registers
Data types include
8-bit bytes
16-bit half words
32-bit words for integer data words
32 64-bit double precision words

67
DLX addressing modes

The only data addressing modes are immediate and
displacement
Possible to implement register deferred and
absolute
DLX memory is byte addressable in the Big Endian
mode with a 32-bit address
DLX uses a load/store architecture so
All memory references are through loads or stores
between memory and either GPRs and FPRs

Add R1, (1001) R1 ? R1 M(1001)
Add R4, (R1) R4 ? R4 Mem(R1)
68
DLX instruction format
DLX has 2 addressing modes which are encoded in
the opcode
I-type instruction
6
5
5
16
Opcode
rs1
rd
Immediate

Encodes Loads and Stores of bytes, words, half
words
All immediates (rd ? rs op immediate)
Conditional branch instructions (rs1 is
register, rd is unused)
Jump register, jump and link register (rd 0,
rs1 destination, immediate 0)

R-type instruction
6
5
5
5
11

Register-register ALU operations rd ? rs1 func
rs2
Function encodes the data path operation Add,
Sub,
Read/write special registers and moves

69
DLX instruction format
J-type instruction
6
26
Opcode
Offset added to PC
Jump and jump and link Trap and return from
exception
70
An example MIPS machine
71
Memory Addressing
72
Displacement Address Size
12 - 16 bits of displacement needed
73
Addressing Objects Endianess and Alignment

Big Endian address of most significant bit
IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA
Little Endian address of least significant bit
Intel 80x86, DEC Vax, DEC Alpha (Windows NT)

little endian byte 0
3 2 1 0
msb
lsb
0 1 2 3
0 1 2 3
Aligned
big endian byte 0
Alignment require that objects fall on address
that is multiple of their size.
Not Aligned
74
Addressing Objects

Big Endian address of most significant

msb
lsb
3 2 1 0
little endian word 0
big endian word 0
0 1 2 3
Alignment require that objects fall on address
that is multiple of their
size. (e.g., 2-bye word should be at 0,2,4,..)
75
Byte Swap Problem
A
3
D
3
B
C
2
2
C
B
1
1
increasing byte address
D
0
A
0
Little Endian
Big Endian
When words are transferred
76
Typical Memory Addressing Modes (Again!!)
Addressing mode
Example
Meaning

Register indirect
Add R4,(R1)
R4
R4MemR1

Indexed
Add R3,(R1R2)
R3
R3MemR1R2

Direct or absolute
Add R1,(1001)
R1
R1Mem1001

Memory indirect
Add R1,_at_(R3)
R1
R1MemMemR3

Auto-increment
Add R1,(R2)
R1
R1MemR2 R2
R2d

Auto-decrement
Add R1,(R2)
R2
R2d R1
R1MemR2

Scaled
Add R1,100(R2)R3
R1
R1Mem100R2R3d
77
Addressing Modes Usage Example
For 3 programs running on VAX ignoring direct
register mode
Displacement 42 avg, 32 to 55 Immediate
33 avg, 17 to 43 Register
deferred (indirect) 13 avg, 3 to 24 Scaled
7 avg, 0 to 16 Memory indirect 3 avg,
1 to 6 Misc 2 avg, 0 to 3 75
displacement immediate 88 displacement,
immediate register indirect. Observation In
addition Register direct, Displacement,
Immediate, Register Indirect addressing modes are
important.
75
88
CISC to RISC observation (fewer addressing modes
simplify CPU design)
78
Immediate Size
50 to 60 fit within 8 bits 75 to 80 fit
within 16 bits (size of the immediate no used
in an instruction)
79
Addressing Summary

Data Addressing modes that are important
Displacement, Immediate, Register Indirect
Displacement size should be 12 to 16 bits
Immediate size should be 8 to 16 bits

80
Utilization of Memory Addressing Modes
Most Common
81
Displacement Address Size Example
Avg. of 5 SPECint92 programs v. avg. 5 SPECfp92
programs
1 of addresses gt 16-bits
12 - 16 bits of displacement needed
CISC to RISC observation
82
Operation Types in The Instruction Set

Operator Type
Examples
Arithmetic and logical Integer arithmetic
and logical operations add, or
Data transfer Loads-stores
(move on machines with memory
addressing)
Control Branch,
jump, procedure call, and return, traps.
System Operating
system call, virtual memory
management instructions
Floating point Floating point
operations add, multiply.
Decimal Decimal add,
decimal multiply, decimal to
character conversion
String String
move, string compare, string search

Operator Type
Examples Arithmetic and logical Integer
arithmetic and logical operations add, or
Data transfer Loads-stores
(move on machines with memory
addressing)
Control Branch, jump,
procedure call, and return, traps. System
Operating system call,
virtual memory
management instructions
Floating point Floating point
operations add, multiply. Decimal
Decimal add, decimal multiply,
decimal to
character conversion String
String move, string
compare, string search Media
The same operation performed on
multiple data
(e.g Intel MMX, SSE)
83
Instruction Usage ExampleTop 10 Intel X86
Instructions
Rank
Integer Average Percent total executed
1
2
3
4
5
6
7
8
9
10
Observation Simple instructions dominate
instruction usage frequency.
CISC to RISC observation
84
Instruction Set Encoding

Considerations affecting instruction set
encoding
To have as many registers and addressing modes as
possible.
The Impact of of the size of the register and
addressing mode fields on the average instruction
size and on the average program.
To encode instructions into lengths that will be
easy to handle in the implementation. On a
minimum to be a multiple of bytes.
Fixed length encoding Faster and easiest to
implement in hardware.
Variable length encoding Produces smaller
instructions.
Hybrid encoding.

CISC to RISC observation
85
Three Examples of Instruction Set Encoding
Operations no of operands
Address specifier 1
Address field 1

Address specifier n
Address field n
Variable VAX (1-53 bytes)
Operation
Address field 1
Address field 2
Address field3
Fixed MIPS, PowerPC, SPARC (Each instruction is
4 bytes)
Operation
Address field
Address Specifier
Address Specifier 1
Address Specifier 2
Operation
Address field
Address Specifier
Address field 2
Operation
Address field 1
Hybrid IBM 360/370, Intel 80x86
86
Complex Instruction Set Computer (CISC)

Emphasizes doing more with each instruction
Thus fewer instructions per program (more compact
code).
Motivated by the high cost of memory and hard
disk capacity when original CISC architectures
were proposed
When M6800 was introduced 16K RAM 500, 40M
hard disk 55, 000
When MC68000 was introduced 64K RAM 200, 10M
HD 5,000
Original CISC architectures evolved with faster
more complex CPU designs but backward instruction
set compatibility had to be maintained.
Wide variety of addressing modes
14 in MC68000, 25 in MC68020
A number instruction modes for the location and
number of operands
The VAX has 0- through 3-address instructions.
Variable-length instruction encoding.

87
Example CISC ISA Motorola 680X0

18 addressing modes
Data register direct.
Address register direct.
Immediate.
Absolute short.
Absolute long.
Address register indirect.
Address register indirect with postincrement.
Address register indirect with predecrement.
Address register indirect with displacement.
Address register indirect with index (8-bit).
Address register indirect with index (base).
Memory inderect postindexed.
Memory indirect preindexed.
Program counter indirect with index (8-bit).
Program counter indirect with index (base).
Program counter indirect with displacement.
Program counter memory indirect postindexed.
Program counter memory indirect preindexed.

GPR ISA (Register-Memory)

Operand size
Range from 1 to 32 bits, 1, 2, 4, 8, 10, or 16
bytes.
Instruction Encoding
Instructions are stored in 16-bit words.
the smallest instruction is 2- bytes (one word).
The longest instruction is 5 words (10 bytes) in
length.

88
Example CISC ISAIntel IA-32, X86 (80386)
GPR ISA (Register-Memory)

12 addressing modes
Register.
Immediate.
Direct.
Base.
Base Displacement.
Index Displacement.
Scaled Index Displacement.
Based Index.
Based Scaled Index.
Based Index Displacement.
Based Scaled Index Displacement.
Relative.

Operand sizes
Can be 8, 16, 32, 48, 64, or 80 bits long.
Also supports string operations.
Instruction Encoding
The smallest instruction is one byte.
The longest instruction is 12 bytes long.
The first bytes generally contain the opcode,
mode specifiers, and register fields.
The remainder bytes are for address displacement
and immediate data.

89
Reduced Instruction Set Computer (RISC)

Focuses on reducing the number and complexity of
instructions of the machine.
Reduced CPI. Goal At least one instruction per
clock cycle.
Designed with pipelining in mind.
Fixed-length instruction encoding.
Only load and store instructions access memory.
Simplified addressing modes.
Usually limited to immediate, register indirect,
register displacement, indexed.
Delayed loads and branches.
Instruction pre-fetch and speculative execution.
Examples MIPS, SPARC, PowerPC, Alpha

(CPI 1 or less)
(Thus more instructions executed than CISC)
90
Example RISC ISA HP Precision Architecture, HP
PA-RISC

Operand sizes
Five operand sizes ranging in powers of two from
1 to 16 bytes.
Instruction Encoding
Instruction set has 12 different formats.
All are 32 bits in length.

7 addressing modes
Register
Immediate
Base with displacement
Base with scaled index and displacement
Predecrement
Postincrement
PC-relative

91
Example RISC ISA DEC/Compaq/Intel? Alpha AXP

Operand sizes
Four operand sizes 1, 2, 4 or 8 bytes.
Instruction Encoding
Instruction set has 7 different formats.
All are 32 bits in length.

4 addressing modes
Register direct.
Immediate.
Register indirect with displacement.
PC-relative.

92
RISC ISA Example MIPS R3000 (32-bits)

5 Addressing Modes
Register direct (arithmetic).
Immedate (arithmetic).
Base register immediate offset (loads and
stores).
PC relative (branches).
Pseudodirect (jumps)
Operand Sizes
Memory accesses in any multiple between 1 and 4
bytes.

Instruction Categories
Load/Store.
Computational.
Jump and Branch.
Floating Point
(using coprocessor).
Memory Management.
Special.

93
A RISC ISA Example MIPS
94
An Instruction Set Example MIPS64

A RISC-type 64-bit instruction set architecture
based on instruction set design considerations of
chapter 2
Use general-purpose registers with a load/store
architecture to access memory.
Reduced number of addressing modes displacement
(offset size of 16 bits), immediate (16 bits).
Data sizes 8 (byte), 16 (half word) , 32 (word),
64 (double word) bit integers and 32-bit or
64-bit IEEE 754 floating-point numbers.
Use fixed instruction encoding (32 bits) for
performance.
32, 64-bit general-purpose integer registers
GPRs, R0, ., R31. R0 always has a value of
zero.
Separate 32, 64-bit floating point registers
FPRs F0, F1 F31 When holding a 32-bit
single-precision number the upper half of the FPR
is not used.

95
MIPS64 Instruction Format (1/2)
I - type instruction
Encodes Loads and stores of bytes, words, half
words. All immediates (rt rs op immediate)
Conditional branch instructions Jump register,
jump and link register ( rs destination,
immediate 0)
R - type instruction
6
5
5
5
5
6
shamt
Opcode
rs
rt
rd
func
Register-register ALU operations rd rs func
rt Function encodes the data path operation
Add, Sub .. Read/write special registers and
moves.
96
MIPS64 Instruction Format (2/2)
J - Type instruction
Jump and jump and link. Trap and return from
exception
97
MIPS Addressing Modes/Instruction Formats

All instructions 32 bits wide

R-Type
ALU
(loads/stores)
Branches
Pseudodirect Addressing for jumps (J-Type) not
shown here
98
MIPS64 Instructions Load and Store

LD R1,30(R2) Load double word RegsR1
64 Mem30RegsR2
LW R1, 60(R2) Load word
RegsR1 64 (Mem60RegsR20)32
Mem60RegsR2
LB R1, 40(R3) Load byte
RegsR1 64 (Mem40RegsR30)56
Mem40RegsR3
LBU R1, 40(R3) Load byte unsigned RegsR1
64 056 Mem40RegsR3
LH R1, 40(R3) Load half word RegsR1
64 (Mem40RegsR30)48
Mem40 RegsR3
Mem 41RegsR3
L.S F0, 50(R3) Load FP single RegsF0
64 Mem50RegsR3 032
L.D F0, 50(R2) Load FP double
RegsF0 64 Mem50RegsR2
SD R3,500(R4) Store double word Mem
500RegsR4 64 RegR3
SW R3,500(R4) Store word
Mem 500RegsR4 32 RegR3
S.S F0, 40(R3) Store FP single
Mem 40, RegsR3 32 RegsF0 031
S.D F0,40(R3) Store FP double
Mem40RegsR3 -64 RegsF0
SH R3, 502(R2) Store half
Mem502RegsR2 16 RegsR34863
SB R2, 41(R3) Store byte
Mem41 RegsR3 8 RegsR2 5663

99
MIPS64 Instructions Arithmetic/Logical

DADDU R1, R2, R3 Add unsigned RegsR1
RegsR2 RegsR3
DADDI R1, R2, 3 Add immediate
RegsR1 RegsR2 3
LUI R1, 42 Load upper immediate
RegsR1 032 42 016
DSLL R1, R2, 5 Shift left logical
RegsR1 Regs R2 ltlt5
DSLT R1, R2, R3 Set less than
if (regsR2 lt RegsR3 )
Regs R1 1 else RegsR1
0

100
MIPS64 Instructions Control-Flow

J name Jump
PC 36..63 name
JAL name Jump and link
Regs31 PC4 PC 36..63 name
((PC4)-
227) name lt ((PC 4) 227)
JALR R2 Jump and link register
RegsR31 PC4 PC RegsR2
JR R3 Jump register
PC RegsR3
BEQZ R4, name Branch equal zero
if (RegsR4 0) PC name
((PC4) -217)
name lt ((PC4) 217
BNEZ R4, Name Branch not equal zero
if (RegsR4 ! 0) PC name
((PC4) - 217)
name lt ((PC 4) 217
MOVZ R1,R2,R3 Conditional move if zero
if (RegsR3 0)
RegsR1 RegsR2

101
The Role of Compilers

The Structure of Recent Compilers

Dependencies Language dependent machine
dependent
Function Transform Language to
Common intermediate form
Somewhat Language dependent largely machine
independent
For example procedure inlining and loop
transformations
Small language dependencies machine dependencies
slight (e.g. register counts/types)
Include global and local optimizations
register allocation
Detailed instruction selection and
machine-dependent optimizations may include or
be followed by assembler
Highly machine dependent language independent
102
The Role of Compilers
103
Compiler Optimization andInstruction Count
Change in instruction count for the programs
lucas and mcf from SPEC2000 as compiler
optimizations vary.
104
Typical Operations
Load (from memory) Store (to memory) memory-to-mem
ory move register-to-register move input (from
I/O device) output (to I/O device) push, pop
(to/from stack)
Data Movement
Arithmetic
integer (binary decimal) or FP Add, Subtract,
Multiply, Divide
not, and, or, set, clear
Logical
shift left/right, rotate left/right
Shift
Control (Jump/Branch)
unconditional, conditional
Subroutine Linkage
call, return
Interrupt
trap, return
Synchronization
test set (atomic r-m-w)
String
search, translate (e.g., char to int)
105
Top 10 80x86 Instructions
106
Methods of Testing Condition

Condition Codes
Processor status bits are set as a side-effect
of arithmetic instructions (possibly on Moves) or
explicitly by compare or test instructions.
ex add r1, r2, r3
bz label
Condition Register
Ex cmp r1, r2, r3
bgt r1, label
Compare and Branch
Ex bgt r1, r2, label

107
Condition Codes
Setting CC as side effect can reduce the of
instructions X . . .
SUB r0, 1, r0 BRP X
X . . . SUB r0,
1, r0 CMP r0, 0 BRP X
vs.
But also has disadvantages --- not all
instructions set the condition codes which
do and which do not often confusing! e.g.,
shift instruction sets the carry bit ---
dependency between the instruction that sets the
CC and the one that tests it to overlap
their execution, may need to separate them
with an instruction that does not change the CC
write
ifetch
read
compute
New CC computed
Old CC read
write
ifetch
read
compute
108
Branches
--- Conditional control transfers
Four basic conditions N -- negative
Z -- zero
V -- overflow C -- carry
Sixteen combinations of the basic four conditions
Always Never Not Equal Equal Greater Less or
Equal Greater or Equal Less Greater Unsigned Less
or Equal Unsigned Carry Clear Carry
Set Positive Negative Overflow Clear Overflow Set
Unconditional NOP Z Z Z (N V) Z (N
V) (N V) N V (C Z) C Z C C N N V V
109
Conditional Branch Distance
Distance from branch in instructions 2i gt Š
2i-1 gt 2i-2 25 of integer branches are gt 2
to 4
110
Conditional Branch Addressing

PC-relative since most branches At least 8 bits
suggested ( 128 instructions)
Compare Equal/Not Equal most important for
integer programs (86)

111
Operation Summary

Support these simple instructions, since they
will dominate the number of instructions
executed
load,
store,
add,
subtract,
move register-register,
and,
shift,
compare equal, compare not equal,
branch (with a PC-relative address at least
8-bits long), jump,
call,
return

112
Data Types
Bit 0, 1 Bit String sequence of bits of a
particular length 4 bits is a nibble
8 bits is a byte 16 bits is a half-word
(VAX word) 32 bits is a word (VAX long
word) Character ASCII 7 bit code
EBCDIC 8 bit code Decimal digits 0-9
encoded as 0000b thru 1001b two decimal
digits packed per 8 bit byte Integers
Sign Magnitude 0X vs. 1X 1's
Complement 0X vs. 1(X) 2's
Complement 0X vs. (1's comp) 1 Floating
Point Single Precision Double
Precision Extended Precision
Positive 's same in all First 2 have two
zeros Last one usually chosen
exponent
How many /- 's? Where is decimal pt? How are
/- exponents represented?
E
M x R
base
mantissa
113
Operand Size Usage

Support these data sizes and types 8-bit,
16-bit, 32-bit integers and 32-bit and 64-bit
IEEE 754 floating point numbers

114
Instruction Format

If have many memory operands per instructions
and many addressing modes, need an Address
Specifier per operand
If have load-store machine with 1 address per
instr. and one or two addressing modes, then just
encode addressing mode in the opcode

115
Generic Examples of Instruction Formats

Variable Fixed Hybrid
116
Summary of Instruction Formats