Topic II Instruction-Set Architecture

About This Presentation

Title:

Topic II Instruction-Set Architecture

Description:

Accumulator is only really beneficial for a chain (sequence) of calculations ... Add, subtract, shift can only be done to A (8-bit accumulator) ... – PowerPoint PPT presentation

Number of Views:18

Avg rating:3.0/5.0

Slides: 45

Provided by: guang4

Learn more at: https://www.capsl.udel.edu

Category:

more less

Transcript and Presenter's Notes

Title: Topic II Instruction-Set Architecture

1
Topic IIInstruction-Set Architecture

Introduction
A Case Study The MIPS Instruction-Set
Architecture

2
Reading List

Slides Topic2x
Henn Patt Chapter 2
Other papers as assigned in class or homeworks

3
The Stored Memory Computer

Five parts of a computer
Datapath (channels/changes bits)
Control (directs operations)
Memory (places to keep bits)
Input (get data from outside)
Output (send data to outside

4
Steps in Executing an Instruction

Instruction Fetch Fetch the next instruction
from memory
Instruction Decode Examine instruction to
determine
What operation is performed by the instruction
(e.g., addition)
What operands are required, and where the result
goes
Operand Fetch Fetch the operands
Execution Perform the operation on the operands
Result Writeback Write the result to the
specified location
Next Instruction Determine where to get next
instruction

5
What is Specified in an ISA?

Instruction Decode How are operations and
operands specified?
Operand Fetch Where can operands be located? How
many?
Execution What operations can be performed? What
data types and sizes?
Result Writeback Where can results be written?
How many?
Next Instruction How can we choose the next
instruction?

6
A Simple ISA Memory-Memory

What operation can be performed? Basic arithmetic
(for now)
What data types and sizes? 32-bit integers
Where can operands and results be located? Memory
How many operands and results ? 2 operands, 1
result
How are operations and operands specified?
OP DEST, SRC1, SRC2
How can we choose the next instruction? Next in
sequence

7
Memory Model

Think of memory as being a large array of n
integers, referenced by the index (random Access
Memory, or RAM)

For instance, M1 contains the value 3. We can
read and write these locations. These are the
only locations available to us. All abstract
locations (such as variables in a C program) must
be assigned locations in M.
Address Contents
0
14
1
3
2
99
. . .
. . .
N - 1
0
8
Simple Code Translation

Given the C code
A B C
Assuming that we could decide that variable A
uses location 100, B uses 48, and C uses 76.
Convert the code above to the following
assembly code
ADD M100, M48, M76
How would we express
A (B C) (D E)

9
Using a Temporary Location

Assume we put A in 100, B in 48, C in 76, D in
20, and E in 32.
Now choose an unused memory location (e.g., 84).
ADD M100, M48, M76 A B C
ADD M84, M20, M32 temp D E
MUL M100, M100, M84 A A temp

10
Problems with Memory-Memory ISAs

Main memory much slower than arithmetic circuits
This was as true in 1950 as in today!
It takes a lot of room to specify memory
addresses
Results are often used one or two instructions
later
Remember make the common case fast!
Solution store temporary or intermediate results
in fast memories near the arithmetic units.

11
Accumulator Machines

An accumulator machine keeps a single
high-speed buffer (e.g., a set of D latches or
flip-flops, one for each data bit) near the
arithmetic logic.
In the simplest kind, only one operand can be
specified the accumulator is implicit OP
operand means
acc. acc. OP operand
Example
LOAD M48 Load B into acc.
ADD M76 Add C to acc. (now has BC)
STORE M100 Write acc. To A

12
Accumulator Machines Does A(BC)(DE)

LOAD M20 Load D into acc.
ADD M32 Add E to acc. (now has DE)
STORE M100 Write acc. To A
LOAD M48 Load B into acc.
ADD M76 Add C to acc. (now has BC)
MUL M100 Multiply A to acc.
STORE M100 Write (BC) (DE) to A

13
Shortcomings of Accumulator Machines

Still requires storing lots of temporary and
intermediate values in memory
Accumulator is only really beneficial for a chain
(sequence) of calculations where the result of
one is the input to the next.

14
Still, Accumulator Machines Were Common in Early
Computers

A simple design, and hence popular, especially
for
Early computers
Early microprocessors (4004, 8008)
Low-end (cheap) models
Reason accumulator logic much more expensive
than memory
Vacuum tubes vs. core memory
D flip-flops vs. DRAM
Precious space on processor chip vs. off-chip DRAM

15
Alternatives to Accumulator Machines

If more hardware resources are available, put
more fast storage locations alongside the
accumulator
Stack machines
Register machines
Special purpose
General purpose

16
Stack Machines

Idea A pile of fast storage locations with a top
and a bottom.

An instruction can only get at the top value, or
may be the top two or three values. We can put
new values on the top (push) or take them off
the top (pop) but thats it. We cant get to
locations underneath the top unless we remove
everything above.
Address Contents
top
14
2nd from top
3
3rd from top
99
. . .
. . .
bottom
0
17
Stack Machine ISA

Basic operations include
Load get value from memory and push onto stack
Store pop value off of stack and put into memory
Arithmetic pop 1 or 2 values off of stack push
result on stack
Dup Get value at top of stack without removing
push new copy onto stack (why is this useful?)

18
Stack Machine Does A(BC)(DE)
(stack top at start)
(DE)
ADD
XXX
(D)
LOAD M20
XXX
(B)
(DE)
LOAD M48
XXX
(E)
(D)
(continued next slide)
LOAD M32
XXX
19
Stack Machine (cont.)
((BC)(DE))
(B)
XXX
MULT
(DE)
LOAD M76
XXX
STORE M100
(BC)
(DE)
ADD
XXX
Note that the stack is now the same as when we
began.
20
Stack Machines Used

Some early computers
8086 floating point unit (sort of)
Java Virtual Machine (JVM)

21
Register Machines

Idea Put more storage locations (registers)
near the accumulator
Regs have names/numbers and can be used instead
of memory
Accessed much faster than main memory
(1-2 CPU cycles vs. 10s to 100 cycles)
Far fewer registers than memory locations
MIPS has 32 32-bit registers
Fewer regs, smaller addresses, fewer bits to name
them
A scarce resource use them carefully!

22
Special- vs. General-Purpose Registers

A special-purpose register is used for specific
purposes and there may be limitations on which
operations can use it
Easier on the HW design put the reg right where
its needed
More difficult for the compiler to use
effectively
A general-purpose register can be used in any
operation
- Datapaths more general, but routing is more
difficult

23
Special-Purpose Registers The Z-80 CPU

Seven 8-bit registers A, B, C, D, E, H, L (BC,
DE, HL can be pairs)
Three 16-bit registers SP, IX, IY, plus PC
(Program counter)
Add, subtract, shift can only be done to A (8-bit
accumulator)
Increment and decrement can be done to all regs
and reg pairs
Can fetch from memory at address (HL) and put in
any 8-bit reg
A fetch from address (BC) or(DE) can only go to A
Fetches from (BC), (HL) and (IX) take different
numbers of cycles
Anyone want to write a compiler for this?

24
General Purpose Register (GPR) Machines

The MIPS (and similar processors) has 32 General
Purpose Registers (GPRs), each 32 bits long. All
can be read or written, except register 0,
whichis always 0 and cant be changed.
Register access time is uniform.

Address Contents
0
0
1
3
2
99
. . .
. . .
31
14
25
GPR Machine Does A(BC)(DE)

ADD 1 M48, M76 R1 B C
ADD 2 M20, M32 R2 D E
MUL M100, 1, 2 A R1 R2

26
Some Trend

From hardware technology number of Rs can be
put on chip has potential grow very fast (Moores
Law ?)
Very large register set will have slow access
time.
Instruction set evolution is slow to accommodate
the change of of Rs

27
Memory and Data Sizes

So far, weve only talked about uniform data
sizes. Actual data come in many different sizes
Single bits (boolean values, true or false)
Bytes (8 bits) Characters (ASCII), very small
integers
Halfwords (16 bits) Characters (Unicode), short
integers
Words (32 bits) Long integers, floating-point
(FP) numbers
Double-words (64 bits) Very long integers,
double-precision FP
Quad-words (128 bits) Quad-precision
floating-point numbers

NOTE There is another data size which is called
extended double precision which is 80 bits long.
Used in x86 FPUs
28
Different Data Sizes

How do we handle different data sizes?
Pick one size to be the unit stored in a single
address
Store larger datum in a set of contiguous memory
locations
Store smaller datum in one location use shift
mask ops
Today, almost all machines (including MIPS) are
byte-addressable each addressable location in
memory holds 8 bits.

29
MIPS Memory

On a byte-addressable machine such as the MIPS,
if we say a word (32 bits) is stored at address
80, we mean it occupies locations 80-83. (The
next word would start at 84.)
Normally, multi-byte loads and stores must be
aligned. The address of an n-byte load/store
must be a multiple of n. For instance, halfwords
can only be stored at even addresses.
MIPS allow non-aligned loads and stores using
special instructions, but they may be slower.
(Most processors dont allow this at all!)

30
Byte-Order (Endianness)

For a multi-byte datum, which part goes in which
byte?
If 1 contains 1,000,000 (F4240H) and we store it
into address 80
On a big-endian machine, the big end goes
into address 80
On a little-endian machine, its the other way
around

00 0F 42 40
79 80 81 82 83 84

40 42 0F 00
79 80 81 82 83 84

31
Big-Endian vs. Little-Endian

Big-endian machines MIPS, Sparc, 68000
Little-endian machines most Intel processors,
Alpha, VAX, Intel 8086
No real reason one is better than the other
Compatibility problems transferring multi-byte
data between big-endian and little-endian
machines CAREFUL!
Read Appendix A-43 for more information.

32
Addressing Modes

- An ISAs addressing modes answer the question
where can operands be located?
We have two types of storage in the MIPS (and
most other machines) registers and main memory.
We can go to either or both for operands. A
single operand can come from either a register or
a memory location
and addressing modes offer various ways of
specifying this location.

33
Simple Addressing Modes

In these modes, a location or datum is given
directly in the instruction

Mode name Example Meaning
Register mov 1, 2 R2 R1
Direct (or absolute) mov 1, (40) M40 R1
Immediate mov 1, 40 40 R1
34
Indirect Addressing Modes

One or more registers are used to produce a
memory address

Mode name Example Meaning
Reg. Indirect mov 1, (2) MR2 R1
Displacement mov 1, 40(2) M40R2 R1
Indexed mov 1, 4(2) MR4R2 R1
Mem. Indirect mov 1, _at_(2) MMR2 R1
35
Advanced Addressing Modes

Extra features to support features in high-level
languages or reduce the number of instructions
during common memory accesses

Mode name Example Meaning
Auto-increment mov 1, 4(2) M4R2 R1
Auto-decrement mov 1, 4(2) - - MR2-4 R1
Scaled mov 1, 40(2) s M40R2xs R1
36
Choices in Addressing Modes

Anything goes Any addressing mode may be used
for any operand at any time
- Easier to map high-level statements directly
to instructions
- Hard to design processor, due to all the
complexity
Limited addressing Only allow a few modes,
and/or restrict some operands to certain modes
- Harder for compiler/programmer to follow all
the rules
- Code may be longer

37
Frequency of Addressing Modes

3 programs measured on VAX, which supports all
kinds of modes

Frequency of mode () Min. ave. max.
Mode Name
Displacement 32 42 55
Immediate 17 33 43
Reg. Indirect 3 13 24
Scaled 0 7 16
Mem. Indirect 1 3 6
Others 0 2 3
38
Empirical Data on Addressing Modes

How big do the displacements need to be?
In study of SPECin92 and SPECfp92, 99 of
displacements fell within 215
How big do the immediates (constants) need to be?
Studies show 50 - 60 fit within 8 bits
75-80 fit within 16 bits

Excercise search current results (e.g. for
SPEC2005 ?)
39
How Do We Represent Instructions?

We need some bits to tell what operation is
performed (e.g., add, sub, mul, etc.) this is
called the opcode.
We need some bits for each operand and result (3
total, in our case)
What type of addressing mode
Number of the register, memory address and/or
immediate constant

40
Variable-Length Instructions

Since the VAX allows any mode for any operand,
there could be an instruction with three 32-bit
addresses (direct addressing) ? gt 12 bytes in
this instruction.
But registers need only a few bits to specify, so
12 bytes would be wasteful for an instruction
using 3 registers only!
Must use variable-length instructions. On the
VAX, instructions can vary from 1 to 17 bytes!

41
Fixed-Length Instructions

If every instruction has the same number of bits
(preferable a nice even number like 16 or 32),
many components of the processor will be simpler.
But we either waste some amounts of space or
cant support all the addressing modes!

42
Loading Small Integers