Today: a look to the future - PowerPoint PPT Presentation

About This Presentation

Title:

Today: a look to the future

Description:

So far, we've assumed that ALU operations can have only register and constant operands. ... are needed to move data between memory and the register file. ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 21

Provided by: howard2

Learn more at: http://charm.cs.uiuc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Today: a look to the future

1
Today a look to the future

Future classes
where you will learn ways in which real computers
are more sophisticated than the one we looked at
Future computer architectures in the coming
decades
RISC
Fractional representations
Fixed-point
Floating-point
Multi-cycle datapaths
Pipelining
Memory hierarchies (Caches)
Speculation (Optimistically proactive)
Next decade Moores law continues
And Beyond

2
Data movement instructions

Finally, the types of operands allowed in data
manipulation instructions is another way of
characterizing instruction sets.
So far, weve assumed that ALU operations can
have only register and constant operands.
Many real instruction sets allow memory-based
operands as well.
Well use the books example and illustrate how
the following operation can be translated into
some different assembly languages.
X (A B)(C D)
Assume that A, B, C, D and X are really memory
addresses.

3
Register-to-register RISC architectures

Our programs so far assume a register-to-register,
or load/store, architecture, which matches our
datapath from last week nicely.
Operands in data manipulation instructions must
be registers.
Other instructions are needed to move data
between memory and the register file.
With a register-to-register, three-address
instruction set, we might translate X (A B)(C
D) into

LD R1, A R1 ? MA // Use direct addressing LD
R2, B R2 ? MB ADD R3, R1, R2 R3 ? R1 R2 // R3
MA MB LD R1, C R1 ? MC LD R2, D R2 ?
MD ADD R1, R1, R2 R1 ? R1 R2 // R1 MC
MD MUL R1, R1, R3 R1 ? R1 R3 // R1 has the
result ST X, R1 MX ? R1 // Store that into MX
4
Memory-to-memory architectures

In memory-to-memory architectures, all data
manipulation instructions use memory addresses as
operands.
With a memory-to-memory, three-address
instruction set, we might translate X (A B)(C
D) into simply
How about with a two-address instruction set?

ADD X, A, B MX ? MA MB ADD T, C, D MT ?
MC MD // T is temporary storage MUL X, X,
T MX ? MX MT
MOVE X, A MX ? MA // Copy MA to MX
first ADD X, B MX ? MX MB // Add
MB MOVE T, C MT ? MC // Copy MC to
MT ADD T, D MT ? MT MD // Add MD MUL
X, T MX ? MX MT // Multiply
5
Register-to-memory architectures

Finally, register-to-memory architectures let the
data manipulation instructions access both
registers and memory.
With two-address instructions, we might do the
following

LD R1, A R1 ? MA // Load MA into R1
first ADD R1, B R1 ? R1 MB // Add MB LD
R2, C R2 ? MC // Load MC into R2 ADD R2, D R2
? R2 MD // Add MD MUL R1, R2 R1 ? R1
R2 // Multiply ST X, R1 MX ? R1 // Store
6
Size and speed

There are lots of tradeoffs in deciding how many
and what kind of operands and addressing modes to
support in a processor.
These decisions can affect the size of machine
language programs.
Memory addresses are long compared to register
file addresses, so instructions with memory-based
operands are typically longer than those with
register operands.
Permitting more operands also leads to longer
instructions.
There is also an impact on the speed of the
program.
Memory accesses are much slower than register
accesses.
Longer programs require more memory accesses,
just for loading the instructions!
Most newer processors use register-to-register
designs.
Reading from registers is faster than reading
from RAM.
Using register operands also leads to shorter
instructions.

7
Representing fractional numbers

Fixed-point numbers
Represent numbers using a fixed number of bits
dedicated to the integer and fractional portion.
10001001.0010110 - 8 bits each
.0111010010010101 all 16 to the fractional
portion
Frequently used in business applications
Evenly spaced gap between representable numbers
for the full range.

8
Representing fractional numbers

Floating-point numbers
Somewhat similar to scientific notation
E.g. 1.001 x 213, 1.1 x 2-4
Several standards, most popular is the IEEE 754
32-bit float consists of
1 sign bit
8 exponent bits
excess-127 format. So treat bits as unsigned
and subtract 127 from them to find actual
exponents
11111111 reserved to signify infinities and NaNs
00000000 reserved to represent denormalized
numbers (no leading 1 assumed and e -126)
23 mantissa bits
Represent the fractional component, i.e. 1.01101
Assume a leading 1 unless exponent is 0.
Value -1s 2(e-127) 1.m

9
Representing fractional numbers

Floating-point numbers
IEEE 754
64-bit double-precision float consists of
1 sign bit
11 exponent bits
excess-1023 format.
11111111 reserved to signify infinities and NaNs
00000000 reserved to represent denormalized
numbers (no leading 1 assumed and e -1022)
52 mantissa bits
Represent the fractional component, i.e. 1.01101
Assume a leading 1 unless exponent is 0.
Value -1s 2(e-1023) 1.m
Uneven gaps between representable numbers (closer
together when very small, larger gaps with larger
numbers)

10
Problems with our datapath

Other than the obvious need more registers, more
bits in each register (and therefore in datapath)
The clock cycle time is contrained by the longest
possible instruction execution time.
Solution
break an instruction execution into multiple
cycles
Bucket brigade pipelined datapath
Microprogrammed datapath

Access PC 1 ns
Instruction Memory 4 ns
Register read 3 ns
MUX B 1 ns
ALU or Memory 4 ns
MUX D 1 ns
Register write 3 ns
11
A Microprogrammed Datapath

The datapath we worked with for the past few
weeks was just an example
We will look at another datapath today
To emphasize that alternate designs are possible
To show an example where each instruction takes
multiple cycles to finish
To show a different way of generating control
signals
Material is not based on the book
Used to be in the older version..
For the exam
Basic understanding of the slides, and
section 8-7 (of the 3rd edition)
Follow the web link there if you are interested

12
Why multiple cycles?

Wouldnt it be slower?
Not necessarily if each clock cycle can be made
shorter
Variable number of cycles for instructions (some
2, some 5)

13
New Datapath

Let us use one memory module
for both data and instructions
Allow for multiple cycles for each instruction

14
Register File
MUX B
ALU
Memory
Data In
Address
Data Out
MUX D
15
How to generate contol signals

Consequence of this datapath
Needs a cycle to fetch instruction from memory
Control word
the set of control signals
In our older datapath
Control word was determined fully by the
instruction
Here
It depends on instruction and on which cycle
within the instruction we are in
Example

16
Generating control sequential circuit
IR Instruction Register
Control Unit
Control word
Cycle Counter
17
Generating ControlMicroprogram Memory
IR Instruction Register
Microprogram Memory
Control word
Next MicroInstruction Address
MicroProgram Counter
18
(No Transcript)
19
Pipelined datapath

Simplified scenario
4 step assembly line
Instruction Fetch
Operand Fetch
Execution of operation
Writeback
Although total time for each instruction to
finish is the same (or slightly larger)
The unit as a whole processes more instructions
per unit time
Just as in assembly of a car
More on this in CS 232 and beyond

20
Other ideas