Title: CMPT 250 Computer Architecture
1CMPT 250 Computer Architecture
- Instructor Yuzhuang Hu
- yhu1_at_cs.sfu.ca
2Another Design Example PIG (Chapter 7-10)
- PIG is a single dice game. Two players roll the
dice in turns. When 1 is rolled, the current
total becomes 0. The first player to reach or
exceed 100 wins.
Turn Total
Player 1
Player 2
HOLD
ROLL
NEW GAME
RESET
3Inputs, Outputs, and Registers of PIG
Symbol Function Type
ROLL 1 Starts die rolling, 0 Stops die rolling Control input
HOLD 1 Ends player turn, 0 Continues player turn Control input
NEWGAME 1 Starts new game, 0 Continues current game Control input
RESET 1 Resets game to INIT state, 0 No action Control input
DIE DIE value-Specialized Counter to count 1..6 3-Bit data register
SUR Subtotal for active player-parallel load register 7-bit data register
TR1 Total for player 1-parallel load register 7-bit data register
TR2 Total for player 2-parallel load register 7-bit data register
FP First player-flip-flop 0Player 1, 1Player 2 1-bit control register
CP Current player 1-bit control register
4State-Machine Diagram for PIG
Default P1CP, P2CP
RESET
DIE?000, FPlt-0
TR1lt-0, TR2lt-0, CPlt-FP
INIT
SURlt-0
ROLL
BEGIN
ROLL
If (DIE110) DIElt-001 Else DIElt-(DIE1)
ROLL
ROL
CPlt-CP
ROLL
DIE1
ONE
SURlt-SURDIE
ROLLHOLD
DIEltgt1
ROLL
CP/(TR1lt-TR1SUR), CP/(TR1lt-TR1SUR)
ROH
CP(TR1lt1100100) CP(TR2lt-1100100)
ROLLHOLD
TEST
FPlt-FP
CP(TR1gt1100100)CP(TR2gt1100100)
NEWGAME
WIN
CP/P1BLINK, CP/P1BLINK
NEWGAME
5Algorithmic State Machine (ASM)
- The ASM is like state diagrams but less formal
and thus easier to be understood. An ASM chart
consists of a set of blocks. Each block can be
viewed as a directed graph with three types of
nodes. - State Box (node).
- Binary Decision Box (node).
- Conditional Action Box (node).
6ASM contd.
- State box represented by a labeled rectangle. It
may contain several register transfer statements
or variables. - Binary decision box represented by a hexagon. It
indicates that a condition needs to be tested. It
is similar to the input condition defined for
State-Machine Diagrams. - Conditional output action box represented by an
oval box. It contains several register transfer
statements or variables. It is similar to the
output condition defined for State-Machine
Diagrams.
7Boxes in ASM Charts
State name
Register transfer
0 (False)
1 (True)
Condition
statements
expression
(Moore type)
(b) Binary Decision box
(a) State box
Conditional outputs
or actions (Mealy type)
(c) Conditional output action box
8A Design Example using ASM
- Problem find the sum for N numbers.
- algorithm sum_n(S)
- input a list S consisting N
numbers. - output the sum of the N numbers in
S. - 1 sum 0
- 2 N get_input()
- 3 while ( N gt 0 )
- 4 sum sum get_input()
- 5 N N 1
- 6 endwhile
9Interface of the Sum Machine
Symbol Function Type
In_bus An input bus Bus
Out_bus An output bus Bus
Data 1 in_bus is ready to send data Control input
Rdy 1 the machine is ready to receive data Control output
Ack 1the machine has processed the input data Control output
10ASM Diagram for the Sum Machine
rdy
S0
S1
0
1
data
N0
0
1
rdy
Sum lt- 0 N lt- in_bus ack
S2
0
data
s1
1
N lt- N-1 Sum lt- sumin_bus
ack
s1
11Digital System Design
- In most digital system designs, we partition the
system into two types of modules a datapath, and
a control unit.
Control Signals
Control Unit
Data Path
Control inputs
Status Signals
Data outputs
Control outputs
Data inputs
12Design from ASM
processor
Data in
Data out
status
control pts
Clock
CTRL PTS SELECTOR
External control inputs
SEQ
13Datapath of the Sum Machine
in_bus
ls
ln
SUM
N
cs
dn
out_bus
FA
N0
eq0
overflow
14ASM Design Guidelines
- Write an algorithm for the problem.
- Translate the algorithm to a sequence of register
transfer statements. - Group adjacent independent register transfer
statements. - Draw the ASM diagram, and introduce control
signals.
15Datapath Definition (Chapter 9)
- The datapath is defined by three basic
components - A set of registers.
- The micro-operations performed on data stored in
the registers. - The control interface.
16A Generic Datapath
- Four parallel-loadregisters
- Two mux-based register selectors
- Register destination decoder
- Mux B for external constant input
- Buses A and B with externaladdress and data
outputs - ALU and Shifter withMux F for output select
- Mux D for external data input
- Logic for generating status bitsV, C, N, Z
17Datapath Examples
- What to do for R1 lt- R2 R3?
- A select, choose R2.
- B select, choose R3.
- G select, choose AB.
- MF select, choose the ALU output.
- MD select, choose MUX F ouput.
- Destination select, choose R1.
- Load enable, to enable R1.
18Other Micro-operation Alternatives
- MF1 shift operation.
- MB1 using a constant.
- Load enable0 no register loading, e.g. when
providing an address out or data out. - MD1 read from memory.
19The Arithmetic/Logic Unit
- ALU performs arithmetic/logic micro-operations.
20The Arithmetic Circuit
- The arithmetic circuit consists of a parallel
n-bit adder and a selection logic.
21Function Table for Arithmetic Circuit
- It is easy to see that YiBiS0BiS1.
22Function Table for ALU
Operation Select Operation Select Operation Select Operation Select Operation Function
S2 S1 S0 Cin Operation Function
0 0 0 0 GA Transfer A
0 0 0 1 GA1 Increment A
0 0 1 0 GAB Addition
0 0 1 1 GAB1 Add with carry input of 1
0 1 0 0 GAB A plus 1s complement of B
0 1 0 1 GAB1 Subtraction
0 1 1 0 GA-1 Decrement A
0 1 1 1 GA Transfer A
1 X 0 0 GA B AND
1 X 0 1 GA v B OR
1 x 1 0 GA xor B XOR
1 x 1 1 GA NOT (1s complement)
23More on ALU
- The ALU has a fairly high number of logic levels
and contributes to propagation delay in the
circuit. However simple ripple-carry adders can
incur large propagation delays.
24Carry Look-Ahead
- Carry look-ahead is designed to reduce the carry
propagation delay in the ALU. - For a single bit full adder
- Generate a carry out when xy1 gxy.
- Propagate the carry in through the carry out when
x or y is 1 px xor y. - In terms of p and g, the carry out cog pci.
25Full Adder With Ports p And q
26A Slight Optimization
- Redefine p to be xy.
- We can do this because of the following
reasoning we only need to consider the case when
xy1. However when xy1, g1, therefore no
matter p takes 0 or 1, co is always equal to 1.
27Computing the Carry In for Each Bit
- ci(1)co(0)g(0)p(0)ci.
- ci(2)co(1)g(1)p(1) g(0)p(1) p(0) ci(0).
- ci(3)co(2)g(2)p(2) g(1)p(2) p(1)
g(0)p(2) p(1) p(0) ci(0). - ci(4)co(3)g(3)p(3) g(2)p(3) p(2)
g(1)p(3) p(2) p(1) g(0)p(3) p(2) p(1)
p(0) ci(0).
28Faster Four-bit Addition
- p(30) and g(30) are available after 1 gate
delay. - co(30) are available after 2 more gate delays.
- s(30) are available after 1 more gate delay.
- In total 1214 gate delays.
29A 4-Bit CLA Adder
30Explanation of P and G
- Consider the msb position of a bit vector (30).
Under what condition will a carry be generated
out of that position? Under what condition will a
carry be propagated through that position? - Define
- Gg(3)p(3) g(2)p(3) p(2) g(1)p(3) p(2)
p(1) g(0) - P p(3) p(2) p(1) p(0)
31A 16-bit CLA Adder
- Use the 4-bit CLA adder as a building box and
design a second level CLA logic to build a 16-bit
CLA adder.
32Delay of the 16-bit CLA Adder
- p(150) and g(150) are available after one gate
delay. - It takes 2 more gate delays for the P and G
signals for each of the 4-bit box. - It takes 2 more gate delays for the second layer
to produce ci(12), ci(8) and ci(4). - It takes 2 more gate delays for the first layer
to produce the rest carry in values. - It takes one more gate delay for the sum.
- In total 122218 gate delays.
- In general the total delay is 124( (log n)/2
)1.
33