Title: (Based on text: David A. Patterson
1COMPUTER ARCHITECTURE
Processor Finite State Machine Microprogramming
- (Based on text David A. Patterson John L.
Hennessy, Computer Organization and Design The
Hardware/Software Interface, 3rd Ed., Morgan
Kaufmann, 2007)
2COURSE CONTENTS
- Introduction
- Instructions
- Computer Arithmetic
- Performance
- Processor Datapath
- Processor Control
- Pipelining Techniques
- Memory
- Input/Output Devices
3PROCESSOR DATAPATH CONTROL
- Multicycle Datapath Control
- Control Finite State Machine
- Control Microprogramming
3
4Defining the Control for Multi-Cycle Datapath
- Multi-cycle vs single-cycle datapath
- for single-cycle, truth-tables to specify setting
of control signals based on instruction - for multi-cycle, control is more complex due to
instruction is executed in steps control must
specify both the control signals in any step
the next step in the sequence - Value of control signals dependent upon
- what instruction is being executed
- which step is being performed
- Two different control techniques
- Finite state machine (FSM)
- Microprogramming
- Implementation can be derived from specification
5Finite State Machine(FSM) Control
- Consists of set of states directions on how to
change states - Each state specifies a set of control signal
outputs that are asserted when machine is at that
state - Each state in FSM takes 1 clock cycle
- First two states (state 0 state 1) common for
all instructions - After state 1, signals asserted depend on
instruction (this process is called instruction
decoding) - After last step (state) of an instruction, FSM
returns to state 0 to begin fetching next
instruction
6The Complete FSM Control
Graphical specification
7CPI in Multi-Cycle CPU
Example
CPI 0.22 x 5 0.11 x 4 0.49 x 4 0.16 x 3
0.02 x 3 1.1 0.44 1.96 0.48 0.06
4.04 Better than worst case CPI (if all
instructions took same number of cycles 5)
8FSM Controller Implementation
- Typically by a block of combinational logic a
state register to hold the current state
Total of 9 states --gt 4 bit state
register Combinational control logic Inputs
current state any input used to determine the
next state (in this case is 6-bit
opcode) Outputs next state number control
signals to be asserted for current state Note
here outputs depend only on current state, not on
inputs (Moore machine)
9PLA Implementation of the Combinational Control
Logic
- If I picked a horizontal or a vertical line,
could you explain it? - Note upper half is AND plane lower half is OR
plane
Example PCWrite 1 if (current state is state
0) or (current state is state 9), i.e.,
Example next state bit 2 NS2 1 (i.e. states 4,
5, 6, or 7) if (current state is 3) or (current
state is 2 and op 101011 (sw)) or (current
state is 1 and op 000000 (R-type)) or (current
state is 6), I.e.
10ROM Implementation of Combinational Control Logic
- Combinational control logic can be express in a
truth table inputs are current state values (S3
- S0) Opcodes (Op5 - Op0) outputs are control
signals next state values (NS3 - NS0) - A ROM can be used to implement a truth table
- if the address (inputs) is m-bits, we can address
2m entries in the ROM - outputs are the bits of data that the address
points to
11ROM Implementation of Combinational Control Logic
- How many inputs are there? 6 bits for opcode, 4
bits for current-state 10 address lines (i.e.,
210 1024 different addresses) - How many outputs are there? 16 datapath-control
outputs, 4 next-state bits 20 bit outputs - ROM is 210 x 20 20K bits (and a rather
unusual size) - Rather wasteful, since lots of input combinations
(addresses) will never occur e.g. many opcodes
are illegal, some states (e.g. states 10 to 15)
are illegal
12ROM vs. PLA
- Break up the table into two parts 4 state bits
tell you the 16 outputs, 24 x 16 bits of
ROM 10 bits tell you the 4 next state bits,
210 x 4 bits of ROM small circuit Total
4.3K bits of ROM small circuit - PLA is much smaller can share product terms
only need entries that produce an active
output can take into account don't cares - Size is (inputs product-terms) (outputs
product-terms) For this example, PLA size prop.
to (10x17)(20x17) 510 PLA cells - PLA cells usually about (slightly bigger) the
size of a ROM cell (bit) - PLA is a much more efficient implementation for
this control unit
13Microprogramming Control
- If the assembly language instruction set becomes
very large, FSM could require hundreds to
thousands of states many arcs (sequences) --
very complex - Complex control better managed by
microprogramming - Basic idea
- All control signals in a cycle form a
microinstruction, each microinst. defines - the set of datapath control signals that must be
asserted in a given state (cycle) - next microinstruction
- Executing a microinstruction asserting the
control signals specified - A sequence of microinstructions form a
microprogram - Each cycle, a microinstruction is fetched from
the microprogram executed - Microprogramming -- designing the control as a
program implementing machine instructions by
simpler microinstructions - Each control state corresponds to a
microinstruction - Our basic FSM 10 states ? 10 micro-instructions
14Microinstruction Format
- A microinstruction contains several fields 1
label - Each field specifies a non-overlapping set of
control signals - Signals that are never asserted simultaneously
may share the same field - A last field specifies how to choose the next
microinstruction - Label some micro-instructions have a label to be
branched at - In our example, we have 7 fields 1 label
- 1st to 6th fields control specification 7th
field next instruction
15A MicroprogramControl Unit
- Microinstructions are placed in a ROM or PLA
- The state (in state register) enters as input or
address to define the current microinstruction,
which in turn asserting relevant control signals - State change at the edge of clock
- Sequencing ways to choose next microinstruction
(next state) - increment current address/state (AddrCtl selects
1 adder) (Seq) - branch to microinstruction that begins execution
of the next MIPS instruction (AddrCtl selects
address 0) (Fetch) - choose next microinstruction based on opcode
(AddrCtl selects dispatch table) (Dispatch)
16A Review of OurState Diagram
Graphical specification
17Sequencing Address Select Logic
18A MicroprogramControl Unit
- A microprogram control unit controlling the
datapath - ROM or PLA is now microcode memory (control
memory) - state register is now microprogram counter (?PC)
19A Review of Datapath Control
20A Review of the Instruction Execution Steps
- 1. IR lt MemoryPC PC lt PC 4 (State 0)
- 2. Instruction Decode (All instructions)
- A lt RegIR2521 B lt RegIR2016
(State 1) - ALUOut lt PC (sign-extend(IR150) ltlt 2)
- 3. Memory address computation (for lw, sw)
- ALUOut lt A sign-extend(IR150) (State 2)
- ALU (R-type) ALUOut lt A op B (State 6)
- Conditional branch if (AB) then PC lt
ALUOut (State 8) - Jump PC lt PC3128 (IR250ltlt2)
(State 9) - 4. For lw or sw instructions (access memory)
- MDR lt MemoryALUOut (State 3) or
MemoryALUOut lt B (State 5) - For ALU (R-type) instructions (write result
to register) RegIR1511 lt ALUOut (State 7)
- 5. For lw instruction only (write data from MDR
to register) RegIR2016lt MDR (State 4)
21A Symbolic Microprogram
- A specification methodology
- appropriate if hundreds of opcodes, modes,
cycles, etc. - signals specified symbolically using
microinstructions - E.g. Read PC Read memory using PC as address
and write result into IR ( MDR) (see next slide
for details) - Our symbolic microprogram with 10
microinstructions
Microassembler performs checks to remove
combinations that cannot be supported in datapath
22Control Signals for Each Symbol in Each Field in
the Microprogram
23Maximally vs Minimally Encoded
- No encoding of control signals in
microinstruction format (horizontal
microprogram) - 1 bit for each control signal in datapath
operation e.g. control signals s, t, u, v, w, x,
y, z will occupy 8 bits in microinstruction - faster, but requires more memory (logic)
- used for Vax 780 an astonishing 400K of control
memory! - Lots of encoding of control signals in
microinstruction format (vertical microprogram) - E.g. s, t, u, v, w, x, y, z will be encoded in
say, 4 bits, with 0000 meaning u 1 (others
0), 1010 meaning u w 1 (others 0), etc.
I.e. all possible combinations are encoded - send the microinstructions through logic to get
control signals - uses less memory, but slower
- Select a good trade-off
- Microcode implementation on-chip vs off-chip
24Exceptions
- Exception unexpected event from within the
processor (e.g. arithmetic overflow) - Interrupt unexpected event from outside of the
processor (e.g. from an I/O device) - An exception or an interrupt causes an unexpected
change in control flow How does the control unit
handle an exception/interrupt? - In case of an exception, processor should
- save address of the offending instruction in
exception program counter (EPC) - indicate the reason for exception in Cause
register (status register) - transfer control to operating system at some
specified address (the OS can then provide some
service taking predefined action in response to
overflow or stopping the program reporting an
error). If OS continues program execution, it
uses EPC to determine where to restart - Another way is vectored interrupts
- the address to which control is transferred is
determined by cause of the exception
I/O device request External Interrupt
Invoke OS from user program Internal Exception
Arithmetic overflow Internal Exception
Using undefined instruction Internal Exception
Hardware malfunctions Either Exception or interrupt
25Exceptions Handlingby Control Unit
- Control unit
- two more control signals EPCWrite CauseWrite
also IntCause - modify the mux to PC to 4-way mux to allow
exception address to PC (the exception address is
OS entry point for exception handling, and is
8000 0180hex for MIPS) - To handle two types of exceptions undefined
instruction arithmetic overflow - add two states in state diagram to do the above
one when no state is defined for the op value at
state 1 (then ? state 10), the other when
overflow is detected from ALU in state 7 (then ?
state 11)
26Chapter Summary
- Part 1
- Elements of datapath instruction subset,
resources, clocking method - Datapath for different instruction classes
- Building single-cycle datapath multiplexors,
functional units, control signals - Single-cycle datapath control unit logic ALU
control, main control - Single-cycle datapath control complete
picture, critical path, problems - Part 2
- Multi-cycle datapath approach, additional
registers multiplexors, control signals - Breaking instructions into execution steps
- Multi-cycle datapath control complete picture
- Finite state machine (FSM) (hardwired) control
controller implementation - Microprogramming control, microinstruction
format, controller implementation, symbolic
microprogram its control signals, issues - Exception Handling