Lecture 8 Reduced Instruction Set Computer - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Lecture 8 Reduced Instruction Set Computer

Description:

Title: Lecture 14 - RISC Author: Last modified by: Created Date: 9/26/1998 11:40:50 AM Document presentation format: – PowerPoint PPT presentation

Number of Views:79

Avg rating:3.0/5.0

Slides: 42

Provided by: 6649912

Category:

more less

Transcript and Presenter's Notes

Title: Lecture 8 Reduced Instruction Set Computer

1
Lecture 8Reduced Instruction Set Computer
2
Lecture 8RISC

In this lecture, we will study
Program execution characteristics
RISC Philosophy
Make the most frequently executed statement fast
Functional, Transfer instructions
Simple, small number of fixed format instructions
Large register file
Make the most time consuming statements fast
Procedure Call and Return instructions
Large register file
Large Register File
Overlapping Register Windows
Linear and Circular organization of ORWs
Ultimate RISC

3
Instruction Execution CharacteristicsType of
Operations

What type of statements is most frequent?
Assignment statements dominate
Functional instructions and Transfer
instructions
Movements of data must be made simple, thus fast
Conditional Statements(if and loop together)
Instructions with Control function
Sequence control mechanism is important

4
Instruction Execution CharacteristicsTime
Consumed by Statements
Machine instruction weighted Average No.
of machine Instr. / Statements x Frequency of
Occurrences Memory reference weighted
Average No. of memory references / Statement x
Frequency of Occurrences Most time consuming
statement is procedure CALL/RETURN
5
Instruction Execution CharacteristicsType of
Operands

Majority of references to scalar
80 are local to a procedure
References to arrays/structure require index or
pointer

Locations of operands(Average per instruction)
0.5 operands in memory
1.4 operands in registers

6
Instruction Execution CharacteristicsProcedure
Calls

Two most significant aspects in implementing this
operation
Number of parameters
Depth of nesting
Statistics on Number of Parameters
98 of dynamically called procedures were passed
fewer than 6 parameters
92 of them used fewer than 6 local scalar
variables

7
Multiple Register Sets
Multiple register sets - Assume that we have
several sets of registers that each set can be
used by each different procedure - Saves some
time in procedure CALL/RETURN simply by changing
the R set pointer value
R set pointer
Set 0 set 1 set 2 . . .
Set n-1
8
Instruction Execution CharacteristicsDepth of
Procedure Nesting
Procedure Nesting and Register Set Window
t
Depth
Shifting register set window need to save the
information in one register
set in the memory so that a register set can
be used by the new procedure
Statistics Window depth of 8 will need to shift
only on less than 1 of calls and
returns
9
Complex Instruction Set Computer(CISC)

Design Philosophy of CISC
Distinction between Architecture and
Implementation via microprogrammed control unit
Richer Instruction Set
Performance of instruction - powerfulness
Reduce Semantic Gap for programming easiness
Simplifying compiler functions
Larger Microprogram
Moving hardware functions to micro-code
Moving software functions to micro-code
Parallelism
Pipelining
Multiple function units, processors, computers
NO ATTENTION ON INSTRUCTION FREQUENCY,
TIME-CONSUMING INSTRUCTIONS, etc

10
RISC Philosophy(1)Make the Most Frequent
Statements Execute Fast
Most frequent statements are Assignment Type of
Statements and each of them are translated by the
compiler into a set of Functional Instructions
and/or Transfer Instruction. Thus Functional and
Transfer Instructions need to be made to execute
fast.
Instruction Cycle of Functional Instruction or
Transfer Instruction
11
Assignment Statements

To make the Instruction Fetch fast
Short OP-code part Small number of instructions
in the instruction set
Short Operand Address part Make the operands in
the registers instead of M
To make the Instruction Preparation fast
Fixed length instruction
Fixed format instruction
Simple addressing modes
To make the Operand Fetch fast
Make the operands available from registers
instead of memory
Needs a large register file
To make the Instruction Execution fast
Multiple register set Overlapping MRS
Instruction execution pipeline

12
RISC Philosophy(2)Make the Most Time-Consuming
Statements Execute Fast

Methods of passing Parameters
Through memory
Parameters are stored in the memory locations
which are commonly accessible by both calling
and called procedures
Execution of CALL and RETURN instructions are
very slow due to the memory accesses, especially
when there are many parameters to pass
Through registers
Parameters are stored in the registers in CPU
Calling procedure needs to save the registers,
which are not used for passing parameters, in the
memory. This results in a lot of memory accesses
and makes the execution times of these
instructions slow.

13
Time Out

?? ??? ???? ?? ??? ?? ? ??? ?? ???.
? ? ??? ?? ??? ???? ?????? ???? ? ?? ??? ???? ??.
??? ?? ??? ?? ??, ?? ?????? ???, ???? ??? ???
???? ????. ?? ???.
??? ??? ?? ??? ? ?? ??? ??? ??? ?? ???????, ????
????? ??? ?? ?????. ???? ??? ??? ?? ??? ??? ????
? ?? ??? ???. ??? ??? ?? ?? ?? ???.
??? ??? ?? ?? ???? ????.
??? ?? ?? ????? ? ??? ????? ?? ? ???? ????

14
CISC and RISC

RISC
A limited and simple instruction set
A large number of GPR(Register File)
An emphasis on optimizing the instruction
pipeline

15
Large Register File
If the number of registers is small, it needs a
strategy to keep the most frequently accessed
operands in registers to minimize Register-Memory
traffic - Software approach Maximize
register usage by compiler (Requires
sophisticated program analysis) - Hardware
approach More registers in the register file
16
Register Window

Fact
Statistically, most operand references are to
local scalars - 80
Local variables to a procedure cannot be accessed
by other procedure(s)
Problem
Local changes with each procedure CALL/RETURN
CALL/RETURN occurs frequently
Parameters need to be passed around
Observations
Statistically, a few parameters(lt6) and local
variables(lt6)
Statistically, depth of procedure activation
fluctuates within relatively narrow range(lt8)
Solution
Multiple small sets of registers
Each set is assigned to a different procedures
Windows for adjacent procedures overlap to allow
parameter passing

17
Multiple Register Set
Each Register Set is assigned to a different
procedure - Size of a Register Set is equal to
the size of a window - Parameters need to be
copied in the called/calling procedures Register
Set - Require register move instructions
18
Overlapping Register Window
When the Register Sets are implemented in a large
Register File, we call the Register Set as a
Register Window. Overlapping Register Window -
Portions of register windows overlap for passing
parameters - At any time only one window is
visible - No need for moving information for
parameter passing
How about global variables?
19
Global Variables

Global Variables are commonly accessible by all
the procedures
Assign to memory locations by compiler
Straight forward but inefficient for the
frequently accessed global variables because of
frequent memory accesses
Set aside a set of Global Variable registers
Available to all procedures
Unified register numbering system to simplify
instruction format
e.g. R0 R7 Global

R8 R13 Current window

20
Linear Organization of Register Windows
21
Circular Organization of Register Windows
n-window register file accommodates n-1 procedure
calls
22
Code Size

Smaller programs
Program takes less memory space
Smaller program improves performance
Fewer instructions
Fewer bytes to fetch
In paging environment, occupy in fewer pages and
reduces page faults
CISC
Smaller number of instructions in the
program(program may be shorter but not
necessarily smaller space)

23
Example
CISC
Memory Traffic Instruction 56
bits Data 32 x 3 96 bits Total MB
used 56 96 152 bits
RISC LD Rb B
LD Rc
C ADD Ra Rb
Rc ST Ra
A
Memory Traffic Instruction 112
bits Data 96 bits Total MB used 200 bits
24
Characteristic of RISC(1, 2)

(1) 1 Instruction per cycle(memory cycle)
Machine cycle IF IP Time to fetch the
operands from registers
Perform operation Store the result in
a register
RISC instruction ltgt CISC micro-instruction

gt No need to
microprogram(Hardwired control)
(2) Register-to-Register operation
With only simple Load and Store operations for
accessing memory(Load/Store Arch.)
Simplifies the instruction set, and control unit

25
Characteristic of RISC(3, 4)

(3) Simple Addressing Modes - Shorten EA
generation time
Almost all instructions use register addressing
Relative addressing using PC, BAR, and Index
address
Other complex modes may be synthesized by software

(4) Simple Instruction Format - Shorten
instruction Decoding Time
Usually one format
Fixed length/align on word boundary
Fixed field length

26
Characteristic of RISC(5)

(5) Pipelining (We will learn this later in
detail)
At this time, you just need to know that
- Instruction execution hardware can be made of
a few inter- connected independent
sub-modules, called pipeline STAGEs

- An instruction execution progresses at each
pipeline stage in sequence - When an
instruction completes its execution at the i-th
stage, the next instruction commences
its execution at the i-th stage - Thus, in the
ideal situation, throughput increases nearly n
times, where n is the number of pipeline
stages - Branch instruction makes the
pipelined execution inefficient
27
Laundry Task

Laundry Example
Ann, Brian, Cathy, Dave each have one load of
clothes to wash, dry, and fold

Washer takes 30 minutes

Dryer takes 40 minutes

Folder takes 20 minutes

We have 3 different work stages
28
Sequential Laundry
29
Pipelined Laundry

Pipelined laundry takes 3.5 hours for 4 loads
Maximum of 3 tasks can be carried out concurrently

30
Pipelined Execution
1 instruction execution
I0
t x 4
Execution of a Sequence of Instructions
I0
S3
At 4t I0
N instructions complete at (n3)t When n
is large it becomes nt Thus, 1 instruction
in every t
I1
S3
At 5t I1
I2
At 6t I2
I3
At 7t I3
I4
At 8t I4
31
Pipeline Characteristics

Multiple tasks operating simultaneously
Pipeline does not help latency of single task,
but it helps throughput of entire workload
Pipeline rate is limited by the slowest pipeline
stage
Unbalanced lengths of pipeline stages reduce
speedup
Potential speedup Number of pipeline stages
Time to Fill pipeline and time to drain it
reduces speedup

32
Time Out

?? ? ??? ??? ?? ??? ??.
??? ??? ?? ? ??? ??? ?? ?? ??? ?? ????.
?? ?? ?? ?????. ?? ?? ???? ????. ??? ???? ?? ?
??? ????.
??? ??? ??? ??? ?? ???? ????? ??? ?? ? ?? ?? ??
???.
??? ??? ? ???? ??? ???? ??? ??? ??? ?? ?????
??? ????. ???, ??, ?? ??? ?? ?? ?? ? ?? ??.

33
Berkeley RISC
RISC-I and RISC-II A 32-bit processor 31 and 39
instructions, respectively ORW, 138 Rs Window
10 global, 6 temporary, 10 local, 6 parameter
Instruction Format
Cond(flag) C, Z, O, N Rd destination register
Rs1 Source register S2 Functional
Instr. if MSB0, then S2Rs2 another source
register if
MSB1, imm13(13-bit immediate data)
Transfer or Sequencing Instr. if MSB0,
EARs1Rs2 index reg.
if MSB1,
EARs1imm13
RISC-II EAPC S2
34
RISC-II Instruction Set

Functional(Ccarry, Rreverse)
ADD, ADDC, SUB, SUBC, SUBR, SUBCR, AND, OR, XOR,
SLL, SRL, SRA
Transfer(Xindex, Wword, Hhalf, Bbyte,
Rrelative, U/Sunsigned)

(Index EARs1S2(Rs2), Relative
EAPCS2(Rs2))
LDXW, LDXHU(S), LDXBU(S), LDRW, LDRHU(S),
LDRBU(S)
STXW, STXHU(S), STXBU(S), STRW, STRHU(S),
STRBU(S)
Sequence Control
JMP, JMPR, CALL, CALLR, RET, CALLINT, RETINT, ...

35
Ultimate RISC Instruction Set

BN instruction
Conditional branch phase in each instruction
cycle
Does not conform with RISC philosophy, that is,
inefficient use of instruction pipeline
Ultimate RISC instruction set
Move the content of the SOURCE(Read) to the
DESTINATION(Write), both within memory
2-address instruction
1 address fits in an M word
4-cycle instruction

36
Ultimate RISC Architecture
Memory Mapped I/O Memory Mapped ALU PC 1 special
word(address0) ALU contains an accumulator and
flags
Memory Mapped ALU Arithmetic operations -
Special Addresses When ALU is used as a
Destination - Store a value in AC - Operate
on AC When ALU is used as the Source - One
address gets the value of AC - Other addresses
test the conditions code and sets the
destination address

(Branch either one of the 2
consecutive addresses)
37
Memory Mapped ALU
Writing an operand into an address associated
with the operation, reading the resulting from
the result from the other address
38
Condition Codes and Branching
Condition Codes 2(10) True 0(00)
False - Upon testing a CC, it sets the LSB of
the destination address - This allows to branch
either one of the two consecutive instructions
Branch Moving a target address to location 0(PC)
39
Instructions Cycle
Instruction Layout in memory - 2
adjoining words/instruction - Contiguous
storage of instructions
Instruction Cycle - 4 clean cycles for
pipelining 1 Fetch Source Address and
increment PC IS read 2 Read Source
Data RS read 3 Fetch Destination
Address ID read 4 Write Data to
Destination WD write Pipelining with a
4-port memory(3 reads and 1 write)
Instruction 1 IS1 RS1 ID1 WD1 Instruction
2 IS2 RS2 ID2 WD2 Instruction
3 IS3 RS3 ID3 WD3 Instruction
4 IS3 RS4 ID4 WD4
40
Improvement3-Cycle Design
Instruction Cycle - 3 clean cycles 1 Fetch
Source and Destination Addresses and increment
PC ISD read 2 Read Source Data RS read 3
Write Data to Destination WD write 3-way
Pipelining using a 3-port memory(2 read ports and
1 write port) Instruction 1 ISD1 RS1 WD1 Instruc
tion 2 ISD2 RS2 WD2 Instruction
3 ISD3 RS3 WD3 Instruction 4 ISD4 RS4 WD4
41
Improvement2-Cycle Design
Instruction Cycle (2 dedicated memory units 1
instruction, 1 data) 1 Read Data from
Source RS read 2 Write Data to
Destination, WD write Read
instruction, (RI read) Increment
PC 2-way Pipelining Instruction
p1 WDp RSp1 WDp1 Instruction
2 RSp2 WDp2 Instruction 3 RSp3 WDp3 Ins
truction 4 RSp4 WDp4

Write a Comment

User Comments (0)