Lecture 8 Reduced Instruction Set Computer - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Lecture 8 Reduced Instruction Set Computer

Description:

Title: Lecture 14 - RISC Author: Last modified by: Created Date: 9/26/1998 11:40:50 AM Document presentation format: – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 42
Provided by: 6649912
Category:

less

Transcript and Presenter's Notes

Title: Lecture 8 Reduced Instruction Set Computer


1
Lecture 8Reduced Instruction Set Computer
2
Lecture 8RISC
  • In this lecture, we will study
  • Program execution characteristics
  • RISC Philosophy
  • Make the most frequently executed statement fast
  • Functional, Transfer instructions
  • Simple, small number of fixed format instructions
  • Large register file
  • Make the most time consuming statements fast
  • Procedure Call and Return instructions
  • Large register file
  • Large Register File
  • Overlapping Register Windows
  • Linear and Circular organization of ORWs
  • Ultimate RISC

3
Instruction Execution CharacteristicsType of
Operations
  • What type of statements is most frequent?
  • Assignment statements dominate
  • Functional instructions and Transfer
    instructions
  • Movements of data must be made simple, thus fast
  • Conditional Statements(if and loop together)
  • Instructions with Control function
  • Sequence control mechanism is important

4
Instruction Execution CharacteristicsTime
Consumed by Statements
Machine instruction weighted Average No.
of machine Instr. / Statements x Frequency of
Occurrences Memory reference weighted
Average No. of memory references / Statement x
Frequency of Occurrences Most time consuming
statement is procedure CALL/RETURN
5
Instruction Execution CharacteristicsType of
Operands
  • Majority of references to scalar
  • 80 are local to a procedure
  • References to arrays/structure require index or
    pointer
  • Locations of operands(Average per instruction)
  • 0.5 operands in memory
  • 1.4 operands in registers

6
Instruction Execution CharacteristicsProcedure
Calls
  • Two most significant aspects in implementing this
    operation
  • Number of parameters
  • Depth of nesting
  • Statistics on Number of Parameters
  • 98 of dynamically called procedures were passed
    fewer than 6 parameters
  • 92 of them used fewer than 6 local scalar
    variables

7
Multiple Register Sets
Multiple register sets - Assume that we have
several sets of registers that each set can be
used by each different procedure - Saves some
time in procedure CALL/RETURN simply by changing
the R set pointer value
R set pointer
Set 0 set 1 set 2 . . .
Set n-1
8
Instruction Execution CharacteristicsDepth of
Procedure Nesting
Procedure Nesting and Register Set Window
t
Depth
Shifting register set window need to save the
information in one register
set in the memory so that a register set can
be used by the new procedure
Statistics Window depth of 8 will need to shift
only on less than 1 of calls and
returns
9
Complex Instruction Set Computer(CISC)
  • Design Philosophy of CISC
  • Distinction between Architecture and
    Implementation via microprogrammed control unit
  • Richer Instruction Set
  • Performance of instruction - powerfulness
  • Reduce Semantic Gap for programming easiness
  • Simplifying compiler functions
  • Larger Microprogram
  • Moving hardware functions to micro-code
  • Moving software functions to micro-code
  • Parallelism
  • Pipelining
  • Multiple function units, processors, computers
  • NO ATTENTION ON INSTRUCTION FREQUENCY,
    TIME-CONSUMING INSTRUCTIONS, etc

10
RISC Philosophy(1)Make the Most Frequent
Statements Execute Fast
Most frequent statements are Assignment Type of
Statements and each of them are translated by the
compiler into a set of Functional Instructions
and/or Transfer Instruction. Thus Functional and
Transfer Instructions need to be made to execute
fast.
Instruction Cycle of Functional Instruction or
Transfer Instruction
11
Assignment Statements
  • To make the Instruction Fetch fast
  • Short OP-code part Small number of instructions
    in the instruction set
  • Short Operand Address part Make the operands in
    the registers instead of M
  • To make the Instruction Preparation fast
  • Fixed length instruction
  • Fixed format instruction
  • Simple addressing modes
  • To make the Operand Fetch fast
  • Make the operands available from registers
    instead of memory
  • Needs a large register file
  • To make the Instruction Execution fast
  • Multiple register set Overlapping MRS
  • Instruction execution pipeline

12
RISC Philosophy(2)Make the Most Time-Consuming
Statements Execute Fast
  • Methods of passing Parameters
  • Through memory
  • Parameters are stored in the memory locations
    which are commonly accessible by both calling
    and called procedures
  • Execution of CALL and RETURN instructions are
    very slow due to the memory accesses, especially
    when there are many parameters to pass
  • Through registers
  • Parameters are stored in the registers in CPU
  • Calling procedure needs to save the registers,
    which are not used for passing parameters, in the
    memory. This results in a lot of memory accesses
    and makes the execution times of these
    instructions slow.

13
Time Out
  • ?? ??? ???? ?? ??? ?? ? ??? ?? ???.
  • ? ? ??? ?? ??? ???? ?????? ???? ? ?? ??? ???? ??.
  • ??? ?? ??? ?? ??, ?? ?????? ???, ???? ??? ???
    ???? ????. ?? ???.
  • ??? ??? ?? ??? ? ?? ??? ??? ??? ?? ???????, ????
    ????? ??? ?? ?????. ???? ??? ??? ?? ??? ??? ????
    ? ?? ??? ???. ??? ??? ?? ?? ?? ???.
  • ??? ??? ?? ?? ???? ????.
  • ??? ?? ?? ????? ? ??? ????? ?? ? ???? ????

14
CISC and RISC
  • RISC
  • A limited and simple instruction set
  • A large number of GPR(Register File)
  • An emphasis on optimizing the instruction
    pipeline

15
Large Register File
If the number of registers is small, it needs a
strategy to keep the most frequently accessed
operands in registers to minimize Register-Memory
traffic - Software approach Maximize
register usage by compiler (Requires
sophisticated program analysis) - Hardware
approach More registers in the register file
16
Register Window
  • Fact
  • Statistically, most operand references are to
    local scalars - 80
  • Local variables to a procedure cannot be accessed
    by other procedure(s)
  • Problem
  • Local changes with each procedure CALL/RETURN
  • CALL/RETURN occurs frequently
  • Parameters need to be passed around
  • Observations
  • Statistically, a few parameters(lt6) and local
    variables(lt6)
  • Statistically, depth of procedure activation
    fluctuates within relatively narrow range(lt8)
  • Solution
  • Multiple small sets of registers
  • Each set is assigned to a different procedures
  • Windows for adjacent procedures overlap to allow
    parameter passing

17
Multiple Register Set
Each Register Set is assigned to a different
procedure - Size of a Register Set is equal to
the size of a window - Parameters need to be
copied in the called/calling procedures Register
Set - Require register move instructions
18
Overlapping Register Window
When the Register Sets are implemented in a large
Register File, we call the Register Set as a
Register Window. Overlapping Register Window -
Portions of register windows overlap for passing
parameters - At any time only one window is
visible - No need for moving information for
parameter passing
How about global variables?
19
Global Variables
  • Global Variables are commonly accessible by all
    the procedures
  • Assign to memory locations by compiler
  • Straight forward but inefficient for the
    frequently accessed global variables because of
    frequent memory accesses
  • Set aside a set of Global Variable registers
  • Available to all procedures
  • Unified register numbering system to simplify
    instruction format
  • e.g. R0 R7 Global

    R8 R13 Current window

20
Linear Organization of Register Windows
21
Circular Organization of Register Windows
n-window register file accommodates n-1 procedure
calls
22
Code Size
  • Smaller programs
  • Program takes less memory space
  • Smaller program improves performance
  • Fewer instructions
  • Fewer bytes to fetch
  • In paging environment, occupy in fewer pages and
    reduces page faults
  • CISC
  • Smaller number of instructions in the
    program(program may be shorter but not
    necessarily smaller space)

23
Example
CISC
Memory Traffic Instruction 56
bits Data 32 x 3 96 bits Total MB
used 56 96 152 bits
RISC LD Rb B
LD Rc
C ADD Ra Rb
Rc ST Ra
A
Memory Traffic Instruction 112
bits Data 96 bits Total MB used 200 bits
24
Characteristic of RISC(1, 2)
  • (1) 1 Instruction per cycle(memory cycle)
  • Machine cycle IF IP Time to fetch the
    operands from registers
    Perform operation Store the result in
    a register
  • RISC instruction ltgt CISC micro-instruction

    gt No need to
    microprogram(Hardwired control)
  • (2) Register-to-Register operation
  • With only simple Load and Store operations for
    accessing memory(Load/Store Arch.)
  • Simplifies the instruction set, and control unit

25
Characteristic of RISC(3, 4)
  • (3) Simple Addressing Modes - Shorten EA
    generation time
  • Almost all instructions use register addressing
  • Relative addressing using PC, BAR, and Index
    address
  • Other complex modes may be synthesized by software
  • (4) Simple Instruction Format - Shorten
    instruction Decoding Time
  • Usually one format
  • Fixed length/align on word boundary
  • Fixed field length

26
Characteristic of RISC(5)
  • (5) Pipelining (We will learn this later in
    detail)
  • At this time, you just need to know that
  • - Instruction execution hardware can be made of
    a few inter- connected independent
    sub-modules, called pipeline STAGEs

- An instruction execution progresses at each
pipeline stage in sequence - When an
instruction completes its execution at the i-th
stage, the next instruction commences
its execution at the i-th stage - Thus, in the
ideal situation, throughput increases nearly n
times, where n is the number of pipeline
stages - Branch instruction makes the
pipelined execution inefficient
27
Laundry Task
  • Laundry Example
  • Ann, Brian, Cathy, Dave each have one load of
    clothes to wash, dry, and fold
  • Washer takes 30 minutes
  • Dryer takes 40 minutes
  • Folder takes 20 minutes

We have 3 different work stages
28
Sequential Laundry
29
Pipelined Laundry
  • Pipelined laundry takes 3.5 hours for 4 loads
  • Maximum of 3 tasks can be carried out concurrently

30
Pipelined Execution
1 instruction execution
I0
t x 4
Execution of a Sequence of Instructions
I0
S3
At 4t I0
N instructions complete at (n3)t When n
is large it becomes nt Thus, 1 instruction
in every t
I1
S3
At 5t I1
I2
At 6t I2
I3
At 7t I3
I4
At 8t I4
31
Pipeline Characteristics
  • Multiple tasks operating simultaneously
  • Pipeline does not help latency of single task,
    but it helps throughput of entire workload
  • Pipeline rate is limited by the slowest pipeline
    stage
  • Unbalanced lengths of pipeline stages reduce
    speedup
  • Potential speedup Number of pipeline stages
  • Time to Fill pipeline and time to drain it
    reduces speedup

32
Time Out
  • ?? ? ??? ??? ?? ??? ??.
  • ??? ??? ?? ? ??? ??? ?? ?? ??? ?? ????.
  • ?? ?? ?? ?????. ?? ?? ???? ????. ??? ???? ?? ?
    ??? ????.
  • ??? ??? ??? ??? ?? ???? ????? ??? ?? ? ?? ?? ??
    ???.
  • ??? ??? ? ???? ??? ???? ??? ??? ??? ?? ?????
  • ??? ????. ???, ??, ?? ??? ?? ?? ?? ? ?? ??.

33
Berkeley RISC
RISC-I and RISC-II A 32-bit processor 31 and 39
instructions, respectively ORW, 138 Rs Window
10 global, 6 temporary, 10 local, 6 parameter
Instruction Format
Cond(flag) C, Z, O, N Rd destination register
Rs1 Source register S2 Functional
Instr. if MSB0, then S2Rs2 another source
register if
MSB1, imm13(13-bit immediate data)
Transfer or Sequencing Instr. if MSB0,
EARs1Rs2 index reg.
if MSB1,
EARs1imm13
RISC-II EAPC S2
34
RISC-II Instruction Set
  • Functional(Ccarry, Rreverse)
  • ADD, ADDC, SUB, SUBC, SUBR, SUBCR, AND, OR, XOR,
    SLL, SRL, SRA
  • Transfer(Xindex, Wword, Hhalf, Bbyte,
    Rrelative, U/Sunsigned)

    (Index EARs1S2(Rs2), Relative
    EAPCS2(Rs2))
  • LDXW, LDXHU(S), LDXBU(S), LDRW, LDRHU(S),
    LDRBU(S)
  • STXW, STXHU(S), STXBU(S), STRW, STRHU(S),
    STRBU(S)
  • Sequence Control
  • JMP, JMPR, CALL, CALLR, RET, CALLINT, RETINT, ...

35
Ultimate RISC Instruction Set
  • BN instruction
  • Conditional branch phase in each instruction
    cycle
  • Does not conform with RISC philosophy, that is,
    inefficient use of instruction pipeline
  • Ultimate RISC instruction set
  • Move the content of the SOURCE(Read) to the
    DESTINATION(Write), both within memory
  • 2-address instruction
  • 1 address fits in an M word
  • 4-cycle instruction

36
Ultimate RISC Architecture
Memory Mapped I/O Memory Mapped ALU PC 1 special
word(address0) ALU contains an accumulator and
flags
Memory Mapped ALU Arithmetic operations -
Special Addresses When ALU is used as a
Destination - Store a value in AC - Operate
on AC When ALU is used as the Source - One
address gets the value of AC - Other addresses
test the conditions code and sets the
destination address

(Branch either one of the 2
consecutive addresses)
37
Memory Mapped ALU
Writing an operand into an address associated
with the operation, reading the resulting from
the result from the other address
38
Condition Codes and Branching
Condition Codes 2(10) True 0(00)
False - Upon testing a CC, it sets the LSB of
the destination address - This allows to branch
either one of the two consecutive instructions
Branch Moving a target address to location 0(PC)
39
Instructions Cycle
Instruction Layout in memory - 2
adjoining words/instruction - Contiguous
storage of instructions
Instruction Cycle - 4 clean cycles for
pipelining 1 Fetch Source Address and
increment PC IS read 2 Read Source
Data RS read 3 Fetch Destination
Address ID read 4 Write Data to
Destination WD write Pipelining with a
4-port memory(3 reads and 1 write)
Instruction 1 IS1 RS1 ID1 WD1 Instruction
2 IS2 RS2 ID2 WD2 Instruction
3 IS3 RS3 ID3 WD3 Instruction
4 IS3 RS4 ID4 WD4
40
Improvement3-Cycle Design
Instruction Cycle - 3 clean cycles 1 Fetch
Source and Destination Addresses and increment
PC ISD read 2 Read Source Data RS read 3
Write Data to Destination WD write 3-way
Pipelining using a 3-port memory(2 read ports and
1 write port) Instruction 1 ISD1 RS1 WD1 Instruc
tion 2 ISD2 RS2 WD2 Instruction
3 ISD3 RS3 WD3 Instruction 4 ISD4 RS4 WD4
41
Improvement2-Cycle Design
Instruction Cycle (2 dedicated memory units 1
instruction, 1 data) 1 Read Data from
Source RS read 2 Write Data to
Destination, WD write Read
instruction, (RI read) Increment
PC 2-way Pipelining Instruction
p1 WDp RSp1 WDp1 Instruction
2 RSp2 WDp2 Instruction 3 RSp3 WDp3 Ins
truction 4 RSp4 WDp4
Write a Comment
User Comments (0)
About PowerShow.com