Chapter 13 Reduced Instruction Set Computers (RISC) - PowerPoint PPT Presentation

1 / 44

About This Presentation

Title:

Chapter 13 Reduced Instruction Set Computers (RISC)

Description:

Graph Coloring Algorithm for Reg Assign. Given: A graph of nodes and edges ... Try to color the graph with n colors, where n is the number of real registers ... – PowerPoint PPT presentation

Number of Views:96

Avg rating:3.0/5.0

Slides: 45

Provided by: adria216

Learn more at: http://faculty.washington.edu

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 13 Reduced Instruction Set Computers (RISC)

1
Chapter 13Reduced Instruction Set Computers
(RISC)

CISC Complex Instruction Set Computer
RISC Reduced Instruction Set Computer

2
Some Major Advances in Computers in 50 years

VLSI
The family concept
Microprogrammed control unit
Cache memory
MiniComputers
Microprocessors
Pipelining
PCs
Multiple processors
RISC processors

3
RISC

Reduced Instruction Set Computer
Key features
Large number of general purpose registers
(or use of compiler technology to optimize
register use)
Limited and simple instruction set
Emphasis on optimising the instruction pipeline
memory management

4
Comparison of processors
5
Driving force for CISC

Software costs far exceed hardware costs
Increasingly complex high level languages
A Semantic gap between HHL ML
This Leads to
Large instruction sets
More addressing modes
Hardware implementations of HLL statements
e.g. CASE (switch) on VAX (long, complex
structure)

6
Intention of CISC

Ease compiler writing
Improve execution efficiency
Complex operations in microcode
Support more complex HLLs

7
Execution Characteristics Studied

What was studied?
Operations performed
Operands used
Execution sequencing
How was it Studied?
Studies was done based on programs written in
HLLs
Dynamic studies measured during the execution of
the program

8
Operations

Assignments
Movement of data
Conditional statements (IF, LOOP)
Sequence control
Observations?
Procedure call-return is very time consuming
Some HLL instructions lead to very many machine
code operations

9
Weighted Relative Dynamic Frequency of HLL
Operations Patterson
10
Operands

Observations?
Predominately local scalar variables
Implications?
Optimization should concentrate on accessing
local variables

11
Procedure Calls

Observations?
Context switching is quite time consuming
Depends on number of parameters passed
Depends on level of nesting
Most programs do not do a lot of calls followed
by lots of returns
Most variables used are local

12
Implications ? Characterize RISC

Best support is provided by optimising
most utilized features and
most time consuming features
Conclusions
Large number of registers
Used for operand referencing
Careful design of pipelines
Address branching - Branch prediction etc.
Simplified instruction set
Reduced length
Reduced number

13
Register File

Software solution
Require compiler to allocate registers
Allocate based on most used variables in a given
time
?Requires sophisticated program analysis
Hardware solution
Have more registers
? Thus more variables will be in registers

14
Using Registers for Local Variables

Store local scalar variables in registers
Reduces memory accesses
Every procedure (function) call changes locality
Parameters must be passed
Partial context switch
Results must be returned
Variables from calling program must be restored
Partial Context switch

15
Using Register Windows

Observations
Typically only few local Pass parameters
Typically limited range of depth of calls
Implications
Use multiple small sets of registers
Calls switch to a different set of registers
Returns switch back to a previously used set of
registers
Partition register set

16
Using Register Windows cont.

Partition register set into
Local registers
Parameter registers (Passed Parameters)
Temporary registers (Passing Parameters)
Then
Temporary registers from one set overlap
parameter registers from the next
? This provides parameter passing without moving
data (just move one pointer)

17
Overlapping Register Windows
Picture of Calls Returns
18
Circular Buffer diagram
19
Operation of Circular Buffer

When a call is made, a current window pointer is
moved to show the currently active register
window
If all windows are in use, an interrupt is
generated and the oldest window (the one furthest
back in the call nesting) is saved to memory
A saved window pointer indicates where the next
saved windows should be restored

20
Global Variables

How should we accommodate Global Variables?
Allocate by the compiler to memory
Have a static set of registers for global
variables
Put them in cache

21
Registers v Cache which is better?
22
Referencing a Scalar - Window Based Register File
23
Referencing a Scalar - Cache
24
Compiler Based Register Optimization

Basis
Assuming relatively small number of registers
(16-32)
Optimizing the use is up to compiler
HLL programs have no explicit references to
registers
Process
Assign symbolic or virtual register to each
candidate variable
Map (unlimited) symbolic registers to (limited)
real registers
Symbolic registers that do not overlap can share
real registers
If you run out of real registers some variables
use memory

25
Graph Coloring Algorithm for Reg Assign

Given
A graph of nodes and edges
Nodes are symbolic registers
Two symbolic registers that are live in the same
program fragment are joined by an edge
Then
Assign a color to each node
Adjacent nodes must have different colors
Assign minimum number of colors
And then
Try to color the graph with n colors, where n is
the number of real registers
Nodes that can not be colored are placed in memory

26
Graph Coloring Algorithm Example
27
The debate Why CISC (1 of 2)?

Compiler simplification?
Dispute
- Complex machine instructions are harder
to exploit
- Optimization actually may be more
difficult
Smaller programs? (Memory is now cheap)
Programs may take up less instructions, but
May not occupy less memory,
just look shorter in symbolic form
More instructions require longer op-codes, more
memory references
Register references require fewer bits

28
The Debate Why CISC (2 of 2)?

Faster programs?
More complex control unit
Microprogram control store larger
? Thus instructions take longer to execute
Bias towards use of simpler instructions ?
It is far from clear that CISC is the appropriate
solution

29
Early RISC Computers

MIPS Microprocessor without Interlocked
Pipeline Stages
Stanford (John Hennessy)
MIPS Technology
SPARC Scalable Processor Architecture
Berkeley (David Patterson)
Sun Microsystems
801 IBM Research (George Radin)

30
Concentrating on RISC

Major Characteristics
One instruction per cycle
Register to register operations
Few, simple addressing modes
Few, simple instruction formats
Also
Hardwired design (no microcode)
Fixed instruction format
But
More compile time/effort

31
Breadth of RISC Characteristics
32
Characteristics of Example Processors
33
Memory to memory vs Register to memory Operations
Lab Project 1
34
Controversy CISC vs RISC

Challenges of comparison
There are no pair of RISC and CISC that are
directly comparable
There are no definitive set of test programs
It is difficult to separate hardware effects from
complier effects
Most comparisons are done on toy rather than
production machines
Most commercial machines are a mixture

35
Controversy RISC v CISC

Not clear cut
Todays designs borrow from both philosophies

36
RISC Pipelining basics

Two phases of execution for register based
instructions
I Instruction fetch
E Execute
ALU operation with register input and output
For load and store there need to be three
I Instruction fetch
E Execute
Calculate memory address
D Memory
Register to memory or memory to register operation

37
Effects of RISC Pipelining
(Allows 2 memory accesses per stage)
(E1 register read, E2 execute register
write Particularly beneficial if E phase can be
longer)
38
Optimization of RISC Pipelining

Delayed branch
Leverages branch that does not take effect until
after execution of following instruction
This, following instruction becomes the delay
slot

39
Normal vs Delayed Branch
40
Example of Delayed Branch (cleaver!)
What is wrong with this example? Why is there a
Write back
41
More Options for RISC Architectures

RISC Pipelining
Superpipelined more fine grained pipeline
(more stages in
pipeline)
Superscaler replicates stages of pipeline
(multiple
pipelines)

42
MIPS 4000 RISC Machine

64 bit architecture (4 Gig address space)
(1 Terabyte of file mapping)
Partitioned into CPU MMU
32 registers (R00), but
128K Cache ½ Instructions, ½ Data
One 32 bit word for each instruction (94
Instructions)
All operations are register to register
No condition codes! Flags are stored in general
registers for explicit use simplifies branch
optimization
Only on load/Store Format Base, Offset
extended addressing synthesized with multiple
instructions
Uses branch prediction
Especially designed for Embedded computing
Has multiple FPUs FP likely stalls pipeline