Title: RISC:Reduced Instruction Set Computing
1RISCReduced Instruction Set Computing
2Overview
- What is RISC architecture?
- How did RISC evolve?
- How does RISC use instruction pipelining?
- How does RISC use register windowing?
- What is the future of RISC ?
3Early Microprocessors
- Early Microprocessors were very simple
- They had a small instruction set
- Gradually, more and more instructions were added
4CISC Complex Instruction Set Computing
- May include over 300 instructions
- Approximately a 11 relationship with higher
level languages - Only some of these instructions are used all the
time
5Why are more instructions slower ?
- A 16 instruction set uses a 4 to 16 decoder
- If you had a 32 instruction set, you would have
to use a 5 to 32 decoder - The larger the decoder, the longer the
propagation delay
6Problem with CISC
- The more instructions in the instruction set, the
larger the propagation delay - CISC is too slow
7Get rid of some of those Instructions
- It takes 20 ns to complete each instruction
- If we reduce the instruction set, we can get it
down to 18 ns to complete each instruction - Every instruction we deleted can be replaced by 3
of the simpler remaining instructions - We choose to eliminate instructions used less
than 2 of the time
8Consider This
- 100(20 c) vs. 98 (18c) 2(54c)
- 20c vs. 17.64c 1.08 c
- 20c gt 18.72c
- In this case, reducing instructions is faster
9Dont reduce too much
- - say we eliminate instructions used 10 of the
time - 100(20 c) vs. 90 (18c) 10(54c)
- 20c vs. 16.2c 5.4 c
- 20c lt 21.6c
- If we reduce our instruction set too much, the
end result could be slower
10RISC Reduced Instruction Set Architecture
- Fewer than 100 instructions in instruction set
- Fixed Length Instructions
- Limited Loading and Storing instructions
- Fewer Addressing modes
- Instruction Pipeline
- Large number of registers
11RISCReduced Instruction Set Architecture cont.
- Hardwired control unit
- Delayed loads and branches
- Speculative Execution of Instructions
- Optimizing compiler
- Separated Instruction and Data Streams
12RISC vs. CISC
- RISC
- Faster
- Less complicated instruction set
- More difficult to program
- CISC
- Slower
- More complicated instruction set
- Easier to program
13ExFixed Length InstructionsInstructional
Formats for SPARC CPU
14Sparc CPU addr1?r2r3
1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
Register 1
Add
Register 2
Not used
Register 3
- Format of instruction op2 add
- Destination register 00001 register 1
- Add 000000
- Source register 00010 register 2
- 0 00000000 unused in this instruction
- Source register register 3
15Pipelines
16Assembly Lines and Pipelines
Why are assembly lines cool? Work on more than
one item at a time Finish more items faster
17Instruction Pipelines
- Very similar to assembly lines in manufacturing
- Divides the execution of a task into several
stages - Then it can work on more than one task at a time
- Overall, faster , and more efficient
18Pipeline example 3 stages
Execute Instruction Store Result
Fetch instruction
Decode Instruction Select registers
Each stage must be completed in 1 clock cycle for
this to work
19Example 1r1?r2 r3r4 ?r5r6r7 ?r8r9
20t1
t2
t3
t4
t5
Decode instruction 1, select registers
Execute instruction 1, store results
Fetch instruction 1
r1?r2 r3r4 ?r5r6r7 ?r8r9
Execute instruction 2, store results
Decode instruction 2, select registers
Fetch instruction 2
10 0100 000000 01010 0 00000000 00110
5611 r7 ? 11
Add r5 r6 r55, r66
Execute instruction 3, store results
Decode instruction 3, select registers
Fetch instruction 3
10 0111 000000 01000 0 00000000 01001
8917 r7 ? 17
Add r8 r9 r88,r99
21Consider a more problematic example
r1?r2 r3 r4?r1 r3 r5?r6 r3
22t1
t2
t3
t4
t5
Decode instruction 1, select registers
Execute instruction 1, store results
Fetch instruction 1
r1?r2 r3r4 ?r1r3r5 ?r6r3
Execute instruction 2, store results
Decode instruction 2, select registers
Fetch instruction 2
10 0100 000000 00001 0 00000000 00011
314 r4 ? 4
Add r1 r3 r11, r33
Problem data conflict Since t3 is not yet
completed, r1 contains wrong value
Execute instruction 3, store results
Decode instruction 3, select registers
Fetch instruction 3
10 0111 000000 01000 0 00000000 01001
639 r5 ? 9
Add r6 r3 r66,r33
23Solutions to Data Conflict
- No-op insertions
- Instruction reordering
- Stall insertions
- Data forwarding
24Solution1 add No Op
t1
t2
t3
t4
t5
Decode instruction 1, select registers
Execute instruction 1, store results
Fetch instruction 1
r1?r2 r3r4 ?r1r3r5 ?r6r3
Execute instruction 2, store results
Decode instruction 2, select registers
No op
Fetch instruction 2
10 0100 000000 00001 0 00000000 00011
358 r4 ? 4
No OP
Add r1 r3 r15, r33
Execute instruction 3, store results
Decode instruction 3, select registers
Fetch instruction 3
639 r5 ? 9
10 0111 000000 01000 0 00000000 01001
Add r6 r3 r66,r33
25Possible problems with no-op
26Solution2 instruction reordering
t1
t2
t3
t4
t5
Decode instruction 1, select registers
Execute instruction 1, store results
r1?r2 r3r5 ?r6r3 r4 ?r1r3
Fetch instruction 1
Execute instruction 2, store results
Decode instruction 2, select registers
Fetch instruction 2
10 0100 000000 00001 0 00000000 00011
639 r5 ? 9
Add r6 r3 r66, r33
Execute instruction 3, store results
Decode instruction 3, select registers
Fetch instruction 3
538 r1 ? 8
10 0111 000000 01000 0 00000000 01001
Add r1 r3 r15,r33
27Possible problems with re-ordering
- It is not possible to reorder every set of
operations successfully - Consider
- r1?r1 r2r1?r1 r3r1?r1 r4
28Solution3 add stall insertion
t1
t2
t3
t4
t5
Decode instruction 1, select registers
Execute instruction 1, store results
Fetch instruction 1
r1?r2 r3r4 ?r1r3r5 ?r6r3
Execute instruction 2, store results
Decode instruction 2, select registers
stall
Fetch instruction 2
10 0100 000000 00001 0 00000000 00011
358 r4 ? 4
stall
Add r1 r3 r15, r33
Execute instruction 3, store results
Decode instruction 3, select registers
Fetch instruction 3
639 r5 ? 9
10 0111 000000 01000 0 00000000 01001
Add r6 r3 r66,r33
29Solution4 data forwarding
t1
t2
t3
t4
t5
Decode instruction 1, select registers
Execute instruction 1, store results
Fetch instruction 1
r1?r2 r3r4 ?r1r3r5 ?r6r3
Execute instruction 2, store results
Decode instruction 2, select registers
Fetch instruction 2
10 0100 000000 00001 0 00000000 00011
358 r4 ? 4
Add r1 r3 r15, r33
Execute instruction 3, store results
Data passed within same time cycle to next
instruction
Decode instruction 3, select registers
Fetch instruction 3
10 0111 000000 01000 0 00000000 01001
Add r6 r3 r66,r33
639 r5 ? 9
30Solutions to Data Conflict
No-Op insertions Slow and Wasteful
Stall insertions Slow and Wasteful
Instruction Reordering not always possible
Data forwarding
31Register Windowing
32Each window overlaps with the next
Main method would be window1 Subroutine is window
2 Since they overlap, window 2 can return values
to window 1 easily
33Summary
- RISC architecture defined
- Benefits and drawbacks of RISC architecture
- Pipelines
- Problems with pipelines
- Register Windowing
34Future of RISC
- Hotly debated
- CISC is still easier to support
- Provides backward compatibility
- RISC is faster
- More than likely, see a convergence of the 2
systems - Ex Pentium Processor