RISC, CISC, Limitations - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

RISC, CISC, Limitations

Description:

... 15-stage fp pipeline, predictor accessed in fetch. Branch Penalties: ... Speculative execution increases parallelism by fetching, issuing, and completing ... – PowerPoint PPT presentation

Number of Views:161
Avg rating:3.0/5.0
Slides: 29
Provided by: bryand1
Category:

less

Transcript and Presenter's Notes

Title: RISC, CISC, Limitations


1
RISC, CISC, Limitations Solutions
  • Bryan Duggan

2
Overview
  • RISC vs CISC
  • Characteristics of RISC CISC
  • Advantages Disadvantages of RISC CISC
  • Limitation of Von-Neuman Architecture
  • Solutions
  • Pipelining
  • Speculative Execution
  • Branch Prediction
  • Multi-processor Systems

3
Evolution of Instruction Sets
Single Accumulator (EDSAC 1950)
Accumulator Index Registers
(Manchester Mark I, IBM 700 series 1953)
Separation of Programming Model from
Implementation
High-level Language Based
Concept of a Family
(B5000 1963)
(IBM 360 1964)
General Purpose Register Machines
Complex Instruction Sets
Load/Store Architecture
(CDC 6600, Cray 1 1963-76)
(Vax, Intel 432 1977-80)
RISC
(Mips,Sparc,HP-PA,IBM RS6000, . . .1987)
4
RISC/CISC
  • Complex Instruction Set Computer
  • Intel x86
  • DEC VAX, PDP11
  • Motorola 68k
  • IBM 360, 370
  • Complex instructions bring the hardware closer to
    high-level languages
  • Memory was expensive
  • Fewer, more powerful instructions
  • Smaller programs
  • More space for data

5
CISC - ISA
  • Instruction Set Architecture
  • Addressing Modes
  • Additional Instructions
  • Procedure and function call
  • Procedure call overhead is significant
  • Registers (state of the processor) must be saved
    and restored Mot 68k MOVEM x86 PUSH, POP
  • Array Indexing
  • y xijk VAX
  • Math functions
  • sqrt, sin, log, ... Intel x86 8087 Motorola
    68k?
  • Yet more instructions!
  • Graphics support
  • MMX

6
CISC - ISA
  • Instruction count
  • Usually almost 256
  • Maximum number of 8-bit opcodes!
  • Powerful instructions
  • Many microcode steps
  • Multiple cycle latency
  • Faster in microcode than users program
  • Added some complexity to interrupt handling,
    page faulting, etc
  • Instructions too long to be uninterruptible!
  • Variable length, multiple formats
  • 1 to 17 bytes

7
CISC - ISA critique
  • Studies of compilers showed
  • Many instructions unused
  • DEC even dropped an indexed memory access,
    post-decrement y xi-- from the ISA going
    from PDP -gt VAX
  • Compiler writers were sometimes simply not using
    complex instructions when they were appropriate
  • because they could write faster sequences of
    simple instructions for the most common cases
  • Operand Constants
  • -15 to 15 56
  • -511 to 511 98
  • 12 Words of storage for sub routines 95

8
CISC
  • Irrespective of its performance ...
  • Complex hardware is expensive
  • Speed improvements
  • Irregular (long design times)
  • Long lead times to market
  • Instruction set chip hardware become more
    complex with each generation of computers.
  • Number of control words and number of clock
    cycles vary between instructions. Difficult to
    implement instruction pipelining.

9
80x86
  • 1978 The Intel 8086 is announced (16 bit
    architecture)
  • 1980 The 8087 floating point coprocessor is
    added
  • 1982 The 80286 increases address space to 24
    bits, instructions
  • 1985 The 80386 extends to 32 bits, new
    addressing modes
  • 1989-1995 The 80486, Pentium, Pentium Pro add a
    few instructions (mostly designed for higher
    performance)
  • 1997 MMX is addedThis history illustrates
    the impact of the golden handcuffs of
    compatibility

10
RISC Characteristics
  • No universally accepted definition
  • Most of the following
  • Instructions are conceptually simple
  • Instructions are of a uniform length
  • Instructions use one (or very few) instruction
    formats
  • Instruction set is orthogonal
  • Little overlapping of instruction functionality
  • Instructions use very few addressing modes
  • Architecture is a load-and-store architecture
  • Only LOAD and STORE instructions reference memory
  • All operate instructions are register-to-register
  • The ISA supports few data types

11
RISC Characteristics, (Cont'd).
  • Other possible attributes
  • Almost all instructions execute in one clock
    cycle
  • Implementation detail
  • Architecture takes advantage of strengths of
    software
  • All reasonable architectures do
  • Architecture should have many registers
  • Not part of RISC
  • Useful, however, for speeding up CPU

12
Reduced Instruction Set Computer
  • No memory-memory instructions
  • Data loaded to registers
  • lw 3, 0(2)
  • Data stored from registers
  • st 4,40(5)
  • Arithmetic, logical, etc operations are all
  • Register -gt Register
  • Mostly 3-operand type
  • op dest_reg, src_regA, src_regB
  • Mostly 1-cycle in ALU
  • Throughput 1 instruction/cycle
  • Register Windows

13
RISC
  • Simplicity of RISC instructions
  • permits high clock rates
  • long-latency ALU instructions are divided further
    as necessary
  • MIPS R4000 8-stage pipeline
  • All instructions 32-bits
  • Simplifies fetch

14
RISC - Simple Hardware, Complex Compiler
  • Basic hardware is simple
  • and hard-wired
  • ie no microcode
  • but
  • Pipeline stalls can reduce throughput
  • Optimising Compiler needed
  • Fully exploit capabilities
  • Dependence Analysis
  • Instruction re-ordering
  • Avoid pipeline stalls

15
RISC Disadvantages
  • A more sophisticated compiler is required.
  • A sequence of RISC instructions is needed to
    implement complex instructions.
  • Require very fast memory systems to feed them
    instructions.
  • Performance of a RISC application depend
    critically on the quality of the code generated
    by the compiler.

16
Von Neuman Limitation
17
Pipelining
  • Laundry Example
  • Ann, Brian, Cathy, Dave each have one load of
    clothes to wash, dry, and fold
  • Washer takes 30 minutes
  • Dryer takes 40 minutes
  • Folder takes 20 minutes
  • How long to do laundry?

18
(No Transcript)
19
(No Transcript)
20
Pipelining Lessons
  • Pipelining doesnt help latency of single task,
    it helps throughput of entire workload
  • Multiple tasks operating simultaneously using
    different resources
  • Potential speedup Number pipe stages
  • Pipeline rate limited by slowest pipeline stage
  • Unbalanced lengths of pipe stages reduces speedup
  • Time to fill pipeline and time to drain it
    reduces speedup
  • Stall for Dependences

6 PM
7
8
9
Time
T a s k O r d e r
21
How does it work?
IF
E
OF
OS
i
IF
E
OF
OS
I 1
IF
E
OF
OS
I 2
IF Instruction Fetch OF Operand Fetch E
Execute OS Operand Store
22
Pipeline Bubble
IF
E
OF
OS
i-1
SolutionAlways put an instruction after a
branch, even if it is a NOOP
IF BRA N
E
OF
OS
i
IF
E
OF
OS
i1
IF
E
OF
OS
i 2
IF
E
OF
OS
N
IF
E
OF
OS
N 1
BUBBLE
i
N
N 1
N 3
N 2
N 4
N 5
N 6
i-1
23
Data Dependency
  • Consider
  • ADD A, B, Temp1SUB Temp1, C,
    Temp2AND Temp1, Temp2, X
  • Generates bubbles

24
Branch Prediction
  • Predicting the outcome of a branch
  • Conditional/Unconditional
  • Direction
  • Taken / Not Taken
  • Direction predictors
  • Target Address
  • PCoffset (Taken)/ PC4 (Not Taken)
  • Why do we need branch prediction?
  • Increases the number of instructions available
    for the scheduler to issue. Increases
    instruction level parallelism (ILP)
  • Allows useful work to be completed while waiting
    for the branch to resolve

25
Branch Prediction Strategies
  • Static
  • Decided before runtime
  • Examples
  • Always-Not Taken
  • Always-Taken
  • Backwards Taken, Forward Not Taken (BTFNT)
  • Profile-driven prediction
  • Dynamic
  • Prediction decisions may change during the
    execution of the program
  • AMD Athlon K7
  • 10-stage integer, 15-stage fp pipeline, predictor
    accessed in fetch
  • Branch Penalties
  • Correct Predict Taken 1 cycle
  • Mispredict penalty at least 10 cycles

26
Speculative Execution
  • Speculative execution increases parallelism by
    fetching, issuing, and completing instructions
    even in the presence of unresolved conditional
    branches and possible exceptions.

27
Multi Processors
  • MIMD
  • Multiple Instruction stream, Multiple Data stream
  • MIMD is often SPMD (Single Program Multiple Data)
  • Processors independently execute programs
  • Interactions between processors are costly
  • Flexible, can do this here, that there
  • Debugging is complicated by races and lack of
    repeatability
  • SMP is Symmetric MultiProcessor
  • Multiple processors as interchangeable peers
  • SMP usually implies
  • MIMD execution mode
  • Shared memory
  • Some problems are inherently sequential

28
Summary
  • RISC
  • CISC
  • Limitations of Von Neumann
  • Pipelining
  • Branch prediction
  • Speculative Execution
  • Multi-processor systems
Write a Comment
User Comments (0)
About PowerShow.com