CSC 159 COMPUTER ORGANIZATION Diploma in Computer Science CS110 - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

CSC 159 COMPUTER ORGANIZATION Diploma in Computer Science CS110

Description:

You open the door for her, pick up her bag after she drops it, offer her a ride, ... Marry me' She gives you a nice hard slap on your face. That's Customer Feedback ... – PowerPoint PPT presentation

Number of Views:2755
Avg rating:5.0/5.0
Slides: 34
Provided by: azlanab
Category:

less

Transcript and Presenter's Notes

Title: CSC 159 COMPUTER ORGANIZATION Diploma in Computer Science CS110


1
CSC 159 COMPUTER ORGANIZATIONDiploma in
Computer Science CS110
  • Chapter 5
  • Modern Computer System

2
  • Pipeline
  • Superscalar
  • Cache Associative, direct mapped
  • Virtual Memory
  • Multiprocessing
  • RISC

3
Subject
  • You see a gorgeous girl at a party. You go up to
    her and say, "I am very rich. Marry me!" That's
    Direct Marketing
  • You're at a party with a bunch of friends and see
    a gorgeous girl. One of your friends goes up to
    her and pointing at you says, "He's very rich.
    Marry him." That's Advertising.
  • You see a gorgeous girl at a party. You go up to
    her and get her telephone number. The next day
    you call and say, "Hi, I'm very rich. Marry me."
    That's Telemarketing.
  • You're at a party and see a gorgeous girl. You
    get up and straighten your tie you walk up to
    her and pour her a drink. You open the door for
    her, pick up her bag after she drops it, offer
    her a ride, and then say, "By the way, I'm very
    rich "Will you marry me?" That's Public
    Relations.
  • You're at a party and see a gorgeous girl. She
    walks up to you and says, "You are very rich, I
    want to marry you." That's Brand Recognition.
  • You see a gorgeous girl at a party. You go up to
    her and say, "I'm rich. Marry me" She gives you
    a nice hard slap on your face. That's Customer
    Feedback

4
  • Instruction execution is extremely complex and
    involves several operations which are executed
    successively (see slide 2). This implies a large
  • amount of hardware, but only one part of this
    hardware works at a given moment.
  • Pipelining is an implementation technique whereby
    multiple instructions are overlapped in
    execution. This is solved without additional
    hardware but only
  • by letting different parts of the hardware work
    for different instructions at the same time.
  • The pipeline organization of a CPU is similar to
    an assembly line the work to be done in an
    instruction is broken into smaller steps
    (pieces), each of which takes a fraction of the
    time needed to complete the entire instruction.
    Each of these steps is a pipe stage (or a pipe
    segment).
  • Pipe stages are connected to form a pipe
  • The time required for moving an instruction from
    one stage to the next a machine cycle(often this
    is one clock cycle). The execution of one
    instruction takes several machine cycles as it
    passes through the pipeline.

5
Pipelining
  • Pipelining
  • the laundry analogy n Suppose you had to do 4
    loads of laundry and each stage takes 30 minutes.
  • Which is faster, doing it sequentially or
    pipelined?
  • ExeTimeseq 4430 480
  • ExeTimeseq 43090 210
  • How much faster?
  • So pipelined is 2.3 (480/210) faster than
    sequential

6
Pipelining so whats the best to expect?
  • Pipelining so whats the best to expect?
  • Suppose you had 1000 loads to do!
  • ExeTimeseq 1000430 120,000 minutes
  • ExeTimeseq 10003090 30,090 minutes
  • How much faster is this case?
  • Perfratio 120000min/30090min 3.8
  • Here pipelined is 3.98 faster than sequential
  • So, as the number of loads increases, this number
    approaches the number of stages in the pipeline
  • So the more stages, the more concurrency, hence
    better throughput
  • BUT no change in execution time per load!

7
Pipelining why such improvement?
  • Multiple tasks happen simultaneously
  • Each resource is kept busy (usually)
  • Except when filling and draining pipe
  • Not much idle time
  • Pipelining what permits this parallelism?
  • Each stage is independent of others
  • Some method to transition from one stage to the
    next
  • Need to empty a stage before reloading
  • May need a basket to carry to next stage
  • Time for each stage is about the same

8
Acceleration by Pipelining
  • Apparently a greater number of stages always
    provides better performance. However
  • a greater number of stages increases the overhead
    in moving information between stages and
    synchronization between stages.
  • with the number of stages the complexity of the
    CPU grows.
  • it is difficult to keep a large pipeline at
    maximum rate because of pipeline hazards.
  • 80486 and Pentium five-stage pipeline for
    integer instr.
  • eight-stage pipeline for FP instr.
  • PowerPC four-stage pipeline for integer instr.
  • six-stage pipeline for FP instr.

9
Pipeline hazards
  • Pipeline hazards are situations that prevent the
    next instruction in the instruction stream from
    executing during its designated clock cycle. The
    instruction is said to be stalled. When an
    instruction is stalled, all instructions later in
    the pipeline than the stalled instruction are
    also stalled. Instructions earlier than the
    stalled one can continue. No new
  • instructions are fetched during the stall.
  • Types of hazards
  • 1. Structural hazards
  • 2. Data hazards
  • 3. Control hazards

10
Hazards 3 types of pipelining hazards
  • structural hazards attempt to use the same
    resource two different ways at the same time
  • E.g., combined washer/dryer would be a structural
    hazard or folder busy doing something else
    (watching TV)
  • data hazards attempt to use item before it is
    ready
  • E.g., one sock of pair in dryer and one in
    washer cant fold until
  • get sock from washer through dryer
  • instruction depends on result of prior
    instruction still in pipeline
  • control hazards attempt to make a decision
    before condition is evaluated
  • E.g., washing football uniforms and need to get
    proper detergent level need to see after dryer
    before next load in
  • branch instructions

11
Pipelining Advances super-pipelining
superscalar
  • Super pipelining longer pipes
  • Increases in pipe stages (up to 10 or more)
  • The more stages the more throughput
  • Superscalar multiple pipes
  • More than one instruction started each cycle
    (referred to as multiple issue)
  • Requires more hardware
  • More complex dependency detection
  • Sometimes different types of pipes ALU, FPU,
    branch, etc.

12
Superscalar
  • A superscalar architecture is one in which
    several instructions can be initiated
    simultaneously and executed independently.
  • Pipelining allows several instructions to be
    executed at the same time, but they have to be in
    different pipeline stages at a given moment.
  • Superscalar architectures include all features of
    pipelining but, in addition, there can be several
    instructions executing simultaneously in the same
    pipeline stage.
  • They have the ability to initiate multiple
    instructions during the same clock cycle.
  • There are two typical approaches today, in order
    to
  • improve performance
  • 1. Superpipelining
  • 2. Superscalar

13
Superscalar (contd)
  • Superscalar architectures allow several
    instructions to be issued and completed per clock
    cycle.
  • A superscalar architecture consists of a number
    of pipelines that are working in parallel.
  • Depending on the number and kind of parallel
    units available, a certain number of instructions
    can be executed in parallel.
  • In the following example a floating point and two
    integer operations can be issued and executed
    simultaneously each unit is pipelined and can
    execute several operations in different pipeline
    stages.

14
Limitations on Parallel Execution
  • The situations which prevent instructions to be
    executed in parallel by a superscalar
    architecture are very similar to those which
    prevent an efficient execution on any pipelined
    architecture (see pipeline hazards - lectures 3,
    4).
  • The consequences of these situations on
    superscalar architectures are more severe than
    those on simple pipelines, because the potential
    of parallelism in superscalars is greater and,
    thus, a greater opportunity is lost.
  • Limitations on Parallel Execution (contd)
  • Three categories of limitations have to be
    considered
  • 1. Resource conflicts
  • - They occur if two or more instructions compete
    for the same resource (register, memory,
    functional unit) at the same time they are
    similar to structural hazards discussed with
    pipelines. Introducing several parallel pipelined
    units, superscalar architectures try to reduce a
    part of possible resource conflicts.
  • 2. Control (procedural) dependency
  • - The presence of branches creates major
    problems in assuring an optimal parallelism. How
    to reduce branch penalties has been discussed in
    lectures 78.
  • - If instructions are of variable length, they
    cannot be fetched and issued in parallel an
    instruction has to be decoded in order to
    identify the following one and to fetch it
    Þsuperscalar techniques are efficiently
    applicable to RISCs, with fixed instruction
    length and format.

15
  • 2. Control
  • - The presence of branches creates major
    problems in assuring an optimal parallelism. How
    to reduce branch penalties has been discussed in
    lectures 78.
  • - If instructions are of variable length, they
    cannot be fetched and issued in parallel an
    instruction has to be decoded in order to
    identify the following one and to fetch it
    superscalar techniques are efficiently applicable
    to RISCs, with fixed instruction length and
    format.
  • 3. Data conflicts
  • - Data conflicts are produced by data
    dependencies between instructions in the program.
    Because superscalar architectures provide a great
    liberty in the order in which instructions can be
    issued and completed, data dependencies have to
    be considered with much attention.

16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
REDUCED INSTRUCTION SET COMPUTERS (RISC)
  • What are RISCs and why do we need them?
  • RISC architectures represent an important
    innovation in the area of computer organization.
  • The RISC architecture is an attempt to produce
    more CPU power by simplifying the instruction set
    of the CPU.
  • The opposed trend to RISC is that of complex
    instruction set computers (CISC).
  • Both RISC and CISC architectures have been
    developed as an attempt to cover the semantic
    gap.

22
The Semantic Gap
  • In order to improve the efficiency of software
    development, new and powerful programming
    languages have been developed (Ada, C, Java).
    They provide high level of abstraction,
    conciseness, power.
  • By this evolution, the semantic gap grows.
  • Problem How should new HLL programs be compiled
    and executed efficiently on a processor
    architecture?
  • Two possible answers
  • 1. The CISC approach design very complex
    architectures including a large number of
    instructions and addressing modes include also
    instructions close to those present in HLL.
  • 2. The RISC approach simplify the instruction
    set and adapt it to the real requirements of user
    programs.

23
Main Characteristics of RISC Architectures
  • The instruction set is limited and includes only
    simple instructions.
  • - The goal is to create an instruction set
    containing instructions that execute quickly
    most of the RISC instructions are executed in a
    single machine cycle (after fetched and decoded).
  • Pipeline operation (without memory
    reference)
  • - RISC instructions, being simple, are
    hard-wired, while CISC architectures have to use
    microprogramming in order to implement complex
    instructions.
  • - Having only simple instructions results in
    reduced complexity of the control unit and the
    data path as a consequence, the processor can
    work at a high clock frequency.
  • - The pipelines are used efficiently if
    instructions are simple and of similar execution
    time.
  • - Complex operations on RISCs are executed as a
    sequence of simple RISC instructions. In the case
    of CISCs they are executed as one single or a few
    complex instruction.

24
Main Characteristics of RISC Architectures
  • Assume
  • - we have a program with 80 of executed
    instructions being simple and 20 complex
  • - on a CISC machine simple instructions take 4
    cycles, complex instructions take 8 cycles cycle
    time is 100 ns (10-7 s)
  • - on a RISC machine simple instructions are
    executed in one cycle complex operations are
    implemented as a sequence of instructions we
    consider on average 14 instructions (14 cycles)
    for a complex operation cycle time is 75 ns
    (0.75 10-7 s).
  • How much time takes a program of 1 000 000
    instructions?
  • CISC (1060.804 1060.208)10-7 0.48 s
  • RISC (1060.801 1060.2014)0.7510-7
    0.27 s
  • complex operations take more time on the RISC,
    but their number is small
  • because of its simplicity, the RISC works at a
    smaller cycle time with the CISC, simple
    instructions are slowed down because of the
    increased data path length and the increased
    control complexity.

25
Main Characteristics of RISC Architectures
  • Load-and-store architecture
  • - Only LOAD and STORE instructions reference
    data in memory all other instructions operate
    only with registers (are register-to-register
    instructions) thus, only the few instructions
    accessing memory need more than one cycle to
    execute (after fetched and decoded).
  • Pipeline operation with memory reference
  • Instructions use only few addressing modes
  • - Addressing modes are usually register, direct,
    register indirect, displacement.
  • Instructions are of fixed length and uniform
    format
  • - This makes the loading and decoding of
    instructions simple and fast it is not needed to
    wait until the length of an instruction is known
    in order to start decoding the following one
  • - Decoding is simplified because opcode and
    address fields are located in the same position
    for all instructions

26
Main Characteristics of RISC Architectures
  • A large number of registers is available
  • - Variables and intermediate results can be
    stored in registers and do not require repeated
    loads and stores from/to memory.
  • - All local variables of procedures and the
    passed parameters can be stored in registers (see
    slide 8 for comments on possible number of
    variables and parameters).
  • What happens when a new procedure is called?
  • - Normally the registers have to be saved in
    memory (they contain values of variables and
    parameters for the calling procedure) at return
    to the calling procedure, the values have to be
    again loaded from memory. This takes a lot of
    time.
  • - If a large number of registers is available, a
    new set of registers can be allocated to the
    called procedure and the register set assigned to
    the calling one remains untouched.

27
Main Characteristics of RISC Architectures
  • Is the above strategy realistic?
  • - The strategy is realistic, because the number
    of local variables in procedures is not large.
    The chains of nested procedure calls is only
    exceptionally larger than 6.
  • - If the chain of nested procedure calls becomes
    large, at a certain call there will be no
    registers to be assigned to the called procedure
    in this case local variables and parameters have
    to be stored in memory.
  • Why is a large number of registers typical for
    RISC architectures?
  • - Because of the reduced complexity of the
    processor there is enough space on the chip to be
    allocated to a large number of registers. This,
    usually, is not the case with CISCs.

28
Are RISCs Really Better than CISCs?
  • RISC architectures have several advantages and
    they were discussed throughout this lecture.
    However, a definitive answer to the above
    question is difficult to give.
  • A lot of performance comparisons have shown
    that benchmark programs are really running faster
    on RISC processors than on processors with CISC
    characteristics.
  • However, it is difficult to identify which
    feature of a processor produces the higher
    performance. Some "CISC fans" argue that the
    higher speed is not produced by the typical RISC
    features but because of technology, better
    compilers, etc.
  • An argument in favour of the CISC the simpler
    instruction set of RISC processors results in a
    larger memory requirement compared to the similar
    program compiled for a CISC architecture.
  • Most of the current processors are not
    typicalRISCs or CISCs but try to combine
    advantages of both approaches

29
Some Processor Examples
  • CISC Architectures
  • VAX 11/780
  • Nr. of instructions 303
  • Instruction size 2 - 57
  • Instruction format not fixed
  • Addressing modes 22
  • Number of general purpose registers 16
  • Pentium
  • Nr. of instructions 235
  • Instruction size 1 - 11
  • Instruction format not fixed
  • Addressing modes 11
  • Number of general purpose registers 8
  • RISC Architectures
  • Sun SPARC
  • Nr. of instructions 52
  • Instruction size 4
  • Instruction format fixed
  • Addressing modes 2
  • Number of general purpose registers up to 520
  • PowerPC
  • Nr. of instructions 206
  • Instruction size 4
  • Instruction format not fixed (but small
    differences)
  • Addressing modes 2
  • No. of gen. purpose registers 32

30
Summary
  • Both RISCs and CISCs try to solve the same
    problem to cover the semantic gap. They do it in
    different ways. CISCs are going the traditional
    way of implementing more and more complex
    instructions. RISCs try to simplify the
    instruction set.
  • Innovations in RISC architectures are based on
    a close analysis of a large set of widely used
    programs.
  • The main features of RISC architectures are
    reduced number of simple instructions, few
    addressing modes, load-store architecture,
    instructions are of fixed length and format, a
    large number of registers is available.
  • One of the main concerns of RISC designers was
    to maximise the efficiency of pipelining.
  • Present architectures often include both RISC
    and CISC features.

31
(No Transcript)
32
Memory Hierarchy distance versus speed size
  • The closer to the CPU the smaller the memory,
    but faster the access time
  • Larger capacity memory is typically slower

33
The End
Write a Comment
User Comments (0)
About PowerShow.com