CPE 431531 Chapter 9 Multiprocessors and Clusters - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

CPE 431531 Chapter 9 Multiprocessors and Clusters

Description:

Multiprocessor: Parallel processors with a single shared memory ... Locks a synchronization device which allows access to data to only one processor at a time ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 14
Provided by: glen3
Category:

less

Transcript and Presenter's Notes

Title: CPE 431531 Chapter 9 Multiprocessors and Clusters


1
CPE 431/531Chapter 9 Multiprocessors and
Clusters
  • Swathi T. Gurumani
  • Modified From Slides of
  • Dr. Rhonda Kay Gaede
  • UAH

2
9.1 Introduction - Motivation
  • Multiprocessor Parallel processors with a single
    shared memory
  • Parallel Processing Program A single program
    that runs on multiple processors simultaneously
  • Why multiprocessors?
  • Effective than building a high-performance
    uni-processor with more advanced technology
  • Many scientific applications are too demanding to
    make progress with a single processor weather
    prediction, protein folding

3
9.1 Introduction - Design Questions
  • How do parallel processors share data?
  • Shared memory processors
  • Message passing for communication
  • Shared memory processors
  • Shared Memory Memory for a parallel processor
    with a single address space, implying implicit
    communication with loads and stores
  • Synchronization Process of coordinating the
    behavior of two or more processes, which may be
    running on different processors
  • Locks a synchronization device which allows
    access to data to only one processor at a time
  • UMA and NUMA

4
9.1 Introduction - Design Questions
  • Symmetric multiprocessor(SMP)-Uniform Memory
    Access(UMA) accesses to main memory take the
    same amount of time no matter which processors
    requests the access and no matter which word is
    accessed.
  • Nonuniform memory access Single address space
    multiprocessor in which some memory accesses are
    faster than others depending on which processor
    asks for which word
  • NUMA machines are scalable and hence potentially
    provide high performance

5
9.1 Introduction - Design Questions
  • Message Passing communication between multiple
    processors by explicitly sending and receiving
    information
  • Needed in machines with private memories
  • Cluster Set of computers connected over a LAN
    that function as a single large multiprocessor
  • Send message routine Used by a processor with
    private memories to pass to another processor
  • Receive message routine Used by a processor to
    accept the message another processor

6
9.1 Introduction - Design Questions
  • Apart from two main communication styles,
    multiprocessors are connected in two basic
    organizations
  • Connect by Single bus
  • Connect by network
  • Number of processors in the multiprocessor has a
    lot to do with this choice

7
9.2 Programming Multiprocessors
  • Its difficult to rewrite programs to run on
    multiprocessors, therefore its not done often.
  • Furthermore, many applications dont require many
    processors.
  • Problems
  • Must get good performance and efficiency
  • Communication overhead
  • Programmer should have a good knowledge about
    hardware

8
9.2 Programming Multiprocessors -Speedup Challenge
  • Suppose you want to achieve linear speedup with
    100 processors. What fraction of the original
    computation can be sequential?

9
9.2 Programming Multiprocessors -Speedup
Challenge Bigger Problem
  • Suppose you want to perform two sums one is a
    sum of two scalar variables and one is a matrix
    sum of a pair of two-dimensional arrays, size
    1000 by 1000. What speedup do you get with 1000
    processors?

10
9.3 Multiprocessors Connectedby a Single Bus
  • Each processor is
  • smaller than a multichip processor, more
    processors can be placed on a bus
  • Caches can lower bus traffic
  • Mechanisms for cache cohereency

11
9.3 Multiprocessors Connected by a Single Bus -
Parallel Program


  • Suppose we want to sum 100,000 numbers on a
    single-bus multiprocessor computer. Lets assume
    we have 100 processors.
  • sumPn 0
  • for (i 1000Pn i lt 1000(Pn1) i i 1)
  • sumPn sumPn Ai / sum the assigned
    areas /
  • half 100 / 100 processors in 1-bus
    multiprocessor /
  • repeat
  • synch() / wait for partial sum completion /
  • if (half2 ! 0 Pn 0)
  • sum0 sum0 sumhalf-1
  • / Conditional sum needed when half is odd
  • Processor0 gets missing element /
  • half half/2 / dividing line on who sums /
  • if (Pn lt half) sumPn sumPn sumPn
    half
  • until (half 1) / exit with final sum in
    sum0 /

12
9.3 Multiprocessors Connected by a Single Bus -
Multiprocessor Cache Coherence
Cache coherence Consistency in the value of
data between the versions in the caches of
several processors. Snooping Maintaining cache
coherency such that all cache controllers monitor
or snoop on the bus to determine whether or not
they have a copy of the desired block.


  • Snooping
  • Write-invalidateInvalidate all other shared
    copies
  • Write-updateUpdate shared copies with the value
    being
  • written

13
9.3 Multiprocessors Connected by a Single Bus
Cache Coherency Protocol
  • Transitions in the state of a cache block happen
    on read misses, write misses or write hits read
    hits do not change cache state.
Write a Comment
User Comments (0)
About PowerShow.com