Parallel Programming on the - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Parallel Programming on the

Description:

Parallel Programming on the SGI Origin2000. Parallelization Concepts. SGI Computer Design ... Parallelization Concepts. Introduction to Parallel Computing ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 36
Provided by: mos683
Category:

less

Transcript and Presenter's Notes

Title: Parallel Programming on the


1
Parallel Programming on the SGI Origin2000
Taub Computer Center Technion
Anne Weill-Zrahia
With thanks to Moshe Goldberg, TCC and Igor
Zacharov SGI
Mar 2005
2
Parallel Programming on the SGI Origin2000
  • Parallelization Concepts
  • SGI Computer Design
  • Efficient Scalar Design
  • Parallel Programming -OpenMP
  • Parallel Programming- MPI

3
Academic Press 2001
ISBN 1-55860-671-8
4
  • Parallelization Concepts

5
Introduction to Parallel Computing
  • Parallel computer A set of processors that
    work cooperatively to solve a computational
    problem.
  • Distributed computing a number of processors
    communicating over a network
  • Metacomputing Use of several parallel computers

6
Parallel classification
  • Parallel architectures
  • Shared Memory /
  • Distributed Memory
  • Programming paradigms
  • Data parallel /
  • Message passing

7
Why parallel computing
  • Single processor performance limited by physics
  • Multiple processors break down problem into
    simple tasks or domains
  • Plus obtain same results as in sequential
    program, faster.
  • Minus need to rewrite code

8
Three HPC Architectures
Shared memory
Cluster
Vector Processor
9
Shared Memory
  • Each processor can access any part of the memory
  • Access times are uniform (in principle)
  • Easier to program (no explicit message passing)
  • Bottleneck when several tasks access same
    location

10
Symmetric Multiple Processors
Memory
Memory Bus
CPU
CPU
CPU
CPU
Examples SGI Power Challenge, Cray J90/T90
11
Data-parallel programming
  • Single program defining operations
  • Single memory
  • Loosely synchronous (completion of loop)
  • Parallel operations on array elements

12
Distributed Parallel Computing
Memory
Memory
Memory
Memory
CPU
CPU
CPU
CPU
Examples SP2, Beowulf clusters
13
Message Passing Programming
  • Separate program on each processor
  • Local Memory
  • Control over distribution and transfer of data
  • Additional complexity of debugging due to
    communications

14
Distributed Memory
  • Processor can only access local memory
  • Access times depend on location
  • Processors must communicate via explicit message
    passing

15
Message Passing or Shared Memory?
Message Passing
Shared Memory
Takes longer to implement More details to worry
about Increases source lines Complex to debug
and time Increase in total memory
used Scalability limited by - communications
overhead - process synchronization Parallelism
is visible
Easier to implement System handles many
details Little increase in source Easier to
debug and time Efficient memory use Scalability
limited by - serial portion of code -
process synchronization Compiler based
parallelism
16
Performance issues
  • Concurrency ability to perform actions
    simultaneously
  • Scalability performance is not impaired by
    increasing number of processors
  • Locality high ration of local memory
    accesses/remote memory accesses (or low
    communication)

17
Objectives of HPC in the Technion
  • Maintain leading position in science/engineering
  • Production sophisticated calculations
  • Required high speed
  • Required large memory
  • Teach techniques of parallel computing
  • In research projects
  • As part of courses

18
HPC in the Technion
SGI Origin2000 22 cpu (R10000) -- 250 MHz
Total memory -- 9 GB 32 cpu (R12000)
300 MHz Total memory - 9GB PC cluster
(linux redhat 9.0) 6 cpu (pentium II -
866MHz) Memory - 500 MB/cpu PC cluster (linux
redhat 9.0) 16 cpu (pentium III 800 MHz)
Memory 500 MB/cpu
19
Origin2000 (SGI) 128 processors
20
Origin2000 (SGI) 22 processors
21
  • PC clusters (Intel)
  • 6 processors
  • 16 processors

22
(No Transcript)
23
Data Grids forHigh Energy Physics
Image courtesy Harvey Newman, Caltech
24
GRIDS Globus Toolkit
  • Grid Security Infrastructure (GSI)
  • Globus Resource Allocation Manager (GRAM)
  • Monitoring and Discovery Service (MDS)
  • Global Access to Secondary Storage (GASS)

25
November 2004
26
A Recent Example Matrix multiply
27
Profile -- original
28
Profile optimized code
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com