CS160 - PowerPoint PPT Presentation

About This Presentation
Title:

CS160

Description:

Dr. Philip Papadopoulos - SDSC. Two Instructors/One Class. We are team-teaching the class ... TA is Derrick Kondo. He is responsible for grading homework and programs ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 30
Provided by: philip335
Learn more at: https://cseweb.ucsd.edu
Category:
Tags: cs160 | derrick

less

Transcript and Presenter's Notes

Title: CS160


1
CS160 Spring 2000http//www-cse.ucsd.edu/classe
s/sp00/cse160
  • Prof. Fran Berman - CSE
  • Dr. Philip Papadopoulos - SDSC

2
Two Instructors/One Class
  • We are team-teaching the class
  • Lectures will be split about 50-50 along topic
    lines. (Well keep you guessing as to who will
    show up next lecture ?)
  • TA is Derrick Kondo. He is responsible for
    grading homework and programs
  • Exams will be graded by Papadopoulos/Berman

3
Prerequisites
  • Know how to program in C
  • CSE 100 (Data Structures)
  • CSE 141 (Computer Architecture) would be helpful
    but not required.

4
Grading
  • 25 Homework
  • 25 Programming assignments
  • 25 Midterm
  • 25 Final
  • Homework and Programming Assignments Due at
    beginning of section

5
Policies
  • Exams are closed book, closed notes
  • No Late Homework
  • No Late Programs
  • No Makeup exams
  • All assignments are to be your own original work.
  • Cheating/copying from anyone/anyplace will be
    dealt with severely

6
Office Hours (Papadopoulos)
  • My office is SSB 251 (Next to SDSC)
  • Hours will be TuTh 230 330 or by appointment.
  • My email is phil_at_sdsc.edu
  • My campus phone is 822-3628

7
Course Materials
  • Book Parallel Programming Techniques and
    Applications using Networked Workstations and
    Parallel Computers, by B. Wilkinson and Michael
    Allen.
  • Web site Will try to make lecture notes
    available before class
  • Handouts As needed.

8
Computers/Programming
  • Please see the TA about getting an account for
    the undergrad APE lab.
  • We will use PVM for programming on workstation
    clusters.
  • A word of advice With the web, you can probably
    find almost completed source code somewhere.
    Dont do this. Write the code yourself. Youll
    learn more. See policy on copying.

9
Any other Adminstrative Questions?
10
Introduction to Parallel Computing
  • Topics to be covered. See syllabus (online) for
    full details
  • Machine architecture and history
  • Parallel machine organization,
  • Parallel algorithm paradigm
  • Parallel programming environments and tools
  • Heterogeneous computing.
  • Evaluating Performance
  • Grid Computing
  • Parallel programming and project
    assignments

11
What IS Parallel Computing?
  • Applying multiple processors to solve a single
    problem
  • Why?
  • Increased performance for rapid turnaround time
    (wall clock time)
  • More available memory on multiple machines
  • Natural progression of standard Von Neumann
    Architecture

12
Worlds 10th Fastest Machine (as of November
1999) _at_ SDSC
1152 Processors
13
Are There Really Problems that Need O(1000)
processors?
  • Grand Challenge Codes
  • First Principles Materials Science
  • Climate modeling (ocean, atmosphere)
  • Soil Contamination Remediation
  • Protein Folding (gene sequencing)
  • Hydrocodes
  • Simulated nuclear device detonation
  • Code breaking (No Such Agency)

14
There must be problems with the approach
  • Scaling with efficiency (speedup)
  • Unparallelizable portions of code (Amdahls law)
  • Reliability
  • Programmability
  • Algorithms
  • Monitoring
  • Debugging
  • I/O
  • These and more keep the field interesting

15
A Brief History of Parallel Super Computers
  • There have been many (dead) supercomputers
  • The Dead Supercomputer Society
  • http//ei.cs.vt.edu/history/Parallel.html
  • Parallel Computing Works
  • Will touch on about a dozen of the important ones

16
Basic Measurement Yardsticks
  • Peak Performance (AKA, guaranteed never to
    exceed) nprocs X FLOPS/proc
  • NAS Parallel Benchmarks
  • Linpack Benchmark for the TOP 500
  • Later in the course, We will explore about how to
    Fool the Masses and valid ways to measure
    performance

17
Illiac IV (1966 1970)
  • 100 Million of 1990 Dollars
  • Single instruction multiple data (SIMD)
  • 32 - 64 Processing elements
  • 15 Megaflops
  • Ahead of its time

18
ICL DAP (1979)
  • Distributed array Processor (also SIMD)
  • 1K 4K bit Serial processors
  • Connected in a mesh
  • Required an ICL mainframe to front-end the main
    processor array
  • Never caught on in the US

19
Goodyear MPP (late 1970s)
  • 16K bit-serial processors (SIMD)
  • Goddard Space and Flight Center NASA
  • Only a few sold. Similar to the ICL DAP
  • About 100 Mflops (100 MHz Pentium)

20
Cray-1 (1976)
  • Seymour Cray, Designer
  • NOT a parallel machine
  • Single processor machine with vector registers
  • Largely regarded as starting the modern
    supercomputer revolution
  • 80 MHz Processor (80 MFlops)

21
Denelcor HEP (Heterogeneous Element Processor,
early 80s)
  • Burton Smith, Designer
  • Multiple Instruction, Multiple Data (MIMD)
  • Fine (instruction-level) and Large-grain
    parallelism (16 processors)
  • Instructions from different programs ran in
    per-processor hardware queues (128 threads/proc)
  • Precursor to the Tera MTA (Multithreaded
    architecture
  • Full-empty bit for every memory location.
    Allowed fast synchronization
  • Important research machine

22
Caltech Cosmic Cube - 1983
  • Chuck Seitz (Founded Myricom) and Geoffrey Fox
    (Lattice gauge theory)
  • First Hypercube interconnection network
  • 8086/8087 based machine with Eugene Brooks
    Crystalline Operating System (CrOS)
  • 64 Processors by 1983
  • About 15x cheaper than a VAX 11/780
  • Begat nCUBE, Floating Point Systems, Ametek,
    Intel Supercomputers (all dead companies)
  • 1987 Vector coprocessor system achieved
    500MFlops

23
Cray XMP (1983) and Cray-2 (1985)
  • Up to 4-Way shared memory machines
  • This was the first supercomputer at SDSC
  • Best Performance (600 Mflop Peak)
  • Best Price/Performance of the time

24
Late 1980s
  • Proliferation of (now dead) parallel computers
  • CM-2 (SIMD) (Danny Hillis)
  • 64K bit-serial, 2048 Vector Coprocessors
  • Achieved 5.2 Gflops on Linpack (LU Factorization)
  • Intel iPSC/860 (MIMD - MPP)
  • 128 Processors
  • 1.92 Gigaflops (Linpack)
  • Cray Y/MP (Vector Super)
  • 8 processors (333 Mflops/proc peak)
  • Achieved 2.1 Gigaflops (Linpack)
  • BBN Butterfly (Shared memory)
  • Many others (long since forgotten)

25
Early 90s
  • Intel Touchstone Delta and Paragon (MPP)
  • Follow-On iPSC/860
  • 13.2 Gflops on 512 Processors
  • 1024 Nodes delivered to ORNL in 1993 (150 GFLOPS
    Peak)
  • Cray C-90 (Vector Super)
  • 16 Processor update of the Y/MP
  • Extremely popular, efficient and expensive
  • Thinking Machines CM-5 (MPP)
  • Upto 16K Processors
  • 1024 Node System at Los Alamos National Lab

26
More 90s
  • Distributed Shared Memory
  • KSR-1 (Kendall Square Research)
  • COMA (Cache Only Memory Architecture)
  • University Projects
  • Stanford DASH Processor (Hennessy)
  • MIT Alewife (Agarwal)
  • Cray T3D/T3E. Fast Processor Mesh with upto 512
    Alpha CPUs

27
What Can you Buy Today? (not an exhaustive list)
  • IBM SP
  • Large MPP or Cluster
  • SGI Origin 2000
  • Large Distributed Shared Memory Machine
  • Sun HPC 10000 64 Processor True Shared Memory
  • Compaq Alpha Cluster
  • Tera MTA
  • Multithreaded architecture (one in existence)
  • Cray SV-1 Vector Processor
  • Fujitsu and Hitachi Vector Supers

28
Clusters
  • Poor mans Supercomputer?
  • A pile-of-PCs
  • Ethernet or High-speed (eg. Myrinet) network
  • Likely to be the dominant high-end architecture.
  • Essentially a build-it-yourself MPP.

29
Next Time
  • Flynns Taxonomy
  • Bit-Serial, Vector, Pipelined Processors
  • Interconnection Networks
  • Routing Techniques
  • Embedding
  • Cluster interconnects
  • Network Bisection
Write a Comment
User Comments (0)
About PowerShow.com