PowerPointPrsentation - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

PowerPointPrsentation

Description:

Solution of technical/scientific problems like weather, fluid ... Photograph courtesy of Charles Babbage Institute, University of Minnesota, Minneapolis ... – PowerPoint PPT presentation

Number of Views:210
Avg rating:3.0/5.0
Slides: 33
Provided by: ste1189
Category:

less

Transcript and Presenter's Notes

Title: PowerPointPrsentation


1
Large-Scale Scientific Computing 1946-2006John
G. Zabolitzky
2
Segments of Computation
  • 1. Scientific ? Commercial ? Consumer ?
    Embedded
  • Solution of technical/scientific problems like
    weather, fluid dynamics, nuclear reactor
    simulation (usually involving many complicated
    operations on real floating-point numbers) as
    opposed to commercial problems like accounting,
    inventory, banking (usually involving characters
    and few, simple operations on fixed-point
    numbers). Not considering consumer applications
    like music, movies, games web-servers
    dishwashers, coffeemakers, automotive.
  • 2. Large-Scale ? Small/Medium-Scale
  • Looking at the largest problems which can be
    treated in the current year. Not looking at
    small-scale, e.g. laboratory-automation, student
    paper, or small research problem. (M problems,
    not k problems)
  • 3. Mainstream ? Experimental, Unique, small
    market share machines
  • Machines which have had a major influence on
    science/technology in general on a broad scale.
  • 4. What is a Computer ?
  • Stored program (not fixed, not external)
    electronic (not electromechanical) computer

3
First 30 Years Time line 1946-1975 Scalar ("von
Neumann") Computing
  • 1946 Zuse(electromechanical), ENIAC(wired
    program), Whirlwind .... early attempts
  • 1950 ERA 1101 (Atlas 1)
  • 1953 ERA 1103 (Atlas 2) IBM 701 "defense
    calculator"
  • 1857 IBM 709
  • 1959 CDC 1604
  • 1960 IBM 7090 709t
  • 1962 IBM 7094
  • 1963 CDC 3600
  • 1964 CDC 6600
  • 1965 IBM /360 family
  • 1969 CDC 7600
  • 1971 IBM /360-195

4
ERA 1101 (1950)
Vacuum Tubes 2 Registers (A(48), Q(24)) 24 bit
binary parallel Drum memory 16k words 4.400
add/mul/sec
1-arithmetic section 2-power supply 3-control
section 4-maintenance section 5-memory,
electronic section 6-memory, drum section 7-heat
transfer unit 8,9- control, paper tape
reader/punch
5
ERA 1103 (1953)
Vacuum Tubes 2 Registers (A(72), Q(36)) 36 bit
binary parallel Williams tube memory 1k words
(CRT tube memory) Drum memory 16k words 4.400
add/mul/sec
6
IBM 701 ("defense calculator") (1953)
Vacuum Tubes 2 Registers (A(38), Q(36)) 36 bit
binary parallel Williams tube memory 2k words
(CRT tube memory) Drum memory 8k words 4.000
add/mul/sec
7
IBM 709 (1957)
Vacuum Tubes 5 Registers (A(38), Q(36), 3
index) 36 bit binary parallel magnetic core
memory 4/8/32k words Drum memory 8/16k
words 5.500 add/mul/sec
8
CDC 1604 (1959)
discrete Transistor 8 Registers (A(96), Q(48), 6
index) 48 bit binary parallel magnetic core
memory 32k words 40k add/mul/sec
9
IBM 7090 (1960)
discrete Transistor 5 Registers (A(38), Q(36), 3
index) 36 bit binary parallel magnetic core
memory 32k words 40k add/mul/sec
10
IBM 7094 (1962)
discrete Transistor 9 Registers (A(38), Q(36), 7
index) 36 bit binary parallel magnetic core
memory 32k words 80k add/mul/sec
11
CDC 6600 (1964)
discrete Transistor 32 Registers (8 X, 8 A, 8B, 8
instruction stack) 60 bit binary
parallel magnetic core memory 128k words 1
MFLOPS first fluid cooled
12
CDC 6600 10 core modules - each 6 kByte - 130
modules total 2 logic frames
13
discrete wire mat vector graphic console
14
"Last week Control Data ... announced the 6600
system. I understand that in the laboratory
developing the system there are only 34 people
including the janitor. Of these, 14 are engineers
and 4 are programmers ... Constrasting this
modest effort with our vast development
activities, I fail to understand why we have lost
our industry leadership position by letting
someone else offer the world's most powerful
computer." -- Thomas Watson, CEO of IBM,
1964 "It seems like Mr. Watson has answered his
own question." -- Seymour Cray, Control Data
Corporation
15
(No Transcript)
16
CDC 7600 (1969)
  • The 7600 has similar hardware stucture like the
    6600 (discrete transistor), with some
    improvements
  • - 12 word instruction stack (was 8 word), total
    of 36 "registers"
  • - 275 nsec small core memory cycle time (64kW,
    was 1000 nsec 128 kW), large core 512 kW
  • - 36 MHz clock (was 10 MHz)
  • - more consequently pipelined functional units
  • - faster peripheral prcoessors

17
IBM /360 - 195 (1971)
integrated circuit 20 Registers (16 GP, 4
FP) 32/64 bit binary parallel magnetic core
memory 1Mword max 756 nsec silicon cache 32 kByte
54 nsec (4 kword) model 195 hidden registers in
CPU to overcome /360 limitations
18
Compiled by Erich Strohmaier
19
Second 30 Years Time line 1976-2006 Vector and
Parallel Computing
  • 1976 Cray-1 first successful vector computer (
    50 MFLOPS)
  • 1982 Cray X-MP first multiple-processor
    shared-memory vector computer
  • 1985 Cray-2 large memory (256 MW 2 GByte)
  • 1888 Cray Y-MP first to break 1 GFLOPS barrier
  • 1993 Cray T3D first successful massively
    parallel machine, 3D-Torus
  • 16 x 1 GFLOPS lt 512 x 0.150 76 GFLOPS
  • 1995 Cray T3E most widely sold MPP machine
    break 1 TFLOPS barrier
  • 1700 x 1.2 GFLOPS 2 TFLOPS
  • 2004 IBM Blue Gene/L world performance leader
    (development started 1999)
  • IBM today has dominant market share (gt 50)
  • leadership recovered after 40 years of
    CDC/Cray dominance
  • same interconnect structure as Cray T3D/T3E
    (3D-Torus)
  • 2006 lowest-power processors (64k x 5 GFLOPS
    320 TFLOPS)

20
Seymour Cray Cray-1 1976 Single Processor 80/160
MFLOPS peak 1 Mword 8 Mbyte
Photograph courtesy of Charles Babbage Institute,
University of Minnesota, Minneapolis
21
MUCH larger working set - 8 vector registers, 64
words - 8 scalar registers - 8 address
registers - large instruction buffer Performance
Features - vector processing one operation
affects 64 vector elements, streamed through
functional unit - small vector startup time -
chaining between vector ops - large, fast
semiconductor memory - requires vectorization
effort
22
Cray X-MP 1982 4 processors 800 MFLOPS 16 Mword
128 MByte
23
Cray-2 1985 4 processors 1200 MFLOPS 256 Mword
2 GByte
24
Minnesota Supercomputer Center Minneapolis 1986 C
DC Cyber 205 Cray-2 (4) Cray-2 (1)
25
Cray Y-MP 1988 8/16 processors 1-16
GFLOPS 16M-1Gword 128M-8GByte
26
Cray T3D (1993) First widely successful
massively parallel system 512 x 0.15 MFLOPS 76
GFLOPS 4 Gword 32 Gbyte distributed memory 3D
Torus interconnect MPP requires massive software
effort
27
Cray T3E (1995) Most successful massively
parallel system in the 1990s 2048 x 1200 MFLOPS
2.4 TFLOPS max.(8 cabinets) 64 Gword 256 Gbyte
distributed memory (large end of config.) 3D
Torus interconnect
3 cabinets 768 processors
28
Cache not always useful
Latency,
congestion not discussed here
29
From Thomas Lippert, FZJ
30
From Thomas Lippert, FZJ
31
From Thomas Lippert, FZJ 1 MW 1 M/year !!
32
After 40 years (1964 - 2004) of CDC - Cray
(vector) dominance IBM has regained the market
leadership.Low-power technology is the key to
success- high density ? fast communication-
low utility cost, low building costScalar ?
Vector ? Parallel increasing burden on
programmer to obtain performance/efficiency
Write a Comment
User Comments (0)
About PowerShow.com