Title: High Performance Distributed Computing
1High Performance Distributed Computing
- Andrew A. Chien
- CSE 225, Winter Quarter 1999
2What is High Performance Distributed Computing?
- Traditional distributed computing
- Heterogeneity
- Local area networks or low-speed coupling
- Canonical problems in making systems work, and
work reliably - HPDC involves systems that have high levels of
connectivity and are used to deliver high
performance on single applications.
3Examples
DC
- File service to interactive Unix workgroup (50
clients), doing edit-compile-debug - File service to (5000 clients) doing
edit-compile-debug, send mail, read mail,surf
network - Data service to 100s of clients doing video
editing on a shared movie image (temporally,
spatially, or other partitioning) - Data service to 1000s of clients doing immersive
Virtual reality capture and direct manipulation
(with recording)
HPDC
4How does this matter?
- HP in HPDC requires the pervasive use of
parallelism to achieve high performance. - Processing, memory, networks, storage
- Additional Challenges include
- Aggregation, fault tolerance, resource matching,
parallelism matching, synchronous interaction - Systems with many orders of magnitude different
in performance (200MIP vs. 1 TeraOp, 10Gbit vs.
10Mbit vs. 30Kbit) - . . .
5Roots of the Grid
- Heterogeneous Distributed Computing
- Making unlike operating systems talk with each
other - Digital VAX/VMS,IBM 360,Prime, Data General,
Hewlett-Packard, BUNCH, ... - Unixes and PCs
- RPC, IDL, Distributed Objects
- gt Working systems, but generally not focused on
highest performance. - gt Asynchronous implementation model
- gt Static resource model, moderate heterogeneity
in retrospect
6Roots of the Grid (cont.)
- Metacomputing for High Performance Computing
- Making unlike Supercomputers talk to one another,
and cooperate on a job simultaneously - First, traditional distributed systems kinds of
ideas Cray computational server, Convex file and
storage server - Later, exploiting performance heterogeneity
Vector code on vector machine, Massively Parallel
code on Massively-parallel machine - Now, innovative online applications online data
processing and instrument control, interactive
visualization and direct manipulation
environments, immersive interaction coupled with
distributed interactive simulation
7Roots of the Grid (cont. II)
- gt Coupling via high speed networks (HiPPi --
100MB/s I/O interconnect) from late 80s - gt Only today are ATM, GigE, and other networks
faster, and deliver this BW to wide area - gt Why would you want to do these applications
synchronously? - So. What changed? (many things)
- Headlines from the late 1990s...
8Advent High Speed Networking (LAN/MAN/WAN)
- Terabit fiber speeds, no technology barriers
- Legal, organizational, market obstacles remain
- Networks are faster than computers (fundamental
physics favors this) - Installed bandwidth exploding (100x year)
- gt we are moving from a sparsely connected to a
richly connected world
9Cheap, Ubiquitous Computing/Electronics
- Proliferation of computing devices (embedded and
non-) - Proliferation of data-acquisition / sensor /
point-of-sale / monitors - Computing is everywhere (increases usability)
- Laptops, ATMs, Cell phones automobiles,
childrens toys, etc. - Electronics/Computing generates huge data
(something to compute on) - Video cameras, credit card terminals, Hubble
space telescope, EOSDIS system (1TB/day!), smart
cards (and dumb cards too) things that watch
the world things that watch the computing
simulations of things that might be. Etc. - Can embed smarts to solve lots of problems
10Micros Catch up with Supercomputers
- Physics and scaling of CMOS technologies
- Caches, locality, and clock rates
- Workstation, then PC market volumes, now Sony
Playstations? Games? - gt high performance achieved by aggregating large
numbers of small processors
11What does all this mean?
- Rich connectivity means many more interesting
networked applications are now possible. - Ex Distributed collaborativity with high
modality - Cheap ubiquitous electronics means we are
drowning in a sea of data - Many orders of magnitude more than humans can
input/type - also enables raft of new applications involving
sensors, actuators, and humans - Micros as building blocks (PCs as building
blocks, fast!) - Units of assembly are small, must deal with
parallelism, issues thereof - and much more...
12What is a Grid?
- Analogy to Electrical power grids
- Electrical power generation without distribution
- EPG is infrastructure for electric power sharing
that made power universally accessible - generation decoupled from use
- large scale resource pooling and sharing possible
- catalyzes markets for generators (how many
types?) - catalyzes markets for appliances (how many
types?) - enables new uses not previously conceived
- but, dont take this too literally...
13What is a Computational Grid?
- Infrastructure for computing resources that
enables similar benefits - decouple provision of computing / data /
networking from use - large scale resource pooling and sharing
- catalyzes markets for generators?
- catalyzes markets for appliances (consumers)?
- enables new uses?
14Where does the analogy work well?
15Where does the analogy break down?
- Homogeneity (computing resources not)
- computations do I/O and have data
- large dynamic range, want this immediately
- computing resources are multidimensional (power
not as much so) - more centralized administrative control in the
power grid (will this be true for the Internet?
For the grid?)
16Example Application
Instruments/ Actuators
- Online intelligent control of instrument/computati
on/vehicle, etc. - Collaborative control, from multiple sites with
different perspectives - Humans, computers, various forms of feedback
- Wealth of new applications for Science, Industry,
Training, Games, etc.
17Key Grid Requirements
- Dependable service(predictable, sustained, high
performance) - Consistency (standards)
- Pervasive
- Inexpensive
18Dimensions of Grid Activities
- Distributed Supercomputing (wide-area
aggregation) - High Throughput Computing
- On-Demand Computing (peaks)
- Data-intensive Computing
- Collaborative Computing
- Others?
19What challenges do Grids present?
- What can you do today?
- Challenges
- Protection and access
- Heterogeneity
- Orchestrating performance
20Grid Research efforts and Initiatives
- the Alliance (NCSA) and NPACI (SDSC) under NSF
support - the NASA IPG effort
- the Department of Energy (Clipper, Super Vis
Corridors, DISCOM, etc.) - ... and ... many international efforts.
21Commercially Related Efforts
- Suns Jini (Javaspaces, EJB, resource discovery)
- Microsofts Millenium (ubiquitous distributed
computing) - Corba and DCOM evolution...
- gt despite all the marketing hype, they dont
know all the answers - gt very easy to influence industry these days...
22CSE225 High Performance Distributed Computing
- Whats the course about?
- Who should be in the course?
- Takeaways
23CSE225 Topics
- Focus on Grid Computing for high performance
systems - High Performance Networking
- Resource Discovery and other Grid services
- Achieving Performance
- Coscheduling, Application Scheduling
- Research and commercial systems
- gt focus on systems and delivering performance
(and then increasing domain of interest) - Possible topics
- too long to list
24CSE225 Organization and Workload
- Two Lectures/discussions per week
- Per class reading assignments
- Course Textbook
- Homeworks (every 1-2 weeks)
- Quarter Project
- gt Homeworks and project will be done in groups
25Should you be in this course?
- Good reasons to be in this course
- Interested in Grids as a research area
- Interested in broadening/deepening my graduate
education - Enjoy building/studying/understanding systems
- Bad reasons to be in this course
- Need a few more units to graduate, it fit my
schedule, seems like it might be easy (wont be
-) - Folks who probably shouldnt be in this course
- Part-timers, auditors, folks who cant commit a
significant regular increment of time (wont work
with the teams)
26Takeaways
- What you should get out of this course
- Introduction/exposure to state of the art
research in Computational Grid systems - Familiarity and insight into the key research
problems - A perspective on the field as a whole and how the
problems studied relate to the commercial
state-of-the-art - Experience with some research Grid systems
27Questions
- Administrative?
- Technical?
- Other?
28Handouts
- Course Handout
- Policies and Information
- Tentative syllabus
- Readings for next time
- Grid Book, Chapters 1-3
- Communications of the ACM, November 1997, 40(11).
Computational Infrastructure Toward the 21st
Century (articles by Smarr, Smith, Reed, Stevens,
and Kennedy). - Optional reading Articles on applications by
McRae and Ostriker/Norman and Communications of
the ACM, November 1998, 41(11), High Performance
Computing Continuum articles from the National
Partnership for Advanced Computational
Infrastructure, the other NSF PACI.
29(No Transcript)
30Perspective on Academic and Commercial activities
- Exciting area with lots of open problems
- Academic and commercial efforts typically have
complementary goals - Academic deep understanding, find best solution,
make it known to all - Commercial quick win, find dominant solution,
sell it to all