Algorithms and Tools for (Distributed) Heterogeneous Computing

About This Presentation

Title:

Algorithms and Tools for (Distributed) Heterogeneous Computing

Description:

Data decomposition techniques for cluster computing. Granularity issues for metacomputing ... MPI_Connect, Nexus, PACX-MPI, MPI-Plus, Data-Exchange, VCM, MagPIe, ... – PowerPoint PPT presentation

Number of Views:86

Avg rating:3.0/5.0

Slides: 56

Provided by: yvesr

Category:

more less

Transcript and Presenter's Notes

Title: Algorithms and Tools for (Distributed) Heterogeneous Computing

1
Algorithms and Tools for (Distributed)
Heterogeneous Computing

Yves ROBERT
www.ens-lyon.fr/yrobert

2
Contents

Framework
Hardware, system and administration issues,
applications
Programming environments
Globus, Legion, Albatross, AppLeS, NetSolve
Algorithmic and programming aspects
Data decomposition techniques for cluster
computing
Granularity issues for metacomputing
Scheduling and load-balancing methods
Conclusion

3
Bibliography

Books
The Grid Blueprint for a New Computing
Infrastructure
High Performance Cluster ComputingVol 1
Architecture and SystemsVol 2 Programming and
ApplicationsR. Buyya ed. Prentice Hall 1999.
Journals
Blueprint for the future of high-performance
computing
The High-Performance computing continuumCACM
Nov. 1997 and Nov. 1998

4
The Grid Blueprint for a New Computing
Infrastructure I. Foster, C. Kesselman (Eds),
Morgan Kaufmann, 1999

ISBN 1-55860-475-8
22 chapters by expert authors including Andrew
Chien, Jack Dongarra, Tom DeFanti, Andrew
Grimshaw, Roch Guerin, Ken Kennedy, Paul Messina,
Cliff Neuman, Jon Postel, Larry Smarr, Rick
Stevens, and many others

5
Bibliography (contd)

Web
NPCACI (National Partnership for Advanced
Computational Infrastructure) www.npaci.edu
An Overview of Computational Grids and Survey of
a Few Research Projects, Jack Dongarra
http//www.netlib.org/utk/people/JackDongarra/tal
ks.html
LIP Report 99-36
Algorithms and Tools for (Distributed)
Heterogeneous Computing A Prospective Report
www.ens-lyon.fr/yrobert

6
Framework

7
Metacomputing

Future of parallel computing distributed and
heterogeneous
Metacomputing Making use of distributed
collections of heterogeneous platforms
Target Tightly-coupled high-performance
distributed applications(rather than
loosely-coupled cooperative applications)

8
Metacomputing Platforms (1)

Low end of the field Cluster computing with
heterogeneous networks of workstations or PCs
Ubiquitous in university departments and
companies
Typical poor mans parallel computer
Running large PVM or MPI experiments
Make use of all available resources slower
machinesin addition to more recent ones

9
Metacomputing Platforms (2)

High end of the field Computational grid linking
the most powerful supercomputers of the largest
supercomputing centers through dedicated
high-speed networks.
Middle of the field Connecting medium size
parallel servers (equipped with
application-specific databases and application-or
iented software) through fast but non-dedicated,
thus creating a meta-system

10
High end (1)

Globus Ubiquitous Supercomputing Testbed
Organization (GUSTO)
November 1998, 70 institutions, 3 continents
17 sites, 330 supercomputers (over 3600
processors)
Aggregate global power in excess of 2 TeraFlops
per second!

11
High end Gusto (2)
12
Low end (1)

Distributed ASCI Supercomputer (DAS)
Common platform for research
(Wide-area) parallel computing and distributed
applications
November 1998, 4 universities, 200 nodes
Node
200 MHz Pentium Pro
128 MB memory, 2.5 GB disk
Myrinet 1.28 Gbit/s (full duplex)
Operating System BSD/OS
ATM Network

13
Low end (2)
14
Administrative Issues (1)

Intensive computations on a set of processors
across several countries and institutions
Strict rules to define the (good) usage of shared
resources se rules must be guaranteed by the
runtime, together with methods to migrate
computations to other sites whenever some local
request is raised

15
Administratives Issues (2)

A major difficulty is to avoid a large increase
in the administrative overhead
Each user cannot have an account on each machine
on the network
A single meta-user cannot be the one and only one
authorized user on the whole set of machines
Challenge find a tradeoff that does not
increase the administrative load while preserving
the userssecurity

16
Tomorrows Virtual Super-Computer (1)

The web (and the associate data-base) is built
using
A set of disks to store the data
A network infrastructure enabling a large number
of users to access this data
Metacomputing
Using the computing power of the computers linked
by Internet to execute various applications
(numerically-intensive applications first, but
many others to follow)
Internet will slowly evolve into a virtual
super-computer

17
Tomorrows Virtual Super-Computer (2)

Metacomputing applications will execute on a
hierarchical grid
Interconnection of clusters scattered all around
the world
A fundamental characteristic of the virtual
super-computer
A set of strongly heterogeneous and
geographically scattered resources

18
Algorithmic and Software Issues (1)
Whereas the architectural vision is clear, the
software developments are not so well understood
19
Algorithmic and Software Issues (2)

Low end of the field
Cope with heterogeneity
Major algorithmic effort to be undertaken
High end of the field
Logically assemble the distributed computers
extensions to PVM and MPI to handle distributed
collection of clusters
Configuration and performance optimization
Inherent complexity of networked and
heterogeneous systems
Resources often identified at runtime
Dynamic nature of resource characteristics

20
Algorithmic and Software Issues (3)

High-performance computing applications must
Configure themselves to fit the execution
environment
Adapt their behavior to subsequent changes in
resource characteristics
Parallel environments focused on strongly
homogeneous architectures (processor, memory,
network)
Array and loop distribution, parallelizing
compilers, HPF constructs, gang scheduling, MPI

However Metacomputing platforms are strongly
heterogeneous!
21
Applications (1)

All applications involving parallel
computingPerformance problems due to
Using a network of heterogeneous machines
Relying on current (limited) programming
environments
Classical applications such as the grand
challenges can be ported on metacomputing
platforms
Forget fine-grain parallelism deep
hierarchy between all memory and communication
layers
Code coupling nicest application for
metacomputing

22
Applications (2)

Other applications out of the world of numerical
(or scientific) computing
data-bases, decision-support systems
all kind of multimedia servers (PPI project at
Caltech)
Best candidates loosely-coupled applications
All kinds of decomposition (functional, pipeline,
data-parallel, macrotasking, ...)
Actual challenge implementation of
tightly-coupled applications

23
Programming environments

24
Programing models (1)

Extensions of MPI
MPI_Connect, Nexus, PACX-MPI, MPI-Plus,
Data-Exchange, VCM, MagPIe,
Globus a layered approach
Fundamental layer a set of core services,
including resource management, security, and
communications that enable the linking and
interoperation of distributed computer systems

25
Programing models (2)

Object-oriented technologies to cope with
heterogeneity
Encapsulate technical details'' such as
protocols, data representations, migration
policies
Legion is building on Mentat, an object-oriented
parallel processing system
Albatross relies on a high-performance Java
system, with a very efficient implementation of
Java Remote Method Invocation.

26
Programing models (3)

Far from achieving the holy goal
Using the computing resources remotely and
transparently,just as we do with
electricity,without knowing where it comes from

27
References

Globus www.globus.org
Legion www.cs.virginia.org/legion
Albatross www.cs.vu.nl/bal/albatross
AppLeSwww-cse.ucsd.edu/groups/hpcl/apples/apples.
html
NetSolve www.cs.utk.edu/netsolve

28
Case study Globus

A big machinery
A sophisticated machinery
The most widely used testbed

29
Layered Architecture
Applications
High-level Services and Tools
GlobusView
Testbed Status
DUROC
globusrun
MPI
Nimrod/G
MPI-IO
CC
Core Services
GRAM
Nexus
Metacomputing Directory Service
Globus Security Interface
Heartbeat Monitor
Gloperf
GASS
30
Core Globus Services

Communication infrastructure (Nexus)
Information services (MDS)
Network performance monitoring (Gloperf)
Process monitoring (HBM)
Remote file and executable management (GASS and
GEM)
Resource management (GRAM)
Security (GSI)

31
Running a Program

Goal Run a Message Passing Interface (MPI)
program on multiple computers
MPICH-G uses Globus for authentication, resource
allocation, executable staging, output
redirection, etc.

mpirun -np 4 my_app
1
32
Globus Components in Action
mpirun
globusrun
DUROC
GRAM
GRAM
GRAM
fork
LSF
LoadLeveler
P2
P2
P2
P1
P1
P1
Nexus
33
DUROC Review

Simultaneous allocation of a resource set
Handled via optimistic co-allocation based on
free nodes or queue prediction
In the future, advance reservations will also be
supported
globusrun will co-allocate specific
multi-requests
Uses a Globus component called the Dynamically
Updated Request OnlineCo-allocator (DUROC)

34
Using Information forResource Brokering
Info service location selection
Metacomputing Directory Service
Resource Broker
What computers? What speed? When available?
20 Mb/sec
GRAM
Globus Resource Allocation Managers
50 processors storage from 1020 to 1040 pm
Fork LSF EASYLL Condor etc.
35
Examples of Useful Information

Characteristics of a compute resource
IP address, software available, system
administrator, networks connected to, OS version,
load
Characteristics of a network
Bandwidth and latency, protocols, logical
topology
Characteristics of the Globus infrastructure
Hosts, resource managers

36
Metacomputing Directory Service

Store information in a distributed directory
Directory stored in collection of servers
Directory can be updated by
Globus system
Other information providers and tools
Applications (i.e., users)
Information dynamically available to
Tools
Applications

37
Remote Service Request
init_rsr() put_int() put_float() send_rsr()
handler get_int() get_float()

Allow method to be selected independently either
automatically or manually

send
receive
select comm method
Application Level
startpoint
endpoint
Implementation Level
available methods
38
Algorithmic issues

39
Data Decomposition Techniques for Cluster
Computing

Block-cyclic distribution paradigm preferred
layout for data-parallel programs (HPF,
ScaLAPACK)
Evenly balances total workload only if all
processors have same speed
Extending ScaLAPACK to heterogeneous clusters
turns out to be surprisingly difficult

40
Algorithmic challenge

Bad news designing a matrix-matrix product or a
dense linear solver proves a hard task on a
heterogeneous cluster!
Next problems
Simple linear algebra kernels on a collection of
clusters (extending the platform)
More ambitious routines, composed of a variety
of elementary kernels, on a heterogeneous cluster
(extending the application)
Implementing more ambitious routines on more
ambitious platforms (extending both)

41
Scheduling (1)

Two-step clustering heuristics
for classical parallel machines
Difficult to trade-off parallelism and
communication even in the presence of
unlimited resources

42
Scheduling (2)

Heterogeneity poses new challenges to scheduling
techniques
Clustering with unlimited resources has no more
meaning
Sophisticated scheduling heuristics available,
such as a dynamic remapping of tasks after having
computed a first allocation based on critical
paths

43
Load-balancing (1)

Distributing the computations (together with the
associated data) can be performed either
dynamically or statically, or a mixture of both.
Some simple schedulers are available, but they
use naive mapping strategies
master-slave techniques
use the past to predict the future''

44
Load-balancing (2)

Trade-off between the data distribution
parameters and the process spawning and possible
migration policies
Redundant computations might also be necessary to
use a heterogeneous cluster at its best
capabilities

45
AppLeS, a high-level scheduling and
load-balancing tool

Both application-specific and system-specific
information are required for good schedules
Dynamic information is necessary to accurately
assess system state. Predictions are accurate
only within a particular time frame
? Network Weather Service
Built on top of Globus or Legion

46
NetSolve

The remote computing paradigm
the program resides on the server
the user's data is sent to the server, where the
appropriate programs or numerical libraries
operate on it
the result is then sent back to the user's
machine.

47
NetSolve - The big picture
Request
48
NetSolve - Solving a Problem
Computational Servers
49
The ScaLAPACK NetSolve Server
50
ScaLAPACK on heterogeneous clusters

Dynamic allocation strategies not suited
large (prohibitive?) communication overhead
dependences may keep fast processors idle
Static allocation load inversely proportional to
processor speed
efficient but difficult to accurately estimate
and predict speed
static communication schemes and static memory
allocation (for library users)

51
Matrix product on a 2D grid
c
j
P
r
i
ij

Intuition load of a processor inversely
proportional to its speed

52
The 2D grid allocation problem

Maximize amount of work ( ? r ) ( ? c
)
Subject to constraints ? i,j r t
c ? 1 t cycle-time of
processor P

j
i
ij
i
j
ij
ij
53
Grid layout yet to be found!
P
ij

Search over all permutations
Use heuristics to solve this problem!

54
Collections of clusters (1)
Fast link
Slower link
55
Collections of clusters (2)

Introduce yet another level of granularity
Overlap inter-cluster communicationwith
independent computation
Static approach ?
So far Globus uses a batch system ? dedicated
machines
Not sufficient in the long-term

56
Conclusion

57
(A) Europe

While there are several projects related to
metacomputing in Europe, there is little
coordination and exchange between these projects
Only few European institutions have joined the
NPACI initiative only three international
affiliates in Europe

58
(B) Algorithmic issues

Difficulties seem largely underestimated
Data decomposition, scheduling heuristics, load
balancing become extremely difficult in the
context of metacomputing platforms
Research community focuses on low-level
communication protocols and distributed system
issues (light-weight process invocation,
migration, ...)

59
(C) Programming level

Which is the good level ?
Data-parallelism unrealistic, due to
heterogeneity
Explicit message passing too low-level
Object-oriented approaches still request the user
to have a deep knowledge of both its application
behavior and the underlying resources
Remote computing systems (NetSolve) face severe
limitations to efficiently load-balance the work
Relying on specialized but highly-tuned libraries
of all kinds may prove a good trade-off

60
(D) Applications

Key applications (from scientific computing to
data-bases) have dictated the way classical
parallel machines are used, programmed, and even
updated into more efficient platforms
Key applications will strongly influence, or even
guide, the development of metacomputing
environments

61
(D) Applications (contd)

Which applications will be worth the abundant but
hard-to-access resources of the grid ?
tightly-coupled grand challenges ?
mobile computing applications ?
micro-transactions on the Web ?
All these applications require new programming
paradigms to enable inexperienced users to access
the magic grid!

Write a Comment

User Comments (0)