Carlos Varela, cvarela@cs.rpi.edu - PowerPoint PPT Presentation

1 / 60

About This Presentation

Title:

Carlos Varela, cvarela@cs.rpi.edu

Description:

Towards a World-Wide Computer: Software Technology for Computational Grids Williams College Carlos Varela, cvarela_at_cs.rpi.edu Department of Computer Science – PowerPoint PPT presentation

Number of Views:263

Avg rating:3.0/5.0

Slides: 61

Provided by: stan1157

Category:

more less

Transcript and Presenter's Notes

Title: Carlos Varela, cvarela@cs.rpi.edu

1
Towards a World-Wide Computer Software
Technology for Computational Grids
Williams College

Carlos Varela, cvarela_at_cs.rpi.edu
Department of Computer Science
Rensselaer Polytechnic Institute
http//wcl.cs.rpi.edu/
Graduate Students
Travis Desell
Kaoutar El Maghraoui
Wei-Jen Wang
April 8, 2005

2
Adaptive Partial Differential Equation Solvers

Investigators
J. Flaherty, M. Shephard B. Szymanski, C. Varela
(RPI)
J. Teresco (Williams), E. Deelman (ISI-UCI)
Problem Statement
How to dynamically adapt solutions to PDEs to
account for underlying computing infrastructure?
Applications/Implications
Materials fabrication, biomechanics, fluid
dynamics, aeronautical design, ecology.
Approach
Partition problem and dynamically map into
computing infrastructure and balance load.
Low communication overhead over low-latency
connections.
Software
Rensselaer Partition Model (RPM)
Algorithm Oriented Mesh Database (AOMD).
Dynamic Resource Utilization Model) (DRUM)

3
Virtual Surgical Planning

Investigators
K. Jansen, M. Shephard (RPI),
C. Taylor, C. Zarins (Stanford)
Problem Statement
How to develop a software framework to enable
virtual surgical planning based on real patient
data?
Applications/Implications
Surgeons will be able to virtually evaluate
vascular surgical options based on simulation
rather than intuition alone.
Approach
Scan of real patient is processed to extract
solid model and inlet flow waveform.
Model is discretized and flow equations solved.
Multiple alterations to model are made within
intuitive human-computer interface and evaluated
similarly.
Software
MEGA (SCOREC discretization toolkit)
PHASTA (RPI flow solver).
Funded by NSF-ITR (7/02-7/07)

4
Particle Physics and Bacterial Pathogenicity

Investigators
J. Cummings, J. Napolitano (RPI Physics),
M. Nishiguchi (NMSU Biology), W. Wheeler (AMNH),
B. Szymanski, C. Varela, J. Flaherty (RPI CS)
Problem Statement
Do missing baryons exist? Sub-atomic particles
that have not been observed.
How do bacteria evolve? What are the mechanisms
of infection and colonization?
Applications/Implications
Physics particle physics, search for missing
baryons.
Biology origins of bacterial pathogenicity,
evolution of species.
Approach
Experimental data analysis and simulation
Comparison and analysis of complete genome
sequences to identify evolutionary patterns.
Software
Domain-specific code for parallel computing on
homogeneous clusters.

5
Milky Way Origin and Structure

Investigators
H. Newberg (RPI Astronomy), J. Teresco
(Williams)
M. Magdon-Ismail, B. Szymanski, C. Varela(RPI CS)
Problem Statement
What is the structure and origin of the Milky Way
galaxy?
How to use data from 10,000 square degrees of the
north galactic cap collected in five optical
filters over five years by the Sloan Digital Sky
Survey?
Applications/Implications
Astrophysics origins and evolution of our
galaxy.
Approach
Experimental data analysis and simulation
Using A stars as tracers of the galactic halo,
and using photometrically determined
metallicities of main sequence F-K stars to
determine whether the thick disk is chemically
distinct from the thin disk and galactic halo of
our galaxy
Status
Sequential code which takes multiple days to run
on a single node.

6
The Rensselaer Grid
External Networks
694 Existing Processors 530 Projected
Processors ------------------------------- 1224
Processors Grid
Internet 2
155 Mbit

CS Clusters
168 processors
64 Dual 2.4 GHz Xeon
40 800 MHz xSeries

Multiscale Cluster
172 processors
66 Dual 2.0 GHz Xeon
40 400 MHz Nextra-X1

Multipurpose Clusters
326 processors
Biotechnology 134 P3 processors
Nanotechnology 192 processors (Athlon, P4 and P3)

WCL Cluster
28 processors
4 dual Sun Baldes 100
4 single IBM nodes
4 Quad IBM Power series

Existing Clusters

Bioscience Cluster
160 processors
80 Dual 2.0 GHz Microway Navion-A Opreton

Multiscale Cluster
160 processors
80 Dual 2.0 GHz Microway Navion-A Opreton

CS Cluster
82 processors
41 Dual 2 GHz Power PC

Multiscale Cluster
128 processors
64 Dual 2.0 GHz Opteron

Projected Clusters
7
Map of Rensselaer Grid Clusters
Nanotech
Multiscale
Bioscience Cluster
CS /WCL
Multipurpose Cluster
CS
8
TeraGrid
Site Resources
Site Resources
HPSS
HPSS
External Networks
External Networks
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 10.3 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
9
Extensible TeraScale Facility (ETF)
Site Resources
Site Resources
HPSS
HPSS
External Networks
External Networks
Rensselaer Grid
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 10.3 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
10
Extensible TeraScale Facility (ETF)
RPI
11
Data Grid for High Energy Physics
Image courtesy Harvey Newman, Caltech
12
iVDGLInternational Virtual Data Grid Laboratory
www.ivdgl.org
13
Worlds Largest Computing Grid --CERN 3/2005
www.cern.ch
www.ivdgl.org
14
PlanetLab An Open Platform for Worldwide
Services
550 nodes over 261 sites, as of April05
www.planet-lab.org
15
Worldwide Computing Software

Computational Resources and Devices
Large pool of idle resources available in the
Internet
Heterogeneous platforms
Networks
Wide range of latencies/bandwidths
Dynamic resources
Different degrees of availability
Different types of failures

Research Goals
Scalability to worldwide execution environments
Inherent adaptability to environmental changes
and resource availability
Programmability and high-performance
Approach
Adaptive reflective middleware to trigger
automatic reconfiguration of applications
High-level programming abstractions

16
Actors/SALSA

Actor Model
A reasoning framework to model concurrent
computations
Programming abstractions for distributed open
systems
G. Agha, Actors A Model of Concurrent
Computation in Distributed Systems. MIT Press,
1986.
SALSA
Simple Actor Language System and Architecture
An actor-oriented language for mobile and
internet computing
Programming abstractions for internet-based
concurrency, distribution, mobility, and
coordination
C. Varela and G. Agha, Programming dynamically
reconfigurable open systems with SALSA, ACM
SIGPLAN Notices, OOPSLA 2001, 36(12), pp 20-34.

17
SALSA Basics

Programmers define behaviors for actors.
Messages are sent asynchronously.
Messages are modeled as potential method
invocations.
Continuation primitives are used for
coordination.

18
Actor Creation

To create an actor locally
TravelAgent a new TravelAgent()
To create an actor with specified uan and ual
TravelAgent a new TravelAgent() at (uan, ual)
Other possibility
TravelAgent a new TravelAgent() at (uan)

19
Message Sending

TravelAgent a new TravelAgent()
alt-book( flight )

20
Remote Message Sending

Obtain a remote actor reference by name.
TravelAgent a getReferenceByName(uan//myhost/t
a)
alt-printItinerary()
Obtain a remote actor reference by location.
TravelAgent a getReferenceByLocation(rmsp//myh
ost/ta1)
alt-printItinerary()

21
Migration

Obtaining a remote actor reference and migrating
the actor.
TravelAgent a getReferenceByName
(uan//myhost/ta)
alt-migrate( rmsp//yourhost/travel ) _at_
alt-printItinerary()

22
Token Passing Continuation

Ensures that each message in the expression is
sent after the previous message has been
processed. It also allows that the return value
of one message invocation may be used as an
argument for a later invocation in the
expression.
Example
a1lt-m1() _at_ a2lt-m2( token )
Send m1 to a1 and then after m1 finishes, send
the result with m2 to a2.

23
Join Blocks

Provide a mechanism for synchronizing the
processing of a set of messages.
Set of results is sent along as a token.
Example
Actor actors searcher0, searcher1,
searcher2, searcher3
Join actorslt-find( phrase ) _at_
resultActorlt-output( token )
Send the find( phrase ) message to each actor in
actors then after all have completed send the
result to resultActor with an output( )
message.

24
Example Acknowledged Multicast

join a1lt-m1(), a2lt-m2, a3lt-m3(), _at_
custlt-n(token)

25
Lines of Code Comparison
26
First Class Continuations

Enable actors to delegate computation to a third
party independently of the processing context.

27
Fibonacci Example

module examples.fibonacci
behavior Fibonacci
int n
Fibonacci(int n) this.n n
int add(int numbers) return (numbers0
numbers1)
int compute()
if (n 0) return 0
else if (n lt 2) return 1
else
Fibonacci fib1 new Fibonacci(n-1)
Fibonacci fib2 new Fibonacci(n-2)
join fib1lt-compute(), fib2lt-compute() _at_
add(token) _at_ currentContinuation

28
SALSA and Java

SALSA source files are compiled into Java source
files before being compiled into Java byte code.
SALSA programs may take full advantage of Java
API.

29
Hello World Example

module demo
behavior HelloWorld
void act( String argv )
standardOutputlt-print( "Hello" ) _at_
standardOutputlt-print( "World!" )

30
Hello World Example

The act( String args ) message handler is
similar to the main() method in Java and is used
to bootstrap SALSA programs.

31
Migration Example
behavior Migrate    void print()
standardOutputlt-println( "Migrate actor just
migrated here." )      void act( String
args )       if (args.length ! 3)
   standardOutputlt-println("Usage java
migration.Migrate ltUANgt ltsrcUALgt
ltdestUALgt")         return              UAN
uan new UAN(args0)        UAL ual new
UAL(args1)        Migrate migrateActor
new Migrate() at (uan, ual)
migrateActorlt-print() _at_
migrateActorlt-migrate( args2 ) _at_
migrateActorlt-print()
32
Migration Example

The program must be given valid name and
locations.
After remotely creating the actor. It sends the
print message to itself before migrating to the
second theater and sending the message again.

33
Compilation
java SalsaCompiler demo/Migrate.salsa SALSA
Compiler Version 1.0 Reading from file
demo/Migrate.salsa . . . SALSA Compiler Version
1.0 SALSA program parsed successfully. SALSA
Compiler Version 1.0 SALSA program compiled
successfully. javac demo/Migrate.java java
demo.Migrate Usage java migration.Migrate
ltuangt ltualgt ltualgt

Compile Migrate.salsa file into Migrate.java.
Compile Migrate.java file into Migrate.class.
Execute Migrate

34
Migration Example
UAN Server
The actor will print "Migrate actor just migrated
here." at theater 1 then theater 2.
35
World Migrating Agent Example
36
Middleware/IOS

Middleware
A software layer between distributed applications
and operating systems.
Alleviates application programmers from directly
dealing with distribution issues
Heterogeneous hardware/O.S.s
Load balancing
Fault-tolerance
Security
Quality of service
Internet Operating System (IOS)
A decentralized framework for adaptive, scalable
execution
Modular architecture to evaluate different
distribution and reconfiguration strategies
T. Desell, K. El Maghraoui, and C. Varela, Load
Balancing of Autonomous Actors over Dynamic
Networks, HICSS-37 Software Technology Track,
Hawaii, January 2004. 10pp.

37
World-Wide Computer Architecture

SALSA application layer
Programming language constructs for actor
communication, migration, and coordination.
IOS middleware layer
A Resource Profiling Component
Captures information about actor and network
topologies and available resources
A Decision Component
Takes migration, split/merge, or replication
decisions based on profiled information
A Protocol Component
Performs communication between nodes in the
middleware system
WWC run-time layer
Theaters provide runtime support for actor
execution and access to local resources
Pluggable transport, naming, and messaging
services

38
Autonomous Actors

Actors
Unit of concurrency
Asynchronous message passing
State encapsulation
Universal actors
Universal names
Location/theater
Ability to migrate between theaters
Autonomous actors
Performance profiling to improve quality of
service
Autonomous migration to balance computational
load
Split and merge to tune granularity
Replication to increase fault tolerance

39
Middleware Agents and Load Balancing

Middleware agents are organized in a virtual
network and exchange information periodically
New peers join and old peers leave
Work loads change
Middleware Agents can organize in different
topologies, e.g., peer-to-peer (p2p) and
cluster-to-cluster (c2c) virtual networks
IOS modular architecture enables using different
load balancing and profiling strategies, e.g.
Random work-stealing (RS)
Actor topology-sensitive work-stealing (ATS)
Network topology-sensitive work-stealing (NTS)
Weighted resource-sensitive work-stealing (WRS)

40
Random Work Stealing (RS)

Loosely based on Cilks random work stealing
Lightly-loaded theaters periodically send work
steal packets to randomly picked peer theaters
Actors migrate from highly loaded theaters to
lightly loaded theaters
Simple strategy no broadcasts required
Stable strategy it avoids additional traffic on
overloaded networks

41
Actor Topology-Sensitive Work-Stealing (ATS)

An extension of RS to collocate actors that
communicate frequently
Decision agent picks the actor that will minimize
inter-theater communication after migration,
based on
Location of acquaintances
Profiled communication history
Tries to minimize the frequency of remote
communication improving overall system throughput

42
Network Topology-Sensitive Work-Stealing (NTS)

An extension of ATS to take the network topology
and performance into consideration
Periodically profile end-to-end network
performance among peer theaters
Latency
Bandwidth
Tries to minimize the cost of remote
communication improving overall system throughput
Tightly coupled actors stay within reasonably low
latencies/ high bandwidths
Loosely coupled actors can flow more freely

43
A General Model for Weighted Resource-Sensitive
Work-Stealing (WRS)

Given
A set of resources, R r0 rn
A set of actors, A a0 an
w is a weight, based on importance of the
resource r to the performance of a set of actors
A
0 w(r,A) 1
Sall r w(r,A) 1
a(r,f) is the amount of resource r available at
foreign node f
u(r,l,A) is the amount of resource r used by
actors A at local node l
M(A,l,f) is the estimated cost of migration of
actors A from l to f
L(A) is the average life expectancy of the set of
actors A
The predicted increase in overall performance G
gained by migrating A from l to f, where G 1
D(r,l,f,A) (a(r,f) u(r,l,A)) / (a(r,f)
u(r,l,A))
G Sall r (w(r,A) D(r,l,f,A))
M(A,l,f)/(10log L(A))
When work requested by f, migrate actor(s) A with
greatest predicted increase in overall
performance, if positive.

44
Preliminary Results

Application Actor Topologies
Unconnected
Sparse
Tree
Hypercube
Middleware Agent Topologies
Peer-to-peer
Cluster-to-cluster
Network Topologies
Grid-like (set of homogeneous clusters)
Internet-like (more heterogeneous)

Migration Policies
Single Actor
Actor Groups
Dynamic Networks

45
Unconnected and Sparse Application Topologies

Load balancing experiments use RR, RS and ATS

46
Tree and Hypercube Application Topologies

RS and ATS do not add substantial overhead to RR
ATS performs best in all cases with some
interconnectivity

47
Peer-to-Peer Middleware Agent Topology (P2P)

List of peers, arranged in groups based on
latency
Local (0-10 ms)
Regional (11-100 ms)
National (101-250 ms)
Global (251 ms)
Work steal requests
Propagated randomly within the closest group
until time to live reached or work found
Propagated to progressively farther groups if no
work is found
Peers respond to steal packets when the decision
component decides to reconfigure application
based on performance model

48
Cluster-to-Cluster Middleware Agent Topology (C2C)

Hierarchical peer organization
Each cluster has a manager
Each node in a cluster reports periodically
profiling information to manager
Managers perform intra-cluster load balancing
Cluster managers form a dynamic peer-to-peer
network
Managers may join, leave at any time
Clusters can split and merge depending on network
conditions
Inter-cluster load balancing is based on
work-stealing similar to p2p protocol component
Clusters are organized dynamically based on
latency

49
Physical Network Topologies

Grid-like Topology
Relatively homogeneous processors
Very high performance networking within clusters
(e.g., myrinet and gigabit ethernet)
Networking between clusters dedicated with high
bandwidth links (e.g., the extensible terascale
facility)

Internet-like Topology
Wider range of processor architectures and
operating systems
Nodes are less reliable
Networking between nodes can range from low
bandwidth and latency to dedicated fiber optic
links

50
Results for applications with high communication
to computation ratio
51
Results for applications with low
communication-to-computation ratio
52
Middleware Agent Topology Evaluation Summary

Simulation results show that
The peer-to-peer protocol generally performs
better in Internet-like environments, with the
exception of the sparse application topology
The cluster-to-cluster protocol generally
performs better on grid-like environments, with
the exception of the unconnected application
topology

53
Single vs. Group Migration
54
Dynamic Networks

Theaters were added and removed dynamically to
test scalability.
During the 1st half of the experiment, every 30
seconds, a theater was added.
During the 2nd half, every 30 seconds, a theater
was removed
Throughput improves as the number of theaters
grows.

55
Actor Distribution in Dynamic Networks

Both RS and ATS distributed actors evenly across
the dynamic network of theaters

56
Ongoing/Future Work

Splitting, Merging, and Replication Components
Profiling Memory and Storage resources
Interoperability with existing high-performance
messaging implementations (e.g., MPI, OpenMP)
IOS/MPI project
Interoperability with Globus/Open Grid Services
Architecture (OGSA)
Interoperability with Web Services

57
Related Work Work Stealing/Internet
Computing/P2P Systems

Work Stealing
Cilks runtime system for multithreaded parallel
programming
Cilks schedulers techniques of work stealing
R. D. Blumofe and C. E. Leiserson, Scheduling
Multithreaded Computations by Work Stealing,
FOCS 94
Internet Computing
SETI_at_home (Berkeley)
Folding_at_home (Stanford)
P2P Systems
Distributed Storage Freenet, KaZaA
File Sharing Napster, Gnutella
Distributed Hashtables Chord, CAN, Pastry

58
Related Work Grid/Distributed Computing

Cluster/Grid Computing Software
OGSA/Web Services
Globus (Univa)
Condor
Legion
Network Infrastructure
PlanetLab
Distributed Computing Services
WebOS
2K
Network Weather Service
Much other work on distributed systems

59
Thank you Software freely available at
http//wcl.cs.rpi.edu/
60
Using the IOS middleware

Start IOS Peer Servers a mechanism for peer
discovery
Start a network of IOS theaters
Write your SALSA programs and extend all actors
to autonomous actors
Bind autonomous actors to theaters
IOS automatically reconfigures the location of
actors in the network for improved performance of
the application.
IOS supports the dynamic addition and removal of
theaters