Flexibility and Interoperability in a Parallel MD code - PowerPoint PPT Presentation

About This Presentation

Title:

Flexibility and Interoperability in a Parallel MD code

Description:

Use most appropriate parallel programming paradigm for each module ... Supports multi-paradigm programming. Provides portability ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 27

Provided by: laxmika

Learn more at: http://charm.cs.uiuc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Flexibility and Interoperability in a Parallel MD code

1
Flexibility and Interoperability in a Parallel MD
code

Robert Brunner,
Laxmikant Kale,
Jim Phillips
University of Illinois at Urbana-Champaign

2
Contributors

Principal investigators
Laxmikant Kale, Klaus Schulten, Robert Skeel
Development team
Milind Bhandarkar, Robert Brunner, Attila Gursoy,
Neal Krawetz, Jim Phillips, Ari Shinozaki, ...

3
Middle layers
Applications
Middle Layers Languages, Tools, Libraries
Parallel Machines
4
(No Transcript)
5
What is needed

Not application centered CS research
Not isolated CS research
Application oriented yet Computer Science
centered research, that will enhance the enabling
layers in the middle

6
Challenges in Parallel Applications

Scalable High Performance
To a small and large number of processors
Small and large molecular systems
Modifiable and extensible design
Ability to incorporate new algorithms
Reusing new libraries without re-implementation
Experimenting with alternate strategies
How to achieve both simultaneously

7
Suggested OO Approach

Dynamic irregular applications
Use multi-domain decomposition
(multiple tasks assigned to each processor)
Data driven scheduling
Migratable objects
Use registration and callbacks to avoid
hardwiring of object connections on a processor
Measurement based migration/load balancing

8
Suggested Approach contd.

Use most appropriate parallel programming
paradigm for each module
Reuse existing libraries, irrespective of
language/paradigm in which it is implemented
Need support for multiparadigm interoparibilty
(Supported by Converse)
Careful class design and use of C features

9
Molecular Dynamics

Collection of charged atoms, with bonds
Newtonian mechanics
At each time-step
Calculate forces on each atom
bonds
non-bonded electrostatic and van der Waals
Calculate velocities and Advance positions
1 femtosecond time-step, millions needed!
Thousands of atoms (1,000 - 100,000)

10
Further MD

Use of cut-off radius to reduce work
8 - 14 Å
Faraway charges ignored!
80-95 work is non-bonded force computations
Some simulations need faraway contributions

11
NAMD Design Objectives

Scalable High Performance
To a small and large number of processors
Small and large molecular systems
Modifiable and extensible design
Ability to incorporate new algorithms
Reusing new libraries without re-implementation
Experimenting with alternate strategies

12
Force Decomposition
Distribute force matrix to processors Matrix is
sparse, non uniform Each processor has one
block Communication N/sqrt(P) Ratio
sqrt(P) Better scalability (can use 100
processors) Hwang, Saltz, et al 6 on 32 Pes
36 on 128 processor
Not Scalable
13
Spatial Decomposition
14
Spatial decomposition modified
15
Implementation

Multiple Objects per processor
Different types patches, pairwise forces, bonded
forces,
Each may have its data ready at different times
Need ability to map and remap them
Need prioritized scheduling
Charm supports all of these

16
Charm

Data Driven Objects
Object Groups
global object with a representative on each PE
Asynchronous method invocation
Prioritized scheduling
Mature, robust, portable
http//charm.cs.uiuc.edu

17
Data driven execution
Scheduler
Scheduler
Message Q
Message Q
18
Object oriented design

Two top level classes
Patches cubes containing atoms
Computes force calculation
Home patches and Proxy patches
Home patch sends coordinates to proxies, and
receives forces from them
Each compute interacts with local patches only

19
Compute hierarchy

Many compute subclasses
Allow reuse of coordination code
Reuse of bookkeeping tasks
Easy to add new types of force objects
Example steered molecular dynamics
Implementor focuses on the new force functionality

20
Multi-paradigm programming

Long-range electrostatic interactions
Some simulations require this feature
Contributions of faraway atoms can be computed
infrequently
PVM based library, DPMTA
Developed at Duke, by John Board, et al
Patch life cycle
better expressed as a thread

21
Converse and Interoperability

Supports multi-paradigm programming
Provides portability
Makes it easy to implement RTS for new paradigms
Several languages/libraries
Charm, threaded MPI, PVM, Java, md-perl, pc,
nexus, Path, Cid, CC,..

22
Namd2 with Converse
23
Separation of concerns

Different developers, with different interests
and knowledge, can contribute effectively
Separation of communication and parallel logic
Threads to encapsulate life-cycle of patches
Adding new integrator, improving performance, new
MD ideas, can be performed modularly and
independently

24
Load balancing

Collect timing data for several cycles
Run heuristic load balancer
Several alternative ones
Re-map and migrate objects accordingly
Registration mechanisms facilitate migration
Needs a separate talk!

25
Performance size of system
26
Performance various machines
27
Speedup
28
Conclusion

Multi-domain decomposition works well for
dynamically evolving, or irregular apps
When supported by data driven objects (Charm),
user level threads, call backs
Object oriented parallel programming
promotes reuse ,
good performance
Multi-paradigm programming is effective!
Measurement based load balancing

29
What works?