Flexibility and Interoperability in a Parallel MD code - PowerPoint PPT Presentation

About This Presentation
Title:

Flexibility and Interoperability in a Parallel MD code

Description:

Use most appropriate parallel programming paradigm for each module ... Supports multi-paradigm programming. Provides portability ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 27
Provided by: laxmika
Learn more at: http://charm.cs.uiuc.edu
Category:

less

Transcript and Presenter's Notes

Title: Flexibility and Interoperability in a Parallel MD code


1
Flexibility and Interoperability in a Parallel MD
code
  • Robert Brunner,
  • Laxmikant Kale,
  • Jim Phillips
  • University of Illinois at Urbana-Champaign

2
Contributors
  • Principal investigators
  • Laxmikant Kale, Klaus Schulten, Robert Skeel
  • Development team
  • Milind Bhandarkar, Robert Brunner, Attila Gursoy,
    Neal Krawetz, Jim Phillips, Ari Shinozaki, ...

3
Middle layers
Applications
Middle Layers Languages, Tools, Libraries
Parallel Machines
4
(No Transcript)
5
What is needed
  • Not application centered CS research
  • Not isolated CS research
  • Application oriented yet Computer Science
    centered research, that will enhance the enabling
    layers in the middle

6
Challenges in Parallel Applications
  • Scalable High Performance
  • To a small and large number of processors
  • Small and large molecular systems
  • Modifiable and extensible design
  • Ability to incorporate new algorithms
  • Reusing new libraries without re-implementation
  • Experimenting with alternate strategies
  • How to achieve both simultaneously

7
Suggested OO Approach
  • Dynamic irregular applications
  • Use multi-domain decomposition
  • (multiple tasks assigned to each processor)
  • Data driven scheduling
  • Migratable objects
  • Use registration and callbacks to avoid
    hardwiring of object connections on a processor
  • Measurement based migration/load balancing

8
Suggested Approach contd.
  • Use most appropriate parallel programming
    paradigm for each module
  • Reuse existing libraries, irrespective of
    language/paradigm in which it is implemented
  • Need support for multiparadigm interoparibilty
  • (Supported by Converse)
  • Careful class design and use of C features

9
Molecular Dynamics
  • Collection of charged atoms, with bonds
  • Newtonian mechanics
  • At each time-step
  • Calculate forces on each atom
  • bonds
  • non-bonded electrostatic and van der Waals
  • Calculate velocities and Advance positions
  • 1 femtosecond time-step, millions needed!
  • Thousands of atoms (1,000 - 100,000)

10
Further MD
  • Use of cut-off radius to reduce work
  • 8 - 14 Ã…
  • Faraway charges ignored!
  • 80-95 work is non-bonded force computations
  • Some simulations need faraway contributions

11
NAMD Design Objectives
  • Scalable High Performance
  • To a small and large number of processors
  • Small and large molecular systems
  • Modifiable and extensible design
  • Ability to incorporate new algorithms
  • Reusing new libraries without re-implementation
  • Experimenting with alternate strategies

12
Force Decomposition
Distribute force matrix to processors Matrix is
sparse, non uniform Each processor has one
block Communication N/sqrt(P) Ratio
sqrt(P) Better scalability (can use 100
processors) Hwang, Saltz, et al 6 on 32 Pes
36 on 128 processor
Not Scalable
13
Spatial Decomposition
14
Spatial decomposition modified
15
Implementation
  • Multiple Objects per processor
  • Different types patches, pairwise forces, bonded
    forces,
  • Each may have its data ready at different times
  • Need ability to map and remap them
  • Need prioritized scheduling
  • Charm supports all of these

16
Charm
  • Data Driven Objects
  • Object Groups
  • global object with a representative on each PE
  • Asynchronous method invocation
  • Prioritized scheduling
  • Mature, robust, portable
  • http//charm.cs.uiuc.edu

17
Data driven execution
Scheduler
Scheduler
Message Q
Message Q
18
Object oriented design
  • Two top level classes
  • Patches cubes containing atoms
  • Computes force calculation
  • Home patches and Proxy patches
  • Home patch sends coordinates to proxies, and
    receives forces from them
  • Each compute interacts with local patches only

19
Compute hierarchy
  • Many compute subclasses
  • Allow reuse of coordination code
  • Reuse of bookkeeping tasks
  • Easy to add new types of force objects
  • Example steered molecular dynamics
  • Implementor focuses on the new force functionality

20
Multi-paradigm programming
  • Long-range electrostatic interactions
  • Some simulations require this feature
  • Contributions of faraway atoms can be computed
    infrequently
  • PVM based library, DPMTA
  • Developed at Duke, by John Board, et al
  • Patch life cycle
  • better expressed as a thread

21
Converse and Interoperability
  • Supports multi-paradigm programming
  • Provides portability
  • Makes it easy to implement RTS for new paradigms
  • Several languages/libraries
  • Charm, threaded MPI, PVM, Java, md-perl, pc,
    nexus, Path, Cid, CC,..

22
Namd2 with Converse
23
Separation of concerns
  • Different developers, with different interests
    and knowledge, can contribute effectively
  • Separation of communication and parallel logic
  • Threads to encapsulate life-cycle of patches
  • Adding new integrator, improving performance, new
    MD ideas, can be performed modularly and
    independently

24
Load balancing
  • Collect timing data for several cycles
  • Run heuristic load balancer
  • Several alternative ones
  • Re-map and migrate objects accordingly
  • Registration mechanisms facilitate migration
  • Needs a separate talk!

25
Performance size of system
26
Performance various machines
27
Speedup
28
Conclusion
  • Multi-domain decomposition works well for
    dynamically evolving, or irregular apps
  • When supported by data driven objects (Charm),
    user level threads, call backs
  • Object oriented parallel programming
  • promotes reuse ,
  • good performance
  • Multi-paradigm programming is effective!
  • Measurement based load balancing

29
What works?
  • To effectively parallelize irregular/dynamic
    applications
  • decompose into multiple entities per processor
  • use adaptive scheduling via data-driven objects
  • object migration based load balancing
  • registration and call backs to make migrations
Write a Comment
User Comments (0)
About PowerShow.com