Spatial Decomposition and Pairlist Management for Molecular Dynamics

1 / 24
About This Presentation
Title:

Spatial Decomposition and Pairlist Management for Molecular Dynamics

Description:

Anton Arkhipov, Peter L. Freddolino, Lingling Miao, Leonardo Trabuco ... Anton Arkhipov, Peter L. Freddolino, Lingling Miao, Leonardo Trabuco ... –

Number of Views:78
Avg rating:3.0/5.0
Slides: 25
Provided by: coursesEc
Category:

less

Transcript and Presenter's Notes

Title: Spatial Decomposition and Pairlist Management for Molecular Dynamics


1
Spatial Decomposition and Pairlist Management for
Molecular Dynamics
2
Molecular Dynamics (MD)
Potential function Utotal is derived from quantum
mechanical calculations. Divided into bonded,
short range nonbonded, and long range
electrostatic components.
3
The goal NAMD-GPU
  • NAMD is a fast, highly parallel MD program
    developed by the Schulten group
  • Short-range nonbonded interactions already
    implemented on GPU
  • Speedups limited as other portions of calculation
    become bottleneck

From J. Stone et al., JCC 28(16)2618-2640 (2007)
4
Pairlists in NAMD
  • Short-range nonbonded interactions must be
    calculated for all pairs within cutoff
  • O(n2) if all distances calculated every step
  • Instead, maintain a pairlist of all atoms within
    a certain distance O(n rp3)

5
Tasks for Pairlists
  • Spatial binning -- divide atoms into a set of
    patches (cubes) of side length rpairlist to limit
    total number of comparisons
  • Pairlist generation -- for each atom, check
    distance to all atoms in neighboring patches and
    make a list of atoms that are within rpairlist
  • Pairlist update checking -- at every MD time
    step, check to see if any atom have moved further
    than (rpairlist - rcut) / 2
  • Pairlist generation functions can also be used
    for MD postprocessing

6
Tasks for Pairlists
7
Spatial Binning
  • Assign each atom to a patch based on its position
  • Arrange atoms for efficient memory access during
    pairlists generation

8
Binning Functions
1. Make a new coordinates array sorted by
increasing atomID
atomID
11 0 7 1
x11 y11 z11 x0 y0 z0 x7 y7 z7 x1 y1 z1
coords
x0 y0 z0 x1 y1 z1 x2 y2 z2 x3 y3 z3
coords2
2. Bin atoms using atomID and coords (independent
of step 1).
atomID
11 0 7 1
5 1 1 8
Patch (atom k is in patch Patchk)
3. Sort Patch and atomID based on patchID
0 0 0 0 1 1 1 1 2 2
Patch
20 3 2 17 0 7 5 15 19 14
atomID
9
Binning Functions (Cont.)
4. Process Patch array to calculate per-patch
memory offsets 5. Process offsets to calculate
number of atoms in each patch
6. Sort coords2 based on patchID
x0 y0 z0 x1 y1 z1 x2 y2 z2 x3 y3 z3
coords2
0 0 0 0 1 1 1 1 2 2
Patch
20 3 2 17 0 7 5 15 19 14
atomID
x20 y20 z20 x3 y3 z3 x2 y2 z2
coords
10
Coalesced Memory Operations
A common and efficient pattern for operating on
data in R3.
11
Binning Optimization
12
Binning Timing
13
Pairlist Generation
  • Any atom within the neighboring patches might be
    a pair atom
  • When the distance between two atoms ltrpairlist,
    they are considered to be a pair

14
Pairlist Generation Implementation
  • The density of our systems is uniform, i.e. each
    atom has 1300 pair atoms
  • Preset a certain pairlist length (PL_SIZE) for
    all the atoms in the system, which can be changed
    by the user
  • A flag is there to test when the given PL_SIZE is
    not big enough


15
Pairlist Generation Implementation
  • Using registers
  • Tiled data into shared memory
  • Using coalesced reading from global memory
  • Also some parameters of the system are stored in
    constant memory

16
Pairlist Generation Optimization
17
Pairlist Generation Timing
18
Pairlist Update Checking
Time step
  • For all atoms, check whether any has moved
    (rpairlist rcut)/2 since last update
  • If any atom has, run a complete binning and
    pairlist cycle
  • Occurs once per 10 steps in MD

19
Plupdate Check Implementation
Coalesced memory reads
20
Plupdate Check Implementation, Multiple Loads
Global ? Registers
Repeat ELEM_PER_BLOCK times
Repeating loads allow one to cover global memory
latency
Registers ? Shared
Global ? Registers
Computation
Execution
21
Plupdate Check Optimization
22
Plupdate Check Timing
23
Conclusions
  • Pairlist generation and pairlist update check
    both yielded good speedups (5-10x)
  • Binning on CPU much faster due to efficiency of
    cache requires further study
  • Functions generated here are fast enough to be
    usefully incorporated into NAMD-GPU
  • Other molecular modeling applications (eg., VMD)
    can also make use of them

24
Acknowledgements
  • John Stone and Jim Phillips (NAMD-GPU)
  • Professor and TAs
  • David Kirk and NVIDIA
  • Theoretical and Computational Biophysics Group
Write a Comment
User Comments (0)
About PowerShow.com