Title: Spatial Decomposition and Pairlist Management for Molecular Dynamics
1Spatial Decomposition and Pairlist Management for
Molecular Dynamics
2Molecular Dynamics (MD)
Potential function Utotal is derived from quantum
mechanical calculations. Divided into bonded,
short range nonbonded, and long range
electrostatic components.
3The goal NAMD-GPU
- NAMD is a fast, highly parallel MD program
developed by the Schulten group - Short-range nonbonded interactions already
implemented on GPU - Speedups limited as other portions of calculation
become bottleneck
From J. Stone et al., JCC 28(16)2618-2640 (2007)
4Pairlists in NAMD
- Short-range nonbonded interactions must be
calculated for all pairs within cutoff - O(n2) if all distances calculated every step
- Instead, maintain a pairlist of all atoms within
a certain distance O(n rp3)
5Tasks for Pairlists
- Spatial binning -- divide atoms into a set of
patches (cubes) of side length rpairlist to limit
total number of comparisons - Pairlist generation -- for each atom, check
distance to all atoms in neighboring patches and
make a list of atoms that are within rpairlist - Pairlist update checking -- at every MD time
step, check to see if any atom have moved further
than (rpairlist - rcut) / 2 - Pairlist generation functions can also be used
for MD postprocessing
6Tasks for Pairlists
7Spatial Binning
- Assign each atom to a patch based on its position
- Arrange atoms for efficient memory access during
pairlists generation
8Binning Functions
1. Make a new coordinates array sorted by
increasing atomID
atomID
11 0 7 1
x11 y11 z11 x0 y0 z0 x7 y7 z7 x1 y1 z1
coords
x0 y0 z0 x1 y1 z1 x2 y2 z2 x3 y3 z3
coords2
2. Bin atoms using atomID and coords (independent
of step 1).
atomID
11 0 7 1
5 1 1 8
Patch (atom k is in patch Patchk)
3. Sort Patch and atomID based on patchID
0 0 0 0 1 1 1 1 2 2
Patch
20 3 2 17 0 7 5 15 19 14
atomID
9Binning Functions (Cont.)
4. Process Patch array to calculate per-patch
memory offsets 5. Process offsets to calculate
number of atoms in each patch
6. Sort coords2 based on patchID
x0 y0 z0 x1 y1 z1 x2 y2 z2 x3 y3 z3
coords2
0 0 0 0 1 1 1 1 2 2
Patch
20 3 2 17 0 7 5 15 19 14
atomID
x20 y20 z20 x3 y3 z3 x2 y2 z2
coords
10Coalesced Memory Operations
A common and efficient pattern for operating on
data in R3.
11Binning Optimization
12Binning Timing
13Pairlist Generation
- Any atom within the neighboring patches might be
a pair atom - When the distance between two atoms ltrpairlist,
they are considered to be a pair
14Pairlist Generation Implementation
- The density of our systems is uniform, i.e. each
atom has 1300 pair atoms - Preset a certain pairlist length (PL_SIZE) for
all the atoms in the system, which can be changed
by the user - A flag is there to test when the given PL_SIZE is
not big enough
15Pairlist Generation Implementation
- Using registers
- Tiled data into shared memory
- Using coalesced reading from global memory
- Also some parameters of the system are stored in
constant memory
16Pairlist Generation Optimization
17Pairlist Generation Timing
18Pairlist Update Checking
Time step
- For all atoms, check whether any has moved
(rpairlist rcut)/2 since last update
- If any atom has, run a complete binning and
pairlist cycle
- Occurs once per 10 steps in MD
19Plupdate Check Implementation
Coalesced memory reads
20Plupdate Check Implementation, Multiple Loads
Global ? Registers
Repeat ELEM_PER_BLOCK times
Repeating loads allow one to cover global memory
latency
Registers ? Shared
Global ? Registers
Computation
Execution
21Plupdate Check Optimization
22Plupdate Check Timing
23Conclusions
- Pairlist generation and pairlist update check
both yielded good speedups (5-10x) - Binning on CPU much faster due to efficiency of
cache requires further study - Functions generated here are fast enough to be
usefully incorporated into NAMD-GPU - Other molecular modeling applications (eg., VMD)
can also make use of them
24Acknowledgements
- John Stone and Jim Phillips (NAMD-GPU)
- Professor and TAs
- David Kirk and NVIDIA
- Theoretical and Computational Biophysics Group