Latency Tolerance Through Parallelization of Time in Scientific Applications - PowerPoint PPT Presentation

About This Presentation
Title:

Latency Tolerance Through Parallelization of Time in Scientific Applications

Description:

Latency Tolerance Through Parallelization of Time in Scientific Applications ... simulation similarly, in terms of coefficients b and perform least squares fit ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 21
Provided by: asri9
Learn more at: http://www.cs.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Latency Tolerance Through Parallelization of Time in Scientific Applications


1
Latency Tolerance Through Parallelization of Time
in Scientific Applications
  • Ashok Srinivasan
  • Computer Science
  • Florida State University

Namas Chandra Mechanical Engineering Florida
State University
Aim Long time scales on small physical
systems Solution features Time parallelization
to avoid fine granularity
www.cs.fsu.edu/asriniva
2
Outline
  • Application
  • Time parallelization
  • Prediction of a Carbon nanotube state
  • Experimental results
  • Conclusions and future work

3
Applications
  • Small physical systems for long time scales
  • Class of applications considered
  • State(Ti) F(StateTi-1)
  • Inherently sequential
  • Example
  • Molecular dynamics simulations of Carbon
    nanotubes
  • Time step size 10-15 second
  • After a million steps, we are still only in the
    nanosecond range
  • Even that requires about a day of sequential
    computing time for around 3000 atoms
  • Spatial parallelization will lead to too fine a
    granularity

4
CNT application
  • Pull the CNT at a constant velocity
  • Performed to determine material property
  • Material response can be used by an FEM
    simulation in a multiscale model

5
Time parallelization
  • Based on a predict-verify approach
  • Use results of old simulations to speed up the
    current simulation
  • Relationship between different problem parameters
    often occurs in engineering
  • Example Temperature and time, stress and time
  • Find a relationship and use it to predict the
    state at different times
  • The relationship is determined automatically, and
    updated dynamically

6
Guided simulations
  • Notation
  • r Exact time/ Parallel overhead
  • P of Procs
  • a Progress rate
  • Speedup
  • P a /(11/r)
  • P a
  • If prediction and communication overheads are
    relatively small
  • P Time steps
  • a ? 1/P,1
  • Requires all-reduce and broadcast

7
Fault tolerance too
  • In case of node failure, another processor fills
    in the missing time interval
  • Other computations need not be discarded
  • Efficiency close to 1
  • For large P
  • Excluding loss in efficiency from errors
  • If communication cost is negligible
  • A master-worker design me be useful sometimes

Master
t1
t3
t4
t2
P3
P1
P2
P4
8
Fault tolerance
  • In case of node failure, another processor fills
    in the missing time interval
  • Other computations need not be discarded
  • Efficiency close to 1
  • For large P
  • Excluding loss in efficiency from errors
  • If communication cost is negligible
  • A master-worker design me be useful sometimes

Master
t2
t5
t6
P3
P1
P2
P4
9
Requirements for this approach
  • Method for predicting a state
  • Criterion for determining whether two states
    (predicted and actual) are similar
  • Choice of suitable base (old) simulation

10
Prediction of a Carbon nanotube state
  • Definition of equivalence of two states
  • Atoms vibrate around their mean position
  • Consider states equivalent if difference in
    position, potential energy, and temperature are
    within the normal range of fluctuations
  • Max displacement 0.211
  • Mean displacement 0.0789
  • Potential energy fluctuation 0.35
  • Temperature fluctuation 12.5 K

Displacement (from mean)
Mean position
11
Prediction
  • Predictor
  • Independently predict change in each coordinate
  • Normalize coordinates to be in 0,1
  • x tDt x t x tDt Dt
  • x tDt is the rate of change of x in this time
    interval
  • It is unknown and needs to be estimated

12
Predict change in coordinates
  • Express x in terms of basis functions
  • Example
  • x tDt a0, tDt a1, tDt x t
  • a0, tDt, a1, tDt are unknown
  • Express changes, y, for the base (old) simulation
    similarly, in terms of coefficients b and perform
    least squares fit
  • Predict ai, tDt as bi, tDt R tDt
  • R tDt (1-b) R tDt b(ai, t- bi, t)
  • Intuitively, the difference between the base
    coefficient and the current coefficient is
    predicted as a weighted combination of previous
    weights
  • We use b 0.5
  • Gives more weight to latest results
  • Does not let random fluctuations affect the
    predictor too much
  • Velocity estimated as latest accurate results
    known

13
Experimental results
  • Experimental parameters
  • Carbon nanotube with 1000 atoms
  • Around 200 atoms in the beginning fixed
  • Around 200 atoms at the end moved
    deterministically
  • Time step size 0.5 femto seconds
  • Time interval per processor 1000 time steps
  • Tersoff-Brenner potential for MD
  • 300 K temperature current 10 K base
  • b 0.5
  • Base simulation v 0.05A/1000 time steps
  • Actual simulation v 0.0625A/1000 time steps
  • A parallel run was simulated

14
Errors on 50 processors
Threshold for accepting the results
Difference between predictor and verifier
15
Errors on 50 processors
Threshold for accepting the results
Difference between predictor and verifier
16
Errors on 50 processors
Threshold for accepting the results
Error
Energy
Difference between predictor and verifier
17
Errors on 50 processors
Temperature
Threshold for accepting the results
Error
Difference between predictor and verifier
18
Speedup
Expected based on progress rate
Observed in simulations
  • Computation time for one time interval 10 s
  • Prediction time 10-3 s
  • Broadcast on 100 processors of IBM SP3 0.005 s
  • Allreduce on 100 processors of IBM SP3 0.0005 s

Overhead/computation is ratio negligible, and
speedup is determined only by errors
19
Limitations of the experiments
  • They are simulations of a parallel implementation
  • But large difference between computation and
    communication time suggests efficient
    implementation

20
Conclusions and future work
  • Conclusions
  • Promises significant improvement in speedup and
    efficiency for long-time simulations, through
    latency and fault-tolerance
  • Future work
  • Implementation on a parallel machine
  • Base simulations with a smaller time scale
  • Better predictors
  • Basis functions corresponding to physical
    phenomena likely to be experienced
  • Use clustering techniques to determine phenomena
    experienced by different regions
  • Automatically and dynamically determine a
    suitable base to use from a large set of
    existing results
Write a Comment
User Comments (0)
About PowerShow.com