Technologies for Computational Science - PowerPoint PPT Presentation

About This Presentation
Title:

Technologies for Computational Science

Description:

Automatic Differentiation (AD): a technology for automatically augmenting ... Exploit sparsity (SparsLinC and/or coloring) Exploit parallelism ... – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 66
Provided by: nor7153
Learn more at: https://www.mcs.anl.gov
Category:

less

Transcript and Presenter's Notes

Title: Technologies for Computational Science


1
Technologies for Computational Science
  • Boyana Norris
  • Argonne National Laboratory
  • http//www.mcs.anl.gov/norris

2
Outline
  • Automatic differentiation
  • Applications in optimization
  • How AD works
  • Components for scientific computing
  • Performance evaluation and modeling
  • Bringing it all together

3
What is automatic differentiation?
  • Automatic Differentiation (AD) a technology for
    automatically augmenting computer programs,
    including arbitrarily complex simulations, with
    statements for the computation of derivatives,
    also known as sensitivities.

The Computational Differentiation Project at
Argonne National Laboratory
4
What is it good for?
  • The need to accurately and efficiently compute
    derivatives of complicated simulation codes
    arises regularly in
  • Optimization (finding a minimum)
  • Solving nonlinear differential equations
  • Sensitivity and uncertainty analysis
  • Inverse Problems, including
  • Data assimilation
  • Parameter identification
  • AD tools automate the generation of derivative
    code without precluding the exploitation of
    high-level knowledge.

5
Sensitivity Analysis
MM5 (a mesoscale weather model, NCAR and Penn
State)
Impact of perturbations of initial temperature on
temperature in the system low-amplitude
supersonic waves clearly visible with AD (left),
but not visible with divided difference
approximations of derivatives (right).
6
Parameter Tuning
Sea Ice Model (Todd Arbetter, University of
Colorado)
Ice thickness for the standard (left) and tuned
(right) parameter values, with actual
observations at two locations indicated.
7
Optimization Problems
  • Often we look for extreme, or optimum, values
    that a function has on a given domain. More
    formally
  • Unconstrained minimization problems are ones in
    which
  • Note Since a maximum of f is a minimum of -f, we
    need only to look for the minimum.

8
Newtons Method
  • Method for finding x such that f(x) 0
  • For optimization, we want ?f(x) 0, so
    iterate


9
Example Minimum Surface
Objective Find a surface with the minimal area
that satisfies Dirichlet boundary conditions and
is constrained to lie above a solid plate.
Solution
Error
10
Example Minimum Surface (Cont.)
11
We can compute derivatives via
  • Analytic code
  • By hand
  • Automatic differentiation
  • Numerical approximation finite differencing
    (FD). For finite differences, recall

12
Why use AD?
  • Compared with other methods (numerical
    differentiation via finite differences, hand
    coding, etc.), AD offers a number of advantages
  • Accuracy
  • Performance
  • Reduced effort
  • Algorithm-awareness

13
More accurate derivatives faster convergence
Application modeling transonic flow over an
ONERA M6 airplane wing.
14
Who uses it?
  • AD has been successfully employed in applications
    in
  • Atmospheric chemistry
  • Breast cancer modeling
  • Computational fluid dynamics
  • Mesoscale climate modeling
  • Network Enabled Optimization System
  • Semiconductor device modeling
  • And also groundwater remediation,
    multidisciplinary design optimization, reactor
    engineering, super-conductor simulation,
    multibody simulations, molecular dynamics
    simulations, power system analysis, water
    reservoir simulation, and storm modeling.

15
How AD Works
  • Every programming language provides a limited
    number of elementary mathematical functions,
    e.g., , -, , /, sin, cos,
  • Thus, every function computed by a program may be
    viewed as the composition of these so-called
    intrinsic functions
  • Derivatives for the intrinsic functions are known
    and can be combined using the chain rule of
    differential calculus

16
A Simple Example (Fortran)
Original program
x 3.14159265/4.0 a sin(x) b cos(x) t a/b
Differentiated program x 3.14159265/4.0 dxdx
1.0 ! Initialize seed matrix a sin(x) dadx
cos(x)dxdx ! TL/CR b cos(x) dbdx
-sin(x)dxdx ! TL/CR t a/b dtda 1.0/b !
TL dtdb -a/(bb) ! TL dtdx dtdadadx
dtdbdbdx ! CR
17
Modes of AD
  • Forward mode
  • Mode used in simple example
  • Propagates derivative vectors, often denoted ?u
    or g_u
  • Derivative vector ?u contains derivatives of u
    with respect to independent variables
  • Time and storage proportional to vector length (
    indeps)
  • Reverse (or adjoint) mode
  • Propagates adjoints, denoted u or u_bar
  • Adjoint u contains derivatives of dependent
    variables with respect to u
  • Propagation starts with dependent variablesmust
    reverse flow of computation
  • Time proportional to adjoint vector length (
    dependents)
  • Storage proportional to number of operations
  • Because of this limitation, often applied to
    subprograms

18
Another Simple Example (C code)
Original code y x1x2x3x4
DERIV_val(y) value of program
variable y DERIV_grad(y)derivative object
associated with y
19
The AD Process
Application Code
AD Tool
Code with Derivatives
Control Files
AD Support Libraries
Compile Link
Users Derivative Driver
Derivative Program
20
Ways of Implementing AD
  • Operator Overloading
  • Use language features to generate trace (tape)
    of computation -gt implicit computational graph
  • Easy to implement hard to optimize
  • Examples ADOL-C
  • Source Transformation (ST)
  • Relies on compiler technology
  • Hard to implement more powerful
  • Examples ADIFOR, ADIC, ODYSSEE, TAMC

21
Example AD Tool Architecture (ST)
  • AD engine isolated front- and backends via XAIF
    (XML AD Interface Format)
  • XML representation of the computational graph
  • Unifies relevant Fortran and C constructs
  • Implements abstractions, e.g. derivative object
  • Shared plug-in differentiation modules

22
XAIF Representation
23
XAIF - Abstraction of the Program at AD-Level
Expression Example
  • Only the core structure of the program is
    reflected in XAIF
  • Control flow
  • Variable information for active variables
  • Basic blocks
  • Expression DAGs



var_1

const
var_2
var_3
24
Estimates of Incremental Computational Costs
25
Hessian Module
  • The Hessian module can compute H, HV, VTHV,
    WTHV, as well as arbitrary elements of the
    Hessian (e.g., diagonal, n predetermined
    entries).
  • Tradeoffs in code generation between source
    expansion and speed. Hessian/Function Ratio

26
Techniques for Improving Performance of AD Code
  • Exploit sparsity (SparsLinC and/or coloring)
  • Exploit parallelism
  • data stripmine derivative computation
  • task multithread independent loops
  • time break computation into phases pipeline
    derivative computations
  • Exploit interface contractions
  • For computations of the form
  • Compute dg/dx, df/dg, multiply to form df/dx
  • Exploit mathematics (e.g., differentiating
    through linear/nonlinear equation solvers)

27
ANL Tools for AD
  • ADIFOR was developed in collaboration with Rice
    University
  • full support for Fortran 77
  • support for parallelism via MPI and PVM
  • support for sparse Jacobians
  • ADIC is the first only compiler-based AD tool
    for ANSI C
  • support for the complete ANSI standard
  • will soon support a large subset of C
  • www.mcs.anl.gov/adic, www.mcs.anl.gov/adicserver
  • XAIF specification and differentiation modules
    (OpenAD project)
  • http//www-unix.mcs.anl.gov/utke/OpenAD

28
AD in Numerical Toolkits
  • NEOS Network-Enabled Optimization Server
  • http//neos.mcs.anl.gov
  • Efficient computation of gradients for large
    problems, where the objective function has the
    form
  • PETSc (Portable Extensible Toolkit for Scientific
    Computation) solvers (work in progress)
  • User only needs to provide the sequential
    subdomain update function in F77 or ANSI-C.
  • Differentiated version of toolkit enables
    optimization/sensitivity analysis of models based
    on PETSc
  • www.mcs.anl.gov/petsc

29
Optimization Solution (PETSc TAO)
Main Routine
Nonlinear Solvers (SNES)
Gradient Evaluation
30
Using AD with the Toolkit for Advanced
Optimization (TAO)

Global-to-local scatter of ghost values
Local Function computation
Local Min.Function computation
Parallel function assembly
Script file

Global-to-local scatter of ghost values
ADIFOR or ADIC
Coded manually can be automated
Seed matrix initialization
Local Hessian computation
Local Hessian computation
Parallel Hessian assembly
31
Outline
  • Automatic differentiation
  • Components for scientific computing
  • Introduction
  • Example applications
  • Performance evaluation and modeling
  • Summary

CCA
Common Component Architecture
32
Software development approaches
Architectures
Components
Object-oriented libraries collections of classes
Libraries collections of subroutines
Unstructured code (everything in main)
33
Components
  • Working definition a component is a piece of
    software that can be composed with other
    components within a framework composition can be
    either static (at link time) or dynamic (at run
    time)
  • plug-and-play model for building applications
  • For more info C. Szyperski, Component Software
    Beyond Object-Oriented Programming, ACM Press,
    New York, 1998
  • Components enable
  • Software and tool interoperability
  • Automation of performance instrumentation/monitori
    ng
  • Application adaptivity (automated or user-guided)
  • Pictorial intro

34
Object-oriented vs component-oriented development
  • Component-oriented development can be viewed as
    augmenting OOD with certain policies, e.g.,
    require that certain abstract interfaces be
    implemented
  • Components, once compiled, require a special
    execution environment
  • OO techniques are useful for building individual
    components by relatively small teams component
    technologies facilitate sharing of code developed
    by different groups by addressing issues in
  • Language interoperability
  • Via interface definition language (IDL)
  • Well-defined abstract interfaces
  • Enable plug-and-play
  • Dynamic composability
  • Components can discover information about their
    environment (e.g., interface discovery) from
    framework and connected components
  • Can convert from an object orientation to a
    component orientation
  • Automatic tools can help with conversion (ongoing
    work by C. Rasmussen and M. Sottile, LANL)

35
Motivating scientific applications
Physics
Adaptive Solution
Optimization
Meshes
Derivative Computation
Discretization
Molecular structures
Astrophysics
Data Redistribution
Parallel I/O
Aerodynamics
Fusion
36
Motivation For Application Developers and Users
  • You have difficulty managing multiple third-party
    libraries in your code
  • You (want to) use more than two languages in your
    application
  • Your code is long-lived and different pieces
    evolve at different rates
  • You want to be able to swap competing
    implementations of the same idea and test without
    modifying any of your code
  • You want to compose your application with some
    other(s) that werent originally designed to be
    combined

37
The model for scientific component programming
CCA
38
CCA Delivers Performance
  • Local
  • No CCA overhead within components
  • Small overhead between components
  • Small overhead for language interoperability
  • Be aware of costs design with them in mind
  • Small costs, easily amortized
  • Parallel
  • No CCA overhead on parallel computing
  • Use your favorite parallel programming model
  • Supports SPMD and MPMD approaches
  • Distributed (remote)
  • No CCA overhead performance depends on
    networks, protocols
  • CCA frameworks support OGSA/Grid Services/Web
    Services and other approaches

39
Overhead from Component Invocation
  • Invoke a component with different arguments
  • Array
  • Complex
  • Double Complex
  • Compare with f77 method invocation
  • Environment
  • 500 MHz Pentium III
  • Linux 2.4.18
  • GCC 2.95.4-15
  • Components took 3X longer
  • Ensure granularity is appropriate!
  • Paper by Bernholdt, Elwasif, Kohl and Epperly

Function arg type f77 Component
Array 80 ns 224ns
Complex 75ns 209ns
Double complex 86ns 241ns
40
Language interoperability what is so hard?
Native cfortran.h SWIG JNI Siloon Chasm Plat
form Dependent
f77
f90
C
C
Python
Java
41
SIDL/Babel makes all supported languages peers
f77
This is not a Lowest Common Denominator Solution!
C
f90
C
Python
Java
42
CCA Concepts Components and Ports
  • Components provide or use one or more ports
  • Components include some code which interacts with
    a CCA framework
  • Frameworks provide services, such as component
    instantiation and port connection

FunctionPort

FunctionPort
OptimizerPort
GradientPort
Objective Function
HessianPort
GradientPort

Optimization Algorithm
Function Gradient

HessianPort
  • Implementation details
  • CCA components
  • Inherit from gov.cca.Component
  • Implement setServices method to register ports
    this component will provide and use
  • Implement the ports they provide
  • Use ports on other components
  • Call getPort/releasePort methods of framework
    Services object
  • Ports (interfaces) extend the gov.cca.Port
    interface

Function Hessian
43
ExampleUnconstrained Minimization Problem
  • Given a rectangular 2-dimensional domain and
    boundary values along the edges of the domain
  • Find the surface with minimal area that satisfies
    the boundary conditions, i.e., compute
  • min f(x), where f R ? R
  • Solve using optimization
    components based on
    TAO (ANL)

44
Unconstrained Minimization Using a Structured Mesh
Reused TAO
Solver Driver/Physics
45
Computational Chemistry Molecular Optimization
  • Investigators Yuri Alexeev (PNNL), Steve Benson
    (ANL), Curtis Janssen (SNL), Joe Kenny (SNL),
    Manoj Krishnan (PNNL), Lois McInnes (ANL), Jarek
    Nieplocha (PNNL), Jason Sarich (ANL), Theresa
    Windus (PNNL)
  • Goals Demonstrate interoperability among
    software packages, develop experience with large
    existing code bases, seed interest in chemistry
    domain
  • Problem Domain Optimization of molecular
    structures using quantum chemical methods

46
Molecular Optimization Overview
  • Decouple geometry optimization from electronic
    structure
  • Demonstrate interoperability of electronic
    structure components
  • Build towards more challenging optimization
    problems, e.g., protein/ligand binding studies

Components in gray can be swapped in to create
new applications with different capabilities.
47
Wiring Diagram for Molecular Optimization
  • Electronic structures components
  • MPQC (SNL)
  • http//aros.ca.sandia.gov/cljanss/mpqc
  • NWChem (PNNL)
  • http//www.emsl.pnl.gov/pub/docs/nwchem
  • Optimization components TAO (ANL)
    http//www.mcs.anl.gov/tao
  • Linear algebra components
  • Global Arrays (PNNL) http//www.emsl.pnl.gov2080/
    docs/global/ga.html
  • PETSc (ANL)
  • http//www.mcs.anl.gov/petsc

48
Outline
  • Automatic differentiation
  • Components for scientific computing
  • Performance evaluation and modeling
  • Performance evaluation challenges
  • Component-based approach
  • Motivating example adaptive linear system
    solution
  • A component infrastructure for performance
    monitoring and adaptation of applications
  • Summary

49
Why Performance Model?
  • Performance models enable understanding of the
    factors that affect performance
  • Inform the tuning process (of application and
    machine)
  • Identify bottlenecks
  • Identify underperforming components
  • Guide applications to the best machine
  • Enable applications-driven architecture design
  • Extrapolate the performance of future systems

50
Challenges in performance evaluation
  • Many tools for performance data gathering and
    analysis
  • PAPI, TAU, SvPablo, Kojak,
  • Various interfaces, levels of automation, and
    approaches to information presentation
  • Users point of view
  • What do the different tools do? Which is most
    appropriate for a given application?
  • (How) can multiple tools be used in concert?
  • I have tons of performance data, now what?
  • What automatic tuning tools are available, what
    exactly do they do?
  • How hard is it to install/learn/use tool X?
  • Is instrumented code portable? Whats the
    overhead of instrumentation? How does code
    evolution affect the performance analysis process?

51
Incomplete list of tools
  • Source instrumentation TAU/PDT, KOJAK
    (MPI/OpenMP), SvPablo, Performance Assertions,
  • Binary instrumentation HPCToolkit, Paradyn,
    DyninstAPI,
  • Performance monitoring MetaSim Tracer (memory),
    PAPI, HPCToolkit, Sigma (memory), DPOMP
    (OpenMP), mpiP, gprof, psrun,
  • Modeling/analysis/prediction MetaSim Convolver
    (memory), DIMEMAS(network), SvPablo
    (scalability), Paradyn, Sigma,
  • Source/binary optimization Automated Empirical
    Optimization of Software (ATLAS), OSKI, ROSE
  • Runtime adaptation ActiveHarmony, SALSA

52
Incomplete list of tools
  • Source instrumentation TAU/PDT, KOJAK
    (MPI/OpenMP), SvPablo, Performance Assertions,
  • Binary instrumentation HPCToolkit, Paradyn,
    DyninstAPI,
  • Performance monitoring MetaSim Tracer (memory),
    PAPI, HPCToolkit, Sigma (memory), DPOMP
    (OpenMP), mpiP, gprof, psrun,
  • Modeling/analysis/prediction MetaSim Convolver
    (memory), DIMEMAS(network), SvPablo
    (scalability), Paradyn, Sigma,
  • Source/binary optimization Automated Empirical
    Optimization of Software (ATLAS), OSKI, ROSE
  • Runtime adaptation ActiveHarmony, SALSA

53
Incomplete list of tools
  • Source instrumentation TAU/PDT, KOJAK
    (MPI/OpenMP), SvPablo, Performance Assertions,
  • Binary instrumentation HPCToolkit, Paradyn,
    DyninstAPI,
  • Performance monitoring MetaSim Tracer (memory),
    PAPI, HPCToolkit, Sigma (memory), DPOMP
    (OpenMP), mpiP, gprof, psrun,
  • Modeling/analysis/prediction MetaSim Convolver
    (memory), DIMEMAS(network), SvPablo
    (scalability), Paradyn, Sigma,
  • Source/binary optimization Automated Empirical
    Optimization of Software (ATLAS), OSKI, ROSE
  • Runtime adaptation ActiveHarmony, SALSA

54
Challenges (where is the complexity?)
  • More effective use ? integration
  • Tool developers perspective
  • Overhead of initially implementing one-to-one
    interoperabilty
  • Ongoing management of dependencies on other tools
  • Individual Scientist Perspective
  • Learning curve for performance tools ? less time
    to focus on own research (modeling, physics,
    mathematics, optimization)
  • Potentially significant time investment needed to
    find out whether/how using someone elses tool
    would improve performance ? tend to do own
    hand-coded optimizations (time-consuming,
    non-reusable)
  • Lack of tools that automate (at least partially)
    algorithm discovery, assembly, configuration, and
    enable runtime adaptivity

55
What can be done
  • How to manage complexity? Provide
  • Performance tools that are truly interoperable
  • Uniform easy access to tools
  • Component implementations of software, esp.
    supporting numerical codes, such as linear
    algebra algorithms
  • New algorithms (e.g., interactive/dynamic
    techniques, algorithm composition)
  • Implementation approach components, both for
    tools and the application software

56
Performance Evaluation Research Center
(http//perc.nersc.gov)
57
What is being done
  • No integrated environment for performance
    monitoring, analysis, and optimization (yet)
  • Most past efforts
  • One-to-one tool interoperability
  • More recently
  • OSPAT (initial meeting at SC04), focus on common
    data representation and interfaces
  • Tool-independent performance databases PerfDMF
  • Eclipse parallel tools project (LANL)

58
OSPAT
  • The following areas were recommended for OSPAT to
    investigate
  • A common instrumentation API for source level,
    compiler level, library level, binary
    instrumentation
  • A common probe interface for routine entry and
    exit events
  • A common profile database schema
  • An API to walk the callstack and examine the heap
    memory
  • A common API for thread creation and fork
    interface
  • Visualization components for drawing histograms
    and hierarchical displays typically used by
    performance tools

59
Example component infrastructure for multimethod
linear solvers
  • Goal provide a framework for
  • Performance monitoring of numerical components
  • Dynamic adaptativity, based on
  • Off-line analyses of past performance information
  • Online analysis of current execution performance
    information
  • Motivating application examples
  • Driven cavity flow Coffey et al, 2003,
    nonlinear PDE solution
  • FUN3D incompressible and compressible Euler
    equations
  • Prior work in multimethod linear solvers
  • McInnes et al, 03, Bhowmick et al,03 and 05,
    Norris at al. 05.

60
Adaptive Linear System Solution
  • Motivation
  • Approximately 80 of total solution time devoted
    to linear system solution
  • Multi-phase nonlinear solution method, requiring
    the solution of linear systems with varying
    levels of ill-conditioning Kelley and Keyes,
    1998
  • New approach aiming to reduce overall time to
    solution
  • Combine more robust (but more costly) methods
    when needed in some phases with faster (but less
    powerful) methods in other phases
  • Dynamically select a new preconditioner in each
    phase based on CFL number

61
Example driven cavity flow
  • Linear solver GMRES(30), vary only fill level of
    ILU preconditioner
  • Adaptive heuristic based on
  • Previous linear solution convergence rate,
    nonlinear solution convergence rate, rate of
    increase of linear solution iterations
  • 96x96 mesh, Grashof 105, lid velocity 100
  • Intel P4 Xeon, dual 2.2 GHz, 4GB RAM

62
Bringing it all together
  • Integration of ongoing efforts in
  • Performance tools common interfaces and data
    represenation (leverage OSPAT, PerfDMF, TAU
    performance interfaces, and similar efforts)
  • Numerical components emerging common interfaces
    (e.g., TOPS solver interfaces) increase choice of
    solution method ? automated composition and
    adaptation strategies
  • Code generation, e.g., AD
  • Long term
  • Is a more organized (but not too restrictive)
    environment for scientific software lifecycle
    development possible/desirable?

63
Multimethod linear solver components
Adaptive Heuristic
Linear Solver B
Linear Solver C
64
AD as Component Factory
  • Both NEOS and PETSc rely on a well-defined
    function interface in order to provide
    derivatives via AD
  • Extend this idea to components

Function
AD Tool
Jacobian
65
Summary
  • Automation at all levels of the application
    development process can simplify and speed up
    application development and result in better
    software quality and performance
  • AD addresses the wide-spread need for accurate
    and efficient derivative computations
  • CCA defines a high-performance component model,
    enabling large-scale software development
  • A growing array of performance tools and
    methodologies aid in understanding and
    fine-tuning application performance
  • Current and future work bringing these
    technologies together in a coherent way, making
    large-scale scientific application development as
    easy as possible

66
Acknowledgments
  • Paul Hovland, Jean Utke, Lois Curfman McInnes
    (ANL)
  • Sanjukta Bhowmick (ANL/Columbia)
  • Ivana Veljkovic, Padma Raghavan (Penn State)
  • Sameer Shende, Al Malony (U. Oregon)
  • CCA and PERC members
  • Funding DOE and NSF

67
For More Information
  • Automatic differentiation
  • Andreas Griewank. Evaluating Derivatives
    Principles and Techniques of Alogrithmic
    Differentiation, SIAM, 2000.
  • www.autodiff.org publications, tools, etc.
  • www.mcs.anl.gov/adicserver ADIC server
  • neos.mcs.anl.gov NEOS server
  • Common component architecture
  • www.cca-forum.org
  • Performance tools
  • perc.nersc.gov
  • Student opportunities at MCS/ANL
  • www-fp.mcs.anl.gov/division/information/educationa
    l_programs/studentopps.html
  • Boyana Norris
  • Email norris_at_mcs.anl.gov, Web
    www.mcs.anl.gov/norris
Write a Comment
User Comments (0)
About PowerShow.com