Lessons Learned from Managing and Deploying ESMF - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Lessons Learned from Managing and Deploying ESMF

Description:

Lessons Learned from Managing and Deploying ESMF – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 35
Provided by: RobertF5
Category:

less

Transcript and Presenter's Notes

Title: Lessons Learned from Managing and Deploying ESMF


1
Lessons Learned from Managing and Deploying ESMF
Cecelia DeLuca / NCAR Portfolio / Institute
Create Meeting April 23-25, 2008
2
Outline
  • About ESMF
  • Best Practices, On and Off List
  • Governance
  • Personal History
  • Conclusion

3
About ESMF
  • ESMF provides component wrappers with standard
    interfaces, a set of data structures for data
    exchanges, and customizable drivers.
  • ESMF provides common utilities, such as data
    communications, regridding, time management,
    configuration, and message logging.
  • Goals are component interoperability and software
    reuse

4
Project History
  • First phase started in 2002 with NASA funding
    support for framework development and framework
    integration into applications
  • Second phase began in 2005 with a transition to
    multi-agency support and management NASA, NOAA,
    NSF, and DoD sponsors
  • ESMF component interfaces are the basis of
    programs at multiple agencies
  • DoD Battlespace Environments Institute (BEI)
  • NOAA National Environmental Modeling System
    (NEMS)
  • NASA Modeling Analysis and Prediction Program for
    Climate Variability and Change (MAP)
  • Community Sediment Transport Model (ONR)
  • In operational use since August 2006, initially
    at the National Weather Service

5
ESMF Earth Science Components Models
6
ESMF Application Example
GEOS-5 Atmospheric General Circulation
Model Application Example
  • Each box is an ESMF component
  • Every component has a standard interface to
    facilitate exchangescall ESMF_CompRun (myComp,
    importState, exportState, clock, phase,
    blockingFlag, rc)
  • Hierarchical architecture enables the systematic
    assembly of many different systems

7
ESMF Package Description, the Basics
  • Component-based architecture, data and task
    parallel
  • Components can run concurrently, sequentially or
    in mixed mode
  • Serial or parallel
  • Single or multiple executable or combinations
  • Shared or distributed memory or hybrid
  • Support for model ensembles, including execution
    of multiple instances in the same address space
  • 400,000 SLOC - mostly written in C, wrapped in
    Fortran 90
  • 2000 unit tests, system tests, and examples
    regression tested nightly on 26
    platform/compiler combinations
  • Reference Manual, Users Guide and the examples
    therein updated automatically with code changes

8
ESMF Package Description, Coupling
  • ESMF data structures are used to wrap (copy or
    reference) user data and transfer/transform it
    between components
  • Data transformations can be executed within a
    coupler component, or arranged in a coupler
    component and executed within model components
  • Coupling can be done in index space
  • General multi-dimensional distributed arrays
  • Sparse matrix multiply for regridding, user
    defined weights
  • Coupling can be done in physical space
  • Fields combine grids, arrays, and metadata
  • Field regrid operation for regridding,
    ESMF-generated weights
  • Grids supported are logically rectangular, 3D
    finite element mesh coming
  • Performance requirements lt 5 overhead in time
    to solution vs customized native approaches,
    highly scalable in performance and memory

9
Standardization of Component APIs
  • Only three ESMF component methods, Initialize,
    Run, and Finalize (I/R/F)
  • Users create a component by assigning their user
    code I/R/F methods to an ESMF component type
  • The ESMF component calls down into the specific
    user-assigned methods
  • I/R/F methods cascade down the tree
  • Small set of standard arguments

call ESMF_CompRun (myComp, importState,
exportState, clock, phase, blockingFlag, rc)
10
Standardization of Data Structures
  • ESMF State data structures contain all data
    exchanged between components
  • Flexibility in data representation - States can
    contain lists of varied data structures,
    including Arrays, ArrayBundles, Fields,
    FieldBundles, and other States
  • The ESMF philosophy is to constrain the number
    of component methods and their arguments, but
    retain flexibility in the data structures used in
    inter-component exchanges

11
ESMF Design Philosophy
  • Design and build comprehensive software engines
    at lower levels
  • Build simple user interfaces on top for specific
    problems
  • Expose a general interface for users who need
    more flexibilty
  • Tricky issue can be difficult to involve
    customers in early design discussions, because
    developers are most concerned with implementing
    generality, and users want to first see simple
    problems

What the developer sees
What the user sees
12
ESMF Release Path
2002 2003 2004 2005
2006 2007 2008 2009
2010
ESMF v1 Prototype
Full system prototype

ESMF v2 Components, VM and Utils ESMF_GridCompRun(
)
Building bottom-up
ESMF v3 Index Space Operations ESMF_ArraySparseMat
Mul()
ESMF v4 Grid Operations ESMF_GridCreate() ESMF_Fie
ldRegrid()
ESMFv5 Standardization Build, init, data types,
error handling,
Standardization
Last public ESMF v2.2.2r
Last internal ESMF v3.1.0p1
13
Team Composition
  • Team composition (blue is off-site)
  • Manager (agency coordination and technical
    coordination)
  • Operations Manager (website, metrics, space
    issues, local administration)
  • Integrator/Test lead (regression testing, release
    management)
  • Tester for numerical methods
  • 7 developers
  • Systems level/porting/build
  • Low level data structures and architecture
  • Structured grids
  • Meshes
  • High level numerical data structures
  • Language interfaces
  • Utilities including calendaring
  • Metadata and attributes
  • External and related
  • External ½ FTE performance testing
  • Related FTE for component distribution portal

14
Outline
  • About ESMF
  • Best Practices, On and Off List
  • Governance
  • Personal History
  • Conclusion

15
Best Practices Listening to the Old-Timers
  • Rational Unified Process
  • Develop software iteratively
  • Manage requirements
  • Use component-based architectures
  • Visually model software
  • Control changes to software
  • Agile Development Checklist
  • Aggressive refactoring
  • Automated, frequent testing
  • Automated, frequent build and deployment
  • Continuous integration
  • Source control
  • Communication plan
  • Task tracking
  • Self-documenting code
  • Peer review
  • Customer view of work in progress
  • Feedback mechanism
  • Airlie Council (original set)
  • Formal risk management
  • Agreement on interfaces
  • Formal inspections
  • Metrics-based scheduling and management
  • Binary quality gates at the inch-pebble level -
    status should be tracked through binary
    completion of relatively small tasks
  • Program-wide visibility of progress vs plan
  • Defect tracking against quality targets
  • Configuration management
  • People-aware management accountability

Blue means a practice followed by ESMF.
16
... and Not Listening
  • We should haves
  • Visually modeling the software
  • Why not? Lack of tools for automating generation
    of diagrams in Fortran and language mixtures -
    maintenance issue when done by hand
  • Would still like to improve this area
  • We backed off
  • Aggressive refactoring
  • Not consistent across code, Fortran developers
    dont, C developers do
  • Formal risk assessment
  • Top risks to project survival involve adoption
    and funding
  • In a highly politicized multi-agency environment,
    better to discuss than to formally document
  • Agreement on interfaces
  • Didnt agree to all interfaces before
    implementing, but hold design reviews prior and
    post implementation of new classes

17
Disagreeing With the Old-TimersDistributed
Development Can Work
  • Even with all the cool communication technology
    in the world, we really can't pull off that kind
    of feedback in a highly distributed environment.
    Heck, even the cool technology in StarTrek
    couldn't pull this off. The technology needed to
    make us truly agile is face-to-face people.
    (Steve McConnell)
  • Making face-to-face optional
  • Saves time and money on travel
  • Enables people to accommodate their home lives
  • Creates collaboration infrastructure that
    supports team work off hours
  • Creates collaboration infrastructure that
    supports interactions with remote customers
  • Enables hiring from a national pool

18
How ESMF Does Distributed Development
  • GOAL - Everybody on the team has access to all
    information, current and past
  • Archived email list where all development
    correspondence gets ccd
  • Frequent telecons with minutes
  • Web browsable repository, mail summary on
    check-ins
  • Daily archived test results
  • Monthly archived metrics
  • Public archived trackers (bugs, feature requests,
    support requests, etc.)
  • Discouraged IMing, one-to-one correspondence or
    calls the medium matters
  • If its not in the project archive, it doesnt
    exist.

19
How ESMF Does Distributed Development, Continued
  • Strict Battle Rhythm
  • Regular meeting times, reporting periods, annual
    project cycle, etc. build confidence
  • Remote members can tell whats going on without
    constant updates
  • Articulated, non-negotiable project values
  • Gives the team an identity
  • Helps filter people quickly who do not fitthe
    team profile
  • Distributed development is not for everyone
    good fitsare secure, work-focused, smart,
    communicative, positive

ESMF team member
20
Values
  • Community driven development and community
    ownership
  • Openness of project processes, management, code
    and information
  • Correctness
  • Commitment to a globally distributed and diverse
    development and customer base
  • Simplicity
  • Efficiency
  • Public storage of project records and other
    information
  • Engagement
  • Web link for detail http//www.esmf.ucar.edu/abou
    t_us/values.shtml

21
Staff
  • Best advice received Spend most management
    time with best people
  • Attention can fix broken software, doesnt often
    fix people
  • Tension demoralizes the team and is distracting
  • Use term positions, contractors, and redirection
    to address difficulties where possible

22
Outline
  • About ESMF
  • Best Practices, On and Off List
  • Governance
  • Personal History
  • Conclusion

23
Beyond Best Practices Governance
  • Management of ESMF required governance that
    recognized social and cultural factors as well as
    technical factors
  • Main objectives of governance
  • Enabling people to fight and criticize in a
    civilized, contained, constructive way
  • Enabling people to make decisions based on
    resource realities
  • Observations
  • Sometimes just getting everyone equally
    dissatisfied and ready to move on is a victory
  • Thorough, informed criticism is about the most
    useful input a project can get

24
Governance Interactions
  • Multiple timescales, all staff levels
  • Places for structured argument

ExecutiveManagement
Executive Board Strategic Direction Organizational
Changes Board Appointments
annually
Reporting
Interagency Working Group Stakeholder
Liaison Programmatic Assessment Feedback
Advisory Board External Projects
Coordination General Guidance Evaluation
Reporting
Working Project
Joint Specification Team Requirements
Definition Design and Code Reviews External Code
Contributions
Change Review Board Development
Priorities Release Review Approval
quarterly
Functionality Change Requests
weekly
Resource Constraints
Implementation Schedule
Collaborative Design Beta Testing
Core Development Team Project Management Software
Development Testing Maintenance Distribution
User Support
daily
25
Outline
  • About ESMF
  • Best Practices, On and Off List
  • Governance
  • Personal History
  • Conclusion

26
Early Challenges
  • New manager, new team, newly at institution
    introduced inefficiencies in administrative and
    technical process (how do you get it done here?,
    who can do what?)RESOLVED BY Time
  • Key low-level data structures data blocks, grids
    were not well designed/implementedRESOLVED BY
    Determined unsalvageable and began redesign
  • Releases without sufficient testing or
    documentationRESOLVED BY Put more resources
    into testing and documentation and implementation
    of best practices

27
Later Challenges
  • Some staff lacked necessary expertise to
    implement new requirementsRESOLVED BY Keeping
    the best, but turning over the other staff
    hard, but brought in a better team
  • Historical conflicts, competition for resources
    and conflicts of interest
  • RESOLVED BY Strategic planning and partnerships
    help, but many issues are out of the projects
    control

28
Success Factors
  • Though new, I had
  • domain and technical expertise
  • many contacts in Earth and space modeling
  • Experience working with a team at Lincoln
    Laboratory that was well managed (by Bob Bond)
    and developed high performance framework software
    this was critical as both a technical and
    process model
  • Strong mentorship and support from a number of
    senior scientists and developers across the
    community
  • Enough resources to plan, design, implement,
    document, test

29
Diligence of ManagementThe Daily Checklist
  • Funds need to come or go?
  • Staff tasked and working? Hires?
  • Customer issues?
  • Does the product need attention? Website, legal,
    reviews, design decisions?
  • Routine administration or meeting planning?
  • PR, papers, presentations due?
  • Strategic actions or issues?
  • In the end, you have to make the project work
    every day with
  • every conversation and decision

30
Outline
  • About ESMF
  • Best Practices, On and Off List
  • Governance
  • Personal History
  • Conclusion

31
Conclusion Recurring Themes
  • Information Management
  • Discipline and Diligence
  • Planning
  • Conflict Management
  • Honest Evaluation

32
Extras
33
What Governance Needs to Achieve
  • Prioritize development tasks in a manner
    acceptable to major stakeholders and the broader
    community, and define development schedules based
    on realistic assessments of resource constraints
    (CRB)
  • Deliver a product that meets the needs of
    critical applications, including adequate and
    correct functionality, satisfactory performance
    and memory use, ... (Core)
  • Support users via prompt responses to questions,
    training classes, minimal code changes for
    adoption, thorough documentation, ... (Core)
  • Encourage community participation in design and
    implementation decisions frequently throughout
    the development cycle (JST)
  • Leverage contributions of software from the
    community when possible (JST)
  • Create frank and constructive mechanisms for
    feedback (Adv. Board)
  • Enable stakeholders to modify the organizational
    structure as required (Exec. Board)
  • Coordinate and communicate at many levels in
    order to create a knowledgeable and supportive
    network that includes developers, technical
    management, institutional management, and program
    management (IAWG and other bodies)

34
Facilitating ScienceCoupled Climate-Chemistry
with ESMF
The image shows results from a version of the
GEOS-5 atmospheric general circulation model
coupled to a stratospheric chemistry
package (STRAT-CHEM), also developed at NASA, but
independently of GEOS-5,which has now been made
ESMF compliant.
Write a Comment
User Comments (0)
About PowerShow.com