Developing Reliable Complex Software Systems in a Research Environment - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Developing Reliable Complex Software Systems in a Research Environment

Description:

Developing Reliable Complex Software Systems in a Research Environment Christopher Mueller and Andrew Lumsdaine Open Systems Laboratory Indiana University – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 20
Provided by: ChrisM245
Category:

less

Transcript and Presenter's Notes

Title: Developing Reliable Complex Software Systems in a Research Environment


1
Developing Reliable Complex Software Systems in a
Research Environment
  • Christopher Mueller and Andrew Lumsdaine
  • Open Systems Laboratory
  • Indiana University

2
Research Software Projects
  • Software is an essential research tool
  • Many projects use custom software
  • Data gathering and processing
  • Simulation
  • Analysis and visualization
  • Algorithm/protocol development
  • Glue for 3rd party applications
  • Many research areas are developing common
    application frameworks
  • Software is often developed by a combination of
    grad students, undergrads, Pis, collaborators,
    and consultants using little or no process.

3
Evolution of an Application
  • First application written in Fortran
  • A Model for Baconian Dynamics

F77
  • Tom ports to C
  • Adds command line parameters, makefile
  • An Application Framework for Baconian Dynamics
  • Jenny ports to F90
  • Extends model
  • An Extended Model of Baconian Dynamics

C
F90
  • Brad ports to C
  • Models system using objects
  • An Object Oriented System for Dynamical Baconian
    Systems

C
C
Java
  • Jeremy (consultant) rewrites existing versions as
    C version
  • Advanced template and object patterns lead to
    fast and extensible code that is indecipherable
    by scientists
  • Maria implements model in Java for the Grid
  • Implements original model
  • A Scalable, Grid Enabled Toolkit for Baconian
    Systems
  • Baconian Dynamics predicts summer blockbusters
  • Everyone wants a copy of the software

Baconian Dynamics predicts the success of a
movie using models based on the casts Bacon
Numbers, i.e. how many degrees of separation are
between the actors and a movie staring Kevin Bacon
4
A Closer Look
F771
  • There were 6 major versions
  • 13 actual implementations
  • 5 Languages
  • 2 major versions advanced the science
  • 4 major versions were simply software projects
  • All versions re-implemented basic features
  • The implementations used for the papers were not
    always used for the next major version

F772
F901
C3
C1
F902
F903
C2
F904
C1
C
Java
C2
Version used for paper
Versions that advanced science
5
Research Software Crisis!
Problem Research software applications are
difficult to develop and are costing researchers
time and money.
Solution Separate Research and Development and
use a development model derived from industrial
software development.
6
Software Development
Business Modeling
Requirements
Software development is an iterative process that
consists of three main phases Business
modeling, Application Development, Maintenance
Design
Implement
Test
Deploy
Maintain
7
Business Modeling
Goal Understand the main roles and procedures
used in a research program
  • Identify Roles
  • Researcher
  • Support staff
  • Developer
  • PI
  • Identify projects
  • Identify common workflows
  • Data processing pipelines
  • Experimental protocols
  • Identify commonly used data
  • Instrument data
  • Reference data collections
  • Parameter files
  • Identify physical resources
  • Instruments
  • Reagents
  • Identify computational resources
  • Commercial software tools (e.g., Excel, Spotfire,
    ChemDraw, etc)
  • In-house software

Business Modeling
Requirements
Design
Implement
Test
Deploy
Maintain
8
Requirements and Design
Goal Understand and agree upon the main features
for the application and each iteration
  • Requirements will change as the project evolves
    based on user feedback
  • Initial requirements should include only features
    that are needed by users, not features that might
    be needed in the future
  • The design fairly coarse grained, but identify
    all the major components
  • Components that use unfamiliar technology should
    be prototyped

Business Modeling
Requirements
Design
Implement
Test
Deploy
Maintain
9
Implementation and Testing
Goal Implement the current iterations features
  • This is where code is written
  • Unit tests are fine-grained tests that cover one
    or two low level features
  • As the code is written, it is versioned. This
    makes it possible to revert to older versions.
  • For in-house software, testing is generally
    performed by the user and developer
  • Short iterations and direct contact between
    developers and users facilitate bug fixes
  • For scientific software, testing must include
    validation, that is, confirming that the code
    generates correct results

Business Modeling
Requirements
Design
Implement
Test
Deploy
Maintain
10
Deployment and Maintenance
Goal Deliver the application to the users and
continue to support it
  • Deployment consists of two steps
  • Staging
  • Application is installed in a production
    sandbox
  • Users test application
  • Deployment
  • Application is installed on the users machines
  • After deployment, the development process is
    repeated until the application is complete and
    enters maintenance mode
  • Complete is agreed upon by the developers and
    users
  • No application is ever really complete, which
    leads us to
  • Maintenance accounts for roughly 60 of software
    costs (time and money)
  • This is good! It means the application is being
    used and improved

Business Modeling
Requirements
Design
Implement
Test
Deploy
Maintain
11
Software Tools
Diagram software (Visio, etc), spreadsheets, word
processors
Business Modeling
Rapid prototyping tools (VB, Python),
Requirements
Interpreted languages (Java, VB, Python)
Libraries/Components (numerical, plotting,
instrument communication)
Design
Compiled languages (C/C/Fortran)
Implement
Integrated Development Environments (IDEs)
Test
Debuggers
Bug/Feature tracking system
Deploy
Packaging Systems
Automated build system (nightly)
Maintain
12
Roles
(bold roles are essential)
  • End User
  • Anyone who uses the software
  • Project Manager
  • Coordinates development efforts, resolves
    conflicts, ensures project is moving along
  • Note This is the hardest job to fill
  • Lead Developer (Architect, Sr. Software Engineer)
  • Experienced member of the team, understands
    technologies and is able to advise other
    developers
  • Same responsibilities as developer
  • Developer
  • Responsible for all aspects of a portion of the
    application (requirements, design,
    implementation, testing)
  • Web developer
  • Similar to a developer, but with a skill set
    targeted at designing and implementing Web sites
    and applications
  • Database Administrator (DBA)
  • Maintains and optimizes the database and helps
    developers design database applications
  • Technical Writer
  • Develops tutorials and user manuals
  • Quality Assurance
  • On projects that a released to a wide audience, a
    separate QA team is responsible for testing
  • System administrator

13
Keys to Success
  • Process is necessary but not sufficient
  • Developer/User Interaction
  • The more levels of communication required, the
    higher the chance that requirements will be
    mis-communicated
  • Neutral management
  • The project managers role is to keep things
    moving smoothly without getting in the way
  • Small, incremental deliverables
  • This ensures the application evolves based on
    users needs and that requirements have a chance
    to be adjusted
  • Implement whats needed, not what might be needed
  • This keeps developers and users focused on the
    current problems
  • Put experienced developers in lead roles
  • You would never make an undergraduate a lead
    scientist
  • Mutual respect
  • The hierarchy and reward systems for software and
    science are different.
  • Scientists should treat developers as colleagues,
    not as servants
  • Developers should respect the ideals and
    institutions of science
  • Developers should be willing to understand the
    scientific field they are supporting

14
Benefits
  • Software Quality is improved
  • Applications are not single-user prototypes
  • Features are available to all researchers
  • Developers are not distracted by classes, papers,
    etc
  • Research Process is improved
  • Researchers can focus on research
  • Development is not a bottleneck
  • Reproducibility and Traceability
  • Reproduce old experiments, trace the data/process
    that led to a result
  • Easier to integrate new/visiting researchers
  • Tools can be shared with a larger community
  • High-end software becomes possible
  • Parallel and high-performance implementations
  • Well designed user interfaces
  • Visualization
  • Databases
  • Data mining
  • Web applications/services

15
Implementing a Software Process
  • Step 1
  • Train research staff about basic software
    processes
  • Incorporate basic tools into the research
    environment
  • Version control
  • Unit tests/validation
  • Bug/Feature Tracking
  • Standard locations for deployed applications and
    data
  • Assign development roles to research staff
  • Make sure to separate research work and
    development work
  • Step 2
  • Build a full time development staff as the
    projects grow
  • Initial staff should include a lead developer and
    a project manager
  • Use project manager to coordinate research
    projects, too
  • A full time developer also helps track
    institutional knowledge as students come and go
  • Additional staff can be added on a consulting,
    part time, or full time basis as needed
  • Step 3
  • Get back do doing what you love science!

16
Costs and Funding
  • Good software is not cheap
  • Personnel Costs
  • Lead developer 70-100k
  • expect 80k to keep a good developer around
  • Developer 40-100k, same as above (contract
    30-200/hr)
  • Project Manager 70-96k
  • System administrator 50-70k
  • Database administrator 70-110k
  • Note that TCO is 1.5-2.5x base salary
  • Funding
  • Share resources with collaborators, department
  • Take advantage of university support services
  • Systems, HPC, visualization, consulting
  • Classes! (e.g., Software Carpentry)
  • Write development costs and infrastructure
    directly into grants
  • Look for software infrastructure grants
  • Lobby!

17
Conclusions
  • Developing software is a complex process
  • Training can help understand and manage
    complexity
  • Separating research and development can help
    improve the quality of research software
  • Existing staff can do this to some extent, but
    outside help is needed as projects expand
  • The funding climate needs to change to fully
    support this
  • Software should be considered essential research
    equipment, on par with microscopes, mass
    spectrometers, and supercomputers

18
References
  • The Mythical Man-Month, Frederick P. Brooks,
    Jr.
  • Peopleware Productive Projects and Teams, Tom
    DeMarco Timothy Lister
  • Software Project Survival Guide, Steve
    McConnell
  • Facts and Fallacies of Software Engineering,
    Robert L. Glass

19
Questions?
Write a Comment
User Comments (0)
About PowerShow.com