Title: The mechanics of EPCC
1The mechanics of EPCC
- How EPCC learnt to do technology transfer and
software engineering the hard way
2Structure of talk
- What is EPCC today
- Learning to deliver on time
- The structure of a commercial project
- Software development and OGSA-DAI
- Project management
- Questions and discussion
3EPCC Activities
- Europes largest, most successful supercomputing
centre 15 years old - Vital statistics
- 65 staff
- 3.2M turnover (almost) all from external
sources - Multidisciplinary and multi-funded
- with a large spectrum of activities
- and a critical mass of expertise
- Strong engagement with industry
- from local SMEs to large multinationals
- project based consultancy services
- Supports research at University of Edinburgh via
- access to facilities
- training and support
- TRACS visitor programme
- NeSC
- founding partner of National e-Science Centre
- Wide variety of leading-edge systems
- 1,600 processor HPCx system
- 2,000 processor IBM Bluegene/L
4Commercial activities today
- Bespoke software development and software project
management for business - network, cluster and high-performance computing
- novel application areas
- from mushrooms to internet packets
- Start-to-finish projects
- full software development lifecycle, 3 - 12
months - most commercial projects are lt 6 months
- Operate like a business
- Commercial Group brings in business, Software
Development Group delivers - charge at commercial rates (1,000 per day)
- very delivery focused
- all commercial contracts are fixed cost
- funded by cash contracts, public funds and
European Commission, EU - many of the smaller projects are supported by SE
5Clients
UK o Almond Engineering Ltd o Altamira Ltd o
Arran Aromatics Ltd o Callanders Sawmills Ltd o
Calman Ltd o CB Technology Ltd o Centre for
Customer Awareness Ltd o CERN o Cheltenham
Gloucester plc o DTI o Digital Bridges Ltd o
Elektrobit Ltd o First Group plc o Golden Crumb
Ltd o High Speed Productions Ltd o Integriti
Solutions Ltd o IP Technology Ltd o Ironside
Farrar Ltd o Jardine Technology Ltd o Peppers
Ghost Productions Ltd o Radar World Ltd o Red
Lemon Ltd o Rosti (Scotland) Ltd o Quadstone
Ltd o SCI Ltd o Scottish Enterprise o The Crown
Office o TSB Bank Scotland Ltd o UK
Meteorological Office o Upstream Systems Ltd o
Alpha Data Parallel Systems Ltd o Nallatech Ltd
UK o Almond Engineering Ltd o Altamira Ltd o
Arran Aromatics Ltd o Callanders Sawmills Ltd o
Calman Ltd o CB Technology Ltd o Centre for
Customer Awareness Ltd o CERN o Cheltenham
Gloucester plc o DTI o Digital Bridges Ltd o
Elektrobit Ltd o First Group plc o Golden Crumb
Ltd o High Speed Productions Ltd o Integriti
Solutions Ltd o IP Technology Ltd o Ironside
Farrar Ltd o Jardine Technology Ltd o Peppers
Ghost Productions Ltd o Radar World Ltd o Red
Lemon Ltd o Rosti (Scotland) Ltd o Quadstone
Ltd o SCI Ltd o Scottish Enterprise o The Crown
Office o TSB Bank Scotland Ltd o UK
Meteorological Office o Upstream Systems Ltd o
Alpha Data Parallel Systems Ltd o Nallatech Ltd
2000 - 2004
USA o Cisco Systems Inc o Sun Microsystems Inc o
IBM Corporation o Oracle Corporation o Hewlett
Packard o Microsoft o Xilinx Corporation
Japan o Hitachi o NEC Europe o Fujitsu Labs
Europe
Europe o European Commission many EU project
partners
6Business Strategy
- to solve business problems NOT sell technology
- individual solutions for clients
Technology Push Down
Academic Research
Project size
OGSA-DAI
X,000,000
PGPGrid
SunDCG
X00,000
First Group
IPO
Microsoft
CCA
X,000
CG
Autoscreen
7How do we work?
Heroic Programming!
All-nighters!
Work weekends!
No.
8How do we work?
- Take pride in a professional approach
- Work in small project teams
- Project leader, 1-6 developers, technical
reviewers - Use documented engineering management processes
- Project management based on PRINCE2
- Engineering using agile methods
- Built from experience and industry best practice
- Iterative/staged development techniques
- Requirements triage
- Test-driven development
- Tuned to the leading edge of innovative software
development
9Who does the work?
- Currently around 4 business development staff
- 2 focus on business development
- 2 focus on marketing and publicity
- Currently around 20 engineering staff
- Three full-time project managers, two software
architects - c. 15 consultants and principal consultants
- Staff backgrounds maths , physics, computer
life sciences - Over 100 staff-years of experience, over 1/3 from
industry - Typical skills
- Java, C/C, Visual Basic/C, Perl, Fortran
- Distributed computing, webservices, XML, J2EE,
MPI, OpenMP - Databases, SQL, JDBC, XML-DB
- Software engineering, OO design, UML
10EPCCs early history
- Established in 1990
- focus for interest in parallel computing within
Physics and CS - Early years largely supported by UK Government
Parallel Applications Programme - made lots of money working with large UK
corporations to optimise/parallelise their codes - How did our funding model come about?
- from a belief in the self-funding of
Universityresearch - weve shown it can be done but its
verydifficult - it did mean we had to work with industry fromthe
beginning
11EPCC history (continued)
- 1990-1994
- funded by UK Government Parallel Applications
Programme - grew to 65 staff
- many parallelisation projects with UK industry
aerospace, nuclear, oil gas etc etc - span out company Quadstone
- 1995-1996
- as Gov money dried up so did projects
- had to move from long term projects (18 months)
to much shorter projects (3-6 months) - major problem project / cost overruns
- nearly had to make many staff redundant
12EPCC history (continued)
- 1997-2000
- successfully moved markets from large-scale
industry to SMEs - opportunities focussed around successful EU TTN
project - projects 3-6 months in duration
- embarked on having a repeatable process
- 2000-now
- over the past few years moved into Grid computing
- continued to work with industry
- wide variety of projects
- OGSA-DAI data access integration for the Grid
- Intersim packet level modelling of
differentiated services - Golden Crumb automatic mushroom selection in
factory - Cheltenham Gloucester data mining for
mortgage industry
13How does EPCC work today?
- We have well developed project processes
- Two linked processes
- software development process
- project management process
- Will illustrate software development process
using OGSA-DAI as example - Recently moved to PRINCE2 project management
methodology
14The project lifecyle
- Commercial Group identifies clients and initiates
discussions - Following initial discussions CD and technical
staff visit company to discuss requirements - High level design written timings / costs
agreed - may involve free code survey at this point
- Contract negotiated fixed price includes
detailed workplan based on design - Project handed to IS staff scheduled according
to skills - All projects have
- Project Leader, Applications Consultant,
Technical Reviewer - Regular meetings between IS and CG
- CG act as account manager to company / funder
15OGSA-DAI
- Data Access and Integration for databases
resources on the Grid - Aim to deliver application mechanisms that
- Meet the data requirements of Grid applications
- Functionally, performance and reliability
- Reduce development cost of data centric Grid
applications - Provide consistent interfaces to data resources
- Acceptable and supportable by database providers
- Trustable, imposed demand is acceptable, etc.
- Provide a standard framework that satisfies
standard requirements - A base for developing higher-level services
- Data federation / Distributed query processing
- Data mining
- Data visualisation
16OGSA-DAI team
NeSC, Edinburgh
EPCC Team, Edinburgh
NEReSC, Newcastle
IBM Dissemination Team
IBM Development Team, Hursley
17Software Process and Teams
REVIEW
Programme Board
Technical Review Board
Technical Reviewer
Users Group
Peer Review and Inspection
Continual process ?
Reqs.
Design
Implement
QA
Ingest
DEVELOPERS
Nightly unit system tests
Deep track features
Release
Dissem.
Testing
Prototype
Additional test cases
System tests based on reqs
Test Cases
Fix Bugs
Support
Training
USERS
Use Cases
Prioritisation
Contribs
Requests
18Working together
- No more heroes any more
- the lone researcher can get into trouble
- so dont do it!
- use teams even for small projects
- a task leader to keep the bigger picture in mind
- a reviewer as a technical foil for the
developer - distributed extreme programming doesnt work
- be sensible!
- Code needs owners
- and joint ownership doesnt work
- Java packages and CVS module provide useful
boundaries - buddy system worked well for a team of 10-12,
not as well for 5 - we now have 80,000 lines of Java code 30,000
lines of documentation
19An agile approach to development
- Agility is all
- Grid/HPC environments and problems complex
systems - complex systems big, complex projects
- big, complex projects high risk of failure
- adopting incremental approaches to requirements,
design, and implementation helps minimise risk - delivering small increments regularly is good
- good for quality, for visibility, for morale
- Keep your eyes on the road
- keep an active eye on project risks
- think about what happens if this goes wrong
- just thinking about it reduces the likelihood
itll happen!
20Releasing software
- No release schedule no releases
- dont timebox research, but do timebox
development - HPC is fun and exciting - beware feature creep!
- hows the project?
- oh, were 95 there (and always will be)
- frequent release milestones focus developers
- but dont overspecify what will be released
- OGSA-DAI had the opposite problem
- three months too short
- six months about right
- major/minor/patch/special brew
- set your testing timetable in stone
21Know your requirements
- Requirements, requirements, requirements
- write em down! Give em numbers!
- remember, requirements arent just functional!
- whatever they are, they are always testable
- tests on HPC systems may be tricky, but
thatmakes it fun! - MoSCoW notation is good
- Must, Should, Could, Won't
- how important are Priority 3 requirements
again..? - OGSA-DAI had lots of requirements
- but make sure you can understand their worth
- real users are often better than good ideas
- a user group helps to focus development as
software matures
22Return, recycle, reuse
- Throwaway prototypes never are
- once Ive proved this, Ill junk the code
- no, you wont (or your grad student wont)
- apply some basic process even to trivial codes
- even reuse of good code is sometimes wrong
- OGSA-DAI started with high ideals
- beware the big ball of mud
- patterns in architecture
- Shantytown
- enables quick exploration of feature territory
- must be built on a strong central foundation
- must include council legislation aka testing
23OGSA-DAI Dashboard
24Can I see your documents please?
- Document! Document! Document!
- Imagine trying to program without a language
reference - structure and stability is good
- Get people who like writing documents to do them
- but get everyone to doc their code
- a single editor can provide guidance
- Good code documentation can be used by the
tooling - Good human documentation will win your users
support - Make sure you dont underestimate the cost
- code maintenance and documentation takes longer
than code development - make it part of the process
25People power
- Social engineering is the key
- Push decisions down to the developers
- Too many chiefs
- make sure you know what are the key battles to
win - Have a process for change
- or one person will become very unpopular
- developers and managers both think they know
whats best - Understand your teams
- different people like working in different ways
- no one style for management in OGSA-DAI
- Competition is good
- go one better
26The big picture
- Balance the hype
- software engineering is about vision vs effort vs
requests - expectation management is important
- researchers, developers, users and funders are
all different - and all want different things
- the larger the project, the harder it falls
- Listen to your users
- useability is good
- it has to install easily
- dont change your interface
- client tooling helps
- support helpdesk is better
- user groups are interesting
27Software development summary
- Agile methods are very sympathetic
- the Agile founders disliked Rigid Inflexible
Processes too! - Adopt a simple process and toolset
- even lightweight process really pays off
- scoping, requirements and risk analysis up front
- incremental approach to design, develop, test
- learn some basic tools (theyre even free!)
- distributed teams are hard to manage strictly
- distributed management is even harder
- Listen to your customers they always know best
28Project Management
- All technical staff have a line manager and at
least one project leader - Procedures are well documented and have grown up
over time - Recently we have moved to PRINCE2 project
management methodology for commercial projects - seems to work well but is a bit of a culture
shock - We employ staff specifically for project
management - All staff time is logged planned and actual
- A working day has two blocks of 3 hours
- Staff can bid for time to do research / proposal
writing
29What is PRINCE2?
- PRojects IN Controlled Environments version 2
- A project management standard produced by UKs
Office of Government Commerce (part of DTI) - PRINCE2 is a process-based approach for project
management providing an easily tailored, and
scalable method for the management of all types
of projects - PRINCE2 is a de facto UK PM standard
- becoming mandatory in the public sector (Gov,
NHS, Police) - becoming PM method of choice in business
- Unilever, GlaxoWellcome, Tesco, BT, Sun, TSB,
NatWest, Norwich Union, Centrica, Cable
Wireless - becoming widespread in Europe too
- PRINCE2 is internationally recognised and
respected
30What is PRINCE2 not?
- PRINCE2 is not a software engineering method
- but it grew out of an IT environment
- and it fits well with traditional or agile
development methods alike - PRINCE2 will not help you code better
- but it will help you deliver better quality
products, on time - and will stop you falling out with your
boss/staff - PRINCE2 will not tell you how to write software
- but it will leave you alone to write software
your way - PRINCE2 is not a silver bullet
- but its general, flexible and tailorable and
most importantly its based on common sense
31PRINCE2 in a nutshell
- Projects have a clear Business Case or they dont
happen - remind me again why were doing this project?
- Projects have a beginning, a middle and an end
- clearly defined they start and they stop they
dont weeble on forever - Projects run in stages with clearly defined
boundaries - get a clear picture of how were all doing
- Product-based planning focuses on deliverables
not tasks - think what do we have to make?
- Layered management corporate, board, project,
team - each level has a clearly defined interface with
the others - Management by exception
- if there are no problems, just carry on
management dont meddle - Change is fundamental change management is
intrinsic - assume things will change and plan accordingly
32The PRINCE2 process diagram
Corporate or Programme Management
Directing a Project
Project Mandate
Starting up a Project
Initiating a Project
Controlling a Stage
Managing Stage Boundaries
Closing a Project
Managing Product Delivery
Planning
33PRINCE2 Components
- As well as the processes there are several
complementary components - The Business Case
- a key driver the Why? for the project
- either a genuine (commercial) business case or at
least a set of compelling reasons - owned by the Executive
- monitored throughout the project
- if the BC goes away, the project should be
stopped - The Project Organisation
- describes the four management layers
- corporate, board, project, team
- everyone should have a job description
- make roles and responsibilities clear
34PRINCE2 Components (2)
- Plans
- product-based, as discussed above
- write product descriptions for key products
- Controls
- divide the project into Management Stages
- a Stage is as far ahead as you can plan in
reasonable detail - typically a few months
- define reports, meetings etc.
- Tolerances
- allowed variations in time, budget, scope before
escalation triggered - you have six months, /- 1 month
- you must satisfy these requirements those are
optional this stage
35PRINCE2 Components (3)
- Quality
- the project must define methods for QC and test
- quality checks should be built in to the MP
process - Risk
- think about it, monitor it
- one of the best management tools is to ask what
might go wrong? - and create plans to handle it if it does
- Configuration Management
- keep track of product versions and histories
- software version control tools are a good way of
implementing this
36PRINCE2 Summary
- PRINCE2 is a powerful, flexible, scalable PM
approach - Its based on industry best practice
- rooted in software development projects
- Provides good, intelligent layers of management
control - Formalises, in a positive way, customer relations
- Can fit easily with agile software development
- Its the only PM approach with internationally
recognised qualifications
37Final comments on working with industry
- Wear a tie!
- Remember that the person youre meeting is just
as nervous of meeting a mad academic as you are
of meet a rapacious capitalist - The managing director of the company may be drunk
- Always apply Denis Healeys law of holes When
in one stop digging - If youre going to deliver late tell the
customer straightaway - Listen listen listen!!! Its the only way to get
business
38Questions / discussion