Condor Build - PowerPoint PPT Presentation

About This Presentation
Title:

Condor Build

Description:

How the Condor Team Got Started in the Build/Test Business: Prehistory ... The Testing Challenge ... Becky Gietzel: parallel testing! ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 28
Provided by: Miron1
Category:
Tags: build | condor | testing

less

Transcript and Presenter's Notes

Title: Condor Build


1
Condor Build TestNMI, OMII, ETICS
2
How the Condor Team Got Started in the Build/Test
Business Prehistory
  • Oracle shamedHHHHHHinspired us.
  • The Condor team was in the stone age, producing
    modern software to help people reliably automate
    their computing tasks -- with our bare hands.
  • Every Condor release took weeks/months to do.
  • Build by hand on each platform, discover lots of
    bugs introduced since the last release, track
    them down, re-build, etc.

3
What Did Oracle Do?
  • Oracle selected Condor as the resource manager
    underneath their Automated Integration Management
    Environment (AIME)
  • Relied on to perform automated build and
    regression testing of multiple components for
    Oracle's flagship Database Server product.
  • Oracle chose Condor because they liked the
    maturity of Condor's core components.

4
Doh!
  • Oracle used distributed computing to automate
    their build/test cycle, with huge success.
  • If Oracle can do it, why cant we?
  • Use Condor to build Condor!
  • NSF Middleware Initiative (NMI)
  • right initiative at the right time!
  • opportunity to collaborate with others to do for
    production software developers like Condor what
    Oracle was doing for themselves
  • important service to the scientific computing
    community

5
NMI Statement
  • Purpose to develop, deploy and sustain a set of
    reusable and expandable middleware functions that
    benefit many science and engineering applications
    in a networked environment
  • Program encourages open source software
    development and development of middleware
    standards

6
Why should you care?
  • From our experience, the functionality,
    robustness and maintainability of a
    production-quality software component depends on
    the effort involved in building, deploying and
    testing the component.
  • If it is true for a component, it is definitely
    true for a software stack
  • Doing it right is much harder than it appears
    from the outside
  • Most of us had very little experience in this area

7
Goals of theNMI Build Test System
  • Design, develop and deploy a complete build
    system (HW and SW) capable of performing daily
    builds and tests of a suite of disparate software
    packages on a heterogeneous (HW, OS, libraries,
    ) collection of platforms
  • And make it
  • Dependable
  • Traceable
  • Manageable
  • Portable
  • Extensible
  • Schedulable

8
The Build Challenge
  • Automation - build the component at the push of
    a button!
  • always more to it than just configure make
  • e.g., ssh to right host cvs checkout untar
    setenv, etc.
  • Reproducibility build the version we released
    2 years ago!
  • Well-managed comprehensive source repository
  • Know your externals and keep them around
  • Portability build the component on
    nodeX.cluster.com!
  • No dependencies on local capabilities
  • Understand your hardware software requirements
  • Manageability run the build daily on 15
    platforms and email me the outcome!

9
The Testing Challenge
  • All the same challenges as builds (automation,
    reproducibility, portability, manageability),
    plus
  • Flexibility
  • test our RHEL4 binaries on RHEL5!
  • run our new tests on our old binaries
  • important to decouple build test functions
  • making tests just a part of a build -- instead of
    an independent step -- makes it
    difficult/impossible to
  • run new tests against old builds
  • test one platforms binaries on another platform
  • run different tests at different frequencies

10
Eating Our Own Dogfood
  • What Did We Do?
  • We built the NMI Build Test Lab on top of
    Condor, DAGMan, and other distributed computing
    technologies to automate the build, deploy, and
    test cycle.
  • To support it, weve had to construct and manage
    a dedicated, heterogeneous distributed computing
    facility.
  • Opposite extreme from typical cluster --
    instead of 1000s of identical CPUs, we have a
    handful of CPUs each for 40 platforms.
  • Much harder to manage! You try finding a
    sysadmin tool that works on 40 platforms!
  • Were just another big Condor user
  • If Condor sucks, we feel the pain.

11
NMI Build Test Facility
INPUT
Distributed Build/Test Pool
NMI Build Test Software
Spec File
Condor Queue
DAG
Customer Source Code
build/test jobs
Spec File
results
results
Customer Build/Test Scripts
results
Web Portal
MySQL Results DB
Finished Binaries
OUTPUT
12
Numbers
100 CPUs 39 HW/OS Platforms 34 OS 9 HW
Arch 3 Sites
100 GB of results per day 1400 Builds/tests
per month 350 Condor jobs per day
13
Condor Build Test
  • Automated Condor Builds
  • Two (sometimes three) separate Condor versions,
    each automatically built using NMI on 13-17
    platforms nightly
  • Stable, developer, special release branches
  • Automated Condor Tests
  • Each nightly builds output becomes the input to
    a new NMI run of our full Condor test suite
  • Ad-Hoc Builds Tests
  • Each Condor developer can use NMI to submit
    ad-hoc builds tests of their experimental
    workspaces or CVS branches to any or all platforms

14
(No Transcript)
15
More Condor Testing Work
  • Advanced Test Suite
  • Using binaries from each build, we deploy an
    entire self-contained Condor pool on each test
    machine
  • Runs a battery of Condor jobs and tests to verify
    critical features
  • Currently gt150 distinct tests
  • each executed for each build, on each platform,
    for each release, every night
  • Flightworthy Initiative
  • Ensuring continued core Condor scalability,
    robustness
  • NSF funded, like NMI
  • Producing new tests all the time

16
NMI Build Test Customers
  • NMI Build Test Facility was built to serve all
    NMI projects
  • Who else is building and testing?
  • Globus
  • NMI Middleware Distribution
  • many grid tools, including Condor Globus
  • Virtual Data Toolkit (VDT) for the Open Science
    Grid (OSG)
  • 40 components
  • Soon TeraGrid, NEESgrid, others

17
Build Test Beyond NMI
  • We want to integrate with other, related software
    quality projects, and share build/test
    resources...
  • an international (US/Europe/China) federation of
    build/test grids
  • Offer our tools as the foundation for other BT
    systems
  • Leverage others work to improve out own BT
    service

18
OMII-UK
  • Integrating software from multiple sources
  • Established open-source projects
  • Commissioned services infrastructure
  • Deployment across multiple platforms
  • Verify interoperability between platforms
    versions
  • Automatic Software Testing vital for the Grid
  • Build Testing Cross platform builds
  • Unit Testing Local Verification of APIs
  • Deployment Testing Deploy run package
  • Distributed Testing Cross domain operation
  • Regression Testing Compatibility between
    versions
  • Stress Testing Correct operation under real
    loads
  • Distributed Testbed
  • Need a breadth variety of resources not power
  • Needs to be a managed resource process

19
NMI/OMII-UK Collaboration
  • Phase I OMII-UK developed automated builds
    tests using the NMI Build Test Lab at
    UW-Madison
  • Phase II OMII-UK deployed their own instance of
    the NMI Build Test Lab at Southampton
    University
  • Our lab at UW-Madison is well and good, but some
    collaborators want/need their own local
    facilities.
  • Phase III (in progress) Move jobs freely between
    UW and OMII-UK BT labs as needed.

20
Next ETICS
Build system, software configuration, service
infrastructure, dissemination, EGEE, gLite,
project coord.
Software configuration, service infrastructure,
dissemination
Web portals and tools, quality process,
dissemination, DILIGENT
NMI Build Test Framework, Condor, distributed
testing tools, service infrastructure
Test methods and metrics, unit testing tools, EBIT
21
ETICS Project Goals
  • ETICS will provide a multi-platform environment
    for building and testing middleware and
    applications for major European e-Science
    projects
  • Strong point is automation of builds, of tests,
    of reporting, etc. The goal is to simplify life
    when managing complex software management tasks
  • One button to generate finished package (e.g.,
    RPMs) for any chosen component
  • ETICS is developing a higher-level web service
    and DB to generate BT jobs -- and use multiple,
    distributed NMI BT Labs to execute manage them
  • This work complements the existing NMI Build
    Test system and is something we want to integrate
    use to benefit other NMI users!

22
ETICS Web Interface
23
OMII-Japan
  • What Theyre Doing
  • provide service which can use on-demand
    autobuild and test systems for Grid middlewares
    on on-demand virtual cluster. Developers can
    build and test their software immediately by
    using our autobuild and test systems
  • Underlying BT Infrastructure is NMI Build Test
    Software

24
This was a Lot of Work But It Got Easier Each
Time
  • Deployments of the NMI BT Software with
    international collaborators taught us how to
    export Build Test as a service.
  • Tolya Karp International BT Hero
  • Improved (i.e., wrote) NMI install scripts
  • Improved configuration process
  • Debugged and solved a myriad of details that
    didnt work in new environments

25
What This Means For You
  • NMI BT Lab Deployment Experience Improved
    Packaging Improved Portability
  • We now have unique ability to give you not only
    source code, but a whole production build test
    infrastructure to go along with it
  • and we have done it for a number of users
    already

26
New CondorNMI Users
  • Yahoo
  • First industrial user to deploy NMI BT Framework
    to build/test custom Condor contributions
  • Hartford Financial
  • Deploying it as we speak

27
Whats to Come
  • More US international collaborations
  • OMII-Europe
  • More Industrial User/Developers
  • New Features
  • Becky Gietzel parallel testing!
  • Major new feature multiple co-scheduled
    resources for individual tests
  • Going beyond multi-platform testing to
    cross-platform parallel testing
  • UW-Madison BT Lab ever more platforms
  • its time to make the doughnuts
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com