VDT and Interoperable Testbeds - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

VDT and Interoperable Testbeds

Description:

... of University of Wisconsin Madison. Alain Roy (CS staff, Condor ... Scott Gose (iVDGL/Globus, Argonne Lab) New iVDGL hire (CS) from Madison starting December 1 ... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 42
Provided by: rober492
Category:

less

Transcript and Presenter's Notes

Title: VDT and Interoperable Testbeds


1
VDT and Interoperable Testbeds
  • Rob Gardner
  • University of Chicago

2
Outline
  • VDT Status and Plans
  • VDT Middleware
  • VDT Team
  • VDT release description
  • VDT and the LCG project
  • Interoperable Grids Status and Plans
  • GLUE working group
  • Grid operations
  • Distributed facilities and resource monitoring
  • ATLAS-kit deployment
  • WorldGrid iVDGL-DataTAG grid interoperability
    project, ATLAS SC2002

3
VDT Middleware
  • Joint GriPhyN and iVDGL deliverable
  • Basic middleware for US LHC program
  • VDT 1.1.5 in use by US CMS testbed
  • US ATLAS testbed is installing VDT with WorldGrid
    components for EU interoperability
  • Release structure
  • introduce new middleware components from GriPhyN,
    iVDGL, EDG, and other grid developers and working
    groups
  • framework for interoperability software and
    schema development (eg. GLUE)

4
Team
  • VDT Group
  • GriPhyN (development) and iVDGL (packaging,
    configuration, testing, deployment)
  • Led by Miron Livny of University of Wisconsin
    Madison
  • Alain Roy (CS staff, Condor team)
  • Scott Koranda (LIGO, University of Wisconsin,
    Milwaukee)
  • Saul Youssef (ATLAS, Boston University)
  • Scott Gose (iVDGL/Globus, Argonne Lab)
  • New iVDGL hire (CS) from Madison starting
    December 1
  • Plus a community of participants
  • Dantong Yu, Jason Smith (Brookhaven) mkgridmap,
    post-install configuration
  • Patrick McGuigan, UT Arlington valuable
    testing/install feedback, PIPPY
  • Alan Tackett, Bobby Brown (Vanderbilt)
    installation feedback, documentation

5
VDT
  • Basic Globus and Condor, plus EDG software (eg.
    GDMP)
  • Plus lots of extras
  • Pacman
  • VO management
  • Test harness
  • Glue schema
  • Virtual Data Libraries
  • Virtual Data Catalog
  • Language and interpreter
  • Server and Client

6
VDT Releases
  • Current version VDT 1.1.5 Released October 29,
    2002
  • Major recent upgrades (since 1.1.2)
  • Patches to Globus 2.0, including the OpenSSL
    0.9.6g security update.
  • A new and improved Globus job manager created by
    the Condor team that is more scalable and robust
    than the one in Globus 2.0. This job manager has
    been integrated into the Globus 2.2 release.
  • Condor 6.4.3 and Condor-G 6.4.3.
  • New software packages, including
  • FTSH (The fault tolerant shell) version 0.9.9
  • EDG mkgridmap (including perl modules that it
    depends on)
  • EDG CRL update
  • DOE and EDG CA signing policy files, so you can
    interact with Globus installations and users that
    use CAs from the DOE and EDG.
  • A program to tell you what version of the VDT has
    been installed, vdt-version.
  • Test programs so that you can verify that your
    installation works.
  • The VDT can be installed anywhere--it no longer
    needs to be installed into the /vdt directory.
  • VDT configuration
  • Sets up Globus and GDMP daemons to run
    automatically
  • Configures Condor to work as a personal Condor
    installation or a central manager
  • Configures Condor-G to work
  • Enables the Condor job manager for Globus, and a
    few other basic Globus configuration steps.
  • Doing some of the configuration for GDMP.

7
Deployment with Pacman
  • Packaging and post-install configuration Pacman
  • Key piece required to do anything not only for
    middleware but also applications and higher level
    toolkits
  • Tools to easily manage installation and
    environment
  • fetch, install, configure, add to login
    environment, update
  • Sits over top of many software packaging
    approaches (rpm, tar.gz, etc.)
  • Uses dependency hierarchy, so one command can
    drive the installation of a complete environment
    of many packages
  • Packages organized into caches hosted at various
    sites
  • Distribute responsibility for support
  • Has greatly helped in testing/installation of
    VDT many new features
  • Made it possible to quickly setup grids and
    application packages

8
VDT and LCG Project
  • VDT and EDG being evaluated for LCG-1 testbed
  • EDG
  • Collected tagged and supported set of middleware
    and application software packages and procedures
    from the European DataGrid project available as
    RPMs with a master location.
  • Includes Application Software for deployment and
    testing on EDG sites
  • Most deployment expects most/all packages to be
    installed with small set of uniform
    configuration
  • Base layer of software and protocols are common
  • Globus X509 certificates, GSI authentication,
    GridFTP, MDS LDAP monitoring and discovery
    framework, GRAM job submission
  • Authorization extensions LDAP VO service
  • Condor Condor-G job scheduling, matchmaking
    (ClassAds), Directed Acyclic Graph (job task
    dependency manager DAGMAN)
  • File movement and storage management GDMP,
    GridFTP
  • Possible solution VDT EDG WP1, WP2
  • If adopted, PIs of GriPhyN and iVDGL, and US CMS
    and US ATLAS computing managers will need to
    define a response for next steps for support

9
Grid Operations
  • Operations areas
  • registration authority
  • VO management
  • Information infrastructure
  • Monitoring
  • Trouble tracking
  • Coordinated Operations
  • Policy
  • Full time effort
  • Leigh Grundhoefer (IU)
  • New hire (USC)
  • Part time effort
  • Ewa Deelman (USC)
  • Scott Koranda (UW)
  • Nosa Olmo (USC)
  • Dantong Yu (BNL)

10
Distributed Facilities Monitoring
  • VO Centric Nagios Sergio Fantinel, Gennaro
    Tortonne (DataTAG)
  • VO Centric Ganglia Catalin Dumitrescu (U of
    Chicago)
  • Ganglia
  • Cluster resource monitoring package from UC
    Berkeley
  • Local cluster and meta-clustering capabilities
  • Meta-daemon storage of machine sensors for CPU
    load, memory, I/O
  • Organize sensor data hierarchically
  • Collect information about job usages
  • Assign users to VOs by queries to Globus job
    manager
  • Tool for policy development
  • express, monitor and enforce policies for usage
    according to VO agreements
  • Deployment
  • Both packages deployed on US and DataTAG ATLAS
    and CMS sites
  • Nagios plugin work by Shawn McKee sensors for
    disk, I/O usage

11
Screen Shots
12
Glue Project
report from Ruth Pordes
  • Background
  • Joint effort of iVDGL Interoperability Group and
    DataTAG WP4
  • Led by Ruth Pordes (Fermilab, also PPDG
    Coordinator)
  • Initiated by HICB (HEP Intergrid Coordination
    Board)
  • Goals
  • Technical
  • Demonstrate that each Grid Service or Layer can
    Interoperate
  • Provide basis for interoperation of applications
  • Identify differences in protocols and
    implementations that prevent interoperability
  • Identify lacks in architecture and design that
    prevent interoperability
  • Sociological
  • Learn to work across projects and boundaries
    without explicit mandate or authority but for the
    longer term good of the whole.
  • Intent to expand from 2 continents to Global -
    inclusive not exclusive
  • Strategic
  • Any Glue code, configuration and documents
    developed will be deployed and supported through
    EDG and VDT release structure
  • Once interoperability demonstrated and part of
    the ongoing culture of global grid middleware
    projects Glue should not be needed
  • Provide short term experience as input to GGF
    standards and protocols
  • Prepare way for movement to new protocols - web
    and grid services (OGSA)

13
Glue Security
  • Authentication - X509 Certificate Authorities,
    Policies and Trust
  • DOE Science Grid SciDAC project CA is trusted by
    the European Data Grid and vice versa
  • Experiment testbeds (ATLAS,  CMS,  ALICE, D0 and
    BaBar) use cross-trusted certificates
  • Users starting to understand need multiple
    certificates also
  • Agreed upon mechanisms for communicating new CAs
    and CRLs have been shown to work but more
    automation for Revocation is clearly needed
  • Authorization
  • Initial authorization mechanisms are in place
    everywhere using the Globus gridmapfiles
  • Various supporting procedures are used to create
    gridmapfiles from LDAP databases of certificates
    or other means
  • Identified requirement for more control over
    access to resources at time of request for use.
    But no accepted or interoperable solutions are in
    place today
  • Virtual Organization Management or Community
    Authorization Service is under active discussion
    - see the PPDG SiteAA mail archives.. And this
    is just one mail list of several..
    http//www.ppdg.net/pipermail/ppdg-siteaa/2002

14
Other Glue Work Areas
  • GLUE Schema (for resource discovery, job
    submission)
  • EDG and the MDS LDAP schema and information were
    initially very different
  • Commitment made up front to move to common
    resource descriptions.
  • Effort has taken since February - weekly phone
    meetings and much email
  • GLUE Schema Compute and Storage Information are
    Released in V1 in MDS 2.2 and will be in EDG V2.0
    - defined with UML and LDIF..
  • Led to better understanding by all participants
    of CIM goals and to collaboration with CIM schema
    group through the GGF
  • File transfer and storage
  • Interoperability tests using GridFTP and SRM V1.0
    within the US have started with some success
  • Joint demonstrations
  • Common submission to testbeds based on VDT and
    EDG in a variety of ways
  • ATLAS Grappa to iVDGL sites (US ATLAS, CMS, LIGO,
    SSDSS) and EDG (JDL on UI)
  • CMS-MOP

15
ATLAS-kit
  • ATLAS-kit
  • RPMs based on Alessandro DeSalvos work
    distributed to DataTAG and EDG sites release
    3.2.1
  • Luca Vacarossa packaged version VDT sites with
    Pacman
  • Distributed as part of ScienceGrid cache
  • ATLAS-kit-verify invokes ATLSIM, does one event
  • New release
  • 4.0.1 in preparation by AD
  • Distribution to US sites to be done by Yuri
    Smirnov (new ATLAS iVDGL hire, started November
    1)
  • Continued work with Flavia Donno and others

16
WorldGridhttpwww.ivdgl.org/worldgrid
  • Collaboration between US and EU grid projects
  • Shared use of Global resources across
    experiment and Grid domains
  • Common submission portals ATLAS-Grappa,
    EDG-Genius, CMS MOP master
  • VO centric grid monitoring Ganglia and
    Nagios-based
  • Infrastructure development project
  • Common information index server with Globus, EDG
    and GLUE schema
  • 18 sites, 6 countries 130 CPUs
  • Interoperability components (EDG schema and IP
    providers for VDT servers, UI and JDL for EDT
    sites) First steps towards policy
    instrumentation and monitoring
  • Packaging
  • with Pacman (VDT sites) and RPMs/lcfg (DataTAG
    sites)
  • ScienceGrid
  • ATLAS, CMS, SDSS and LIGO application suites

17
WorldGrid Site
  • VDT installation and startup
  • Packaging and installation, configuration
  • Gridmap file generation
  • GIIS(s) registration
  • Site configuration, WorldGrid-WorkerNode testing
  • EDG information providers and testing
  • Gangila sensors, instrumentation
  • Nagios plugins, display clients
  • EDG packaged ATLAS and CMS code
  • Sloan applications
  • Testing with Grappa-ATLAS submission (Joint EDT
    and US sites)

18
WorldGrid at IST2002 and SC2002
19
Grappa
  • Grappa
  • Web-based interface for Athena job submission to
    Grid resources
  • First one for ATLAS
  • Based on XCAT Science Portal technology developed
    at Indiana
  • Components
  • Built from Jython scripts using java-based grid
    tools
  • Framework for web-based job submission
  • Flexible user-definable scripts saved as
    'notebooks
  • Interest from GANGA team to work collaboratively
    on grid interfaces
  • IST2002 and SC2002 demos

20
XCAT Science Portal
  • Portal framework for creating science portals
    (application-specific web portals that provide an
    easy and intuitive interface for job execution on
    the Grid)
  • Users compose notebooks to customize portal (eg.
    The Athena Notebook)
  • Jython scripts (the user notebooks)
  • flexible python scripting interface
  • easy incorporation of java-based toolkits
  • Java toolkit
  • IU XCAT Technology (component technology)
  • CoG (Java implementation of Globus)
  • Chimera (Java Virtual Data toolkit-GriPhyN)
  • HTML form interface
  • runs over Jakarta Tomcat server using https
    (secure web protocols)
  • Java integration increases system-independence/
    robustness

21
Athena Notebook
  • Jython implementation of java-based grid toolkits
    (CoG, etc)
  • Job Parameter input forms (Grid Resources, Athena
    JobOptions, etc)
  • Web-based framework for interactive job
    submission
  • --- integrated with ---
  • Script-based framework for interactive or
    automatic (eg. Cron-job) job submission
  • Remote Job Monitoring (for both interactive and
    cron-based jobs)
  • Atlfast and Atlsim job submission
  • Visual Access to Grid Resources
  • Compute Resources
  • MAGDA Catalogue
  • System health monitors Ganglia, Nagios, Hawkeye,
    etc
  • Chimera Virtual Data toolkit -- tracking of job
    parameters

22
Grappa communications flow
Script-Based Submisson interactive or cron-job
http JavaScript
Cactus framework
https - JavaScript
Web Browsing Machine (JavaScript) Netscape/Mozilla
/Int.Expl/PalmScape
Input files
CoG Submission,Monitoring
http// browse catalogue
Grappa Portal Machine XCAT tomcat server
Data Storage - Data Disk - HPSS
. . .
MAGDA registers file/location registers file
metadata
Magda
CoG
Resource A
Resource Z
(spider)
Data Copy
Compute Resources
23
Grappa Portal
  • Grappa is not
  • Just a GUI front end to external grid tools
  • Just a GUI linking to external web services
  • But rather
  • An integrated java implementation of grid tools
  • The portal itself does many tasks
  • Job scheduling
  • Data transfer
  • Parameter tracking

24
Grappa Milestones
  • Design Proposals Spring 2001
  • 1st Atlsim prototype Fall 2001
  • 1st Atlas Software Week Demo March 2002
  • Selected (May 2002) for US Atlas SC2002 Demo
  • 1st submission across entire US Atlas testbed
    (Feb 2002)
  • 1st Large Scale Job submission 50Mevents (April
    2002)
  • Integration with MAGDA (May 2002)
  • Registration of metadata with MAGDA (June 2002)
  • Resource Weighted Scheduling (July 2002)
  • Script based production cron system (July 2002)
  • GriPhyN-Chimera VDL Integration (Fall 2002)
  • EDG Compatibility plus DataTAG integration (Fall
    2002)

25
Spare Slides about Grappa Follow
26
Various Grappa/Athena modes
  • Has been run using locally installed libraries
  • Has been run using AFS libraries
  • Has been run with static boxed versions of
  • Atlfast and Atlsim
  • (where we bring along everything to the remote
    compute node)
  • This can translate to input data as well
  • Do we bring input data to the executable
  • Bring the executable to the data
  • Bring everything along
  • Many possibilities

27
A Quick Grappa Tour
  • Demo of running atlfast demo-production
  • Interactive mode
  • Automatic mode
  • GRAM contact monitoring
  • Links to resources external to the portal
  • Magda metadata queries
  • Ganglia cluster monitoring

28
Typical User session
Start up portal on selected machine Start up
web browser Configure/Select testbed
resources Configure input files. Submit job
29
User Session
Open monitoring window Auto refresh (User
configurable) Monitor/cancel jobs
30
User Session
Monitor Cluster health The new ganglia version
creates an additional level combining different
clusters into a metacluster
31
User Session
Browse MAGDA Catalogue Search for
Personal files would like Search for
Physics Collections (based on metadata)
32
Production Running
  • Configure Cronjob
  • Automatic job submission
  • Writing to magda cache
  • Automatic subnotebook naming
  • Setup up cron script location and timing
  • Script interacts with same portal as web-based
  • Use interactive mode to monitor production

33
Production Monitoring
Configure and test scripts using command line
tools Command line submission Automatic cron
submission View text log files
34
Production Monitoring
Access portal via web Cron submissions appear as
new subfolders Click on subfolder to check what
was submitted Monitor job status
35
Production Monitoring
Select jobs/groups of jobs to monitor Auto
refresh (user configurable) Cancel job button
36
Production Monitoring
Browse MAGDA catalogue Auto registration of
files Check metadata (currently available as
command line tool) Search for files would
like Search for physics collections
37
Metadata in MAGDA
Metadata published along with data files MAGDA
registers metadata Metadata browsing available
as command line tool -check individual files
38
Chimera Virtual Data Toolkit
  • Provides tracking of all job parameters
  • Browseable parameter catalogue
  • Simplified methods for data recreation
  • Processes that crash
  • Data that is lost
  • Data retrieval slower than re-creation
  • Condor-G job scheduling
  • And much more

39
Grappa US Atlas Testbed
  • Currently about 60 CPUs available
  • 6 condor pools
  • IU, OU, UTA, BNL, BU, LBL
  • 2 standalone machines
  • ANL, UMICH
  • Grappa/testbed production rate, cron-based,
  • Achieved 15M atlfast events/day

40
Grappa DC1 -- phase 1
  • Atlfast-(demo)-production testing successful
  • Simple atlfast demo with several pythia options
  • Large scale submission across the grid
  • Interactive and Automatic submission demostrated
  • Files and meta-data incorporated into MAGDA
  • Atlsim production testing initially successful
  • Atlsim run in both boxed and local modes.
  • Only one atlsim mode tested

41
Grappa DC1 -- phase 2
  • Atlsim notebook upgrade
  • Could make notebook tailor-made for phase2
  • Launching mechanism for atlsim
  • Atlsim does a lot that grappa could do
  • vdc queries
  • data transfer (input or output)
  • leave it in atlsim for now -- or --
  • incorporate some pieces into grappa
  • Would require additional manpower to
    define/incorporate/test atlsim production notebook
Write a Comment
User Comments (0)
About PowerShow.com