Achieving the Vision: Grid2003 and Beyond - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Achieving the Vision: Grid2003 and Beyond

Description:

From joint DataTAG/EDG/Trillium GLUE project. MonALISA monitoring framework from Caltech ... Need tools, services, procedures, documentation, organization ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 21
Provided by: paula92
Category:

less

Transcript and Presenter's Notes

Title: Achieving the Vision: Grid2003 and Beyond


1
Achieving the Vision Grid2003 and Beyond
Paul Avery University of Florida avery_at_phys.ufl.ed
u
2
  • Grid2003 Running since Oct. 2003
  • 27 sites (U.S., Korea)
  • 2100-2800 CPUs
  • 700-1100 concurrent jobs
  • 10 applications

Korea
http//www.ivdgl.org/grid2003
3
Grid2003 A Collaborative Effort
  • Participants sites in U.S. Trillium Grid
    projects
  • PPDG (Particle Physics Data Grid)
  • GriPhyN
  • iVDGL (International Virtual Data Grid
    Laboratory)
  • (US-ATLAS, US-CMS, BTEV, LIGO, SDSS, Computer
    Science)
  • US-ATLAS and US-CMS effort
  • Fermilab, LBL, Argonne
  • U. New Mexico, U. Texas Arlington
  • International sites affiliated with LHC
    experiments
  • Kyungpook National University (Korea, CMS)
  • New sites
  • University of Buffalo (CCR)

4
Grid2003 A Federated Approach
  • Federation Example from US-LHC testbeds
  • Local responsibility for facilities, but
    reporting to US-LHC projects
  • Systems and support, local resources w/ well
    defined interfaces
  • General grid-wide services provided by (some)
    sites
  • Six distinct Virtual Organizations (VOs) within
    Grid2003
  • US-ATLAS
  • US-CMS
  • BTEV
  • LIGO
  • SDSS
  • iVDGL

5
Organization in Grid2003 Federation
  • Grid sites Autonomy, control, agreements,
    policies
  • Setup and manage systems (mix local and Grid use)
  • Install configure middleware head nodes
  • Automate central monitoring, validation,
    diagnosis
  • Grid system services
  • Collaborative approach to bringing up cross-site
    services(VO management, monitoring,
    configuration management)
  • Interfaces well defined through VDT and services
  • Robust against single point of failure
  • Grid application groups
  • 10 applications in several domains
  • End-to-end operations, diagnosis and production
    services

6
Middleware Packaging and Distribution
  • Virtual Data Toolkit (VDT) from GriPhyN
  • Globus, Condor, GriPhyN Chimera, Pegasus, DAGMAN
  • Pacman from iVDGL
  • Meta-packaging and distribution tool
  • VO management scripts from EDG
  • Mapping user accounts across multiple VOs
  • Schema and information providers
  • From joint DataTAG/EDG/Trillium GLUE project
  • MonALISA monitoring framework from Caltech
  • Netlogger monitoring package from DOE Science
    Grid
  • Upgrades
  • Upgraded from VDT 1.1.9 to VDT 1.1.11 during
    project
  • Upgrade from MDS 2.2 to MDS 2.4

7
Applications Run on Grid2003
  • High energy physics
  • US-ATLAS analysis (DIAL), US-ATLAS simulation
    (GCE)
  • US-CMS simulation (MOP)
  • BTeV simulation
  • Gravity waves
  • LIGO blind search for continuous sources
  • Digital astronomy
  • SDSS cluster finding (maxBcg)
  • Bioinformatics
  • Bio-molecular analysis (SnB)
  • Genome analysis (GADU/Gnare)
  • CS Demonstrators
  • Job Exerciser, GridFTP Demo, NetLogger-grid2003

8
Grid2003 A Necessary Step
  • Learning how to cope with large scale
  • Interesting failure modes as scale increases
  • Enormous human burden, barely possible on SC2003
    timescale
  • Previous experience from Grid testbeds critical
  • Learning how to operate a Grid
  • Add sites, recover from errors, provide
    information,update software, add sites, test
    applications,
  • Need tools, services, procedures, documentation,
    organization
  • Need reliable, intelligent, skilled people
  • Learning how to delegate responsibilities
  • Multiple levels Project, VO, service, site,
    application
  • Essential for future growth
  • Grid2003 experience critical for building
    useful Grids
  • See Grid2003 Project Lessons for details

9
Grid2003 A SUCCESS Story!
  • Much larger than originally planned
  • More sites, CPUs, simultaneous jobs
  • More applications (10) in more diverse areas
  • Able to accommodate a new institution
    application
  • U Buffalo
  • Survived updates of critical software
  • VDT, MDS, MonaLISA
  • Still operational after 2.5 months
  • US-CMS using it for production simulations
  • twice resources than in US-CMS alone (next slide)

10
US-CMS Production
USCMS
Non-USCMS
11
Lesson 1 Building Stuff Matters
  • Building something brings out the best in people
  • (Similar to a large HEP detector)
  • Cooperation
  • Willingness to invest time
  • Striving for excellence!
  • Grid development requires significant deployments
  • CMS testbed debugging Globus, Condor (early
    2002)
  • ATLAS testbed early development of Grid tools
  • SDSS, LIGO, CMS virtual data tools
  • Powerful training mechanism
  • Good starting point for new institutions

12
Lesson 2 Packaging Matters
  • VDT and Pacman
  • Simple installation, configuration of Grid tools
    ( applications)
  • Hugely important for first testbeds in 2002
  • Major advances over 13 VDT releases
  • Great improvements expected in Pacman 3
  • Packaging is a strategic issue!
  • More than a convenience crucial to our future
    success
  • Packaging ? Uniformity automation ? lower
    barriers to scaling
  • Automation is the next frontier
  • Reduce FTE overhead, communication traffic
  • Automate installation, configuration, testing,
    validation
  • Automate software updates, enable remote
    installation, etc.
  • Develop a complete Grid2003 installation in
    Pacman 3?

13
Grid2003 and Beyond (1)
  • Continuing commitment of Grid2003 stakeholders
  • Deploy Functional Demonstration Grids
    Grid2004, Grid2005,
  • Continuing evolution of Functional Demonstration
    Grids
  • New release every 6-12 months, increasing
    functionality scale
  • Continuing commitment to Grid and related RD
  • CS research, VDT improvements (GriPhyN, PPDG)
  • Security (PPDG)
  • Advanced monitoring (MonALISA/GEMS, MDS, )
  • Collaborative tools, e.g. VRVS, AG,

14
Grid2003 and Beyond (2)
  • Continuing development of new tools, services
  • Grid enabled analysis
  • UltraLight infrastructures CPU storage
    optical networks
  • Continuing development and exploitation of
    networks
  • National HENP WG on Internet2, National Lambda
    Rail
  • International SCIC, AMPATH, world data xfer
    speed records

15
Chimera Virtual Data System
  • Virtual Data Language (VDL)
  • Describes virtual data products
  • Virtual Data Catalog (VDC)
  • Used to store VDL
  • Abstract Job Flow Planner
  • Creates a logical DAG (dependency graph)
  • Concrete Job Flow Planner
  • Interfaces with a Replica Catalog
  • Provides a physical DAG submission file to
    Condor-G
  • Generic and flexible
  • As a toolkit and/or a framework
  • In a Grid environment or locally

VDC
AbstractPlanner
XML
XML
VDL
DAX
ReplicaCatalog
ConcretePlanner
Virtual data CMS production MCRunJob
DAG
DAGMan
16
National Light Rail Footprint
  • Started in 2003
  • Initial 4?10 Gb/s
  • Future 40?10 Gb/s

17
UltraLight
Unified Infrastructure Computing, Storage,
Networking
  • 10 Gb/s network
  • Caltech, UF, FIU, UM, MIT
  • SLAC, FNAL, BNL
  • Intl partners
  • Cisco, Level(3), Internet2

18
Grid2003 and Beyond (3)
  • Continuing commitment to international
    collaboration
  • Close coordination with LCG
  • Constant participation in LHC production
    computing exercises
  • Development of new international partners
    (Brazil, Korea, )
  • GLORIAD, ITER,
  • Continuing commitment to multi-disciplinary
    activities
  • HEP, CS, LIGO, Astronomy, Biology, Coastal
    Engineering,
  • Continuing evolution of interactions w/ funding
    agencies
  • Partnership of DOE (labs) and NSF (universities)
  • Close interaction of Directorates within NSF
    (e.g., CHEPREO)
  • Continuing commitment to coordinated outreach
  • QuarkNet, GriPhyN, iVDGL, PPDG, CHEPREO, CMS,
    ATLAS
  • Jan. 29-30 Needs Assessment Workshop in Miami
  • Digital Divide efforts (Feb. 15-20 Rio workshop)

19
An Inter-Regional Center for High Energy Physics
Research and Educational Outreach (CHEPREO) at
Florida International University
  • E/O Center in Miami area
  • iVDGL Grid Activities
  • CMS Research
  • AMPATH network (S. America)

Funded September 2003
20
Is Grid2003 a Path to Open Science Grid?
  • Yes (previous slides)
  • but not the whole story
  • Security
  • User account management
  • Storage management
  • Cluster management
  • Accounting
  • Database integration
  • Optical network integration
  • Heterogeneity (IA64, G5, other Linux flavors)
  • MPI type applications
  • More (applications, manpower, computing
    resources)
  • We need collaborators!
Write a Comment
User Comments (0)
About PowerShow.com