Title: Trillium, Tier2 Centers and Grid3
1- Trillium, Tier2 Centers and Grid3
NSF Tier2 Meeting Arlington, VAJuly 9, 2004
Paul Avery University of Florida avery_at_phys.ufl.ed
u
2U.S. Trillium Grid Partnership
- Trillium PPDG GriPhyN iVDGL
- Particle Physics Data Grid 12M (DOE) (1999
2004) - GriPhyN 12M (NSF) (2000 2005)
- iVDGL 14M (NSF) (2001 2006)
- Basic composition (150 people)
- PPDG 4 universities, 6 labs
- GriPhyN 12 universities, SDSC, 3 labs
- iVDGL 18 universities, SDSC, 4 labs, foreign
partners - Expts BaBar, D0, STAR, Jlab, CMS, ATLAS, LIGO,
SDSS/NVO - Complementarity of projects
- GriPhyN CS research, Virtual Data Toolkit (VDT)
development - PPDG End to end Grid services, monitoring,
analysis - iVDGL Grid laboratory deployment using VDT
- Experiments provide frontier challenges
- Unified entity when collaborating internationally
3Trillium Science Drivers
- ATLAS CMS experiments _at_ CERN LHC
- 100s of Petabytes 2007 - ?
- High Energy Nuclear Physics expts
- 1 Petabyte (1000 TB) 1997 present
- LIGO (gravity wave search)
- 100s of Terabytes 2002 present
- Sloan Digital Sky Survey
- 10s of Terabytes 2001 present
- Future Grid resources
- Massive CPU (PetaOps)
- Large distributed datasets (gt100PB)
- Global communities (1000s)
4Goal Peta-scale Virtual-Data Gridsfor Global
Science
Production Team
Single Researcher
Workgroups
Interactive User Tools
GriPhyN
GriPhyN
GriPhyN
Request Execution Management Tools
Request Planning Scheduling Tools
Virtual Data Tools
ResourceManagementServices
Security andPolicyServices
Other GridServices
- PetaOps
- Petabytes
- Performance
GriPhyN
Distributed resources(code, storage,
CPUs,networks)
Raw datasource
5LHC Petascale Global Science
- Complexity Millions of individual detector
channels - Scale PetaOps (CPU), 100s of Petabytes (Data)
- Distribution Global distribution of people
resources
BaBar/D0 Example - 2004 700 Physicists 100
Institutes 35 Countries
CMS Example- 2007 5000 Physicists 250
Institutes 60 Countries
6Global LHC Data Grid Hierarchy
10s of Petabytes/yr by 2007-81000 Petabytes in
lt 10 yrs?
CMS Experiment
Online System
0.1 - 1.5 GBytes/s
CERN Computer Center
Tier 0
10-40 Gb/s
Tier 1
2.5-10 Gb/s
Tier 2
1-2.5 Gb/s
Tier 3
Physics caches
Tier 4
PCs
7Tier2 Centers
- Tier2 facility
- 20 40 of Tier1?
- 1 FTE support commodity CPU disk, no
hierarchical storage - Essential university role in extended computing
infrastructure - Validated by 3 years of experience with
proto-Tier2 sites - Functions
- Physics analysis
- Simulation
- Experiment software
- Support smaller institutions
- Official role in Grid hierarchy (U.S.)
- Sanctioned by MOU (ATLAS, CMS, LIGO)
- Local P.I. with reporting responsibilities
- Selection by collaboration via careful process
8Analysis by Globally Distributed Teams
- Non-hierarchical Chaotic analyses productions
- Superimpose significant random data flows
9International Virtual Data Grid Laboratory
SKC
Boston U
Buffalo
UW Milwaukee
Michigan
UW Madison
BNL
Fermilab
LBL
Argonne
PSU
Iowa
Chicago
J. Hopkins
Indiana
Hampton
Caltech
ISI
Vanderbilt
UCSD
UF
Austin
FIU
Brownsville
10Roles of iVDGL Institutions
- U Florida CMS (Tier2), Management
- Caltech CMS (Tier2), LIGO (Management)
- UC San Diego CMS (Tier2), CS
- Indiana U ATLAS (Tier2), iGOC (operations)
- Boston U ATLAS (Tier2)
- Harvard ATLAS (Management)
- Wisconsin, Milwaukee LIGO (Tier2)
- Penn State LIGO (Tier2)
- Johns Hopkins SDSS (Tier2), NVO
- Chicago CS, Coord./Management, ATLAS (Tier2)
- Vanderbilt BTEV (Tier2, unfunded)
- Southern California CS
- Wisconsin, Madison CS
- Texas, Austin CS
- Salish Kootenai LIGO (Outreach, Tier3)
- Hampton U ATLAS (Outreach, Tier3)
- Texas, Brownsville LIGO (Outreach, Tier3)
- Fermilab CMS (Tier1), SDSS, NVO
- Brookhaven ATLAS (Tier1)
11iVDGL Goals
- Deploy and operate a Grid laboratory
- Support research mission of data intensive
experiments - Provide computing people at university
proto-Tier2 sites - Operate Grid laboratory for CS technology
development - Prototype and deploy a Grid Operations Center
(iGOC) - Integrate Grid software tools
- Into computing infrastructures of the experiments
- Support delivery of Grid technologies
- Harden the Virtual Data Toolkit (VDT) and
middleware technologies developed by GriPhyN and
other Grid projects - Education and Outreach
- Provide tools and mechanisms for underrepresented
groups and remote regions to participate in
international science projects - Collaborate on joint projects with other EO
efforts
12Trillium Grid Tools Virtual Data Toolkit
Use NMI processes later
NMI
VDT
Test
Sources (CVS)
Build
Binaries
Build Test Condor pool (37 computers)
Pacman cache
Package
Patching
RPMs
Build
Binaries
GPT src bundles
Build
Binaries
Test
Contributors (VDS, etc.)
13Trillium Collaborative RelationshipsInternal and
External
Partner Physics projects Partner Outreach projects
Requirements
Prototyping experiments
Production Deployment
- Other linkages
- Work force
- CS researchers
- Industry
Computer Science Research
Virtual Data Toolkit
Larger Science Community
Techniques software
Tech Transfer
Globus, Condor, NMI, iVDGL, PPDG, EU DataGrid,
LHC Experiments, QuarkNet, CHEPREO
U.S.Grids
Intl
Outreach
14- Grid2003 An Operational National Grid
- 28 sites Universities national labs
- 2800 CPUs, 4001300 jobs
- Running since October 2003
- Applications in HEP, LIGO, SDSS, Genomics
Korea
http//www.ivdgl.org/grid2003
15Grid2003 Three Months Usage
16Production Simulations on Grid2003
US-CMS Monte Carlo Simulation
Used 1.5 ? US-CMS resources
Non-USCMS
USCMS
17Education and Outreach
18Grids and the Digital DivideRio de Janeiro, Feb.
16-20, 2004
NEWS Bulletin ONE TWOWELCOME BULLETIN
General InformationRegistrationTravel
Information Hotel Registration Participant List
How to Get UERJ/Hotel Computer Accounts Useful
Phone Numbers Program Contact us Secretariat
Chairmen
- Background
- World Summit on Information Society
- HEP Standing Committee on Inter-regional
Connectivity (SCIC) - Themes
- Global collaborations, Grids and addressing the
Digital Divide - Next meeting 2005 (Korea)
http//www.uerj.br/lishep2004
19iVDGL, GriPhyN Education / Outreach
- Basics
- 200K/yr
- Led by UT Brownsville
- Workshops, portals
- Partnerships with CHEPREO, QuarkNet,
2036 students!
21CHEPREO Center for High Energy Physics Research
and Educational OutreachFlorida International
University
- Physics Learning Center
- CMS Research
- iVDGL Grid Activities
- AMPATH network (S. America)
Funded September 2003 4M initially (3 years)
22UUEO A New Initiative
- Meeting April 8 in Washington DC
- Brought together 40 outreach leaders (including
NSF) - Proposed Grid-based framework for common E/O
effort
23Extra Slides
24OutreachQuarkNet-Trillium Virtual Data Portal
- More than a web site
- Organize datasets
- Perform simple computations
- Create new computations analyses
- View share results
- Annotate enquire (metadata)
- Communicate and collaborate
- Easy to use, ubiquitous,
- No tools to install
- Open to the community
- Grow extend
Initial prototype implemented by graduate student
Yong Zhao and M. Wilde (U. of Chicago)
25Large Hadron Collider (LHC) _at_ CERN
- 27 km Tunnel in Switzerland France
TOTEM
CMS
ALICE
LHCb
Search for Origin of Mass Supersymmetry (2007
?)
ATLAS
26GriPhyN Achievements
- Virtual Data paradigm to express science
processes - Unified language (VDL) to express general data
transformation - Advanced planners, executors, monitors,
predictors, fault recovery? to make the Grid
like a workstation - Virtual Data Toolkit (VDT)
- Tremendously simplified installation
configuration of Grids - Close partnership with and adoption by multiple
sciencesATLAS, CMS, LIGO, SDSS, Bioinformatics,
EU Projects - Broad education outreach program (UT
Brownsville) - 25 graduate, 2 undergraduate 3 CS PhDs by end of
2004 - Virtual Data for QuarkNet Cosmic Ray project
- Grid Summer School 2004, 3 MSIs participating
27Analysis by Globally Distributed Teams
- Non-hierarchical Chaotic analyses productions
- Superimpose significant random data flows
28Virtual Data Toolkit Tools in VDT 1.1.12
- Globus Alliance
- Grid Security Infrastructure (GSI)
- Job submission (GRAM)
- Information service (MDS)
- Data transfer (GridFTP)
- Replica Location (RLS)
- Condor Group
- Condor/Condor-G
- DAGMan
- Fault Tolerant Shell
- ClassAds
- EDG LCG
- Make Gridmap
- Cert. Revocation List Updater
- Glue Schema/Info provider
- ISI UC
- Chimera related tools
- Pegasus
- NCSA
- MyProxy
- GSI OpenSSH
- LBL
- PyGlobus
- Netlogger
- Caltech
- MonaLisa
- VDT
- VDT System Profiler
- Configuration software
- Others
- KX509 (U. Mich.)
29VDT Growth (1.1.14 Currently)
VDT 1.1.8 First real use by LCG
VDT 1.1.14 May 10
VDT 1.1.11 Grid2003
VDT 1.0 Globus 2.0b Condor 6.3.1
VDT 1.1.7 Switch to Globus 2.2
VDT 1.1.3, 1.1.4 1.1.5 pre-SC 2002
30Grid2003 Broad Lessons
- Careful planning and coordination essential to
build Grids - Community investment of time/resources
- Operations team needed to operate Grid as a
facility - Tools, services, procedures, documentation,
organization - Security, account management, multiple
organizations - Strategies needed to cope with increasingly large
scale - Interesting failure modes as scale increases
- Delegation of responsibilities to conserve human
resources - Project, Virtual Org., Grid service, site,
application - Better services, documentation, packaging
- Grid2003 experience critical for building
useful Grids - Frank discussion in Grid2003 Project Lessons doc
31Grid2003 ? Open Science Grid
- Build on Grid2003 experience
- Persistent, production-quality Grid, national
international scope - Ensure U.S. leading role in international science
- Grid infrastructure for large-scale collaborative
scientific research - Create large computing infrastructure
- Combine resources at DOE labs and universities to
effectively become a single national computing
infrastructure for science - Provide opportunities for educators and students
- Participate in building and exploiting this grid
infrastructure - Develop and train scientific and technical
workforce - Transform the integration of education and
research at all levels
http//www.opensciencegrid.org
32Grid References
- Grid2003
- www.ivdgl.org/grid2003
- Globus
- www.globus.org
- PPDG
- www.ppdg.net
- GriPhyN
- www.griphyn.org
- iVDGL
- www.ivdgl.org
- LCG
- www.cern.ch/lcg
- EU DataGrid
- www.eu-datagrid.org
- EGEE
- egee-ei.web.cern.ch
2nd Edition www.mkp.com/grid2
332004 Grid Summer School
- First of its kind in the U.S.
- (EU had one in Summer 2003)
- Marks new direction for Trillium
- First attempt to systematically teach Grid
technologies - First attempt to gather relevant materials in one
place - Today Students in CS and Physics
- Later Students, postdocs, junior senior
scientists - Reaching a wider audience
- Put materials on the web for direct access
- Build online Grid courses (www.cnx.rice.edu)
- Create Grid book (online print) with Georgia
Tech - New funding opportunities
- NSF new large-scale training education programs