Title: Harvey B Newman, Professor of Physics
1 HENP Grids and Networks Global Virtual
Organizations
- Harvey B Newman, Professor of Physics
- LHCNet PI, US CMS Collaboration Board Chair
- Drivers of the Formation of the Information
SocietyApril 18, 2003
2Computing Challenges Petabyes, Petaflops,
Global VOs
- Geographical dispersion of people and resources
- Complexity the detector and the LHC environment
- Scale Tens of Petabytes per year of data
5000 Physicists 250 Institutes 60
Countries
Major challenges associated with Communication
and collaboration at a distance Managing globally
distributed computing data resources
Cooperative software development and physics
analysis New Forms of Distributed Systems Data
Grids
3Next Generation Networks for Experiments Goals
and Needs
Large data samples explored and analyzed by
thousands of globally dispersed scientists, in
hundreds of teams
- Providing rapid access to event samples, subsets
and analyzed physics results from massive data
stores - From Petabytes in 2003, 100 Petabytes by 2008,
to 1 Exabyte by 2013. - Providing analyzed results with rapid turnaround,
bycoordinating and managing the large but
LIMITED computing, data handling and NETWORK
resources effectively - Enabling rapid access to the data and the
collaboration - Across an ensemble of networks of varying
capability - Advanced integrated applications, such as Data
Grids, rely on seamless operation of our LANs
and WANs - With reliable, monitored, quantifiable high
performance
4LHC Collaborations
CMS
ATLAS
US
The US provides about 20-25 of the author list
in both experiments
5US LHC INSTITUTIONS
US CMS
Accelerator
US ATLAS
6Four LHC Experiments The
Petabyte to Exabyte Challenge
- ATLAS, CMS, ALICE, LHCBHiggs New particles
Quark-Gluon Plasma CP Violation
Data stored 40 Petabytes/Year and UP
CPU 0.30 Petaflops and UP 0.1
to 1 Exabyte (1 EB 1018
Bytes) (2008) (2013 ?) for the LHC
Experiments
7LHC Higgs Decay into 4 muons (Tracker only)
1000X LEP Data Rate
109 events/sec, selectivity 1 in 1013 (1 person
in a thousand world populations)
8LHC Data Grid Hierarchy
CERN/Outside Resource Ratio 12Tier0/(?
Tier1)/(? Tier2) 111
PByte/sec
100-1500 MBytes/sec
Online System
Experiment
CERN 700k SI95 1 PB Disk Tape Robot
Tier 0 1
HPSS
2.5-10 Gbps
Tier 1
FNAL 200k SI95 600 TB
IN2P3 Center
INFN Center
RAL Center
2.5-10 Gbps
Tier 2
2.5-10 Gbps
Tier 3
Institute 0.25TIPS
Institute
Institute
Institute
Physicists work on analysis channels Each
institute has 10 physicists working on one or
more channels
0.110 Gbps
Physics data cache
Tier 4
Workstations
9 Transatlantic Net WG (HN, L. Price)
Bandwidth Requirements
Installed BW. Maximum Link Occupancy 50
Assumed See http//gate.hep.anl.gov/lprice/TAN
10History One large Research Site
Much of the TrafficSLAC ? IN2P3/RAL/INFNvia
ESnetFranceAbileneCERN
Current Traffic 400 MbpsESNet
LimitationProjections 0.5 to 24 Tbps by 2012
11Progress Max. Sustained TCP Thruput on
Transatlantic and US Links
- 8-9/01 105 Mbps 30 Streams SLAC-IN2P3 102
Mbps 1 Stream CIT-CERN - 11/5/01 125 Mbps in One Stream (modified
kernel) CIT-CERN - 1/09/02 190 Mbps for One stream shared on 2
155 Mbps links - 3/11/02 120 Mbps Disk-to-Disk with One Stream
on 155 Mbps link (Chicago-CERN) - 5/20/02 450-600 Mbps SLAC-Manchester on OC12
with 100 Streams - 6/1/02 290 Mbps Chicago-CERN One Stream on
OC12 (mod. Kernel) - 9/02 850, 1350, 1900 Mbps Chicago-CERN
1,2,3 GbE Streams, OC48 Link - 11-12/02 FAST 940 Mbps in 1 Stream
SNV-CERN 9.4
Gbps in 10 Flows SNV-Chicago
Also see http//www-iepm.slac.stanford.edu/monitor
ing/bulk/ and the Internet2 E2E Initiative
http//www.internet2.edu/e2e
12DataTAG Project
NewYork
ABILENE
STARLIGHT
ESNET
2.5 to 10G
GENEVA
Wave Triangle
10 G
10G
CALREN2
STAR-TAP
- EU-Solicited Project. CERN, PPARC (UK), Amsterdam
(NL), and INFN (IT)and US (DOE/NSF UIC, NWU
and Caltech) partners - Main Aims
- Ensure maximum interoperability between US and EU
Grid Projects - Transatlantic Testbed for advanced network
research - 2.5 Gbps Wavelength Triangle from 7/02 to 10
Gbps Triangle by Early 2003
13FAST (Caltech) A Scalable, Fair Protocol for
Next-Generation Networks from 0.1 To 100 Gbps
SC2002 11/02
Highlights of FAST TCP
- Standard Packet Size
- 940 Mbps single flow/GE card
- 9.4 petabit-m/sec
- 1.9 times LSR
- 9.4 Gbps with 10 flows
- 37.0 petabit-m/sec
- 6.9 times LSR
- 22 TB in 6 hours in 10 flows
- Implementation
- Sender-side (only) mods
- Delay (RTT) based
- Stabilized Vegas
Sunnyvale-Geneva
Baltimore-Geneva
Baltimore-Sunnyvale
SC2002 10 flows
SC2002 2 flows
I2 LSR
29.3.00 multiple
SC2002 1 flow
9.4.02 1 flow
22.8.02 IPv6
URL netlab.caltech.edu/FAST
Next 10GbE 1 GB/sec disk to disk
C. Jin, D. Wei, S. Low FAST Team Partners
14FAST TCP Baltimore/Sunnyvale
88
- RTT estimation fine-grain timer
- Fast convergence to equilibrium
- Delay monitoring in equilibrium
- Pacing reducing burstiness
10G
90
9G
90
- Measurements
- Std Packet Size
- Utilization averaged over gt 1hr
- 3000 km Path
Average utilization
92
8.6 Gbps 21.6 TB in 6 Hours
95
Fair SharingFast Recovery
1 flow 2 flows 7 flows
9 flows 10 flows
15On Feb. 27-28, a Terabyte of data was transferred
in 3700 seconds by S. Ravot of Caltech between
the Level3 PoP in Sunnyvale near SLAC and CERN
through the TeraGrid router at StarLight from
memory to memory As a single TCP/IP stream at
average rate of 2.38 Gbps. (Using large
windows and 9kB Jumbo frames)This beat the
former record by a factor of 2.5, and used
the US-CERN link at 99 efficiency.
10GigE Data Transfer Trial Internet2 LSR 2003
European Commission
10GigE NIC
16TeraGrid (www.teragrid.org)NCSA, ANL, SDSC,
Caltech, PSC
Abilene
Chicago
DTF Backplane 4 X 10 Gbps
Indianapolis
Urbana
Caltech
Starlight / NW Univ
UIC
San Diego
I-WIRE
Multiple Carrier Hubs
Ill Inst of Tech
ANL
A Preview of the Grid Hierarchyand Networks of
the LHC Era Higgs Study at Caltech and
Productionwith FAST TCP is a Flagship
TeraGridAppication
Univ of Chicago
OC-48 (2.5 Gb/s, Abilene)
Indianapolis (Abilene NOC)
Multiple 10 GbE (Qwest)
NCSA/UIUC
Multiple 10 GbE (I-WIRE Dark Fiber)
Source Charlie Catlett, Argonne
17National Light Rail Footprint
- NLR
- Buildout Started November 2002
- Initially 4 10G Wavelengths
- To 40 10G Waves in Future
Transition now to optical, multi-wavelength RE
networks US, Europe and Intercontinental
(US-China-Russia) InitiativesHEP is the
universally recognized leading application for
initial NLR use
18HENP Major Links Bandwidth Roadmap (Scenario)
in Gbps
Continuing the Trend 1000 Times Bandwidth
Growth Per Decade We are Learning to Use and
Share Multi-Gbps Networks Efficiently HENP is
leading the way towards future networks dynamic
Grids
19HENP Lambda GridsFibers for Physics
- Problem Extract Small Data Subsets of 1 to 100
Terabytes from 1 to 1000 Petabyte Data Stores - Survivability of the HENP Global Grid System,
with hundreds of such transactions per day
(circa 2007)requires that each transaction be
completed in a relatively short time. - Example Take 800 secs to complete the
transaction. Then - Transaction Size (TB) Net
Throughput (Gbps) - 1
10 - 10
100 - 100
1000 (Capacity of
Fiber
Today) - Summary Providing Switching of 10 Gbps
wavelengthswithin 3-5 years and Terabit
Switching within 5-8 yearswould enable
Petascale Grids with Terabyte transactions,as
required to fully realize the discovery potential
of major HENP programs, as well as other
data-intensive fields.
20Emerging Data Grid User Communities
- NSF Network for Earthquake Engineering Simulation
(NEES) - Integrated instrumentation, collaboration,
simulation - Grid Physics Network (GriPhyN)
- ATLAS, CMS, LIGO, SDSS
- Access Grid VRVS supporting group-based
collaboration - And
- Genomics, Proteomics, ...
- The Earth System Grid and EOSDIS
- Federating Brain Data
- Computed MicroTomography
- Virtual Observatories
21Particle Physics Data GridCollaboratory Pilot
(2001-2003)
The Particle Physics Data Grid Collaboratory
Pilot will develop, evaluate and deliver vitally
needed Grid-enabled tools for data-intensive
collaboration in particle and nuclear physics.
Novel mechanisms and policies will be integrated
with Grid Middleware, experiment specific
applications and computing resources to provide
effective end-to-end capability.
- A Strong Computer Science/PhysicsPartnership
- Reflected in a groundbreakingDOE MICS/HENP
Partnership - Driving Forces Now-running future
experiments, ongoing CS projects leading-edge
Grid develoments - Focus on End-to-end services
- Integration of Grid Middleware with
Experiment-specific Components - Security Authenticate, -orize, Allocate
- Practical Orientation Monitoring,
instrumentation, networks
22PPDG Mission and Focii Today
- Mission Enabling new scales of research in
experimental physics and experimental computer
science - Advancing Grid Technologies by addressing key
issues in architecture, integration, deployment
and robustness - Vertical Integration of Grid Technologies into
the Applications frameworks of Major
Experimental Programs - Ongoing as Grids, Networks and Applications
Progress - Deployment, hardening and extensions of Common
Grid services and standards - Data replication, storage and job management,
monitoring and task execution-planning. - Mission-oriented, Interdisciplinary teams of
physicists, software and network engineers, and
computer scientists - Driven by demanding end-to-end applications of
experimental physics
23PPDG Accomplishments (I)
- First-Generation HENP Application Grids Grid
Subsystems - Production Simulation Grids for ATLAS and
CMSSTAR Distributed Analysis Jobs to 30
TBytes, 30 Sites - Data Replication for BaBar Terabyte Stores
systematically replicated from California to
France and the UK. - Replication and Storage Management for STAR and
JLAB Development and Deployment of Standard
APIs, andInteroperable Implementations. - Data Transfer, Job and Information Management for
D0 GridFTP integrated with SAM Condor-G job
scheduler, MDS resource discovery all integrated
with SAM. - Initial Security Infrastructure for Virtual
Organizations - PKI certificate management, policies and trust
relationships (using DOE Science Grid and
Globus) - Standardizing Authorization mechanisms standard
callouts for Local Center Authorization for
Globus, EDG - Prototyping secure credential stores
- Engagement of site security teams
24PPDG Accomplishments (II)
- Data and Storage Management
- Robust data transfer over heterogeneous networks
using standard and next-generation protocols
GridFTP, bbcp GridDT, FAST TCP - Distributed Data Replica management SRB, SAM,
SRM - Common Storage Management interface and services
across diverse implementations SRM - HPSS,
Jasmine, Enstore - Object Collection management in diverse RDBMSs
CAIGEE, SOCATS - Job Planning, Execution and Monitoring
- Job scheduling based on resource discovery and
status Condor-G and extensions for strategy
policy - Retry and Fault Tolerance in response to error
conditionshardened gram, gass-cache, ftsh
Condor-G - Distributed monitoring infrastructure for system
tracking, resource discovery, resource job
information Monalisa, MDS, Hawkeye - Prototypes and Evaluations
- Grid enabled physics analysis tools prototypical
environments - End to end troubleshooting and fault handling
- Cooperative Monitoring of Grid, Fabric,
Applications
25WorldGrid EU-US InteroperationOne of 24 Demos
at SC2002
Collaborating with iVDGL and DataTAG on
international grid testbeds will lead to easier
deployment of experiment grids across the globe.
26PPDG Collaborators Participated in 24 SC2002 Demos
27CAIGEE
Progress in Interfacing AnalysisTools to the
Grid CAIGEE
- CMS Analysis an Integrated Grid Enabled
Environment
- Lightweight, functional, making use of existing
software AFAP - Plug-in Architecture based on Web Services
- Expose Grid Views of the Global System to
physicists at various levels of detail,with
Feedback - Supports Data Requests, Preparation, Production,
Movement, Analysis of Physics Object Collections - Initial Target US-CMS physicists in California
(CIT, UCSD, Riverside, Davis, UCLA) - Expand to Include FIU, UF, FSU, UERJ
- Future Whole US CMS and CMS
28CAIGEE Draft Architecture
- Multiplatform, Light Client
- Object Collection Access
- Interface to (O)RDBMSs
- Use of Web Services
CAIGEEARCHITECTURE
29PPDG Common Technologies, Tools Applications
Who Uses What ?
30PPDG Inter-ProjectCollaborations and
Interactions
US Physics Grid Projects Virtual Data Toolkit
GriPhyN, iVDGL, PPDG the Trillium EU-US
Physics Grids LHC Computing Grid ATLAS,
CMS European Data Grid - BaBar, D0, ATLAS,
CMSHEP Intergrid Coordination Board SciDAC
projects Interaction Earth System Grid II SRM DOE
Science Grid CA, RA A High Performance DataGrid
Toolkit GlobusToolkit Storage Resource Mngmnt for
DG Apps SRM Security and Policy for Group
Collaborations GSI, CAS Scientific Data
Management Center STAR SDM Bandwidth Estimation
Measurement IEPEM-BW Methologies and
Applications A Natl Computational Infrastructure
for TJNAF/LQCD Lattice Gauge Theory Distributed
Monitoring Framework Netlogger, Glue Schema
31PPDG CP GriPhyN Virtual Data Grids
- Users View of a PVDG (PPDG-CP Proposal
April 2000)
32HENP Data Grids Versus Classical Grids
- The original Computational and Data Grid concepts
are largely stateless, open systems known to
be scalable - Analogous to the Web
- The classical Grid architecture has a number of
implicit assumptions - The ability to locate and schedule suitable
resources, within a tolerably short time (i.e.
resource richness) - Short transactions with relatively simple
failure modes - HEP Grids are Data Intensive, and
Resource-Constrained - 1000s of users competing for resources at 100s
of sites - Resource usage governed by local and global
policies - Long transactions some long queues
- Need Realtime Monitoring and Tracking
- Distributed failure modes ?Strategic task
management
33Layered Grid Architecture
Coordinating multiple resources ubiquitous
infrastructure services, app-specific distributed
services
Collective
Sharing single resources nego- tiating
access, controlling use
Resource
Talking to things communicatn (Internet
protocols) security
Connectivity
Controlling things locally Access to,
control of, resources
Fabric
The Anatomy of the Grid Enabling Scalable
Virtual Organizations, Foster, Kesselman,
Tuecke, Intl J. High Performance Computing
Applications, 15(3), 2001.
34Current Grid Challenges SecureWorkflow
Management and Optimization
- Maintaining a Global View of Resources and System
State - Coherent end-to-end System Monitoring
- Adaptive Learning new algorithms and
strategiesfor execution optimization
(increasingly automated) - Workflow Strategic Balance of Policy Versus
Moment-to-moment Capability to Complete Tasks - Balance High Levels of Usage of Limited Resources
Against Better Turnaround Times for Priority
Jobs - Goal-Oriented Algorithms Steering Requests
According to (Yet to be Developed) Metrics - Handling User-Grid Interactions Guidelines
Agents - Building Higher Level Services, and an
IntegratedScalable User Environment for the Above
35HENP Grid Architecture Layers Above the
Collective Layer
- Physicists Application Codes
- Reconstruction, Calibration, Analysis
- Experiments Software Framework Layer
- Modular and Grid-aware Architecture able to
interact effectively with the lower layers
(above) - Grid Applications Layer (Parameters and
algorithms that govern system operations) - Policy and priority metrics
- Workflow evaluation metrics
- Task-Site Coupling proximity metrics
- Global End-to-End System Services Layer
- Workflow monitoring and evaluation mechanisms
- Error recovery and long-term redirection
mechanisms - System self-monitoring, steering, evaluation and
optimisation mechanisms - Monitoring and Tracking Component performance
36Distributed System Services Architecture (DSSA)
CIT/Romania/Pakistan
- Agents Autonomous, Auto-discovering,
self-organizing, collaborative, adaptive - Station Servers (static) host mobile Dynamic
Services - Servers interconnect dynamically form a robust
fabric in which mobile agents travel, with a
payload of (analysis) tasks - Adaptable to Web services JINI/Javaspaces
WSDL/UDDIOGSA Integration/Migration planned - Adaptable to Ubiquitous Working Environments
Managing Global Data Intensive Systems Requires A
New Generation of Scalable, Intelligent Software
Systems
37MonaLisa A Globally Scalable Grid Monitoring
System
- Deployed on US CMS Test GridCERN, Bucharest,
Taiwan, Pakistan - Agent-based Dynamic information / resource
discovery mechanism - Talks w/Other Monitoring Systems MDS, Hawkeye
- Implemented in
- Java/Jini SNMP
- WDSL / SOAP with UDDI
- For a Global Grid Monitoring Service
38MONARC SONN 3 Regional Centres Learning to
Export Jobs (Day 9)
ltEgt 0.73
ltEgt 0.83
1MB/s 150 ms RTT
CERN30 CPUs
CALTECH 25 CPUs
1.2 MB/s 150 ms RTT
0.8 MB/s 200 ms RTT
NUST 20 CPUs
ltEgt 0.66
Simulations for Strategy Development Self-Learnin
g Algorithms for Optimization are Key Elements
of Globally Scalable Grids
Optimized Day 9
39Grids and Open Standardsthe Move to OGSA
App-specific Services
Open Grid Services Arch
Web services
Increased functionality, standardization
GGF OGSI, ( OASIS, W3C) Multiple
implementations, including Globus Toolkit
X.509, LDAP, FTP,
Globus Toolkit
Defacto standards GGF GridFTP, GSI
Custom solutions
Time
40OGSA ExampleReliable File Transfer Service
- A standard substrate the Grid service
- Standard interfaces behaviors to address key
distributed system issues - Refactoring, extension of Globus Toolkit
protocol suite
Client
Client
Client
Request manage file transfer operations
File Transfer
Internal State
Data transfer operations
412003 ITR Proposals Globally EnabledAnalysis
Communities
- Develop and build Dynamic Workspaces
- Construct Autonomous Communities Operating
Within Global Collaborations - Build Private Grids to support scientific
analysis communities - e.g. Using Agent Based Peer-to-peer Web
Services - Drive the democratization of science via the
deployment of new technologies - Empower small groups of scientists (Teachers
and Students) to profit from and contribute
to intl big science
42Private Grids and P2P Sub-Communities in Global
CMS
43GECSR
- Initial targets are the global HENP
collaborations, but GESCR is expected to be
widely applicable to other large scale
collaborative scientific endeavors - Giving scientists from all world regions the
means to function as full partners in the
process of search and discovery
The importance of Collaboration Services is
highlighted in the Cyberinfrastructure report of
Atkins et al. 2003
44A Global Grid Enabled Collaboratory for
Scientific Research (GECSR)
- A joint ITR proposal from
- Caltech (HN PI,JBCoPI)
- Michigan (CoPI,CoPI)
- Maryland (CoPI)
- and Senior Personnel from
- Lawrence Berkeley Lab
- Oklahoma
- Fermilab
- Arlington (U. Texas)
- Iowa
- Florida State
- The first Grid-enabled Collaboratory Tight
integration between - Science of Collaboratories,
- Globally scalable working environment
- A Sophisticated Set of Collaborative
Tools(VRVS, VNC Next-Gen) - Agent based monitoring and decision support
system (MonALISA)
4514000 Hosts8000 Registered Users in 64
Countries 56 (7 I2) Reflectors Annual Growth 2
to 3X
46Building Petascale Global GridsImplications for
Society
- Meeting the challenges of Next Generation
Petabyte-to-Exabyte Grids, and Terascale data
transactions over Gigabit-to-Terabit Networks,
will transform research in science and
engineering - These developments will create the first truly
global virtual organizations (GVO) - They could also be the model for the data
intensive business processes of future
corporations - If these developments are successful this could
lead to profound advances in industry, commerce
and society at large - By changing the relationship between people
and persistent information in their daily
lives - Within the next five to ten years
- HENP is leading these developments, together
with leading computing scientists
47Networks, Grids, HENP and WAN-in-Lab
- Current generation of 2.5-10 Gbps network
backbones arrived in the last 15 Months in the
US, Europe and Japan - Major transoceanic links also at 2.5 - 10 Gbps
in 2003 - Capability Increased 4 Times, i.e. 2-3 Times
Moores - Reliable high End-to-end Performance of network
applications(large file transfers Grids) is
required. Achieving this requires - A Deep understanding of Protocol Issues, for
efficient use - Getting high performance (TCP) toolkits in
users hands - End-to-end monitoring a coherent approach
- Removing Regional, Last Mile Bottlenecks and
Compromises in Network Quality are now On the
critical path, in all regions - HENP is working in Concert with Internet2,
TERENA, AMPATH APAN DataTAG, the Grid
projects and the Global Grid Forum to solve
these problems
48ICFA Standing Committee on Interregional
Connectivity (SCIC)
- Created by ICFA in July 1998 in Vancouver
Following ICFA-NTF - CHARGE
- Make recommendations to ICFA concerning the
connectivity between the Americas, Asia and
Europe (and network requirements of HENP) - As part of the process of developing
theserecommendations, the committee should - Monitor traffic
- Keep track of technology developments
- Periodically review forecasts of future
bandwidth needs, and - Provide early warning of potential problems
- Create subcommittees when necessary to meet the
charge - Representatives Major labs, ECFA, ACFA, NA
Users, S. America - The chair of the committee should report to ICFA
once peryear, at its joint meeting with
laboratory directors (Feb. 2003)
49 SCIC in 2002-3A Period of Intense Activity
- Formed WGs in March 2002 9 Meetings in 12 Months
- Strong Focus on the Digital Divide
- Presentations at Meetings and Workshops(e.g.
LISHEP, APAN, AMPATH, ICTP and ICFA Seminars) - HENP more visible to governments in the WSIS
Process - Five Reports Presented to ICFA Feb. 13,2003See
http//cern.ch/icfa-scic - Main Report Networking for HENP H. Newman et
al. - Monitoring WG Report L. Cottrell
- Advanced Technologies WG Report R. Hughes-Jones,
O. Martin et al. - Digital Divide Report A. Santoro et al.
- Digital Divide in Russia Report V. Ilyin
50SCIC Work in 2003
- Continue Digital Divide Focus
- Improve and Systematize Information in Europe
in Cooperation with TERENA and SERENATE - More in-depth information on Asia, with APAN
- More in-depth information on South America, with
AMPATH - Begin Work on Africa, with ICTP
- Set Up HENP Networks Web Site and Database
- Share Information on Problems, Pricing Example
Solutions - Continue and if Possible Strengthen Monitoring
Work (IEPM) - Continue Work on Specific Improvements
- Brazil and So. America Romania Russia India
Pakistan, China - An ICFA-Sponsored Statement at the World Summit
on the Information Society (12/03 in Geneva),
prepared by SCIC CERN - Watch Requirements the Lambda Grid
Analysis revolutions - Discuss, Begin to Create a New Culture of
Collaboration