Harvey B. Newman, Caltech

About This Presentation

Title:

Harvey B. Newman, Caltech

Description:

Caltech/Wisconsin Condor/NCSA Production. Simple Job Launch from Caltech ... Condor, NCSA. Distributed MOnte Carlo Production (MOP): FNAL ' ... – PowerPoint PPT presentation

Number of Views:58

Avg rating:3.0/5.0

Slides: 35

Provided by: cms586

Category:

more less

Transcript and Presenter's Notes

Title: Harvey B. Newman, Caltech

1
LHC Experiments and the PACIA Partnership
for Global Data Analysis

Harvey B. Newman, Caltech
Advisory Panel on CyberInfrastructure
National Science Foundation
November 29, 2001
http//l3www.cern.ch/newman/LHCGridsPACI.ppt

2
Global Data Grid Challenge

Global scientific communities, served by
networks with bandwidths varying by orders of
magnitude, need to perform computationally
demanding analyses of geographically distributed
datasets that will grow by at least 3 orders of
magnitude over the next decade, from the 100
Terabyte to the 100 Petabyte scale from 2000 to
2007

3
The Large Hadron Collider (2006-)

The Next-generation Particle Collider
The largest superconductor installation in
the world
Bunch-bunch collisions at 40 MHz,Each generating
20 interactions
Only one in a trillion may lead to a major
physics discovery
Real-time data filtering Petabytes per second
to Gigabytes per second
Accumulated data of many Petabytes/Year

Large data samples explored and analyzed by
thousands of globally dispersed scientists, in
hundreds of teams
4
Four LHC Experiments The Petabyte to Exabyte
Challenge

ATLAS, CMS, ALICE, LHCBHiggs New particles
Quark-Gluon Plasma CP Violation

Data stored 40 Petabytes/Year and UP
CPU 0.30 Petaflops and UP 0.1
to 1 Exabyte (1 EB 1018
Bytes) (2007) (2012 ?) for the LHC
Experiments
5
Evidence for the Higgs at LEP at M115 GeV The
LEP Program Has Now Ended
6
LHC Higgs Decay into 4 muons 1000X LEP Data Rate
109 events/sec, selectivity 1 in 1013 (1 person
in a thousand world populations)
7
LHC Data Grid Hierarchy
CERN/Outside Resource Ratio 12Tier0/(?
Tier1)/(? Tier2) 111
PByte/sec
100-400 MBytes/sec
Online System
Experiment
CERN 700k SI95 1 PB Disk Tape Robot
Tier 0 1
HPSS
2.5 Gbits/sec
Tier 1
FNAL 200k SI95 600 TB
IN2P3 Center
INFN Center
RAL Center
2.5 Gbps
Tier 2
2.5 Gbps
Tier 3
Institute 0.25TIPS
Institute
Institute
Institute
Physicists work on analysis channels Each
institute has 10 physicists working on one or
more channels
100 - 1000 Mbits/sec
Physics data cache
Tier 4
Workstations
8
TeraGridNCSA, ANL, SDSC, Caltech
StarLight Intl Optical Peering Point (see
www.startap.net)
A Preview of the Grid Hierarchyand Networks of
the LHC Era
Abilene
Chicago
Indianapolis
DTF Backplane(4x? 40 Gbps)
Urbana
Pasadena
Starlight / NW Univ
UIC
I-WIRE
San Diego
Multiple Carrier Hubs
Ill Inst of Tech
ANL
OC-48 (2.5 Gb/s, Abilene)
Univ of Chicago
Multiple 10 GbE (Qwest)
Indianapolis (Abilene NOC)
Multiple 10 GbE (I-WIRE Dark Fiber)
NCSA/UIUC

Solid lines in place and/or available in 2001
Dashed I-WIRE lines planned for Summer 2002

Source Charlie Catlett, Argonne
9
Current Grid Challenges Resource Discovery,
Co-Scheduling, Transparency

Discovery and Efficient Co-Scheduling of
Computing, Data Handling, and Network Resources
Effective, Consistent Replica Management
Virtual Data Recomputation Versus Data Transport
Decisions
Reduction of Complexity In a Petascale World
GA3 Global Authentication, Authorization,
Allocation
VDT Transparent Access to Results (and Data
When Necessary)
Location Independence of the User Analysis,
Grid,and Grid-Development Environments
Seamless Multi-Step Data Processing and
AnalysisDAGMan (Wisc), MOPIMPALA(FNAL)

10
CMS Production Event Simulation and
Reconstruction
Common Prod. tools (IMPALA)
GDMP
Digitization
Simulation
PU
No PU
Fully operational
?
?
CERN
?
?
FNAL
In progress
?
Moscow
?
?
INFN
?
?
Caltech
?
?
UCSD
?
?
UFL
Worldwide Productionat 12 Sites
?
?
Imperial College
?
?
Bristol
?
?
Wisconsin
Not Op.
?
?
IN2P3
Not Op.
Not Op.
Helsinki
Grid-Enabled
Automated
11
US CMS TeraGrid Seamless Prototype

Caltech/Wisconsin Condor/NCSA Production
Simple Job Launch from Caltech
Authentication Using Globus Security
Infrastructure (GSI)
Resources Identified Using Globus Information
Infrastructure (GIS)
CMSIM Jobs (Batches of 100, 12-14 Hours, 100 GB
Output) Sent to the Wisconsin Condor Flock
Using Condor-G
Output Files Automatically Stored in NCSA Unitree
(Gridftp)
ORCA Phase Read-in and Process Jobs at NCSA
Output Files Automatically Stored in NCSA Unitree
Future Multiple CMS Sites Storage in Caltech
HPSS Also,Using GDMP (With LBNLs HRM).
Animated Flow Diagram of the DTF Prototype
http//cmsdoc.cern.ch/wisniew/infrastructure.html

12
Baseline BW for the US-CERN Link HENP
Transatlantic WG (DOENSF)
Transoceanic NetworkingIntegrated with the
TeraGrid, Abilene, Regional Netsand Continental
NetworkInfrastructuresin US, Europe, Asia,
South America
US-CERN Plans 155 Mbps to 2 X 155 Mbps this
Year 622 Mbps in April 2002DataTAG 2.5 Gbps
Research Link in Summer 200210 Gbps Research
Link in 2003
13
Transatlantic Net WG (HN, L. Price)
Bandwidth Requirements

Installed BW. Maximum Link Occupancy 50
Assumed The Network Challenge is Shared by Both
Next- and Present Generation Experiments
14
Internet2 HENP Networking WG Mission

To help ensure that the required
National and international network
infrastructures
Standardized tools and facilities for high
performance and end-to-end monitoring and
tracking, and
Collaborative systems
are developed and deployed in a timely manner,
and used effectively to meet the needs of the US
LHC and other major HENP Programs, as well as
the general needs of our scientific community.
To carry out these developments in a way that is
broadly applicable across many fields, within and
beyond the scientific community
Co-Chairs S. McKee (Michigan), H. Newman
(Caltech) With thanks to R. Gardner and J.
Williams (Indiana)

15
Grid RD Focal Areas for NPACI/HENP Partnership

Development of Grid-Enabled User Analysis
Environments
CLARENS (IGUANA) Project for Portable
Grid-Enabled Event Visualization, Data
Processing and Analysis
Object Integration backed by an ORDBMS, and
File-Level Virtual Data Catalogs
Simulation Toolsets for Systems Modeling,
Optimization
For example the MONARC System
Globally Scalable Agent-Based Realtime
Information Marshalling Systems
To face the next-generation challenge of
DynamicGlobal Grid design and operations
Self-learning (e.g. SONN) optimization
Simulation (Now-Casting) enhanced to monitor,
track and forward predict site, network and
global system state
1-10 Gbps Networking development and global
deployment
Work with the TeraGrid, STARLIGHT, Abilene, the
iVDGL GGGOC, HENP Internet2 WG, Internet2 E2E,
and DataTAG
Global Collaboratory Development e.g. VRVS,
Access Grid

16
CLARENS a Data AnalysisPortal to the Grid
Steenberg (Caltech)

A highly functional graphical interface,
Grid-enabling the working environment for
non-specialist physicists data analysis
Clarens consists of a server communicating with
various clients via the commodity XML-RPC
protocol. This ensures implementation
independence.
The server is implemented in C to give access
to the CMS OO analysis toolkit.
The server will provide a remote API to Grid
tools
Security services provided by the Grid (GSI)
The Virtual Data Toolkit Object collection
access
Data movement between Tier centers using GSI-FTP
CMS analysis software (ORCA/COBRA)
Current prototype is running on the Caltech
Proto-Tier2
More information at http//heppc22.hep.caltech.edu
, along with a web-based demo

17
Modeling and SimulationMONARC System

Modelling and understanding current systems,
their performance and limitations, is essential
for the design of the future large scale
distributed processing systems.
The simulation program developed within the
MONARC (Models Of Networked Analysis At Regional
Centers) project is based on a process oriented
approach for discrete event simulation. It is
based on the on Java(TM) technology and provides
a realistic modelling tool for such large scale
distributed systems.

SIMULATION of Complex Distributed Systems
18
MONARC SONN 3 Regional Centres Learning to
Export Jobs (Day 9)
ltEgt 0.73
ltEgt 0.83
1MB/s 150 ms RTT
CERN30 CPUs
CALTECH 25 CPUs
1.2 MB/s 150 ms RTT
0.8 MB/s 200 ms RTT
NUST 20 CPUs
ltEgt 0.66
Day 9
19
Maximizing US-CERN TCP Throughput (S.Ravot,
Caltech)

TCP Protocol Study Limits
We determined Precisely
The parameters which limit the throughput over
a high-BW, long delay (170 msec) network
How to avoid intrinsic limits unnecessary
packet loss
Methods Used to Improve TCP
Linux kernel programming in order to tune TCP
parameters
We modified the TCP algorithm
A Linux patch will soon be available
Result The Current State of the Art for
Reproducible Throughput
125 Mbps between CERN and Caltech
135 Mbps between CERN and Chicago
Status Ready for Tests at Higher BW (622 Mbps)
in Spring 2002

Congestion window behavior of a TCP connection
over the transatlantic line
Reproducible 125 Mbps BetweenCERN and
Caltech/CACR
20
Agent-Based Distributed System JINI Prototype
(Caltech/Pakistan)

Includes Station Servers (static) that host
mobile Dynamic Services
Servers are interconnected dynamically to form a
fabric in which mobile agents travel, with a
payload of physics analysis tasks
Prototype is highly flexible and robust against
network outages
Amenable to deployment on leading edge and
future portable devices (WAP, iAppliances, etc.)
The system for the travelling physicist
The Design and Studies with this prototype use
the MONARC Simulator, and build on SONN
studies? See http//home.cern.ch/clegrand/lia/

21
Globally Scalable Monitoring Service
Lookup Service
Discovery
Lookup Service
Proxy
Client (other service)
Registration
Push Pull rsh ssh existing scripts snmp
RC Monitor Service

Component Factory
GUI marshaling
Code Transport
RMI data access

Farm Monitor
Farm Monitor
22
Examples

GLAST meeting
10 participants connected via VRVS (and 16
participants in Audio only)

VRVS 7300 Hosts 4300 Registered Users In 58
Countries 34 Reflectors 7 In I2 Annual Growth
250
US CMS will use the CDF/KEK remote control room
concept for Fermilab Run II as a starting point.
However, we will (1) expand the scope to
encompass a US based physics group and US LHC
accelerator tasks, and (2) extend the concept to
a Global Collaboratory for realtime data
acquisition analysis
23
Next Round Grid Challenges Global Workflow
Monitoring, Management, and Optimization

Workflow Management, Balancing Policy Versus
Moment-to-moment Capability to Complete Tasks
Balance High Levels of Usage of Limited Resources
Against Better Turnaround Times for Priority
Jobs
Goal-Oriented According to (Yet to be Developed)
Metrics
Maintaining a Global View of Resources and System
State
Global System Monitoring, Modeling,
Quasi-realtime simulation feedback on the
Macro- and Micro-Scales
Adaptive Learning new paradigms for execution
optimization and Decision Support (eventually
automated)
Grid-enabled User Environments

24
PACI, TeraGrid and HENP

The scale, complexity and global extent of the
LHC Data Analysis problem is unprecedented
The solution of the problem, using globally
distributed Grids, is mission-critical for
frontier science and engineering
HENP has a tradition of deploying new highly
functional systems (and sometimes new
technologies) to meet its technical and
ultimately its scientific needs
HENP problems are mostly embarrassingly
parallel but potentially overwhelming in their
data- and network intensiveness
HENP/Computer Science synergy has increased
dramatically over the last two years, focused on
Data Grids
Successful collaborations in GriPhyN, PPDG, EU
Data Grid
The TeraGrid (present and future) and its
development program is scoped at an appropriate
level of depth and diversity
to tackle the LHC and other Petascale
problems, over a 5 year time span
matched to the LHC time schedule, with full ops.
In 2007

25
Some Extra Slides Follow
26
Computing Challenges LHC Example

Geographical dispersion of people and resources
Complexity the detector and the LHC environment
Scale Tens of Petabytes per year of data

5000 Physicists 250 Institutes 60
Countries
Major challenges associated with Communication
and collaboration at a distance Network-distribute
d computing and data resources Remote software
development and physics analysis RD New Forms
of Distributed Systems Data Grids
27
Why Worldwide Computing? Regional Center Concept
Goals

Managed, fair-shared access for Physicists
everywhere
Maximize total funding resources while meeting
the total computing and data handling needs
Balance proximity of datasets to large central
resources, against regional resources under more
local control
Tier-N Model
Efficient network use higher throughput on short
paths
Local gt regional gt national gt international
Utilizing all intellectual resources, in several
time zones
CERN, national labs, universities, remote sites
Involving physicists and students at their home
institutions
Greater flexibility to pursue different physics
interests, priorities, and resource allocation
strategies by region
And/or by Common Interests (physics topics,
subdetectors,)
Manage the Systems Complexity
Partitioning facility tasks, to manage and focus
resources

28
HENP Related Data Grid Projects

Funded Projects
PPDG I USA DOE 2M 1999-2001
GriPhyN USA NSF 11.9M 1.6M 2000-2005
EU DataGrid EU EC 10M 2001-2004
PPDG II (CP) USA DOE 9.5M 2001-2004
iVDGL USA NSF 13.7M 2M 2001-2006
DataTAG EU EC 4M 2002-2004
About to be Funded Project
GridPP UK PPARC gt15M? 2001-2004
Many national projects of interest to HENP
Initiatives in US, UK, Italy, France, NL,
Germany, Japan,
EU networking initiatives (Géant, SURFNet)
US Distributed Terascale Facility (53M, 12
TFL, 40 Gb/s network)

in final stages of approval
29
Network Progress andIssues for Major Experiments

Network backbones are advancing rapidly to the 10
Gbps range Gbps end-to-end data flows will
soon be in demand
These advances are likely to have a profound
impacton the major physics Experiments
Computing Models
We need to work on the technical and political
network issues
Share technical knowledge of TCP Windows,
Multiple Streams, OS kernel issues Provide User
Toolset
Getting higher bandwidth to regions outside W.
Europe and US China, Russia, Pakistan, India,
Brazil, Chile, Turkey, etc.
Even to enable their collaboration
Advanced integrated applications, such as Data
Grids, rely onseamless transparent operation
of our LANs and WANs
With reliable, quantifiable (monitored), high
performance
Networks need to become part of the Grid(s)
design
New paradigms of network and system
monitoringand use need to be developed, in the
Grid context

30
Grid-Related RD Projects in CMS Caltech, FNAL,
UCSD, UWisc, UFl

Installation, Configuration and Deployment of
Prototype Tier2 Centers at Caltech/UCSD and
Florida
Large Scale Automated Distributed Simulation
Production
DTF TeraGrid (Micro-)Prototype CIT, Wisconsin
Condor, NCSA
Distributed MOnte Carlo Production (MOP) FNAL
MONARC Distributed Systems Modeling
Simulation system applications to Grid Hierarchy
management
Site configurations, analysis model, workload
Applications to strategy development e.g.
inter-siteload balancing using a Self
Organizing Neural Net (SONN)
Agent-based System Architecture for
DistributedDynamic Services
Grid-Enabled Object Oriented Data Analysis

31
MONARC Simulation System Validation
CMS Proto-Tier1 Production Farm at FNAL
CMS Farm at CERN
32
MONARC SONN 3 Regional Centres Learning to
Export Jobs (Day 0)
1MB/s 150 ms RTT
CERN30 CPUs
CALTECH 25 CPUs
1.2 MB/s 150 ms RTT
0.8 MB/s 200 ms RTT
NUST 20 CPUs
Day 0
33
US CMS Remote Control RoomFor LHC
34
Full Event Database of 100,000 large objects
Denver Client
Full Event Database of 40,000 large objects
?
?
?
Request
?
Request
?
?
Parallel tuned GSI FTP
Parallel tuned GSI FTP
Tag database of 140,000 small objects
Bandwidth Greedy Grid-enabled Object Collection
Analysis for Particle Physics (SC2001
Demo) Julian Bunn, Ian Fisk, Koen Holtman, Harvey
Newman, James Patton
The object of this demo is to show grid-supported
interactive physics analysis on a set of 144,000
physics events. Initially we start out with
144,000 small Tag objects, one for each event, on
the Denver client machine. We also have 144,000
LARGE objects, containing full event data,
divided over the two tier2 servers.
? Using local Tag event database, user plots
event parameters of interest ? User selects
subset of events to be fetched for further
analysis ? Lists of matching events sent to
Caltech and San Diego ? Tier2 servers begin
sorting through databases extracting required
events ? For each required event, a new large
virtual object is materialized in the server-side
cache, this object contains all tracks in the
event. ? The database files containing the new
objects are sent to the client using Globus FTP,
the client adds them to its local cache of
large objects ? The user can now plot event
parameters not available in the Tag ? Future
requests take advantage of previously cached
large objects in the client
http//pcbunn.cacr.caltech.edu/Tier2/Tier2_Overall
_JJB.htm

Write a Comment

User Comments (0)