Globally%20Distributed%20Computing%20and%20Networking%20for%20Particle%20Physics%20Event%20Analysis

About This Presentation
Title:

Globally%20Distributed%20Computing%20and%20Networking%20for%20Particle%20Physics%20Event%20Analysis

Description:

Globally Distributed Computing and Networking for Particle Physics Event Analysis –

Number of Views:59
Avg rating:3.0/5.0
Slides: 36
Provided by: julia71
Category:

less

Transcript and Presenter's Notes

Title: Globally%20Distributed%20Computing%20and%20Networking%20for%20Particle%20Physics%20Event%20Analysis


1
Globally Distributed Computing and Networking for
Particle Physics Event Analysis
  • Julian Bunn
  • California Institute of Technology
  • Center for Advanced Computing Research
  • Friday October 13th 2006

Work supported by NSF and DOE
2
CERNs Large Hadron Collider
5000 Physicists/Engineers 300 Institutes
70 Countries
CMS
TOTEM
pp, general purpose
pp, general purpose
Atlas
ALICE Heavy Ions
LHCb B-physics
27 km Tunnel under Switzerland France
3
(No Transcript)
4
The Great Questions of Particle Physics and
Cosmology
  1. Where are the Higgs particles andwhat is the
    mysterious Higgs field ?
  2. Where does the pattern of particle families and
    masses come from ?
  3. Why do neutrinos and quark flavours oscillate ?
  4. Is there Supersymmetry ?
  5. Does Dark Matter exist, and what is it ?
  6. Why is gravity so weak?
  7. Why is any matter left in the universe?
  8. Are there extra space-time dimensions?
  9. What is the Dark Energy?

5
Higgs Discovery Reach Many Modes
MH115-800 GeV covered with 10 fb-1,or Less

CMS
CMS
6
(No Transcript)
7
Closing CMS for the first time (July 2006)
8
Complexity of LHC Events Higgs decay
9
Creation of Black Holes (Dont worry)
10
Magnet Test and Cosmic Challenge
  • A first test of major parts of CMS in situ
    Hardware Software
  • Magnet commissioning and field mapping
  • Cryogenics, electrical and control systems
  • field map with 40 Gauss precision in Tracker
  • Field ramped to 4 Tesla
  • HCAL performance in magnetic field
  • Muons
  • Alignment with cosmic rays
  • crossing between endcaps barrel
  • Distortions as field is increased,
  • Reproducibility after endcap removal
  • Data Acquisition System (DAQ)
  • Event building
  • Databases and networks
  • Integration with Run Control, L1 Trigger etc.
  • Reconstruction and Visualization Software

Cosmic Ray Bending in B-Field
Real Data
Simulation
11
Run 2605 / Event 3981/ B 3.8 T/27.08.06
CERN PRESS RELEASE 13 September 2006 Mammoth CMS
magnet reaches full-field at CERN, Tests show CMS
detector will be ready for data, All major
subsystems operated together for the first time,
90 overall running efficiency
12
Distributing Computing Architecture and
Infrastructure
13
LHC Data Grid Hierarchy
Online custom fast electronics, pipelined,
buffered, FPGAs
PByte/sec
gt10 Tier1 and 100 Tier2 Centers
150-1500 MBytes/sec
Online System
Experiment
CERN Center PBs of Disk Tape Robot
Tier 0 1
Tier 1
10 - 40 Gbps
FNAL Center
IN2P3 Center
INFN Center
RAL Center
10 Gbps
Tier 2
1-10 Gbps
Tier 3
Physicists work on analysis channels Each
institute has 10 physicists working on one or
more channels
Institute
Institute
Institute
Institute
Physics data cache
1 to 10 Gbps
Tens of Petabytes by 2008An Exabyte 5-7 Years
later100 Gbps Data Networks
Workstations/Laptops
Tier 4
14
Tier2 Cluster
dCache servers
40 Woodcrest nodes ordered
30 1U Dual-Core Opteron nodes
Interactive Analysis Cluster
Backup server
Force10 Ultralight Connection
1U Dual Xeon Cluster nodes
2 Foundry Switches for Cluster Network
15
LHC Tier2 Performance Milestones
  • 1.5 TByte/day Upload to FNAL Tier1
  • 10 TByte/day Download from FNAL Tier1 (gt 120
    MByte/sec)
  • Install 40 Woodcrest nodes in Fall to bring CPU
    capacity to 500,000 SI2k

16
LHC Data Analysis Essential Components
  • Data Processing All data needs to be
    reconstructed, first into fundamental components
    like tracks and energy deposition and then into
    physics objects like electrons, muons, hadrons,
    neutrinos, etc.
  • Raw -gt Reconstructed -gt Summarized
  • Simulation, same path. Critical to understanding
    detectors and underlying physics.
  • Data Discovery We must be able to locate events
    of interest (Databases)
  • Data Movement We must be able to move discovered
    data as needed for analysis or reprocessing
    (Networks)
  • Data Analysis We must be able to apply our
    analysis to the reconstructed data
  • Collaborative Tools Vital to sustaining global
    collaborations
  • Policy and Resource Management We must be able
    to share, manage and prioritise in a resource
    scarce environment

17
Networking for Physics
18
NSFs UltraLight Project
Led by Caltech UF, FIU, UMich, SLAC,FNAL, MIT,
CERN, UERJ(Rio), USP, NLR, CENIC,UCAID,Transligh
t, UKLight, Netherlight, UvA, UCLondon, KEK,
Taiwan, Kreonet, Cisco
  • Delivering the network as an integrated, managed
    resourceHybrid (packet-switched dynamic
    optical paths) experimental network
  • Leveraging Transatlantic RD network
    partnerships
  • 10 GbE across US and the Atlantic with FNAL,
    BNL, UltraScience Net, ESnet, I2, NLR, CERN
    CHEPREO, WHREN/LILA, TransLight, NetherLight,
    GLORIAD etc. Extensions to Korea, Brazil,
    India, Japan and Taiwan
  • End-to-end monitoring, dynamic tracking,
    optimization and BW provisioning
  • Working closely with US LHCNet, ESnet, FNAL, BNL
    teams, to provide high-performance, robust,
    scalable production services to the US LHC
    Program

19
HENP Bandwidth Roadmap Major Links (in Gbps)
1000 Times Bandwidth Growth Per
DecadeParalleled by ESnet Roadmap for Data
Intensive Sciences
20
Demonstrations of high performance physics
Dataset transfers - SC2005 Bandwidth Challenge
151 Gb/s Peak, gt 100 Gb/s for Hours
  • 22 10G Waves
  • 64 10G Switch Ports 2 Fully Populated Cisco
    6509Es
  • 43 Neterion 10G NICs
  • 70 nodes with 280 Cores
  • 200 SATA Disks
  • 40 Gbps -gt StorCloud

21
Data Analysis on the Grid
22
Grid Analysis Environment (GAE)
  • Physics Analysis using the Grid
  • (Grid Distributed Computing with Certificates)
  • The Acid Test for Grids crucial for LHC
    experiments
  • Large, Diverse, Distributed Community of users
  • Support for 100s to 1000s of analysis and
    production tasks,shared among dozens of sites
  • Operates in a resource-limited and
    policy-constrained environment
  • Widely varying task requirements and
    prioritiestime-dependent workload task
    deadlines
  • Dominated by collaboration policy and strategy
    (resource usage and priorities)
  • Need for Priority Schemes, robust Authentication
    and Security
  • Requires a scalable global system with real-time
    monitoring task and workflow tracking
  • Decisions often require an end-to-end System view

23
GAE Architecture
ROOT, CMSSW, IGUANA, IE, Firefox
  • Analysis Clients talk standard protocols to the
    Clarens data/services Portal.
  • Simple Web service API allows simple or complex
    clients to benefit from this architecture.
  • Typical clients ROOT, Web Browser, CMSSW
  • The Clarens Portal hides the complexity of the
    Grid
  • Key features Global Scheduler, Catalogs,
    Monitoring, and Grid-wide Execution service.

Analysis Client
Analysis Client
Analysis Client
HTTP, SOAP, XML-RPC
Grid Services Web Server
Clarens
Scheduler
Catalogs
Fully- Abstract Planner
Metadata
Partially- Abstract Planner
Virtual Data
MonALISA
ROOT
Applications
Data Management
Monitoring
Replica
Fully- Concrete Planner
Grid
Execution Priority Manager
BOSS, Condor
Grid Wide Execution Service
24
Clarens (Python and Java)
HTTP Client
HTTP Client
(WAN) Network
JClarens
Tomcat Web Server
Apache Web Server
(P)Clarens
MOD_PYTHON
/xmlrpc servlet
AXIS (SOAP)
XML-RPC engine
XML-RPC
SOAP
GET
Service Management
Service Management
Databases
Databases
Remote File Access
Remote File Access
PKI Security
VO Management
PKI Security
VO Management
Configuration
Discovery
Discovery
Process Management
Configuration
Core Services
Utilities
Core Services
Utilities
25
HotGrid Science Portals (NVO and LHC)
From this
To this
power user
and now do some science....
Write proposal
big-ironcomputing
Learn Globus Learn MPI Learn PBS Port code to
Itanium Get certificate Get logged in Wait 3
months for account Write proposal
Strong certificate
morescience....
HotGrid
somescience....
Web form
  • Graduated Access
  • Anonymous
  • HotGrid
  • Strong
  • Own Account

User fills in form with HotGrid CA Name,
address, short project description Passphrase Ob
tains HotGrid certificate Certificate is
installed on TG Can use Gateway immediately
Restricted time, resources, etc.
26
Analysis using Grid Portals
  • Community portals offer convenient interfaces to
    specialised services
  • The Clarens Pythia Portal allows particle
    physicists to use the Pythia particle generation
    software.
  • No local software is needed, and powerful backend
    computers can handle lengthy simulations.

Grid Certificate
Remote File Access
Remote Execution
27
NVO NESSSI Portal
Large Clusters and Supercomputers
28
Collaboration Systems
29
The EVO/VRVS Collaboration System
Caltech
  • On demand audio/video conferencing with shared
    desktops etc.
  • VRVS in daily use by many communities in particle
    physics and other sciences.
  • New system EVO fully distributed and features
    support for many different clients, protocols and
    devices.

Manchester
Tokyo
CERN
Rio
FIU
Pakistan
Stanford
Ottawa
30
Monitoring and Control of the Global System
31
MonALISA Monitoring during Bandwidth Challenge
Monitoring NLR, Abilene/HOPI, LHCNet, USNet,
TeraGrid, PWave, SCInet, Gloriad, JGN2, WHREN,
other Intl RE Nets, and 14000 Grid Nodes at
250 Sites
32
MonALISA What is it?
  • A set of globally distributed agents and servers
    that provide real time monitoring and decision
    support for OSG, EVO/VRVS and an increasing
    number of other communities.
  • Design and development began at Caltech in 1999,
    based on studies of distributed computing
    configurations (MONARC project)
  • MonALISA hosts a variety of services on each of
    its server instances
  • Each server processes 5000 messages/sec
  • System is scalable and robust (no single point of
    failure)
  • All servers around the world can be addressed in
    less than one second
  • Primarily used for monitoring Grids, and in
    particular
  • Compute clusters, individual nodes (load, memory,
    faults etc.)
  • Local and wide area networks (router counts and
    info via snmp)
  • Application level (via counters and callouts in
    end user applications)
  • Increasingly used for automatic decision making
    and support

33
MonALISA Architecture
Regional or Global High Level Services,
Repositories Clients
HL services
Secure and reliable communication, Dynamic load
balancing, Scalability Replication,AAA for
Clients
Proxies
Distributed System for gathering and analyzing
information based on mobile agents Customized
aggregation, Triggers, Actions
MonALISA services
Agents
Distributed Dynamic Registration and
Discovery.Based on a lease mechanism, and
robust event-propagation
Network of
JINI-Lookup Services Secure Public
34
Dynamic Network Path Allocation Automated
Dataset Transfer
gtbbcopy A/fileX B/path/ OS path
available Configuring interfaces Starting Data
Transfer
Detects errors and automatically recreates the
path in less than the TCP timeout (lt1second)
Real time monitoring
Internet
MonALISA Distributed Service System
Regular IP path
APPLICATION
DATA
MonALISA Service
Monitor Control
A
B
OS Agent
LISA AGENT LISA sets up - Network Interfaces
- TCP stack - Kernel parameters - Routes LISA
? APPLICATION use eth1.2,
LISA Agent
TL1
Optical Switch
Active light path
35
The Last Slide Distributed Computing and
Networking for Particle Physics
  • Data Processing LHC Tiered hierarchy well
    tested and scaled for LHC data rates. Ten years
    ago the demands seemed off the wall, but now
    are more mundane
  • Data Discovery Event catalogues and globally
    distributed stores are working, but were not
    quite at the easy-use stage
  • Data Movement Once we identify the datasets, we
    have no problem moving them in the WAN at line
    rates
  • Data Analysis We know what we want to analyse
    (events), and the methods are not much different
    from what we did in previous experiments, but the
    scale thats different. The Acid Test is Grid
    Enabled Analysis in progress.
  • Collaborative Tools In daily use.
  • Policy and Resource Management A nascent field.
    Lots of progress with authentication,
    certificates and all that palaver, but little
    idea of e.g. how to set and apply policies to
    grid resources.
  • Exciting times ahead as we put all this to the
    real test
Write a Comment
User Comments (0)
About PowerShow.com