High Energy Physics and Data Grids - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

High Energy Physics and Data Grids

Description:

HEP Short History Frontiers. HEP and Data Grids (Aug. 4-5, ... Communication and collaboration at a distance. Remote software development and physics analysis ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 34
Provided by: paula250
Category:

less

Transcript and Presenter's Notes

Title: High Energy Physics and Data Grids


1
  • High Energy Physicsand Data Grids

Paul Avery University of Florida http//www.phys.u
fl.edu/avery/ avery_at_phys.ufl.edu
US/UK Grid Workshop San Francisco August 4-5, 2001
2
Essentials of High Energy Physics
  • Better name ? Elementary Particle Physics
  • Science Elementary particles, fundamental forces

Particles
Forces
Strong ? gluon Electro-weak ? ?, W?, Z0 Gravity ?
graviton
Leptons
Quarks
  • Goal ? unified theory of nature
  • Unification of forces (Higgs, superstrings, extra
    dimensions, )
  • Deep connections to large scale structure of
    universe
  • Large overlap with astrophysics, cosmology,
    nuclear physics

3
HEP Short History Frontiers
10-10
m
10 eV
gt300,000 Y
1900....
Quantum MechanicsAtomic physics

1940-50
Quantum Electro Dynamics
3 min
10
m
-15
MeV - GeV
1950-65
Nuclei, HadronsSymmetries, Field theories
10
m
gtgt GeV
10
sec
1965-75
Quarks. Gauge theories
-16
-6
u
e

Z
100 GeV
10
sec
1970?83 SPS
ElectroWeak unification, QCD
10
m
-10
-18
e
-
u
1990 LEP
3 families, Precision Electroweak
1994 Tevatron
Top quark
Origin of masses
10-19
m
10
2
GeV
10
-12
sec
2007 LHC
Higgs ? Supersymmetry ?
The next step...
GRAND Unified Theories ?
10
m
1016
GeV
10
sec
Proton Decay ?
-32
-32
Underground
10
m
1019
GeV
10
sec
??
Quantum Gravity?

The Origin of the

-35
-43
Superstrings ?
Universe
(Planck scale)
4
HEP Research
  • Experiments primarily accelerator based
  • Fixed target, colliding beams, special beams
  • Detectors
  • Small, large, general purpose, special purpose
  • but wide variety of other techniques
  • Cosmic rays, proton decay, g-2, neutrinos, space
    missions
  • Increasing scale of experiments and laboratories
  • Forced on us by ever higher energies
  • Complexity, scale, costs ? large collaborations
  • International collaborations are the norm today
  • Global collaborations are the future (LHC)

LHC discussed in next few slides
5
The CMS Collaboration
Number of Laboratories
Member States
58

Non-Member States

50
USA
36
144
Total
Number of Scientists
1010
Member States
Non-Member States
448
351
USA
Total
1809
1809 Physicists and Engineers 31 Countries
144 Institutions
6
CERN LHC site
CMS
LHCb
ALICE
Atlas
7
High Energy Physics at the LHC
Compact Muon Solenoid at the LHC (CERN)
Smithsonianstandard man
8
Collisions at LHC (2007??)
Proton?Proton

2835 bunch/beam

Protons/bunch
1011
Beam energy
7 TeV (7x1012 ev)
Luminosity
1034 cm?2s?1
Bunch
Crossing rate
40 MHz(every 25 nsec)
Proton
Collision rate
109 Hz
(Average 20 Collisions/Crossing)
Parton
(quark, gluon)
New physics rate 10?5 Hz
Selection 1 in 1013
Particle
9
HEP Data
  • Scattering is principal technique for gathering
    data
  • Collisions of beam-beam or beam-target particles
  • Typically caused by a single elementary
    interaction
  • But also background collisions ? obscures physics
  • Each collision generates many particles Event
  • Particles traverse detector, leaving electronic
    signature
  • Information collected, put into mass storage
    (tape)
  • Each event is independent ? trivial computational
    parallelism
  • Data Intensive Science
  • Size of raw event record 20KB ? 1MB
  • 106 ? 109 events per year
  • 0.3 PB per year (2001) BaBar (SLAC)
  • 1 PB per year (2005) CDF, D0 (Fermilab)
  • 5 PB per year (2007) ATLAS, CMS (LHC)

10
Data Rates From Detector to Storage
40 MHz
1000 TB/sec
Physics filtering
Level 1 Trigger Special Hardware
75 GB/sec
75 KHz
Level 2 Trigger Commodity CPUs
5 GB/sec
5 KHz
Level 3 Trigger Commodity CPUs
100 MB/sec
100 Hz
Raw Data to storage
11
LHC Data Complexity
  • Events resulting from beam-beam collisions
  • Signal event is obscured by 20 overlapping
    uninteresting collisions in same crossing
  • CPU time does not scale from previous generations

2000
2007
12
Example Higgs Decay into 4 Muons
40M events/sec, selectivity 1 in 1013
13
LHC Computing Challenges
  • Complexity of LHC environment and resulting data
  • Scale Petabytes of data per year (100 PB by
    2010) Millions of SpecInt95s of CPU
  • Geographical distribution of people and resources

1800 Physicists 150 Institutes 32 Countries
14
Transatlantic Net WG (HN, L. Price) Tier0
- Tier1 BW Requirements

Installed BW in Mbps. Maximum Link Occupancy
50 work in progress
15
Hoffmann LHC Computing Report 2001
  • Tier0 Tier1 link requirements
  • (1) Tier1 ? Tier0 Data Flow for Analysis 0.5 -
    1.0 Gbps
  • (2) Tier2 ? Tier0 Data Flow for Analysis 0.2 -
    0.5 Gbps
  • (3) Interactive Collaborative Sessions (30 Peak)
    0.1 - 0.3 Gbps
  • (4) Remote Interactive Sessions (30 Flows Peak)
    0.1 - 0.2 Gbps
  • (5) Individual (Tier3 or Tier4) data transfers
    0.8 Gbps Limit to 10 Flows of 5
    Mbytes/sec each
  • TOTAL Per Tier0 - Tier1 Link 1.7 - 2.8 Gbps
  • Corresponds to 10 Gbps Baseline BW Installed on
    US-CERN Link
  • Adopted by the LHC Experiments (Steering
    Committee Report)

16
LHC Computing Challenges
  • Major challenges associated with
  • Scale of computing systems
  • Network-distribution of computing and data
    resources
  • Communication and collaboration at a distance
  • Remote software development and physics analysis

Result of these considerations Data Grids
17
Global LHC Data Grid Hierarchy
Tier0 CERNTier1 National LabTier2 Regional
Center (University, etc.)Tier3 University
workgroupTier4 Workstation
  • Key ideas
  • Hierarchical structure
  • Tier2 centers
  • Operate as unified Grid

18
Example CMS Data Grid
CERN/Outside Resource Ratio 12Tier0/(?
Tier1)/(? Tier2) 111
Experiment
PBytes/sec
Online System
100 MBytes/sec
Bunch crossing per 25 nsecs.100 triggers per
secondEvent is 1 MByte in size
CERN Computer Center gt 20 TIPS
Tier 0 1
HPSS
2.5 Gbits/sec
France Center
Italy Center
UK Center
USA Center
Tier 1
2.5 Gbits/sec
Tier 2
622 Mbits/sec
Tier 3
Institute 0.25TIPS
Institute
Institute
Institute
100 - 1000 Mbits/sec
Physics data cache
Physicists work on analysis channels. Each
institute has 10 physicists working on one or
more channels
Tier 4
Workstations,other portals
19
Tier1 and Tier2 Centers
  • Tier1 centers
  • National laboratory scale large CPU, disk, tape
    resources
  • High speed networks
  • Many personnel with broad expertise
  • Central resource for large region
  • Tier2 centers
  • New concept in LHC distributed computing
    hierarchy
  • Size ? national lab university1/2
  • Based at large University or small laboratory
  • Emphasis on small staff, simple configuration
    operation
  • Tier2 role
  • Simulations, analysis, data caching
  • Serve small country, or region within large
    country

20
LHC Tier2 Center (2001)
GEth Switch
FEth Switch
FEth Switch
FEth Switch
FEth Switch
FEth
Router
Hi-speedchannel
WAN
Tape
gt1 RAID
21
Hardware Cost Estimates
1.1 years
1.4 years
2.1 years
1.2 years
  • Buy late, but not too late phased implementation
  • RD Phase 2001-2004
  • Implementation Phase 2004-2007
  • RD to develop capabilities and computing model
    itself
  • Prototyping at increasing scales of capability
    complexity

22
HEP Related Data Grid Projects
  • Funded projects
  • GriPhyN USA NSF, 11.9M 1.6M
  • PPDG I USA DOE, 2M
  • PPDG II USA DOE, 9.5M
  • EU DataGrid EU 9.3M
  • Proposed projects
  • iVDGL USA NSF, 15M 1.8M UK
  • DTF USA NSF, 45M 4M/yr
  • DataTag EU EC, 2M?
  • GridPP UK PPARC, gt 15M
  • Other national projects
  • UK e-Science (gt 100M for 2001-2004)
  • Italy, France, (Japan?)

23
(HEP Related) Data Grid Timeline
24
Coordination Among Grid Projects
  • Particle Physics Data Grid (US, DOE)
  • Data Grid applications for HENP
  • Funded 1999, 2000 (2M)
  • Funded 2001-2004 (9.4M)
  • http//www.ppdg.net/
  • GriPhyN (US, NSF)
  • Petascale Virtual-Data Grids
  • Funded 9/2000 9/2005 (11.9M1.6M)
  • http//www.griphyn.org/
  • European Data Grid (EU)
  • Data Grid technologies, EU deployment
  • Funded 1/2001 1/2004 (9.3M)
  • http//www.eu-datagrid.org/
  • HEP in common
  • Focus infrastructure development deployment
  • International scope
  • Now developing joint coordination framework

GridPP, DTF, iVDGL ? very soon?
25
Data Grid Management
26
BaBar
HENPGCUsers
D0
Condor Users
BaBar Data Management
HENP GC
D0 Data Management
Condor
PPDG
SRB Users
CDF
CDF Data Management
SRB Team
Globus Team
Nuclear Physics Data Management
CMS Data Management
Atlas Data Management
Nuclear Physics
Globus Users
CMS
Atlas
27
EU DataGrid Project
?
?
?
?
?
?
?
?
28
PPDG and GriPhyN Projects
  • PPDG focus on todays (evolving) problems in HENP
  • Current HEP BaBar, CDF, D0
  • Current NP RHIC, JLAB
  • Future HEP ATLAS , CMS
  • GriPhyN focus on tomorrows solutions
  • ATLAS, CMS, LIGO, SDSS
  • Virtual data, Petascale problems (Petaflops,
    Petabytes)
  • Toolkit, export to other disciplines,
    outreach/education
  • Both emphasize
  • Application sciences drivers
  • CS/application partnership (reflected in funding)
  • Performance
  • Explicitly complementary

29
PPDG Multi-site Cached File Access System
Satellite Site Tape, CPU, Disk, Robot
PRIMARY SITE Data Acquisition, Tape, CPU, Disk,
Robot
University CPU, Disk, Users
Satellite Site Tape, CPU, Disk, Robot
Satellite Site Tape, CPU, Disk, Robot
University CPU, Disk, Users
University CPU, Disk, Users
Resource Discovery, Matchmaking,
Co-Scheduling/Queueing, Tracking/Monitoring,
Problem Trapping Resolution
30
GriPhyN PetaScale Virtual-Data Grids
Production Team
Individual Investigator
Workgroups
1 Petaflop 100 Petabytes
Interactive User Tools
Request Planning
Request Execution
Virtual Data Tools
Management Tools
Scheduling Tools
Resource
Other Grid
  • Resource
  • Security and
  • Other Grid

Security and
Management
  • Management
  • Policy
  • Services

Policy
Services
Services
  • Services
  • Services

Services
Transforms
Distributed resources(code, storage,
CPUs,networks)
Raw data
source
31
Virtual Data in Action
  • Data request may
  • Compute locally
  • Compute remotely
  • Access local data
  • Access remote data
  • Scheduling based on
  • Local policies
  • Global policies
  • Cost

Major facilities, archives
Regional facilities, caches
Local facilities, caches
32
GriPhyN Goals for Virtual Data
Explore concept of virtual data and
itsapplicability to data-intensive science
  • Transparency with respect to location
  • Caching, catalogs, in a large-scale,
    high-performance Data Grid
  • Transparency with respect to materialization
  • Exact specification of algorithm components
  • Traceability of any data product
  • Cost of storage vs CPU vs networks
  • Automated management of computation
  • Issues of scale, complexity, transparency
  • Complications calibrations, data versions,
    software versions,

33
Data Grid Reference Architecture
Write a Comment
User Comments (0)
About PowerShow.com