Title: PowerPoint-Pr
1The High Energy PhysicsCommunity Grid Project
Inside D-Grid ACAT 07Torsten Harenberg -
University of Wuppertal harenberg_at_physik.uni-wuppe
rtal.de
2D-Grid organisational structure
3technical infrastructure
4HEP Grid effords since 2001
today
EDG
EGEE
EGEE 2
EGEE 3 ?
GridKa / GGUS
DGI
DGI 2
???
D-Grid Initiative
HEP CG
???
5LHC Groups in Deutschland
- Alice Darmstadt, Frankfurt, Heidelberg, Münster
- ATLAS Berlin, Bonn, Dortmund, Dresden, Freiburg,
Gießen, Heidelberg, Mainz, Mannheim, München,
Siegen, Wuppertal - CMS Aachen, Hamburg, Karlsruhe
- LHCb Heidelberg, Dortmund
6German HEP institutes participating in WLCG
- WLCG Karlsruhe (GridKa Uni), DESY, GSI,
München, Aachen, Wuppertal, Münster, Dortmund,
Freiburg
7HEP CG participants
- Participants Uni Dortmund, TU Dresden, LMU
München, Uni Siegen, Uni Wuppertal, DESY (Hamburg
Zeuthen), GSI -
- Associated partners Uni Mainz, HU Berlin, MPI f.
Physik München, LRZ München, Uni Karlsruhe, MPI
Heidelberg, RZ Garching, John von Neumann
Institut für Computing, FZ Karlsruhe, Uni
Freiburg, Konrad-Zuse-Zentrum Berlin
8HEP Community Grid
- WP 1 Data management (dCache)
- WP 2 Job Monitoring and user support
- WP 3 distributed data analysis (ganga)
- gt Joint venture between physics and computer
science
9WP 1 Data managementcoordination Patrick
Fuhrmann
- An extensible metadata catalogue for semantical
data access - Central service for gauge theory
- DESY, Humboldt Uni, NIC, ZIB
- A scaleable storage element
- Using dCache on multi-scale installations.
- DESY, Uni Dortmund E5, FZK, Uni Freiburg
- Optimized job scheduling in data intensive
applications - Data and CPU Co-scheduling
- Uni Dortmund CEI E5
10WP 1 Highlights
- Establishing a metadata catalogue for the gauge
theory - Production service of a metadata catalogue with gt
80.000 documents. - Tools to be used in conjunction with LCG data
grid - Well established in international collaboration
- http//www-zeuthen.desy.de/latfor/ldg/
- Advancements in data management with new
functionality - dCache could become quasi standard in WLCG
- Good documentation and automatic installation
procedure helps to provide useability for small
Tier-3 installations up to Tier-1 sites. - High troughput for large data streams,
optimization on quality and load of disk storage
systems, giving high performant access to tape
systems
11dCache based scaleable storage element
- thousands of pools - gtgt PB Disk Storage - gtgt
100 File transfers/ sec - lt 2 FTEs
- single host - 10 TeraBytes - Zero Maintenance
- dCache project well established
- New since HEP CG
- Professional product management, i.e. code
versioning, packaging, user support and test
suits.
12dCache principle
13dCache connection to the Grid world
14dCache achieved goals
- Development of the xRoot protocol for distributed
analysis - Small sites automatic installation and
configuration (dCache in 10mins) - Large sites (gt 1 Petabyte)
- Partitioning of large systems.
- Transfer optimization from / to tape systems
- Automatic file replication (freely configurable)
15dCache Outlook
- Current usage
- 7 Tier I centres with up to 900 Tbytes on disk
(pre center) plus tape system. (Karlsruhe, Lyon,
RAL, Amsterdam, FermiLab, Brookhaven, Nordu Grid) - 30 Tier II centres, including all US CMS in
USA, planned for US ATLAS. - Planned usage
- dCache is going to be included in the Virtual
Data Toolkit (VDT) of the Open Science Grid
proposed storage element in the USA. - Planned US Tier I will break the 2 PB boundary
end of the year.
16HEP Community Grid
- WP 1 Data management (dCache)
- WP 2 Job Monitoring and user support
- WP 3 distributed data analysis (ganga)
- gt Joint venture between physics and computer
science
17WP 2 job monitoring and user support
co-ordination Peter Mättig (Wuppertal)
- Job monitoring- and resource usage visualizer
- TU Dresden
- Expert system classifying job failures
- Uni Wuppertal, FZK, FH Köln, FH Niederrhein
- Job online steering
- Uni Siegen
18Job monitoring- and resource usage visualizer
19Integration into GridSphere
20Job Execution Monitor in LCG
- Motivation
- 1000s of jobs each day in LCG
- Job status unknown while running
- Manual error detection slow and difficult
- GridICE, ... service/hardware based monitoring
- Conclusion
- Monitor job while running
- JEM
- Automatical error detection needed
- expert system
21JEM Job Execution Monitor
22JEM - status
- Monitoring part ready for use
- Integration into GANGA (ATLAS/LHCb
distributed analysis tool) ongoing - Connection to GGUS planned
- http//www.grid.uni-wuppertal.de/jem/
23HEP Community Grid
- WP 1 Data management (dCache)
- WP 2 Job Monitoring and user support
- WP 3 distributed data analysis (ganga)
- gt Joint venture between physics and computer
science
24WP 3 distributed data managementCo-ordination
Peter Malzacher (GSI Darmstadt)
- GANGA distributed analysis _at_ ATLAS and LHCb
- Ganga is an easy-to-use frontend for job
definition and management - Python, IPython or GUI interface
- Analysis jobs are automatically splitted into
subjobs which are sent to multiple sites in the
Grid - Data management for in- and output. Distributed
output is collected. - Allows simple switching between testing on a
local batch system and large-scale data
processing on distributed resources (Grid) - Developed in the context of ATLAS and LHCb
- Implemented in Python
25GANGA schema
26PROOF schema
27HEPCG summary
DESY, Dortmund Dresden, Freiburg, GSI,
München, Siegen, Wuppertal
Dortmund, Dresden, Siegen, Wuppertal, ZIB, FH
Köln, FH Niederrhein
Physics Departments Computer Sciences D-GRID
Germanys contribution to HEP computing dCache
, Monitoring, distributed analysis Effort will
continue, 2008 Start of LHC data taking
challenge for GRID Concept gt new tools and
developments needed