Mihnea Dulea, IFIN-HH - PowerPoint PPT Presentation

About This Presentation
Title:

Mihnea Dulea, IFIN-HH

Description:

Efficient Handling and Processing of PetaByte-Scale Data for the Grid Centers within the FR Cloud 1ST JOINT SYMPOSIUM CEA-IFA HaPPSDaG - PROJECT PRESENTATION - – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 27
Provided by: UserP156
Category:

less

Transcript and Presenter's Notes

Title: Mihnea Dulea, IFIN-HH


1
Efficient Handling and Processing
of PetaByte-Scale Data for the Grid
Centers within the FR Cloud 1ST JOINT SYMPOSIUM
CEA-IFA
HaPPSDaG - PROJECT PRESENTATION - - FIRST YEAR
PROGRESS REPORT -
M. Dulea National Institute for Nuclear Physics
and Engineering 'Horia Hulubei' (IFIN-HH)
Mihnea Dulea, IFIN-HH
2
OVERVIEW
  • Computing support for LHC
  • Project topics
  • Project objectives and work planning
  • Framework agreements
  • General information
  • Project teams and infrastructure
  • First year results

Mihnea Dulea, IFIN-HH
3
COMPUTING SUPPORT for LHC - LCG
  • LHC COMPUTING GRID
  • LCG is a wide distributed array of computing
    resources that provides the computing support
    required for the storage, processing, simulation
    and analysis of the data gathered by the four
    major
  • experiments performed at LHC.
  • It consists of more than 140 computing centres
  • and federations of centres from 35 countries.
  • The resource centres are classified according
  • to their size and functionality as Tier-0 (CC
  • _at_ CERN), Tier-1 (11 centres), and Tier-2.
  • The centres are interconnected through a
  • high-speed network (GEANT2 in EU).
  • Current and 2012-2014 activity related to LHC.

Mihnea Dulea, IFIN-HH
4
COMPUTING SUPPORT - FR
  • ATLAS FRENCH CLOUD
  • Grid sites
  • CC-IN2P3 (Tier-1)
  • Tier-2 centres ... (many)
  • GRIF
  • Grille de Recherche d'Ile de France
  • computing grid in Paris region, joint
  • initiative of CEA/IRFU labs of
  • CNRS/IN2P3 (6 sites)
  • The sites are interconnected
  • through dedicated 10 Gbps links connected to the
    FR NREN
  • RENATER Réseau national de télécommunications
    pour la technologie, l'enseignement et la
    recherche
  • FR Cloud includes foreign grid centres from
    China, Japan, Romania

Mihnea Dulea, IFIN-HH
5
COMPUTING SUPPORT - RO
  • ROMANIAN TIER-2 FEDERATION RO-LCG
  • Grid sites
  • IFIN-HH, 5 Grid sites (resource centres)
  • ISS - Inst. for Space Sciences (2 sites)
  • UPB - Univ. 'Politehnica' of Bucharest
  • ITIM - NIRD in Molecular Isotopic
  • Technologies - Cluj
  • UAIC, Alex. Ioan Cuza University - Iasi
  • The sites are connected to the 10 Gbps
  • backbone of the RO NREN - the Romanian
  • Educational and Research Network RoEduNet
  • 4 grid sites currently support ATLAS vo
    RO-07-NIPNE, RO-02-NIPNE (IFIN-HH)
  • RO-14-ITIM (Cluj), RO-16-UAIC (Iasi)

Mihnea Dulea, IFIN-HH
6
PROJECT TOPIC
  • Computing support for LHC experiments provision
    of grid resources services
  • The overall support of LCG
  • deployment and operation is
  • provided from other funds (e.g.
  • CONDEGRID project in RO).

HAPPSDAG addressess specific ATLAS issues in
order to optimize resource usage
Mihnea Dulea, IFIN-HH
7
ATLAS ISSUES
  • Generic requirements regarding
  • - data transfer from Tier-1 to the associated
    Tier-2 sites (CC-IN2P3 gt RO-LCG)
  • - transfer of large files from SE to WN for each
    analysis job consider many simultaneous jobs
  • - transfer of log and results files from WN to
    SE immediate transfer of log file to UI
  • RO specific needs at the beginning of the
    project Grid cluster
  • - analysis of the causes of the lower performance
    of
  • RO-LCG sites before Oct. 2010
  • - elaborate and test technical solutions for
    performance
  • improvement
  • - ensure better communication and coordination
  • between the RO sites and the FR-cloud partners
  • - general measures for improving Tier1 - Tier2
    interaction
  • - elaborate general guidelines regarding the
  • improvement in efficiency of the grid centers
    which are
  • associated to ATLAS clouds Transfer
    paths from/to the Storage Element (SE)

Mihnea Dulea, IFIN-HH
8
PROJECT OBJECTIVES
Strategic objective provide means for
improvement of the processing and handling of
large data sets at the Tier2 centers which
participate in the ATLAS experiment at the LHC
computing support. (RO - case study) Specific
objectives and partner contributions
  • Improve communication and coordination between
    GRIF/IN2P3 and RO sites (RO/FR)
  • Testing improving quality of the FR - RO data
    link for large dataset transfers (RO/FR)
  • Implementation of specific measures for
    increasing ATLAS job load and storage
  • performance on sites (RO)
  • Improving large dataset transfer between FR - RO
    and data analysis (RO/FR)
  • Contributing to grid monitoring and technical
    support within FR-cloud (RO)
  • Training regarding grid monitoring and support
    (FR gt RO)
  • Dissemination (RO/FR)

Mihnea Dulea, IFIN-HH
9
PLANNING of WORK
  • Stage 1 (01.10.2010 - 10.12.2010)
  • Analysis of Tier1-Tier2 communication
  • Stage 2 (01.01.2011 - 30.09.2011)
  • Studies and software tools for monitoring
    and operation of the FR Cloud - RO grid
    connection and job loading. Testing of data
    handling and processing.
  • Stage 3 (01.10.2011 - 30.09.2012)
  • Methods and procedures for improving the
    performance of the RO sites within the FR
    Cloud

Mihnea Dulea, IFIN-HH
10
FRAMEWORK AGREEMENTS
  • General Cooperation Agreement for Scientific
    Research
  • between CEA and IFA, signed in December 2009
  • - Field of cooperation Technologies for
    Information and Health
  • - Topic proposed for 2010 Grid Technologies
  • Joint Call for proposals of joint RD projects
    (May 2010)
  • - IFIN-HH and IRFU submitted a proposal for
    a Joint Research and
  • Development Projects
  • Cooperation Agreement in the Field of Scientific
    Research (AS)
  • between CEA and IFIN-HH, (01.10.2010)
  • - General Coordinators Gerard Cognet (FR),
    Ioan Ursu (RO)
  • - leading and coordinating the cooperation
    activities
  • Project Agreement (CEA, IFIN-HH)

Mihnea Dulea, IFIN-HH
11

GENERAL INFORMATION
  • RO Contract n C1-06/2010, between IFA and
    IFIN-HH
  • Start date 01/10/2010
  • Duration 24 months
  • Funding of the RO part of the project 400 000
    lei ( 94.000 )
  • Funding of the FR part of the project 133 000

BUDGET 2010 2010 2011 2011 2012 2012
BUDGET RO (lei) CEA (Eur) RO (lei) CEA (Eur) RO (lei) CEA (Eur)
Manpower 25.333 6000 120.133 48000 82.000 22000
Travels 8.000 4000 3.200 14000 8.000 14000
Others (Romanian Engineer staying at Saclay ) 5000 10000 10000
Others (French guests staying in Romania ) 0 10.000 10.000
Others (equipment) 0 40.000 40.000
Others (indirect costs) 6.667 26.667 20.000
Total 40.000 15.000 200.000 72.000 160.000 46.000
Mihnea Dulea, IFIN-HH
12
PROJECT TEAMS
  • Project coordinators Jean-Pierre Meyer (FR),
    Mihnea Dulea (RO)
  • Technical correspondents Pierrick Micout (FR),
    Gabriel Stoicea (RO)
  • FR team (CEA/IRFU)
  • Eric LANÇON
  • Pierrick MICOUT
  • Christine LEROY
  • Frédéric SCHAER
  • Zoulikha GEORGETTE
  • Adelino GOMEZ
  • RO team (IFIN-HH)
  • Serban Constantinescu
  • Mihai Ciubancan
  • Ionut Traian Vasile
  • Camelia Mihaela Visan

Mihnea Dulea, IFIN-HH
13
Centre for Informational Technologies (CTI) -
IFIN-HH
INFRASTRUCTURE _at_ CTI/DPETI
1200 (grid) 960 (hpc) cores, 270 TB
14
ANALYSIS of NETWORK INFRASTRUCTURE
  • Objective identify the weak points of the FR-RO
    data connection and adoption of measures for
    improving the transfer capacity of large
    datasets.
  • Network structure complex, various owners and
    administrators gt more difficult to act

Section Centres Administrator Owner Location
IFIN-HH LAN RO-02-NIPNE RO-07-NIPNE CTI/DPETI IFIN-HH Magurele
IFIN - UPB UPB ICOMM IFIN-HH UPB
RoEduNet RO-14-ITIM RO-16-UAIC AARNIEC MECTS Romania
GEANT2 In 34 EU states DANTE EU NRENs EU
RENATER GRIF, IN2P3 GIP RENATER GIP RENATER France
  • Activities (ROFR)
  • Testing connectivity transport capacity with
    various tools
  • Finding routing paths and points of data traffic
    delay
  • Comparing performances of RO-CERN link with
    those of RO-IN2P3
  • Conclusions a) performance degradation at
    RoEduNet / GEANT2 interface
  • b) bottlenecks on some of the RoEduNet
    routers

Mihnea Dulea, IFIN-HH
15
IMPROVING POINT-TO-POINT TRAFFIC PERFORMANCES
  • Requires close collaboration with network
    administrators along the RO-FR path
  • Example following bandwidth capacity and traffic
    analysis, a RoEduNet router was found,
    responsible of bottlneck. AARNIEC's intervention
    rised the available bandwidth to 700 Mbps (fig.
    below).

Permanent monitoring required
Mihnea Dulea, IFIN-HH
16
MONITORING TOOLS for DATA TRANSFER and STORAGE
PERFORMANCE - 1
  • Development of software tools for monitoring of
    SE traffic (in/out) (adding data sent by daemons
    running on storage servers in a database web
    interface for display)
  • Tools developed in IFIN-HH useful for FR
    partners too for monitoring RO sites.
  • Traffic from/to WNs and from/to external network

Max at 5 Gbps
Max at gt 3 Gbps
Mihnea Dulea, IFIN-HH
17
MONITORING TOOLS for DATA TRANSFER and STORAGE
PERFORMANCE - 2
  • Traffic on gateway (in/out) SE extern
    throughput
  • Monitoring groups of running or pending jobs

Mihnea Dulea, IFIN-HH
18
MONITORING TOOLS for DATA TRANSFER and STORAGE
PERFORMANCE - 3
  • Accounting of running or pending jobs on CE or
    CREAM-CE

Mihnea Dulea, IFIN-HH
19
IMPROVEMENT of SITE MONITORING and TECHNICAL
SUPPORT
  • Implementation of its own SAM (Service
    Availability Monitoring) system, that uses
    IFIN-HH grid infrastructure and a new monitoring
    vo - ifops. Results published using Nagios.
  • Early notification of technical staff leads to
    improvement of availability of grid services

Monitoring of CREAM-CE, tbit03.nipne.ro
Mihnea Dulea, IFIN-HH
20
IMPROVEMENT and TESTS of SE-WN THROUGHPUT
  • Adding more resources (WNs) doesn't always mean
    better results. Scalability is required
  • Improvement of file transfer speed from SE to WN,
    required by analysis jobs (4-6 files 2-4 GB)
  • Replacing the transfer to disk servers through
    Network File System (NFS) protocole by new DPM
    (Disk Pool Manager) disk storage servers.
  • Higher transfer speed gt no job exceeds the time
    limit gt no cancellation
  • Tests of the new configuration

Time representation of the transfer speed (in
Mbps) for 70 quasi-simultaneous jobs
Mihnea Dulea, IFIN-HH
21
GLOBAL IMPROVEMENT of EFFICIENCY
  • Mean efficiency of ATLAS job execution in 2011
    91

Monthly number of ATLAS jobs and number of ATLAS
events processed in RO-LCG
Mihnea Dulea, IFIN-HH
22
TRAINING REGARDING MONITORING AND TECHNICAL
SUPPORT
  • 20.06.11 - 04.07.11 training stage of C. Visan
    at CEA/IRFU, preparing later participation to
    monitoring and support activities for FR Cloud
    sites.
  • Topics
  • - CEA/IRFU monitoring methods at site, VO,
    project levels EGI/WLCG and LHC monitoring
    (Christine Leroy, Pierrick Micout )
  • - grid site usage (Georgette Zoulikha)
  • - NAGIOS installing/configuration on virtual
    machines (Frederic Schaer)
  • - job submission through Pathena (PanDA Athena),
    at LAL-Orsay (Laurent Duflot)
  • - CACTI site monitoring (Victor Mendoza,
    Université Pierre et Marie Curie (UPMC))
  • - instructions for site and job monitoring in
    ADCoS (ATLAS Distributed Computing Operations
    Shift) and for support team of FR Cloud (Squad).
    (Sabine Crepe)

Mihnea Dulea, IFIN-HH
23
MOBILITY
  • Kick-off meeting (15-16.11 2010, Saclay)
  • Participation at the RO-LCG 2010 Conference,
    Bucharest (Christine Leroy, Sabine
  • Crepe - IN2P3)
  • Participation of Gabriel Stoicea to the spring
    meeting of LCG-France (30-31.05.2011)
  • Training - monitoring and support (20.06.11 -
    04.07.11, Saclay), C.M. Visan

Mihnea Dulea, IFIN-HH
24
BENEFITS
  • CEA/IRFU
  • The results of the project contribute to global
    improvement of FR Cloud efficiency
  • Elaboration, in collaboration, of general
    guidelines for interaction between grid centres
    in ATLAS clouds, and
  • Using FR-RO interaction as a representative case
    study for sharing best practices with smaller
    sites
  • IFIN-HH
  • General efficiency improvement of the activity
    of the RO sites
  • Better integration and visibility in the
    framework of the computing support for ATLAS
    collaboration
  • High-level training of RO technical staff

Mihnea Dulea, IFIN-HH
25
PROSPECTS
  • Further development of methods and procedures
    for improving the performance of the RO sites
    within the FR Cloud
  • General guidelines regarding the improvement in
    efficiency of the grid centers which are
    associated to ATLAS clouds
  • HAPPSDAG workshop and technical meeting in
    Bucharest (28-30.11.2011)
  • Participation of IFIN-HH to site and job
    monitoring in ADC shifts (ATLAS Distributed
    Computing) or in the monitoring team of FR Cloud.
  • Dissemination of results

Mihnea Dulea, IFIN-HH
26
THANK YOU FOR YOUR ATTENTION ! Questions?
Mihnea Dulea, IFIN-HH
Write a Comment
User Comments (0)
About PowerShow.com