Mirco Mazzucato Infn Padova 1 - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Mirco Mazzucato Infn Padova 1

Description:

Graphic job description. In collaboration with DATAMAT, Italy. Mirco Mazzucato INFN-Padova-22 ... VIRGO. T2 (~150 nodes 50 TB) T3 (10-15 nodes) T1 Cnaf (~800 ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 46
Provided by: mircoma
Category:

less

Transcript and Presenter's Notes

Title: Mirco Mazzucato Infn Padova 1


1
The Italian eInfrastructure (Internet
and Grids)Current achievements and INFN plans
for the future evolutions
  • Comitato Tecnico Scientifico del Garr
  • 15 luglio 2004
  • Mirco Mazzucato
  • INFN-Padova
  • mirco.mazzucato_at_pd.infn.it

2
Content
  • The Italian eInfrastructure evolution
  • INFN Grid, Grid.it, S-Paci, .
  • The lessons from the LHC stress test of Grid
    technologies as they are implemented in the
    following Grid software Releases
  • Globus toolkit v.2.x
  • DataGrid v2.x (EU-DataGrid is the European
    project just ended)
  • LCG v2 (LHC Computing Grid project for HEP
    experiments at di LHC collider at CERN selection
    of services from Globus, Condor, DataGrid,
    DataTAG)
  • INFN-Grid/GRID.IT v2 (customization of LCG v2 by
    INFN within the italian grid project, GRID.IT)
  • Issues for the future
  • The next steps

3
Early Grid RD in Italy The INFN-GRID Project
  • First national Grid project approved in Europe
    beg. 2000
  • Focused on the preparation of the INFN LHC comp.
    infrastructure
  • The size of the project 20 Italian Sites, 100
    people, 50 FTEs
  • Budget devoted to the development of the LHC
    Regional Computing Centers and related
    collaborative Grid infrastructure
  • ..but since the beginning the development of the
    middleware in INFN Grid was conceived as being of
    general use and has taken into account the
    requirements of other sciences
  • Biology (PD) and Earth Observation(Esrin-ESA-Frasc
    ati)
  • It is a successful example of collaboration
    between physicists, sw engineers, computer
    professionals and computer scientists (CS Dep. of
    Universities of VE, PD, BO, CT, TO,) and
    Italian Industries
  • DatamatSPA and Nice have been major contributors
    in the joint developments of the Italian DataGrid
    middleware components
  • INFN Grid has been and is the national container
    for INFN to coordinate the contribution to all EU
    and International Grid projects and to the GGF
    standardization
  • Early RD in Italy include work done in ISUFI
    (University of Lecce)
  • -see S-PACI

4
The INFN Grid project and the Italian
eInfrastructure
  • INFN Condor on WAN (started 1996, operational in
    1998)
  • Integrate 20 sites CPU resources into a
    national pool with 6 Ckpt domains
  • National testbed to evaluate Globus services in
    1999
  • INFN-GRID, INFN special project, (February
    2000-)
  • National Grid infrastructure driven by INFN
    experiments 2-3 M/year22 M T1,2
  • DATAGRID , CERN Coord. EU Project, 3 years
    duration 10MEuro(2001-2003)
  • European integration and new M/W services for
    HEP, Biology, EO
  • DataTAG, CERN Coord. EU project, 2 years duration
    4 MEuro(2002-2003)
  • Optical networking 1TB/0.5 Hours and
    Interoperability with US Grid, GLUE
  • Grid.IT, National project 3 years (2003-2005)
    MIUR funds 8.1 MEuro
  • Towards a national production eInfrastructure
  • eBusiness, eIndustry, eGovernment, EScience and,
    Technology -- (BIGEST) Italian Grid Initiative
    (2003 -)
  • Coordination of all national eInfrastructure
    activities
  • The Italian grid infrastructure in the new EU
    project EGEE(2004 -) 32 MEuro INFN, S-PACI,
    ENEA...link with CINECA for DEISA
  • The new production EU eInfrastructure for all
    Sciences and beyond
  • 9. LCG the world-wide Grid for LHC experiments
    (2002 -)

5
The national Grid.it eInfrastructure
  • In Grid.it INFN is responsible for the RD and
    creation of a national Grid Infrastructure and
    for studying and prototyping a national Grid
    Operation Service (GOS)
  • The generalization of the infrastructure support
    to other Sciences from INFN is a model
    successfully established in the past with the
    research network (INFNET - GARR)
  • Resources are provided by INFN and major Italian
    Centers (S-Paci, Naples Campus Grid....)
  • The GOS support several Italian Sciences
    applications and the operation of the Italian
    infrastructure also in the context of the new
    European Infrastructure project EGEE
  • The Italian eScience Grid.it infrastructure
    currently support
  • Astrophysics
  • Biology
  • Computational Chemistry
  • Geophysics
  • Earth Observation
  • but other sciences are joining thanks to new MIUR
    funds

6
Grid.IT Production Grid 0perations Portal
  • User documentation
  • site managers documentation
  • Software repository
  • Monitoring
  • Trouble tickets system
  • Knowledge base

http//grid-it.cnaf.infn.it
7
Get your personal certificate
Clear, simple and automated procedure to
allow all Italian Institutions to set up a
Registration Authority and get INFN Certificates
8
How to register to a VO
9
Grid groups within the Grid.it support system
Trouble Ticketing System
http//helpdesk.oneorzero.com
10
Grid services suported by Grid.it
User Interface
Grid Monitoring (GridICE)
VO server CMS
VO server atlas
VO server inaf
Information Index
Resource Broker
BDII
INFN-Milano
INFN-Padova
INFN-Napoli
INFN-CNAF
Computing Element
Storage Element
Computing Element
Storage Element
GIIS
GRIS1
GIIS
GRIS1
SRM
SRM
GRIS
GRIS
GRAM
GRAM
RLS
WorkerNode
WorkerNode
WorkerNode
WorkerNode
...
WorkerNode
...
WorkerNode
11
(No Transcript)
12
GridICE
  • A monitoring system developed for the INFN Grid
    Infrastructure and adopted by LCG
  • Selects grid entities (resources and services)
    per VOs and sites
  • Automatic discovery based on Grid Information
    Service (Globus/MDS2.x, BDII) and Glue schema
    extensions.
  • Layered architecture
  • Measurement service (local monitoring interfaced
    to the GIS )
  • Publisher service
  • Data collector service with auto-discovery
    feature
  • Data Analyzer Detection and notification
    service (on going)
  • Presentation service via web interface
  • Modularity, flexibility and interoperability
  • Ongoing activities
  • Integration of network resource monitoring

13
Grid ICE components
WEB
GIIS Server
Gfx/Presentation
GIIS
1
LDAP
SQL
MonitorigServer
2
LDAP
3
4
GRIS
1 LDAP Query 2 available CE/SE 3 LDAP Query 4
CEIDs, WNs, Steps 3,4 repeated for every CE/SE
Computing Element/Storage Element
14
Data presentation (3)
15
Resources
16
Services
17
Grid Service monitoring
18
General User Interface GENIUS
PORTALJointly developed by INFN and Nice
  • Based on WEB portal architecture
  • Support for generic applications
  • Basic Requirement Grid transparent access
  • It must be accessed from everywhere and by
    everything (desktop, laptop, PDA, WAP phone).
  • It must be redundantly secure at all levels 1)
    secure for web transactions, 2) secure for user
    credentials, 3) secure for user authentication,
    4) secure at VO level.
  • All available grid services must be incorporated
    in a logic way, just one mouse click away.
  • Its layout must be easily understandable and user
    friendly.

19
GENIUS how it works
WEB Browser
GENIUS
Local WS
EnginFrame
Apache
WMS UI
From Roberto Barbera
20
GENIUS interfaced to 100 Grid services
Roberto Barbera
21
Graphic job description
In collaboration with DATAMAT, Italy
Roberto Barbera
Roberto Barbera
22
GENIUS PDA version (1)
Roberto Barbera
23
GENIUS PDA version (2)
Roberto Barbera
24
Italian Grid now (Site/resource map)
INFN CMS T2 T2/3 Atlas T2 T2/3
Alice T2 T2/3 LHCb T2 T2/3
Babar VIRGO T2 (150 nodes 50 TB) T3 (10-15
nodes) T1 Cnaf (800 nodes, 220TB disk 1600 TB
Tape MSS) grid.it resources INFN (15-25
nodes) INAF (5-10 nodes) INGV (NEC
computers), BIO (tbd) general purpose
resources (8-15 nodes)
TRENTO
UDINE
MILANO
PADOVA
TORINO
LNL
PAVIA
National Grid (Internet)
TRIESTE
FERRARA
GENOVA
PARMA
CNAF
BOLOGNA
PISA
FIRENZE
S.Piero
PERUGIA
LNGS
ROMA
LAQUILA
ROMA2
LNF
SASSARI
NAPOLI
BARI
LECCE
SALERNO
CAGLIARI
COSENZA
PALERMO
CATANIA
LNS
Tot. 1400 nodes, 2800 processors
25
The new INFN national computing facility at
CNAF(BO)
  • 1250 KVA Power Generator with 5,000 l oil tank to
    be safe against power-cuts.
  • 800 KVA Uninterruptable Power Supply with
    batteries lasting for 10 at nominal power.
  • 570 KW Cooling System.
  • 1000 m2 Computing Room.
  • GARR Giga-PoP with multiple 2.5 Gbps backbone
    lines and 1Gbps Wide Area Network Access
  • CPU 800 1U nodes, 1.6K Intel processors, 1.3
    MKSI2K
  • Disk 220 TB
  • Tape library 1.6 PB
  • Will grow x 4 to meet LHC expts requirements in
    2007

26
Global Grid services view in Grid-it/EDG 2.0/LCG-2
users
CA services VOMS Policy and Accounting
Monitoring
Users WMS-UI, Genius
Applications and WMS-API
Collective services Grid scheduler (RB), replica
manager (RLS).
Resource or Core services GRAM, GSI, GRIS ,
Basic Data Access (SRM).
Grid Resources and Local Services layer Compute
Element, Storage Element, Network, Local
Authorization Service, Local policy Service,
Local Scheduler
Network Layer
27
Grid.it services were stressed by INFN in LHC
expts 2004 Data Challenges(104 simultaneous
jobs, 106 files, 20 TB data mov.)
  • Grid job submission INFN/EDG WMS/RB
  • RJS (Remote Job Submission) to specific
    computers
  • The user submits a job on a WAN computing system
    providing the address
  • RJS to Grid domains (set of computers) Via a Grid
    scheduler without knowing the destination
    computers MatchmakingOptimization
  • Need of a Grid Information System LCG BDII
  • Data Management
  • Data replica management
  • Remote Data location via File catalogue and
    metadata catalogue (RLS, Replica Location
    Service) WEB Services interface
  • Performace issues
  • Data transfer and access,
  • GridFTP, Globus-url-copy Provides
    high-performance, reliable data transfer
  • RM provide optimized transfer
  • Storage Resource Management (SRM) for
  • Storage allocation
  • File pinning..
  • Grid User Interface

28
Grid services tests
  • General Services
  • Security based on Globus GSI (Grid Security
    Infrastructure)
  • Standard Protocols X.509 certificates, PKI,
    GSS-API .
  • Login once (credential delegation)
  • VO oriented Authentication/Authorization tools
    (VOMS)
  • Monitoring System
  • VO oriented Policy and accounting system
  • VO oriented User Support systems

29
Grid Information Service
  • The Information Service plays a fundamental role
    since resource discovery and decision making is
    based upon the information service
    infrastructure. Basically an IS is needed to
    collect and organize, in a coherent manner,
    information about grid resources and status and
    make them available to the consumer entities.
  • Resource schema conceptual model of grid
    resources to be used as a base schema of the Grid
    Information Service for discovery and monitoring
    purposes.
  • GLUE schemas aims to provide standards for
  • Computing Service information model
  • Storage Manager Service information model
  • Network Service information model (connectivity
    between Grid Domains)
  • New astronomical catalogue model (just started
    within Grid.it as collaboration between INFN and
    INAF)

30
GLUE WHO and WHEN
  • GLUE (Grid Laboratory Uniform Environment),
    promoted by EU-DataTAG and US-iVDGL in the
    context of the High Energy and Nuclear Physics
    InterGrid Joint Technical Board (HI-JTB
    http//www.hicb.org/)
  • GLUE is a collaborate joint efforts towards
    standards based global Grid infrastructure for
    the HEP Experiments, focusing on interoperability
    issues between US and EU HEP Grid related
    projects.
  • Frst Results on Grid resource schema and user
    authentication/authorization management.
    Contributions from DataGrid, Globus, PPDG and
    GriPhyn
  • GLUE Schema activity started in April 2002
  • Objective define a common schema to represent
    Grid resources in order to support the activity
    of discovery and monitoring

31
Included in GT2 and EDG 2.0 release
32
Included in GT2 and EDG 2.0 release
33
(No Transcript)
34
INFN/DataGrid Information Directory Information
Tree
35
Real Grid Job Submission is allowed via the
Work load Management Service (INFN within
EU-DataGrid)
  • The user interacts with Grid via a Workload
    Management System (not directly with GRAM)
  • The Goal of WMS is the distributed scheduling
    and resource management in a Grid environment.
  • What does it allow Grid users to do?
  • To submit their jobs via a Job description
    language
  • To execute them
  • selecting the CE
  • or leaving WMS to optimize according to data and
    CPU availability
  • To get information about their status
  • To retrieve their output
  • The WMS tries to optimize the usage of resources
    using Match Making and re-scheduling

collective
36
GUI APIs
37
WMS Components
  • WMS is currently composed of the following parts
  • User Interface (UI) access point for the user
    to the WMS
  • Resource Broker (RB) the broker of GRID
    resources, responsible to find the best
    resources where to submit jobs
  • Job Submission Service (JSS) provides a
    reliable submission system
  • Information Index (II) a specialized Globus
    GIIS (LDAP server) used by the Resource Broker as
    a filter to the information service (IS) to
    select resources
  • Logging and Bookkeeping services (LB) store Job
    Info available for users to query

38
WMS Architecture in EDG v2.x
RLS
RB node
Network Server
Match- Maker/ Broker
Inform. Service
Workload Manager
Job Adapter
RB storage
Job Contr. - CondorG
Logging Bookkeeping
CE characts status
SE characts status
Log Monitor
39
WMS release 2 functionalities
  • User APIs
  • GUI
  • Support for interactive jobs
  • Job checkpointing
  • Support for parallel jobs
  • Support for automatic output data upload and
    registration
  • VOMS(Managerment Service) support
  • Support for job dependencies (via DAGman)
  • Lazy scheduling job (node) bound to a resource
    (by RB) just before that job can be submitted
    (i.e. when it is free of dependencies)

40
WMS Future activities
  • New functionalities
  • Support for job partitioning (available soon)
  • Use of job checkpointing and DAGMan mechanisms
  • Original job partitioned in sub-jobs which can be
    executed in parallel
  • At the end each sub-job must save a final state,
    then retrieved by a job aggregator, responsible
    to collect the results of the sub-jobs and
    produce the overall output
  • Grid Accounting DGAS (testing in progress)
  • Based upon a computational economy model
  • Users pay in order to execute their jobs on the
    resources and the owner of the resources earn
    credits by executing the user jobs
  • Start with providing detailed accounting of
    resource usage
  • Scheduling optimization
  • VO-based Resource access Policy support PBOX
    (development in progress) Grid resource sharing
    requires
  • to deploy VO-wide policies.
  • to respect local site policies.
  • to specify policies relating to the behavior of
    the grid as a whole.
  • RB will take decision (Match Making) on the
    basis of VO/user policies

41
Grid Storage Element Interfaces
  • First version of SE in LCG DC
  • disk server with GridFTP and NFS server protocols
  • Future SE version
  • SRM interface
  • Management and control
  • SRM (with possible evolution)
  • Posix-like File I/O
  • File Access
  • Open, read, write
  • Not real posix (like rfio)

Storage Management
SRM interface
POSIXAPI File I/O
rfio
dcap
chirp
aio
File access
dCache
NeST
Castor MSS
Disk
42
Data Management services
  • A Replica Location Service (RLS) is a distributed
    registry service that records the locations of
    data copies and allows discovery of replicas
  • Maintains mappings between logical identifiers
    and target names
  • Physical targets Map to exact locations of
    replicated data
  • Logical targets Map to another layer of logical
    names, allowing storage systems to move data
    without informing the RLS
  • Provide optimization to replica access
  • RLS was designed and implemented in a
    collaboration between the Globus project and the
    DataGrid project
  • Different interfaces
  • WMS interacts with RLS to optimize job scheduling

43
WMS DM architecture/interaction
Resource Broker
Information Service
User Interface
Replica Location Service
Replica Manager Client
Use
Communication
DM communication
Replica Optimisation
StorageElementMonitor
StorageElement
NetworkMonitor
44
General Grid services
  • General Services
  • Security
  • Login once
  • Based on Public Key Infrastructure
  • VO oriented Authentication/Authorization tools
  • VO oriented Policy and Accounting
  • Monitoring System
  • User Support System

45
User interacts with CA, VO and Resource Providers
  • Certificates are issued by a set of well-defined
    Certification Authorities (CAs).
  • Grant authorization at the VO level.
  • Each VO has its own VOMS server.
  • Contains (group / role / capabilities) triples
    for each member of the VO.
  • RPs evaluate authorization granted by VO to a
    user and map into local credentials to access
    resources
  • CAs Policies and procedures ? mutual thrust

2
1
4
VO-Manager (administer user membership, roles
and Capabilities)
cert-request
3
agreement
cert signing
Resource provider (map into Local credential)
cert/crl update
R B
Service
46
Near futureVO oriented Policy system PBOX
  • Policy Examples
  • Users belonging to group /vo/a may only submit 10
    jobs a day.
  • Users belonging to group /vo/b should have their
    jobs submitted on the max priority queue.
  • User some user is banned from the CNAF site.
  • Requirements. The system should
  • Be VO-based and distributed.
  • Be highly configurable and able to define and
    enforce previously unknown types of policies.
  • Leave total control on local sites to local
    admins.
  • Be capable of express policies requiring a global
    view of the grid.
  • Be compliant to existing protocols and not
    require their redesign.
  • Objective help the Workload Management System in
    Grid resource selection.

47
PBOX
User
VO PBOX
VO Admin
Resource Broker
Policy enforcement point (PEP)
Policy decision point (PDP)
Policy Communication Interface (PCI)
Site Admin
Policy Communication Interface (PCI)
Grid or FARM PBOX
Policy enforcement point (PEP)
Policy decision point (PDP)
48
Conclusions
  • First generation of Grid services are ready for
    production Grids and already in use in LCG in EU,
    Grid.IT in Italy and in Grid3 in US.
  • They are still evolving for more functionalities,
    robustness and security
  • Application as LHC experiments Data Challenges
    indicate clear directions for the evolution to
    satisfy those communities
  • Major basic services are still completely missing
    or missing very important functionalities
    required by user communities
  • Metadata Catalogs, User defined collections,
    Reliable Data and Metadata replication services,
    Policy Framework..
  • Next major step now is towards Bringing Grid
    Web Services Together
  • The Web Service Resource Framework WSRF is the
    new proposed standard Modeling Stateful
    Resources with Web Services
  • WS-Notification
  • Provides a publish-subscribe messaging capability
    for Web Services
  • WS-Resource framework Transactions..
  • However implementation of basic WSRF services
    still missing. No measurement of performance
  • LHC experiments (10 people) need to have a fully
    operational infrastructure in place for 2007.
    Should concentrate on providing basic services
    especially those missing
Write a Comment
User Comments (0)
About PowerShow.com