eScience - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

eScience

Description:

Hardware virtualisation one or more virtual machines running an operating system within a host system e.g. run Linux (guest) in a ... Chemistry use of ... – PowerPoint PPT presentation

Number of Views:198
Avg rating:3.0/5.0
Slides: 39
Provided by: RobinMi2
Category:

less

Transcript and Presenter's Notes

Title: eScience


1
eScience Grid ComputingGraduate Lecture
  • 5th November 2012
  • Robin Middleton PPD/RAL/STFC
  • (Robin.Middleton_at_stfc.ac.uk)

I am indebted to the EGEE, EGI, LCG and
GridPP projects and to colleagues therein for
much of the material presented here.
2
eScience Graduate Lecture
A high-level look at some aspects of computing
for particle physics today
  • What is eScience, what is the Grid ?
  • Essential grid components
  • Grids in HEP
  • The wider picture
  • Summary

3
What is eScience ?
  • also e-Infrastructure, cyberinfrastructure,
    e-Research,
  • Includes
  • grid computing (e.g. WLCG, EGEE, EGI, OSG,
    TeraGrid, NGS)
  • computationally and/or data intensive highly
    distributed over wide area
  • digital curation
  • digital libraries
  • collaborative tools (e.g. Access Grid)
  • other areas
  • Most UK Research Councils active in e-Science
  • BBSRC
  • NERC (e.g. climate studies, NERC DataGrid -
    http//ndg.nerc.ac.uk/ )
  • ESRC (e.g. NCeSS - http//www.merc.ac.uk/ )
  • AHRC (e.g. studies in collaborative performing
    arts)
  • EPSRC (e.g. MyGrid - http//www.mygrid.org.uk/ )
  • STFC (e.g. GridPP - http//www.gridpp.ac.uk/ )

4
eScience year 2000
  • Professor Sir John Taylor, former (1999-2003)
    Director General of the UK Research Councils,
    defined eScience thus
  • science increasingly done through distributed
    global collaborations enabled by the internet,
    using very large data collections, terascale
    computing resources and high performance
    visualisation.
  • Also quotes from Professor Taylor
  • e-Science is about global collaboration in key
    areas of science, and the next generation of
    infrastructure that will enable it.
  • e-Science will change the dynamic of the way
    science is undertaken.

5
What is Grid Computing ?
  • Grid Computing
  • term invented in 1990s as metaphor for making
    computer power as easy to access as the electric
    power grid(Foster Kesselman - "The Grid
    Blueprint for a new computing infrastructure)
  • combines computing resources from multiple
    administrative domains
  • CPU and storageloosely coupled
  • serves the needs of one or more virtual
    organisations (e.g. LHC experiments)
  • different from
  • Cloud Computing (e.g. Amazon Elastic Compute
    Cloud - http//aws.amazon.com/ec2/ )
  • Volunteer Computing (SETI_at_home, LHC_at_home -
    http//boinc.berkeley.edu/projects.php )

6
Essential Grid Components
  • Middleware
  • Information System
  • Workload Management Portals
  • Data Management
  • File transfer
  • File catalogue
  • Security
  • Virtual Organisations
  • Authentication
  • Authorisation
  • Accounting

7
Information System
  • At the heart of the Grid
  • Hierarchy of BDII (LDAP) servers
  • GLUE information schema
  • (http//www.ogf.org/documents/GFD.147.pdf)
  • LDAP (Lightweight Directory Access Protocol)
  • tree structure
  • DN Distinguished Name

8
Workload Management System (WMS)
  • For example - composed of the following parts
  • User Interface (UI) access point for the user
    to the WMS
  • Resource Broker (RB) the broker of GRID
    resources, responsible to find the best
    resources where to submit jobs
  • Job Submission Service (JSS) provides a
    reliable submission system
  • Information Index (BDII) a server (based on
    LDAP) which collects information about Grid
    resources used by the Resource Broker to rank
    and select resources
  • Logging and Bookkeeping services (LB) store Job
    Info available for users to query
  • However, you are much more likely to use a portal
    to submit work
  • Executable gridTest
  • StdError stderr.log
  • StdOutput stdout.log
  • InputSandbox /home/robin/test/gridTest
  • OutputSandbox stderr.log, stdout.log
  • InputData lfntestbed0-00019
  • DataAccessProtocol gridftp
  • Requirements other.ArchitectureINTEL \
    other.OpSysLINUX other.FreeCpus gt4
  • Rank other.GlueHostBenchmarkSF00

Example JDL
9
Portals - Ganga
  • Job Definition Management
  • Implemented in Python
  • Extensible plug-ins
  • Used ATLAS, LHCb non-HEP
  • http//ganga.web.cern.ch/ganga/index.php

10
Data Management
  • Storage Element (SE)
  • gt1 implementation
  • all are accessed through SRM (Storage Resource
    Manager) interface
  • DPM Disk Pool Manager (disk only)
  • secure authentication via GSI, authorisation via
    VOMS
  • full POSIX ACL support with DN (userid) and VOMS
    groups
  • disk pool management (direct socket interface)
  • storage name space (aka. storage file catalog)
  • DPM can act as a site local replica catalog
  • SRMv1, SRMv2.1 and SRMv2.2
  • gridFTP, rfio
  • dCache (disk tape) developed at DESY
  • ENSTORE developed at Fermilab
  • CASTOR devloped at CERN
  • Cern Advanced STORage manager
  • HSM Hierarchical Storage Manager
  • disk cache tape

11
File Transfer Service
  • File Transfer Service is a data movement fabric
    service
  • multi-VO service, balance usage of site resources
    according to VO and site policies
  • uses SRM and gridFTP services of an Storage
    Element (SE)
  • Why is it needed ?
  • For the user, the service it provides is the
    reliable point to point movement of Storage URLs
    (SURLs) among Storage Elements
  • For the site manager, it provides a reliable and
    manageable way of serving file movement requests
    from their VOs
  • For the VO manager, it provides ability to
    control requests coming from users(re-ordering,
    prioritization,...)

12
File Catalogue
  • LFC LHC File Catalogue - a file location
    service
  • Glossary
  • LFN Logical File Name GUID Global Unique
    ID SURL Storage URL
  • Provides a mapping from one or more LFN to the
    physical location of file
  • Authentication authorisation is via a grid
    certificate
  • Provides very limited metadata size, checksum
  • Experiments usually have a metadata catalogue
    layered above LFC
  • e.g. AMI ATLAS Metadata Interface

13
Grid Security
  • Based around X.509 certificates Public Key
    Infrastructure (PKI)
  • issued by Certificate Authorities
  • forms a hierarchy of trust
  • Glossary
  • CA Certificate Authority
  • RA Registration Authority
  • VA Validation Authority
  • How it Works
  • User applies for certificate with public key at a
    RA
  • RA confirms user's identity to CA which in turn
    issues the certificate
  • User can then digitally sign a contract using the
    new certificate
  • User identity is checked by the contracting party
    with VA
  • VA receives information about issued certificates
    by CA

14
Virtual Organisations
  • Aggregation of groups ( individuals) sharing use
    of (distributed) resources to a common end under
    an agreed set of policies
  • a semi-informal structure orthogonal to normal
    institutional allegiances
  • e.g. A HEP Experiment
  • Grid Policies
  • Acceptable use Grid Security New VO
    registration
  • http//proj-lcg-security.web.cern.ch/proj-lcg-secu
    rity/security_policy.html
  • VO specific environment
  • experiment libraries, databases,
  • resource sites declare which VOsit will support

15
Security - The Three As
  • Authentication
  • verifying that you are who you say you are
  • your Grid Certificate is your passport
  • Authorisation
  • knowing who you are, validating what you are
    permitted to do
  • e.g. submit analysis jobs as a member of LHCb
  • e.g. VO capability to manage production software
  • Accounting (auditing)
  • local logging what you have done your jobs !
  • aggregated into grid-wide respository
  • provides
  • usage statistics
  • information source in event of security incident

16
Grids in HEP
  • LCG EGEE EGI Projects
  • GridPP
  • The LHC Computing Grid
  • Tiers 0,1,2
  • The LHC OPN
  • Experiment Computing Models
  • Typical data access patterns
  • Monitoring
  • Resource providers view
  • VO view
  • End-user view

17
LCG?EGEE-gtEGI
LCG ? LHC Computing Grid Distributed Production
Environment for Physics Data Processing Worlds
largest production computing grid In 2011
gt250,000 CPU cores, 15PB/Yr, 8000 physicist, 500
institutes
EGEE ? Enabling Grids for E-sciencE Starts from
LCG infrastructure Production Grid in 27
countries HEP, BioMed, CompChem, Earth Science,
EU Support
18
GridPP
  • Phase 1 2001-2004
  • Prototype (Tier-1)
  • Phase 2 2004-2008
  • From Prototype to Production
  • Production (Tier-12)
  • Phase 3 2008-2011
  • From Production to Exploitation
  • Reconstruction, Monte Carlo, Analysis
  • Phase 4 2011-2014
  • routine operation during LHC running
  • Integrated within the LCG/EGI framework
  • UK Service Operations (LCG/EGI)
  • Tier-1 Tier-2s
  • HEP Experiments
  • _at_ LHC, FNAL, SLAC
  • GANGA (LHCb ATLAS)
  • Working with NGS informing the UK NGI for EGI

Tier-1 Farm Usage
19
LCG The LHC Computing Grid
  • Worldwide LHC Computing Grid - http//lcg.web.cern
    .ch/lcg/
  • Framework to deliver distributed computing for
    theLHC experiments
  • Middleware / Deployment
  • (Service/Data Challenges)
  • Security (operations policy)
  • Applications (Experiment) Software
  • Distributed Analysis
  • Private Optical Network
  • Experiments ? Resources ? MoUs
  • Coverage
  • Europe ? EGI
  • USA ? OSG
  • Asia ? Naregi, Taipei,China
  • Other

20
LHC Computing Model
The LHC Computing Centre
CERN Tier 0
21
LHCOPN Optical Private Network
  • Principle means to distribute LHC data
  • Primarily linking Tier-0 and Tier-1s
  • Some Tier-1 to Tier-1 Traffic
  • Runs over leased lines
  • Some resilience
  • Mostly based on10 Gigabit technology
  • Reflects Tierarchitecture

22
LHC Experiment Computing Models
  • General (ignoring experiment specifics)
  • Tier-0 (_at_CERN)
  • 1st pass reconstruction (including initial
    calibration)
  • RAW data storage
  • Tier-1
  • Re-processing some centrally organised analysis
  • Custodial copy of RAW data, some ESD, all AOD,
    some SIMU
  • Tier-2
  • (chaotic) user analysis simulation
  • some AOD (depends on local requirements)
  • Event sizes determine disk buffers at experiments
    Tier-0
  • Event datasets
  • formats (RAW, ESD, AOD, etc)
  • (adaptive) placement (near analysis) replicas
  • Data streams physics specific, debug,
    diagnostic, express, calibration
  • CPU storage requirements
  • Simulation

23
Typical Data Access Patterns
Access Rates (aggregate, average) 100 Mbytes/s
(2-5 physicists) 500 Mbytes/s (5-10
physicists) 1000 Mbytes/s (50 physicists) 2000
Mbytes/s (150 physicists)
Typical LHC particle physics experiment One year
of acquisition and analysis of data
Raw Data 1000 Tbytes
Reco-V1 1000 Tbytes
Reco-V2 1000 Tbytes
ESD-V1.1 100 Tbytes
ESD-V1.2 100 Tbytes
ESD-V2.1 100 Tbytes
ESD-V2.2 100 Tbytes
AOD 10 TB
AOD 10 TB
AOD 10 TB
AOD 10 TB
AOD 10 TB
AOD 10 TB
AOD 10 TB
AOD 10 TB
AOD 10 TB
24
Monitoring
  • A resource providers view

25
Monitoring
  • Virtual Organisation specifics

26
Monitoring - Dashboards
  • Virtual Organisation view
  • e.g. ATLAS dashboard

27
Monitoring Dashboards
  • For the end user
  • available through dashboard

28
The wider Picture
  • What some other communities do with Grids
  • The ESFRI projects
  • Virtual Instruments
  • Digital Curation
  • Clouds
  • Volunteer Computing
  • Virtualisation

29
What are other communities doing with grids ?
  • Astronomy Astrophysics
  • large-scale data acquisition, simulation, data
    storage/retrieval
  • Computational Chemistry
  • use of software packages (incl. commercial) on
    EGEE
  • Earth Sciences
  • Seismology, Atmospheric modeling, Meteorology,
    Flood forecasting, Pollution
  • Fusion (build up to ITER)
  • Ion Kinetic Transport, Massive Ray Tracing,
    Stellarator Optimization.
  • Computer Science
  • collect data on Grid behaviour (Grid Observatory)
  • High Energy Physics
  • four LHC experiments, BaBar, D0, CDF, Lattice
    QCD, Geant4, SixTrack,
  • Life Sciences
  • Medical Imaging, Bioinformatics, Drug discovery
  • WISDOM drug discovery for neglected / emergent
    diseases(malaria, H5N1, )

30
ESFRI Projects(European Strategy Forum on
Research Infrastructures)
  • Many are starting to look at their e-Science
    needs
  • some at a similar scale to the LHC (petascale)
  • project design study stage
  • http//cordis.europa.eu/esfri/

Cherenkov Telescope Array
31
Virtual Instruments
  • Integration of scientific instruments into the
    Grid
  • remote operation, monitoring, scheduling,
    sharing
  • GridCC - Grid enabled Remote Instrumentation with
    Distributed Control and Computation
  • CR build workflows to monitor control remote
    instruments in real-time
  • CE, SE, ES , IS SS as in a normal grid
  • Monitoring services
  • Instrument Element (IE)
  • - interfaces for remote control monitoring
  • CMS run control includes an IEbut notreally
    exploited (yet) !
  • DORII Deployment Of Remote Instrumentation
    Infrastructure
  • Consolidation of GridCC with EGEE,
    g-Eclipse,Open MPI, VLab
  • The Liverpool Telescope - robotic
  • not just remote control, but fully autonomous
  • scheduler operates on basis of observingdatabase
  • (http//telescope.livjm.ac.uk/)

32
Digital Curation
  • Preservation of digital research data for future
    use
  • Issues
  • media data formats metadata data management
    tools reading (FORTRAN) ...
  • digital curation lifecycle - http//www.dcc.ac.uk/
    digital-curation/what-digital-curation
  • Digital Curation Centre - http//www.dcc.ac.uk/
  • NOT a repository !
  • strategic leadership
  • influence national (international) policy
  • expert advice for both users and funders
  • maintains suite of resources and tools
  • raise levels of awareness and expertise

33
JADE (1978-86)
  • New results from old data
  • new improved theoretical calculations MC
    models optimised observables
  • better understanding of Standard Model (top, W,
    Z)
  • re-do measurements better precision, better
    systematics
  • new measurements, but at (lower) energies not
    available today
  • new phenomena check at lower energies
  • Challenges
  • rescue data from (very) old media resurrect old
    software data management implement modern
    analysis techniques
  • but, luminosity files lost recovered from ASCII
    printout in an office cleanup
  • Since 1996
  • 10 publications (as recent as 2009)
  • 10 conference contributions
  • a few PhD Theses
  • (ack S.Bethke)

34
What is HEP doing about it ?
  • ICFA Study Group on Data Preservation and Long
    Term Analysis in High Energy Physics
    https//www.dphep.org/
  • 5 Workshops so far intermediate report to ICFA
  • Available at arxiv0912.0255
  • Initial recommendationsDecember 2009
  • Blueprint for DataPreservation in HighEnergy
    Physics to follow

35
Grids, Clouds, Supercomputers,
(Ack Bob Jones former EGEE Project Director)
  • Grids
  • Collaborative environment
  • Distributed resources (political/sociological)
  • Commodity hardware (also supercomputers)
  • (HEP) data management
  • Complex interfaces (bug not feature)
  • Supercomputers
  • Expensive
  • Low latency interconnects
  • Applications peer reviewed
  • Parallel/coupled applications
  • Traditional interfaces (login)
  • Also SC grids (DEISA, Teragrid)
  • Clouds
  • Proprietary (implementation)
  • Economies of scale in management
  • Commodity hardware
  • Virtualisation for service provision and
    encapsulating application environment
  • Details of physical resources hidden
  • Simple interfaces (too simple?)
  • Volunteer computing
  • Simple mechanism to access millions CPUs
  • Difficult if (much) data involved
  • Control of environment ? check
  • Community building people involved in Science
  • Potential for huge amounts of real work

35
36
Clouds / Volunteer Computing
  • Clouds are largely commercial
  • Pay for use
  • Interfaces from grids exist
  • absorb peak demands(e.g. before a conference !)
  • CernVM images exist
  • Volunteer Computing
  • LHC_at_Home
  • SixTrack study particle orbitstability in
    accelerators
  • Garfield study behaviour of gas-based detectors

37
Virtualisation
  • Virtual implementation of a resource e.g. a
    hardware platform
  • a current buzzword, but not new IBM launched
    VM/370 in 1972 !
  • Hardware virtualisation
  • one or more virtual machines running an operating
    system within a host system
  • e.g. run Linux (guest) in a virtual machine (VM)
    with Microsoft Windows (host)
  • independent of hardware platform migration
    between (different) platforms
  • run multiple instances on one box provides
    isolation (e.g. against rogue s/w)
  • Hardware-assisted virtualisation
  • not all machine instructions are virtualisable
    (e.g. some privileged instructions)
  • h/w-assist traps such instructions and provides
    hardware emulation of them
  • Implementations
  • Zen, VMware, VirtualBox, Microsoft Virtual PC,
  • Interest to HEP ?
  • the above opportunity to tailor to experiment
    needs (e.g. libraries, environment)
  • CernVM CERN specific Linux environment -
    http//cernvm.cern.ch/portal/
  • CernVM-FS network filesystem to access
    experiment specific software
  • Security certificate to assure origin/validity
    of VM

38
Summary
  • What is eScience about and what are Grids
  • Essential components of a Grid
  • middleware
  • virtual organisations
  • Grids in HEP
  • LHC Computing GRID
  • A look outside HEP
  • examples of what others are doing
Write a Comment
User Comments (0)
About PowerShow.com