Antun Balaz, antun.balaz@scl.rs - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Antun Balaz, antun.balaz@scl.rs

Description:

Overview Introduction to clusters High performance computing Grid computing paradigm Ingredients for ... IP R&E network G ANT2 Global Connectivity Future ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 30
Provided by: dusanvud
Category:

less

Transcript and Presenter's Notes

Title: Antun Balaz, antun.balaz@scl.rs


1
High Performance Cluster and Grid Computing
  • Antun Balaz, antun.balaz_at_scl.rs
  • Scientific Computing Laboratory
  • Institute of Physics Belgrade
  • Serbia

Introduction to High Performance and Grid
Computing
2
Overview
  • Introduction to clusters
  • High performance computing
  • Grid computing paradigm
  • Ingredients for Grid development
  • Introduction to Grid middleware

3
Parallel computing
  • Splitting problem in smaller tasks that are
    executed concurrently
  • Why?
  • Absolute physical limits of hardware components
    (speed of light, electron speed, )
  • Economical reasons more complex more expensive
  • Performance limits double frequency ltgt double
    performance
  • Large applications demand too much memory time
  • Advantages Increasing speed optimizing
    resources utilization
  • Disadvantages Complex programming models
    difficult development

4
Parallelism levels
  • CPU
  • Multiple CPUs
  • Multiple CPU cores
  • Threads time sharing
  • Memory
  • Shared
  • Distributed
  • Hybrid (virtual shared memory)

5
Parallel architectures (1)
  • Vector machines
  • CPU processes multiple data sets
  • shared memory
  • advantages performance, programming difficulties
  • issues scalability, price
  • examples Cray SV, NEC SX, Athlon3/d, Pentium-
    IV/SSE/SSE2
  • Massively parallel processors (MPP)
  • large number of CPUs
  • distributed memory
  • advantages scalability, price
  • issues performance, programming difficulties
  • examples ConnectionSystemsCM1 i CM2, GAAP
    (GeometricArrayParallel Processor)

6
Parallel architectures (2)
  • Symmetric Multiple Processing (SMP)
  • two or more processors
  • shared memory
  • advantages price, performance, programming
    difficulties
  • issues scalability
  • examples UltraSparcII, Alpha ES, Generic
    Itanium, Opteron, Xeon,
  • Non Uniform Memory Access (NUMA)
  • Solving SMPsscalability issue
  • hybrid memory model
  • advantages scalability
  • issues price, performance, programming
    difficulties
  • examples SGI Origin/Altix, Alpha GS, HP
    Superdome

7
Clusters
  • Poors man supercomputer Collection of
    interconnected stand-alone computers working
    together as a single, integrated computing
    resourceR. Buyya
  • Cluster consists of
  • Nodes
  • Network
  • OS
  • Cluster middleware
  • Standard components
  • Avoiding expensive proprietary components

8
Cluster classification
  • High performance clusters (HPC)
  • Parallel, tightly coupled applications
  • High throughput clusters (HTC)
  • Large number of independent tasks
  • High availability clusters (HA)
  • Mission critical applications
  • Load balancing clusters
  • Web servers, mail servers,
  • Hybrid clusters
  • Example HPCHA

9
Beowulf clusters
  • 1994
  • T. Sterling M. Baker
  • NASA Ames Centre
  • Frontend
  • Access machine
  • JMS Monitoring server
  • Shared storage NFS (directory /home)
  • Nodes
  • Multiple private networks
  • Local storage (/scratch)
  • Private networks
  • High speed / low latency

10
From clusters to Grids
  • Many distributed computing resources (clusters)
    exist, even in Serbia
  • Problem 1 they cannot be used by end users
    transparently
  • Problem 2 even when access is granted to users
    to several clusters, they tend to neglect smaller
    clusters
  • Problem 3 distribution of input/output data,
    sharing of data between clusters
  • To overcome such problems, Grid paradigm was
    introduced

11
Unifying concept Grid
Resource sharing and coordinated problem solving
in dynamic, multi-institutional virtual
organizations.
12
Effective policy governing access within a
collaboration
13
What problems Grid addresses
  • Too hard to keep track of authentication data
    (ID/password) across institutions
  • Too hard to monitor system and application status
    across institutions
  • Too many ways to submit jobs
  • Too many ways to store access files/data
  • Too many ways to keep track of data
  • Too easy to leave dangling resources lying
    around (robustness)

14
Requirements
  • Security
  • Monitoring/Discovery
  • Computing/Processing Power
  • Moving and Managing Data
  • Managing Systems
  • System Packaging/Distribution
  • Secure, reliable, on-demand access to data,
    software, people, and other resources (ideally
    all via a Web Browser!)

15
Why Grid security is hard (1)
  • Resources being used may be valuable the
    problems being solved sensitive
  • Both users and resources need to be careful
  • Dynamic formation and management of user groups
  • Large, dynamic, unpredictable
  • Resources and users are often located in distinct
    administrative domains- Cannot assume
    cross-organizational trust agreements
  • Different mechanisms credentials

16
Why Grid security is hard (2)
  • Interactions are not just client/server, but
    service-to-service on behalf of user
  • Requires delegation of rights user ? service
  • Services may be dynamically instantiated
  • Standardization of interfaces to allow for
    discovery, negotiation and use
  • Implementation must be broadly available
    applicable
  • Standard, well-tested, well-understood
    protocols integrated with wide variety of tools
  • Policy from sites, user communities and users
    need to be combined
  • Varying formats
  • Want to hide as much as possible from
    applications!

17
Grids and VOs (1)
  • Virtual organizations (VOs) are groups of Grid
    users (authenticated through digital
    certificates)
  • VO Management Service (VOMS) serves as a central
    repository for user authorization information,
    providing support for sorting users into a
    general group hierarchy, keeping track of their
    roles,etc.
  • VO Manager, according to VO policies and rules,
    authorizes authenticated users to become VO
    members

18
Grids and VOs (2)
  • Resource centers (RCs) may support one or more
    VOs, and this is how users are authorized to use
    computing, storage and other Grid resources
  • VOMS allows flexible approach to AA on the Grid

19
User view of the Grid
  • User Interface
  • User Interface
  • Grid services

20
Ingredients for GRID development
  • Right balance of push and pull factors is needed
  • Supply side
  • Technology inexpensive HPC resources (linux
    clusters)
  • Technology network infrastructure
  • Financing domestic, regional, EU, donations
    from industry
  • Demand side
  • Need for novel eScience applications
  • Hunger for number crunching power and storage
    capacity

21
Supply side - clusters
  • The cheapest supercomputers massively parallel
    PC clusters
  • This is possible due to
  • Increase in PC processor speed (gt Gflop/s)
  • Increase in networking performance (1 Gbs)
  • Availability of stable OS (e.g. Linux)
  • Availability of standard parallel libraries (e.g.
    MPI)
  • Advantages
  • Widespread choice of components/vendors, low
    price (by factor 5-10)
  • Long warranty periods, easy servicing
  • Simple upgrade path
  • Disadvantages
  • Good knowledge of parallel programming is
    required
  • Hardware needs to be adjusted to the specific
    application (network topology)
  • More complex administration
  • Tradeoff brain power ? ? purchasing power
  • The next step is GRID
  • Distributed computing, computing on demand
  • Should do for computing the same as the Internet
    did for information (UK PM, 2002)

22
Supply side - network
  • Needed at all scales
  • World-wide
  • Pan-European (GEANT2)
  • Regional (SEEREN2, )
  • National (NREN)
  • Campus-wide (WAN)
  • Building-wide (LAN)
  • Remember it is end user to end user connection
    that matters

23
GÉANT2 Pan-European IP RE network
24
GÉANT2 Global Connectivity
25
Future development regional network
26
Supply side - financing
  • National funding (Ministries responsible for
    research)
  • Lobby gvnmt. to commit to Lisbon targets
  • Level of financing should be following an
    increasing trend (as a of GDP)
  • Seek financing for clusters and network costs
  • Bilateral projects and donations
  • Regional initiatives
  • Networking (HIPERB)
  • Action Plan for RD in SEE
  • EU funding
  • FP6 IST priority, eInfrastructures GRIDs
  • FP7
  • CARDS
  • Other international sources (NATO, )
  • Donations from industry (HP, SUN, )

27
Demand side - eScience
  • Usage of computers in science
  • Trivial text editing, elementary visualization,
    elementary quadrature, special functions, ...
  • Nontrivial differential eq., large linear
    systems, searching combinatorial spaces, symbolic
    algebraic manipulations, statistical data
    analysis, visualization, ...
  • Advanced stochastic simulations, risk
    assessment in complex systems, dynamics of the
    systems with many degrees of freedom, PDE
    solving, calculation of partition
    functions/functional integrals, ...
  • Why is the use of computation in science growing?
  • Computational resources are more and more
    powerful and available (Moores law)
  • Standard approaches are having problemsExperiment
    s are more costly, theory more difficult
  • Emergence of new fields/consumers finance,
    economy, biology, sociology
  • Emergence of new problems with unprecedented
    storage and/or processor requirements

28
Demand side - consumers
  • Those who study
  • Complex discrete time phenomena
  • Nontrivial combinatorial spaces
  • Classical many-body systems
  • Stress/strain analysis, crack propagation
  • Schrodinger eq diffusion eq.
  • Navier-Stokes eq. and its derivates
  • functional integrals
  • Decision making processes w. incomplete
    information
  • Who can deliver? Those with
  • Adequate training in mathematics/informatics
  • Stamina needed for complex problems solving
  • Answer rocket scientists (natural sciences and
    engineering)

29
Scenario
User interface
stderr.txt
stdout.txt
stderr.txt
stdout.txt
publish state
Input sandbox
Output sandbox
A worker node is allocated by the local jobmanager
Logging and bookkeeping
  • STD input stream is read from file
  • STD out and err. streams are redirected into
    files

stderr.txt
/bin/hostname
Computing Element
stdout.txt
Write a Comment
User Comments (0)
About PowerShow.com