Grid Computing: an introduction - PowerPoint PPT Presentation

1 / 198
About This Presentation
Title:

Grid Computing: an introduction

Description:

Weather Forecast and Climate. Simulation of VLSI systems. Parallel Search in Databases ... Fibre channel, Gigabit Ethernet, Web services, XML: 1995-2000 ... – PowerPoint PPT presentation

Number of Views:177
Avg rating:3.0/5.0
Slides: 199
Provided by: joscc
Category:

less

Transcript and Presenter's Notes

Title: Grid Computing: an introduction


1
Grid Computing an introduction
  • José C. Cunha, DI-FCT/UNL

2
Distributed and Parallel Computing
  • Distributed Computing
  • Parallel Computing
  • Grid Computing

3
(No Transcript)
4
Distributed Computing
  • Physically distributed computations and data
  • Goals
  • Adapt to geographical application distribution
  • Provide appropriate levels of transparency
  • Geographical distribution (LAN or WAN)
  • Users / Access / Processing / Archiving Sites
  • Availability and Reliability
  • Fault tolerance / Redundancy

5
Transparency
  • Depends on the layer
  • Failure
  • Communication (message,RPC,memory)
  • Design choices can be revised
  • Interactions events, uncertainty, causality
  • Loose / tight interactions / collaboration
  • Pessimistic / Optimistic Choices (DBs)
  • Sometimes there is no choice
  • mobility, disconnected operation

6
Transparency and Virtualisation
  • Transparency and Awareness
  • The concept of transparency has been revised as
    time passes
  • Raw hardware, Assembly, High-Level Languages,
    etc....Operating Systems,...., Text editors and
    processing tools....
  • The Grid is one of the current revisions....

7
(No Transcript)
8
Parallel Computing
  • Goal to reduce execution time, compared to
    sequential execution.
  • Computer System Architectures
  • Supercomputers
  • Shared / Distributed memory multiprocessors
  • LANs and Clusters of PCs
  • Parallel Programming requires
  • Decompose application in parts
  • Launch tasks in parallel processes
  • Plan the cooperation between tasks

9
(No Transcript)
10
In the 2006... There are increased reasons to
exploit Parallelism
11
Classical Application Areas
  • Science and Engineering
  • Fluid Dynamics
  • Particle Systems in Physics
  • Weather Forecast and Climate
  • Simulation of VLSI systems
  • Parallel Search in Databases
  • Artificial Intelligence
  • ...

12
Great Application Successes
The development of scalable massively parallel
computers was motivated largely by a set of Grand
Challenge applications (courtesy Prof. David
Walker)
  • Climate modelling to understand the Earth's
    climate and predict future changes
  • Computational fluid dynamics to design aerospace
    vehicles and cars
  • Numerical turbulence to develop realistic fluid
    and particle simulations of plasma turbulence to
    optimise performance of fusion devices
  • Rational drug design to discover / design new
    drugs with simulations of macro-molecular
    structure

13
  • But the application profiles have changed

14
Evolution ofApplication Characteristics
  • Complex models simulations
  • Large volumes of input / generated data
  • Difficult interpretation and classification
  • High degree of User interaction
  • Offline / online data processing / visualisation
  • Distinct user interfaces
  • Computational steering
  • Multidisciplinary
  • Heterogeneous models / components
  • Interactions among multiple users /
    collaboration
  • Require parallel and distributed processing

15
Heterogeneous Components
  • Sequential, Parallel, Distributed Problem Solvers
    (simulators, mathematical packages,etc.)
  • Tools for data / result processing,
    interpretation and visualisation
  • Online access to scientific data sets and
    databases
  • Interactive (online) computational steering

16
(No Transcript)
17
Ambitious application requirements
  • Distinct operation modes (offline/online)
  • Distinct user interfaces
  • User / Agent driven control
  • Dynamic modification of operation modes and
    interactions
  • Multiple users concurrently join ongoing
    experiments with distinct roles (observers,
    controllers)

18
Complex cycle of user activities
  • Problem specification
  • Configuration of the environment
  • Component selection (simulation, control,
    visualisation) and configuration
  • Component activation and mapping
  • Initial set up of simulation parameters
  • Start of execution, possibly with monitoring,
    visualisation and steering
  • Analysis of intermediate / final results

19
  • Requirements
  • To meet more complex applications
  • To ease the cycle of application development,
    deployment and control
  • To integrate heterogeneous components into an
    environment
  • To allow transparent access to parallel and
    distributed resources
  • To support collaborative modeling and simulation

20
Problem-solving perspective
  • Integrated environments for solving a class of
    related problems in an application domain

21
Problem-Solving Environments
  • A different approach
  • -- specific methods for each problem domain
  • are encapsulated in components (libraries,
    packages, class OO repositories)
  • -- development and runtime support tools
  • are also made available.
  • Application components and computational tools
    are integrated into a single unified environment
    (PSE)
  • Easy-to-use by the end-user

22
PSE Functionalities
  • Support for problem specification
  • Resource management
  • Execution support services

23
PSEs Users and Developers
  • End-users (scientists,engineers,etc.)
  • Solve a particular problem in a specific
    application domain
  • Perform experiments
  • PSE Developers
  • Develop new algorithms and techniques
  • Integrate them into components and place them in
    component repositories
  • Develop tools to support problem specification
    and application composition
  • Develop tools to help the user choose the best
    solutions and to locate the resources
  • Use the services and interfaces built by the
    System Developers

24
PSEs Problem Specification / Solution
  • Use either
  • Visual programming environment to link software
    components
  • High-level language specification
  • Recommender systems can be used to help user
    choose best way to solve problem and locate
    required software
  • They are very important to enable use of a
    complex computing environment

25
Software Components
  • Components specified in terms of their input and
    output interfaces
  • User ignores internal details of components
  • Components can be interconnected within a visual
    development environment

26
Plug and Play Components (courtesy Prof. David
Walker)
  • Can link the output of one component to the input
    of another.
  • Store components in a repository.

See Triana for example http//triana.co.uk/
27
An Example
28
Impact of PSEs in many areas (1990-1999-2000...)
  • Fully developed PSEs in the Industry, e.g.
    Automotive, Aerospace
  • Many applications in Science and Engineering
  • Design optimisation
  • Application behavior studies (parameter sweeping)
  • Rapid prototyping
  • Decison support
  • Process control
  • Emerging areas Education, Environment, Health,
    Finance
  • A new profile of end-user, beyond the scientist
    and engineer

29
(No Transcript)
30
Computer technology advances
31
Storage evolution (Carl Kesselman)
  • Storage density doubles every 12 months
  • Dramatic growth in online data (1 petabyte 1000
    terabyte 1,000,000 gigabyte)
  • 2000 0.5 petabyte
  • 2005 10 petabytes
  • 2010 100 petabytes
  • 2015 1000 petabytes?
  • Transforming entire disciplines in physical and,
    increasingly, biological sciences etc.

32
Networks evolution (Carl Kesselman)
  • Network vs. computer performance
  • Computer speed doubles every 18 months
  • Network speed doubles every 9 months
  • Difference order of magnitude per 5 years
  • 1986 to 2000
  • Computers x 500
  • Networks x 340,000
  • 2001 to 2010
  • Computers x 60
  • Networks x 4000

Moores Law vs. storage improvements vs. optical
improvements. Graph from Scientific American
(Jan-2001) by Cleo Vilett, source Vined Khoslan,
Kleiner, Caufield and Perkins.
33
Enabling factors
  • The Internet
  • Broadband communications (eg optical-based)
  • Faster processors / HPC using standard / open OS
  • World Wide Web infrastructure and services

34
Major Phases
  • Networking TCP/IP
  • Communications Internet and e-mail
  • Information the Web
  • Computing the Grid

35
Computing milestones
  • Mainframes, time-sharing, Unix, minicomputers
    1960-70
  • PCs, commercial Unix, Crays, Workstations, MPPs
    1980s
  • Clusters, PVM, Linux, PDAs, Open Source, P2P
    1990s
  • Globus Project, 1G Grids 1995-2000
  • 2G Grids, OGSI/OGSA, 3G Grids 2000-05
  • Mainframes, 1960s focus on efficient and
    exploitation of shared resources, virtual monitor
    concept, time-sharing
  • Minicomputers, microcomputers, desktops,
    1970-80s large dissemination of computing power
  • Client-server computing, 1990 distributed
    functionality to the endpoints, the clients
  • Developments on networks and interconnects,
    1980-90s rise of commercial Internet
  • Now, is the Grid time

36
Communication milestones
  • Packet switching, e-mail, ARPAnet, LANs/Ethernet,
    TCP/IP 1960-70-80
  • Internet era, broadband, WWW, wireless 1990s
  • Fibre channel, Gigabit Ethernet, Web services,
    XML 1995-2000
  • Internet from a military Project DARPA to
    academic NSF projects
  • 1969 ARPAnet had 4 nodes
  • mid-1970s around 30 university, military, and
    gov. sites
  • 1974/78 TCP/IP Transmission Control Protocol /
    Internet Protocol
  • 1983 ARPAnet had hundreds of nodes
  • NSFnet, 1980s scientific communication network
    to access NSF supercomputer centers
  • mid-1980s NSFDARPA joined efforts IETF,
    Internet Engineering Task Force shaped modern
    Internet

37
Communication milestones
  • WWW mid-late 1980s, goal to share information
  • HyperTextMarkupLanguage HTML a standard to
    create/organise docs
  • HyperTextTransferProtocol HTTP, browsers and
    servers, to link and access docs online,
    transparently
  • W3C(onsortium) mid-1990s new standards for
    information interchange (XML etc)
  • SONET/DWDM (Synchronous Optical Network/Dense
    Wavelenght Division Multiplexing) optical
    technology, late 1990s, early 2000s
  • - Provides broadband connectivity and services
    at reasonable prices
  • - Corporate WANs at 155 Mbps vs USA 56Kbps in the
    mid-1980s
  • - at 2.5 Gbps since mid-1990s
  • - OC-768 (about 40 Gbps)
  • - A single fiber in the range 1 Tbps using
    high-density DWDM but still out of reach to
    individual organisations

38
Communication milestones
  • Recent past
  • Theoretical WAN performance doubled every 9-12
    months, supported by optical technology
  • But Commercial user-available bandwidth (BW) has
    grown at a much slower rate

39
How about the theoretical maximum communication
speed?
  • 1. In general, the max. available speed is not
    affordable by the end-user
  • Communication speed/cost will continue to
    increase
  • But quality high-speed BW will hardly ever be
    free
  • Providers must react to the continuous pressure
    for more multiplexing of wavelenghts in DWDM
    products
  • And profits are in order
  • the individual end-user will hardly afford the
    theoretical performance
  • Cf actual cost of a bottle of mineral water
  • 2. Plus the effects of the Overheads due to the
    communication protocol layers
  • 3. And also It all depends on the application
    profile Is it CPU-bound or I/O-bound?
  • A grid job splits into multiple components which
    are spread on the grid
  • ? need to locate one another
  • ? to establish communication connections
  • ? to send data

40
History of sharing
  • 1965 MIT Multics operating system (multi-user
    time-sharing system)
  • A computer facility should operate like a power
    company or water company
  • Late 1960s-early 1970s when computers were first
    linked by data communication networks, the ARPA
    net supported early experiments on exploiting
    unused remote machine cycles
  • 1973 Xerox PARC worm program replicated itself
    in about 100 Ethernet-connected computers
  • Each worm used idle resources to perform a
    computation
  • Could replicate and send clones to other nodes
  • Since 1990s parallel and distributed computing
  • Widely available PCs and workstations
  • High-speed networks such as Gigabit Ethernet
  • Clusters for HPC

41
History of sharing
  • Clusters motivated interest in
  • Aggregating distributed resources to solve
    complex problems via parallel computing and also
    to support reliability via redundancy
  • 2002 NSF installed the TeraGrid transcontinental
    (virtual) supercomputer set up HPC clusters at 4
    sites (NCSA /ANL Illinois, and Caltech/SDSCS
    California)
  • aimed at problems with requirements in the TFLOPS
    range

42
(No Transcript)
43
Modern applications demanding more ambitious
goals
  • Enable heavy applications in science and
    engineering
  • Complex simulations with visualisation and
    steering
  • Access and analysis of large remote datasets
  • Access to remote data sources and special
    instruments (satellite data, particle
    accelerators)
  • distributed in wide-area networks, and
  • accessed through collaborative and
    multi-disciplinary PSE, via Web Portals.

44
(No Transcript)
45
The Grid
  • Treat CPU cycles and software like commodities.
  • Enable the coordinated use of geographically
    distributed resources in the absence of central
    control and existing trust relationships.
  • Computing power is produced much like utilities
    such as power and water are produced for
    consumers.
  • Users will have access to power on demand
  • When the Network is as fast as the computers
    internal links, the machine disintegrates across
    the Net into a set of special purpose appliances
  • Gilder Technology Report June 2000

This slide is courtesy of Professor Jack Dongarra
46
US Software Infrastructure, 1998
The Grid is a computational and network
infrastructure providing pervasive, uniform, and
reliable access to distributed resources.
  • Globus provides core services for grid-enabled
    computing http//www.globus.org/

47
Concept of a Grid
  • Gathers a diversity of resources, distributed at
    large-scale
  • supercomputers and parallel machines, and
    clusters
  • massive storage systems
  • databases and data sources
  • special devices
  • Provides globally unified access to virtual
    resources
  • Transient to support experiments
  • (computation, data, scientific
    instruments)
  • Persistent
  • (databases, catalogues, archives)
  • Collaboration spaces

48
What is a Grid Computing System
  • A virtualised computing environment
  • Enabling dynamic runtime selection, sharing,
    aggregation of geog distributed autonomous
    resources
  • Based on the availability, capability,
    performance and cost
  • Based on an applications or organisations
    requirements
  • Relies on a highly interconnected networking
    infrastructure

49
(No Transcript)
50
Related concepts
  • Virtualisation
  • IBM allows several OS to run simultaneously on
    one large computer (VirtualMachineMonitor)
  • Generic approach to
  • Allow logical access to types of remote,
    heterogeneous, and distributed resources
  • As if they were a single larger homogeneous
    resource, locally available
  • Applies to computation, storage, and network
    resources and to any other LOGICAL RESOURCE
  • Dynamically adjust resource mappings to match
    application demands

51
Virtualisation
  • The logical functions of the server, storage and
    network resources are separated from their
    physical functions and representations
    (processors, memories, I/O devices, switches).
  • Resources are aggregated into pools
  • Elements from the pools are allocated,
    provisioned, managed, manually or automatically,
    to meet application demands

52
Virtualisation examples
  • Processes
  • Server
  • Network
  • Storage
  • Data center groups of servers, storage, and
    network resources can be reallocated on the fly
  • Software resources

53
Cluster computing
  • Aggregate processors locally in parallel-based
    configurations, integrate them and provide access
    as a single unified resource
  • Central resource manager and scheduler
  • Centralised control and knowledge of system and
    user states
  • Typically owned by single organisation

54
Cluster vs Grids
  • Clusters focus on datacenter, single
    organisation
  • Grids focus on geo distributed multiorganisation
    utility-based (outsourced) networking

55
Changing perspectives - Grid Views
  • The Grid. Use distributed hardware and software
    infrastructure ? reliable, pervasive, inexpensive
    access to computational resources irrespective of
    physical location or access point.
  • The Consumer Grid. Services and resources
    anywhere. Issues of dynamic resource discovery,
    trust, and digital reputation.
  • Application Service Provider. Provide or sell
    computational or data services via Web.
  • Virtual Organisation. Group of people or
    institutions with some common purpose that need
    to share resources .

56
Grids Towards uniform and standard large-scale
computing environments
  • Analogy to the Electrical Power Grid
  • Simple local interface
  • Transparency
  • Pervasive access
  • Secure
  • Dependable
  • Efficient
  • Inexpensive
  • The Computational, Data, and Interaction Grids
  • Not really true (yet!?)

57
The Transparent Grid
  • Transparency The user is not aware (and doesnt
    care) what computing resources are used to solve
    their problem
  • Similarly, in an electrical grid we ignore the
    source of the power
  • Heterogeneity
  • Resource discovery
  • Scheduling

Distributed computing issues
58
(No Transcript)
59
EGEEEnabling Grids for E-science in
Europewww.eu-egee.orgEU IST project
60
The Grid metaphor
61
(No Transcript)
62
the future Grid!
63
The Pervasive Grid
  • Pervasive The Grid can be accessed from any
    networked device, eg, laptop, mobile phone, PDA,
    etc.
  • In electrical analogy, any appliance can access
    power through a standard interface, eg, a wall
    socket.
  • Standard interfaces
  • Protocols
  • Legacy software

64
The transparent grid access
65
Grid is an evolving field
  • Multiple views, perspectives
  • Concepts, models and architectures still being
    defined and tested
  • Applications still emerging
  • Wide variety of interests

66
The main questions
  • Grid benefits, challenges, status and directions
  • Grid architectures
  • Portal and UI, User and node security, Brokers,
    Schedulers, Data managers, Job and resource
    managers
  • Standardisation efforts
  • Architecture OGSA/OGSI (Open Grid Service
    Arch/Infrast)
  • Execution Models Workflows, Events, Transactions
  • System services Security, Monitoring, Billing
    and Accounting, Implementation (Globus Toolkit)
  • Economics
  • Grid deployment
  • Local, national, and global grids

67
Applications and benefits
  • The Grid can be seen as an evolution of
  • Parallel and Distributed computing
  • The Web
  • And Virtualisation concepts
  • As such, the Grid will probably improve existing
    application types, and will enable new types of
    applications

68
Applications example
  • Virtual access to special instruments
  • electron microscopes, particle accelerators, wind
    tunnels,
  • coupled with remote supercomputers, DBs,
  • to enable
  • interactive use,
  • online scenario comparisons,
  • and collaborative data analysis

69
Applications example
  • Virtual access to distributed supercomputing
  • For complex computations
  • Migrate CPU-bound operations to more powerful
    remote computing resources supported by large
    virtual supercomputers, assembled to solve
    problems too large to fit on a single computer
    system

70
Applications example
  • Collaborative engineering
  • Design of complex systems
  • Based on highly interactive environments
  • Relying on high-bandwidth access to shared
    virtual spaces, supporting
  • Interactive manipulation of shared datasets
  • Management of complex simulations

71
Applications example
  • Parameter studies
  • Rapid, large-scale parametric studies
  • A single program is run many times
  • To explore a multidimensional parameter space

72
Summary Grid applications
  • Distributed supercomputing for Computational
    science and Engineering
  • High-capacity throughput large-scale
    simulation/chip design, and parameter studies
  • Content sharing digital contents
  • Data-intensive drug design, particle physics,
    stock prediction, etc.
  • On-demand real-time medical instrumentation,
    mission-critical
  • Collaborative e-science, e-engineering, design,
    data exploration, education, e-learning
  • Remote software access/renting services (ASP and
    Web services)
  • Utility/service-oriented computing

73
Question Is this just an academic exercise? No!
  • Real applications needs
  • Solve new or larger problems by aggregating
    available resources at large-scale
  • for bigger, longer experiments, and more accurate
    models
  • Easier access to remote resources
  • a large diversity of computation, data and
    information services
  • Increased levels of interaction for increased
    productivity and capability to analyse and react
  • enable coordinated resource sharing and
    collaboration across virtual organisations

74
Applications and User Profiles
  • Computational Grids
  • provide a single point of access to a
    high-performance computing service
  • Scientific Data Grids
  • Access large datasets with optimized data
    transfers and interactions for data processing
  • Virtual Organisations and Interactions
  • Access to virtual environments for resource
    sharing, user interaction and collaboration
  • Real-time interactions for decision support
  • Information and Knowledge services
  • Access large geographically distributed data
    repositories, e.g. for data mining applications

75
Grid benefits
  • Resource sharing
  • Transparent access to remote resources
  • Efficient exploitation of resources, reduce
    execution time large-scale data processing,
    support load smoothing across the network,
    exploit time and work differences
  • Enable the concept of a virtual data center
  • Access to remote DB and software
  • Reduce the local services needed
  • On-demand aggregation of resources, to meet
    dynamic needs (including real-time response)
  • Fault-tolerance and dependability

76
Ultimate goal
  • Allows an organisation to
  • Integrate and share heterogeneous pools of
    resources (physical and logical)
  • Presenting them as one large, cohesive, virtual,
    transparent computing system
  • In order to deliver agreed services at specified
    levels of quality (application functionality,
    efficiency and performance)

77
Grid mechanisms
  • To enable online discovery and access to
    distributed resources
  • And online collaboration

78
Grid ideas
  • Internet a network of communication
  • Grid a network of cooperation / computation
  • Grid relies on the ability to negotiate
    resource-sharing among partners (providers and
    consumers) and using the resulting resource pool
    for some specific application goal

79
Grid Views
80
View - Computational Grids
  • Service-oriented view
  • Netsolve an example

81
View Grids as Frameworks for Application
Service Providers
  • Application Service Provider. Provide or sell
    computational services via web interface.
  • Provide remote services such as compute cycles,
    specific applications, or storage.
  • Selective outsourcing certain functions are
    performed remotely.
  • Application hosting remote sites act as
    application servers.
  • Browser-based computing online applications
    accessible through web site.
  • (Courtesy Prof. David Walker)

82
An Example NetSolve as a Scientific ASP
  • A client-server system for remote solutions of
    complex scientific problems
  • On request performs computational tasks on a set
    of servers
  • Searches for computational resources on a
    network, chooses the best one available, and
    returns the answers to the user.
  • Based on agents or resource brokers
  • Developed by Professor Jack Dongarra and
    colleagues at University of Tennessee, Knoxville

83
NetSolve The Big Picture (David Walker)
Client
Schedule Database
AGENT(s)
Matlab Mathematica C, Fortran Java, Excel
S3
S4
S1
S2
C
A
84
Data grids
  • Aggregate underused/unused storage
  • Into a larger virtual data store
  • For improved performance and reliability and for
    increased capacity

85
  • Storage
  • a file or a DB can span multiple physical
    devices
  • a unifying distributed file system can solve
    this problem
  • storage hierarchy
  • - primary (attached to a CPU)
  • - secondary (in hard disks such as RAID)
  • - tertiary (in near-real-time accessible media
    as tape )
  • --- distributed
  • Using mountable network file systems
  • as Network File System (NFS), Distributed File
    System (DFS) or
  • General Parallel File System (GPFS)
  • DB management software can federate a group of
    individual DBs and files to build a larger DB

86
  • Grid file systems can manage automatic file or
    data sets replication
  • for performance and reliability
  • Applications may require different semantics for
    synchronous replication of data files and so
    require specific data placement decisions
    exploiting locality of access this may
    critically affect the resulting performance
  • ? an intelligent grid data scheduler can
    consider, not only the computational requirements
    of an application but also its data requirements,
    based on usage patterns and replication needs
  • and then can schedule jobs closer to the
    data
  • and/or on processors with direct SAN access to
    storage devices
  • ? Need to revise traditional scheduling
    strategies and models typically based on
    computational requirements only

87
View Scientific Data Grids
  • EU DataGrid projects
  • Large-scale environment for accessing and
    analysing large amounts of data
  • High energy physics, Biology, Earth observation
  • Petabytes of data (1 000 000 Giga)
  • Thousands of researchers
  • Scalable storage of datasets replicated,
    catalogued, distributed in distinct sites

88
Distributed Computing Grid Experiences in CMS
Data Challenge
A.Fanfani Dept. of Physics and INFN, Bologna
  • Introduction about LHC and CMS
  • CMS Production on Grid
  • CMS Data challenge

89
Large Hadron Collider LHC
bunch-crossing rate 40 MHz
?20 p-p collisions for each bunch-crossing p-p
collisions ? 109 evt/s ( Hz )
90
CMS detector
91
CMS Data Acquisition
Bunch crossing 40 MHz
1event is ? 1MB in size
? GHz ( ? PB/sec)
Online system
Level 1 Trigger - special hardware
  • multi-level trigger to
  • filter out not interesting events
  • reduce data volume

75 KHz (75 GB/sec)
100 Hz (100 MB/sec)
data recording
Offline analysis
92
CMS Computing
  • Large amounts of events will be available when
    the detector will start collecting data
  • Large scale distributed Computing and Data Access
  • Must handle PetaBytes per year
  • Tens of thousands of CPUs
  • Tens of thousands of jobs
  • heterogeneity of resources
  • hardware, software, architecture and Personnel
  • Physical distribution of the CMS Collaboration

93
CMS Computing Hierarchy
1PC ? PIII 1GHz
? PB/sec
? 100MB/sec
Offline farm
recorded data
Online system
  • Filter?raw data
  • Data Reconstruction
  • Data Recordin
  • Distribution to Tier-1

CERN Computer center
Tier 0
?10K PCs
. .
  • Permamnet data storage and management
  • Data-heavy analysis
  • re-processing
  • Simulation
  • ,Regional support

Italy Regional Center
Fermilab Regional Center
France Regional Center
Tier 1
?2K PCs
? 2.4 Gbits/sec
. . .
Tier 2
  • Well-managed disk storage
  • Simulation
  • End-user analysis

Tier2 Center
Tier2 Center
Tier2 Center
?500 PCs
? 0.6 2. Gbits/sec
workstation
Tier 3
InstituteB
InstituteA
? 100-1000 Mbits/sec
94
View - Virtual Organisations
  • Resource sharing and collaboration between
    dynamically changing collections of individuals
    and organisations
  • e.g. Consortium of companies collaborating in a
    design of a new product
  • Sharing design data, Collaborative simulations,
    etc
  • e.g. Scientists collaborating in common
    experiments via a distributed virtual laboratory

95
Example Collaborative Immersive Visualisation
  • Scientific simulations, experiments, and
    observations generate vast amounts of data that
    often overwhelm data management, analysis, and
    visualization capabilities.
  • Observer appears to be in the same space as the
    visualised data and can navigate within the
    visualisation space relative to the data.
  • Important in interpreting and extracting insights
    from the data.
  • Several observers can co-exist in the same
    visualisation space - ideal for remote
    collaboration.
  • CAVE a fully immersive environment. Systems with
    stereoscopic projections onto 3 walls and the
    floor.
  • ImmersaDesk or stereoscopic workstation projects
    stereoscopic images onto a single flat panel
    display.

96
CAVE
97
(No Transcript)
98
Virtual organisations (VO)
  • A set of entities (individuals and institutions)
    defining a set of resource sharing and access
    rules
  • Highly controlled sharing
  • What is shared
  • Who is allowed access
  • Conditions to allow such sharing

99
Keys
  • Resource sharing and problem-solving in dynamic
    multi-institutional VOs
  • Service providers
  • Application
  • Storage
  • Machine-cycles (computation)
  • Collaboration in industry consortia

100
Commercial, IT, data center applications
  • First grid generations had limitations namely for
    database interoperability
  • This has motivated approaches for
    business-centric solutions, developed by
    commercial software and DB suppliers

101
Commercial and financial
  • Enabling
  • Data-mining, pattern-detection, scenario-modeling
    processes
  • Applied to banks, credit card processing,
    financial institutions
  • Improve the financial transaction flow, better
    understanding of customer profitability, and risk
    modeling done in real time (knowledge-based
    analysis and simulation are common in financial
    firms)

102
Financial applications
  • Instead of
  • Manually subdivide algorithms
  • Run them on separate machines
  • Manually merge and integrate the results
  • Exploit grid tools to the same, more or less
    automatically, in a virtualised environment

103
Business goals
  • Improve
  • Utilisation
  • Responsiveness
  • Reduce IT costs

104
Traditionally
  • Business applications
  • Dedicated platforms of servers and storage
    devices associated to each server
  • Not able to share resources
  • Not exploiting abilities to predict, anticipate,
    and exploit expected levels of processing loads
  • Design for excess capacity to handle excess peak
    loads
  • Higher overall costs

105
Virtualisation of resources
  • Exploit
  • Synergistic integration
  • Economies of scale
  • Load smoothing
  • Due to the sharing and aggregation of distributed
    resources
  • And the delivery of services in a highly
    transparent way to the end-user
  • Several solutions
  • Dedicated local Clusters
  • Grids

106
Cost savings
  • Cluster computing
  • Aggregating processors in parallel-based
    configurations
  • Cost reductions in IT costs and costs of
    operations, confirmed.
  • Enterprise grids
  • Middleware-based to exploit unused CPU cycles ?
    avoiding growth/expansion costs
  • Expected savings.

107
Expectations
  • 2005-06 the Grid will become commercially viable
  • Early adoption for enteprise applications, at
    single-site and multi-site
  • Exploitation of solutions from Web services and
    utility computing
  • By 2005, significant 50 of companies were
    already aware of the IT utility model for
    outsourcing (IT services from Service Providers
    as a commodity)
  • A significant of companies have some sort of
    utility computing and a significant of IT
    services are being delivered from offshore
    centers
  • Uncertainties remain about cost, security, and
    integration with existing IT systems

108
Grid for entreprises
  • Obtain computing services over networks from
    remote Service Providers
  • Aggregate an organisations dispersed set of
    independent resources into one unified single
    virtual environment

109
Data grids
  • Connection
  • connect DBs at different locations in a single
    company
  • Significant savings in finding information ?
    staff efficiency gains
  • Requires large investment in broadband links
    to connect remote data centers

110
Cluster Computational Grid
  • Processing power for HPC
  • Big saving in processing time ? efficiency and
    savings in RD costs
  • No initial impact on broadband until cluster
    computing evolves to an enterprise grid

111
  • Cluster/Local Grid
  • few homogeneous processors connected in a data
    center on a LAN or SAN(StorageAreaNetwork) ? more
    a cluster than a grid
  • under the same OS and a central administration
  • Enterprise or IntraGrids
  • heterogeneous processors and OS, geo distributed
    and interconnected by Intranet links (or
    high-quality high-throughput, high-security
    communications)
  • owned by different departments of a single
    organisation
  • may be structured as a hierarchy cluster of
    clusters

112
Enterprise Grid
  • Processing power connection within a
    single company, links RD centers at different
    geo locations
  • Efficiency due to processing power access to
    data
  • Savings on RD times and time-to-market
  • Investment in broadband links require very high
    speed due to large amount of data transmitted

113
Partner Grid
  • Processing power Connection for multiple
    companies
  • Savings in design time and RD time, and
    time-to-market
  • More efficient collaboration between partners in
    a supply chain relationship
  • Significant investment in secure,
    high-performance, broadband links between the
    companies

114
  • Enterprise Grids
  • require policies and operations to control
    actual use of grid resources, based on
    priorities, and kinds of applications
  • also requiring security control across distinct
    departments
  • Global Grids
  • crosses organisation borders
  • more critical security
  • allows sharing, trading, brokering resources
    over global pools

115
Web Services
  • Provide secure Internet access to new services
    for consumers and business
  • Closely develops with cluster and data grids
  • Big gain in productivity
  • savings in cost of offering services and
    time-to-market new services
  • requires a data grid-like structure to provide
    rapid updating of information
  • Large spending on broadband to link data centers
  • Significant spending on software and integration
    services
  • Example Bank of America over the Internet

116
How to evolve to a Grid?
  • Transform individual components (computers,
    storage, networks) into aggregated and virtual
    pool of resources, to be allocated and monitored
    automatically
  • Provide defined business services on the basis of
    specified goals and priorities develop and
    automate policies and service-level objectives to
    manage the needed applications and resources
  • Build an enterprise grid infrastructure and use
    open-source and vendors proprietary tools
  • Enable these tools to comply with new standards,
    and combine components together.

117
How?
  • Concept of outsourcing
  • Delegate the provision of a service in an
    external reliable and trusted supplier
  • Install the concept of utility computing
  • ? Expected as a major trend in the 2010s
  • Virtualisation of resources
  • Dynamically manage and adjust a logical pool of
    resources and their mappings to share the
    physical infrastructure

118
Virtualisation without limit
  • ? Application software and licenses
  • Specific business software may be installed on a
    few designated grid processors and be shared
    among clients.
  • eventually limiting the nº of current users
    ? virtual licenses
  • Cf vs installing the same licensed software in
    thousands of servers

119
Grid requirements include
  • Online negotiation of access to services who,
    what, why, when, how
  • Establishment of applications and systems able to
    deliver multiple qualities of service
  • Autonomic management of infrastructure elements
  • Dynamic formation and management of virtual
    organisations
  • Open, extensible, evolvable infrastructure

120
(No Transcript)
121
More Complex Applications and Environments
  • Large number of components
  • Complex interactions
  • Dynamic configuration

122
Software Engineering Challenges
  • Suitable levels of flexibility in all stages of
    the software lifecycle
  • Application specification and design
  • Program transformation and refinement
  • Simulation
  • Code generation
  • Configuration and deployment
  • Coordination and control of the execution

123
Issues - 1
  • Clear separation and representation of concepts
  • Computation and interaction
  • Structure and behaviour
  • Specification of multiple components
  • Enabling alternative mappings
  • Varying degrees of automated processing
  • Supported by pattern and template repositories
    with relevant attributes

124
Issues - 2
  • Mapping the programming models into the
    underlying computing platforms
  • Interacting with resource descriptions and
    discovery services
  • For flexible configuration and deployment
  • Coordination of distributed execution
  • Allowing workflow descriptions
  • With adaptability and dynamic reconfiguration

125
Component Based Development /Software
Architecture
Repositories (Skeletons/Templates/Patterns)
Abstract Description Language
specify, design, compose
For structure, behaviour, computation, and
interaction
Mappings
verify, analyse, evaluate, predict
Programming Levels (Models)
Resource Description and Discovery
Deploy and Configure
Grid Execution Environments
control, coordinate execute, reconfigure
Methodology
126
Global conceptual layers
  • Software architectures
  • Coordination models
  • Resource management
  • Execution, monitoring and control
  • Support infrastructures

127
(No Transcript)
128
1 - Software Architectures
  • Specification of components, their composition
    and interactions
  • Modeling and reasoning on global structure and
    behavior
  • Specification languages
  • for structure and behavior
  • incremental refinement and dynamic composition

129
2 - Coordination models
  • Represent and manage interaction patterns among
    components
  • Communication and cooperation models
  • Consistency guarantees
  • Abstract, logical, dynamic organisation models
  • Dynamic application structure, interaction
    patterns and operation modes

130
Handle dynamic characteristics
  • Looking at the past
  • Fault tolerance, Load balancing, Task spawning
  • At present and in the future
  • Changes in the configuration and availability of
    resources, variations of characteristics and
    behaviour
  • Changes at the application level user control of
    a dynamic experiment
  • Flexibility to build PSEs
  • Mobility of agents and devices

131
3 - Resource management
  • Configuration of parallel and distributed virtual
    machines
  • Resource discovery, scheduling, and reservation
  • Execution and monitoring at local and large
    scales
  • Quality of service

132
  • Need to be fair and efficient in
  • locating software resources
  • negotiating for use of resources
  • scheduling components on distributed resources to
    achieve
  • Minimum execution time
  • Maximum throughput
  • Need to be able to monitor resource usage and
    level of availability
  • Need of Resource Specification Languages
  • A difficult problem in dynamic environment.

133
New challenges
  • New problem-solving strategies with adaptive
    behaviour
  • Awareness to Quality of Service factors
  • Management at intermediate layers
  • By intermediate agents planners
  • Contract negotiation
  • Dynamic revision of plans
  • Reconfiguration
  • Specify, compose, develop, understand dynamic
    distributed large-scale applications models,
    languages, and tools

134
Two Views of Components
  • A component as an executable that runs on a
    certain specified machine.
  • A component can be viewed as a contract. It says
    If you give me these inputs then Ill give you
    these outputs.
  • In the second case the component is not tied to
    any particular executable. Problem specification
    is separate from service provision.

135
Binding Service Requests to Resources
  • In a fully transparent system the scheduler would
    decide where components execute based on
  • Availability and performance of resources
  • Cost and time constraints
  • This is a hard problem.
  • Possible solution is to supply hints about
    where it can run, eg, in a components XML
    specification.

136
High Level View of Network Computing
  • Services are advertised on the network
  • A service typically consists of
  • A component that actually provides the service,
    and
  • An agent that mediates access to the service.
  • Scheduler must be able to locate services and
    then schedule use.

137
Service-oriented architecture
  • Defines how two entities interact so that one
    performs a unit of work to the other
  • - the unit of work is a service
  • - service interactions are defined in a
    description language
  • - each interaction is self-contained and
    loosely-coupled and independent of other
    interactions
  • - applications are assembled as collections
    of services, each with different functions
  • and are exposed as services on the network, to
    be (re)used
  • . different users can communicate with the
    services differently
  • an intermediate layer between providers and
    consumers
  • - building applications is
  • to identify required components, find them,
    glue them together

138
Service Providers and Brokers
  • NetSolve is an example of an ASP providing
    numerical software, still limited to
    client-server style.
  • Trend to network-based computing paradigm.
  • Nodes offer sets of computing services with known
    advertised interfaces.
  • Software seen as a pay-as-you-go service
    rather than a product that you buy once
  • ?Computational Economies
  • Open Service-Oriented Architectures
  • Shifting paradigms to master-slave and more
    tight cooperation models

139
Grids Key components
  • Resource management
  • Security
  • Data management
  • Services management

140
Grid types
  • Space scale Local, metropolitan, regional,
    national, global
  • Time scale logically aggregate resources for
    long or short periods of time
  • Crossing borders Resources can span a single or
    multiple organisations, or a service provider
    space

141
Very complex systems
  • Aim at providing unifying abstractions to the
    end-user
  • Large-scale universe of distributed,
    heterogeneous, and dynamic resources
  • Critical aspects
  • Distributed
  • Large-scale
  • Multiple administrative domains
  • Security and access control
  • Heterogeneity
  • Dynamic

142
Layers of a Grid Architecture
  • User Interfaces, Applications, PSEs
  • Programming Models, Development Tools and
    Environments
  • Grid middleware Services and Resource
    Management
  • Heterogeneous Resources and Infrastructure

143
Elements of a Grid Architecture
  • Applications, User interfaces, Grid portals and
    PSEs
  • Models, tools and environments for application
    composition, programming and deployment
  • Grid operating environment (middleware)
  • Services and resource management, discovery and
    scheduling
  • Information registration and querying
  • Authentication, Security
  • Computation, data management, and communication
  • Monitoring, Quality of Service
  • Heterogeneous resources and infrastructure

144
Grid tools(1)
  • Infrastructure include hardware and software
    components (file systems, resource managers,
    messaging systems, security applications,
    certificate authorities, file transfer
    mechanisms)
  • Middleware software plug-ins that facilitate
    using the Grid
  • open source Globus GT 3 - first implementation
    of OGSI, as a set of services and software
    libraries
  • based on a security model plus a mechanism
    for hierarchically collecting data about the grid
  • includes support for
  • security
  • information infrastructure
  • resource management
  • data management
  • communication
  • fault detection
  • portability

145
Grid tools(2)
  • Directory services to discovery available
    services, to define and monitor the grid topology
  • generally based on the Lightweight Directory
    Access Protocol LDAP
  • and Domain Name Server (DNS)
  • Schedulers and load balancers ensure job
    completion under priority, deadline or urgency
    constraints and distribute tasks and data across
    systems to reduce the chance of bottlenecks
  • Developer tools for file transfer,
    communications, environment control, ranging from
    utilities to APIs
  • Security authenticate and authorise, control
    who/what can access a grids resources. Includes
  • message integrity
  • message confidentiality

146
Grid architecture concepts
  • Influenced by the Globus Toolkit
  • a de facto standard for security, info.
    discovery, resource data management,
    communication, fault detection, and portability
  • Driven by the Global Grid Forum (GGF)
  • An industry advisory group for community-driven
    development of new standards
  • Grid architectures heavily dependant on former
    Internet protocols and services (for
    communication, routing, name resolution..)

147
Grid logical hierarchy
  • L1Grid fabric resources (computers, storage,
    networks, special devices) -gt managed by a local
    RM with a local policy, and interconnected in
    LAN, MAN, or WAN
  • L2Security infrastructure (authenticate secure
    connectivity access to resources)
  • L3Core Grid middleware (job management, storage
    access, accounting) ? uniform access to the
    fabric resources, and hides partitioning,
    distribution, and load-balancing
  • L4 User-level middleware resource aggregators
    (scheduling services and resource brokers)
  • L5 Grid programming environments and tools
    (languages, libraries, compilers, and support
    tools)
  • L6 Applications (commercial, scientific,
    engineering)

148
GGF layered architecture
  • Fabric controlling things locally
  • Connectivity talking to things
    communication (Internet protocols) security
  • Resource sharing single resources
    negotiating access, controlling use
  • Collective coordinating multiple resources
    ubiquitous infrastructure services,
    application-specific, distributed services
  • Applications putting things to work

149
Critical Grid Issues
  • Security When resources are shared across
    organisation boundaries security is an important
    issue.
  • Dependability The Grid must be robust and
    resilient to failure.
  • Efficiency Resources should not be wasted, good
    load balancing needed.
  • Cost For broad impact The Grid should be
    inexpensive.
  • Portability Grid applications should be able to
    run on a wide range of hardware.

150
Functional perspective of a Grid
  • a) Grid Portal UI
  • interface to launch applications
  • with transparent access to resources and
    services
  • b) Grid Security
  • b1) USERs view
  • - provides authentication, authorization, data
    confidentiality, data integrity, and
    availability, from the users view
  • - a single sign-on run-anywhere uniform
    authentication service
  • - a user job requires on-the-fly confidential
    message-passing services
  • or may require a long-lived service
  • - user must be allowed to check availability of
    such security services

151
  • provide security across organisation borders
  • with support for local control over access
    rights and mapping
  • uniform authentication, authorization and
    message-protection
  • with delegation of credentials for computations
    involving multiple geo distributed resources
  • usually relies on public key technology
  • b2) SYSTEMs view
  • - the user needs to be authentication but remote
    resources too!
  • - secure (authenticated and confidential)
    communication between internal grid components
  • - a Certificate Authority establishes the
    identity of users and grid resources

152
  • c) Broker and Directory
  • users request to launch an application ?
  • requires to identify suitable resources
  • based on applicatíons parameters
  • ?
  • -- informs about available resources and
    working status
  • -- allows to define and monitor grid
    topology/resources
  • ? supported by a Directory mechanism (LDAP
    and/or DNS)
  • d) Scheduler
  • to coordinate the concurrent execution of jobs
    components
  • in a simple case
  • - selection of suitable processor
  • - grid request to send the job code and data to
    the selected processor

153
  • in general cases
  • - a scheduler must dynamically react to grid
    load
  • by getting measurement information obtained by
    grid monitoring and resource management
  • scheduler strategies
  • - simple round-robin (cf default PVM)
  • - usually, try to find most appropriate
    processor(s)
  • - hierarchical scheduling
  • metascheduler submit a job to a cluster
    scheduler
  • cluster scheduler manages a cluster as a
    single resource and uses an internal scheduling
    strategy

154
  • schedulers also monitor job progression
  • - to automatically resubmit to other nodes, in
    case of losses
  • - to check for job completion (eg with
    timeouts)
  • some use a resource static reservation system
  • -- a calendar-based mechanism (like in old batch
    processing)
  • managing pools of resources
  • - processors automatically report their
    availability to grid management
  • ? allows reassignment of jobs to such
    processors
  • - local nodes may report start of local NONGRID
    work
  • ? forces node availability for grid work
  • may originate umpredictable completion times
  • -- suggests use of DEDICATED grid resources

155
  • e) Grid data management
  • reliable and secure method for moving files and
    data
  • f) Grid job/resource management
  • Grid Resource Allocation (GRAM)
  • f1) keeps track of grid available resources,
    node capacities and current utilisation levels,
    and of current grid users
  • ?passes this information to the Scheduler, for
    deciding where to submit jobs
  • ? also uses this to monitor grids
    unpredictable incidents outages, congestion
  • ? and for administration overall usage
    patterns, statistics, log resource usage for
    accounting purposes
  • f2) services to launch a job on a set of
    resources
  • to check status
  • to get results when job is complet
Write a Comment
User Comments (0)
About PowerShow.com