Principles of High Performance Computing ICS 632

1 / 61
About This Presentation
Title:

Principles of High Performance Computing ICS 632

Description:

... for supporting community of users that user/produce large data collections ... Today: modern language designers do not make a distinction between 'scientific' ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 62
Provided by: henrica

less

Transcript and Presenter's Notes

Title: Principles of High Performance Computing ICS 632


1
Principles of High Performance Computing(ICS 632)
  • Introduction to Grid Computing

2
Grid Computing
  • Term coined by Ian Foster in the mi-90s
  • Vision for large-scale computing
  • Analogy with the Power Grid
  • Availability
  • Standard Interface
  • Distributed
  • Market

3
(No Transcript)
4
On 10/07/2008
Big companies
New companies
5
Science Today
  • Massive computer simulation
  • Massive computerized data analysis

6
Multiple CPUs
  • The single-CPU system that solves todays problem
    is not going to happen for a long time!
  • and by then, well have bigger problems anyway
  • Solution Use multiple CPUs

SGI Altix (up to 512 CPUs)
Dual-Xeon Motherboard
7
Multiple Computers
  • Adding CPUs to a single computer becomes very
    expensive
  • How about multiple computers together?
  • Linux Clusters (60 of Top-500 list)

Blue/Gene 30K computers
8
Beyond the machine room?
  • Need more capacity than available at (most)
    single sites
  • Everyone would like a 10K-node 100GHz cluster
  • Very expensive (cooling, power)
  • More economical to have multiple sites
  • Need to locate available resources now
  • Data/Instruments are inherently distributed

9
Grid Computing A Brief History
  • Early 90s
  • Gigabit testbeds, metacomputing
  • Mid to late 90s
  • Early experiments (e.g., I-WAY), academic
    software projects (e.g., Globus, Legion),
    application experiments
  • 2005
  • hundreds of application communities projects
  • Major infrastructure deployments
  • Significant technology base (esp. Globus
    ToolkitTM)
  • Major industrial interest and involvement
  • Global Grid Forum 400 organizations, 50
    countries

10
Grid Computing A Definition
  • Definition Resource sharing coordinated
    problem solving in dynamic, multi-institutional
    virtual organizations

11
The TeraGrid
10,000 processors 1 PetaByte of storage
12
Grids that I use
13
Desktop Grids (SETI_at_home)
  • Detect any alien signals received through
    Arecibo radio telescope
  • Uses the idle cycles of computers to analyze
    the data generated from the telescope
  • Over 500,000 active participants, most of whom
    run screensaver on home PC
  • Over a cumulative 20 TeraFlop/sec
  • TeraGrid 40 TeraFlop/src
  • Cost 700K!!
  • TeraGrid gt 100M
  • Companies United Devices
  • Intranet solutions

14
Domain-specific VOs (CMS)
1800 Physicists, 150 Institutes, 32 Countries
100 PB of data by 2010 50,000 CPUs
15
Domain-specific VOs (CMS)
16
Domain-Specific Grids (Grid3)
  • Grid2003 An Operational Grid (Oct 2003)
  • 28 sites (3K CPUs)
  • 7 VOs (each for a physics application)

17
Domain-Specific Grids (Grid3)
18
The Big Question
  • At some level, all these applications share
    common needs
  • find resources
  • acquire resources
  • locate and move data
  • start/monitor computation
  • all securely and conveniently
  • Can a single software infrastructure support all
    of the above?

19
The Globus Alliance
  • http//www.globus.org
  • Development of Grid protocols services
  • Protocol-mediated access to remote resources
  • On the Grid speak Intergrid protocols
  • Mostly (extensions to) existing protocols
  • Development of Grid APIs SDKs
  • Interfaces to Grid protocols services
  • Facilitate application development by supplying
  • higher-level abstractions

20
Globus Toolkit (GTK)
  • A software toolkit addressing key technical
    problems in the development of Grid-enabled
    tools, services, and applications
  • Offers a modular set of orthogonal services
  • Implements standard Grid protocols and APIs
  • Available under liberal open source license
  • Large community of developers users
  • Commercial support

21
GTK Services
  • GTK services span four main areas
  • Security
  • Resource Management
  • Data Management
  • Information Services
  • Version 2.4 released in 2003
  • Garnered a large scientific user community and
    became the de-facto standard

22
Globus Security
  • Usual concepts
  • Authentication establishing identity
  • Authorization establishing rights
  • Key Features
  • easy to use
  • single sign-on
  • delegation
  • mutual user-resource authentication
  • integration with local systems (Kerberos, AFS,
    ...)
  • Can be called directly by developers
  • Is integrated as part of most Globus SDKs
  • Typically (mostly) invisible to the user

23
Create Processes at A and B that Communicate
Access Files at C
Globus resource manager
Globus resource manager
Site A (Kerberos)
Site B (Unix)
Computer
Computer
Globus FTP server
Site C (Kerberos)
Storage system
24
Globus Security
  • Globus Security Infrastructure (GSI)
  • Extensions to standard protocols APIs
  • Standards SSL/TLS, X.509 CA, GSS-API
  • Extensions for single sign-on and delegation
  • Uses well-known PKI technology
  • A private key is used to encrypt data.
  • A public key can decrypt data encrypted with the
    private key.
  • All in a X.509 certificate
  • Someones subject name (user ID)
  • Their public key
  • A signature from a Certificate Authority (CA)
    that
  • Certificates for users and resources

25
Globus Resource Mngmt
  • Goal allow users and applications to find and
    utilize grid resources
  • Requirements
  • Allows for programs to be started on grid
    resources
  • must provide an interface to local mechanisms
    (fork, PBS, SGE, Condor, etc.)
  • Allows for resources to be described in some sort
    of language
  • Allows for reservation, co-allocation, etc.
  • Note that there are many hard policy issues
    here, which are not addressed by Globus SDKs

26
Globus Resource Mngmt
  • GRAM Globus Resource Allocation Manager
  • implemented as part of a gatekeeper daemon that
    sits on top of the local resource manager
  • receives requests and starts local processes
  • uses HTTP, integrated with GSI
  • RSL Resource Specification Language
  • Specifies resource requirements
  • Specifies job configuration
  • Uses LDAP-like syntax
  • Support for reservation and co-allocation (GARA)
  • A few other things

27
Globus Data Mngmt
  • Goal provide all functionality needed for
    supporting community of users that user/produce
    large data collections
  • Often termed DataGrid
  • convenient terminology
  • not a separate infrastructure
  • Requirements
  • Efficient protocol for large data transfers
  • Ways to replicate data and locate replicas

28
Globus Data Mngmt
  • Data transfers with GridFTP
  • Extension to the well-supported FTP protocol
  • Integrated with GSI
  • Third-party transfers
  • Partial file access
  • Striping and interface to MPI I/O
  • Parallel transfers

29
Globus Data Management
  • Metadata catalog describes data
  • Replica catalog file locations (LDAP)

Metadata Catalog
Replica Catalog
Application
Replica selection
Globus provides APIs
replica
replica
replica
replica
replica
replica
30
Globus Information Services
  • Goal Allow decision making
  • Need information to answer
  • What resources are available?
  • What is their state?
  • Which ones should I use?
  • Challenges
  • Information is always old
  • Distributed state is never coherent
  • Scalability over tens of thousands of resources

31
Globus Information Services
  • Globus provides two components
  • Resource description services
  • Supplies information about a resource
  • Resource index services
  • Supplies information which was gathered from
    resource description services
  • Provides naming and indexing for fast retrieval
  • Uses LDAP
  • With two protocols
  • Grid resource registration protocol
  • Grid resource enquiry protocol

32
GT2 Why is it good?
  • Good technical solutions for key problems
  • Has enabled the first generation of production
    grid systems and applications
  • Has provided reference implementations that
    interface to many systems
  • Has garnered industrial support
  • Has created a community of developers who
  • build on top of Globus services
  • add to Globus services

33
GT2 Why is it bad?
  • Protocol deficiencies, e.g.
  • Heterogeneous basis HTTP, LDAP, FTP, custom
  • No standard means of invocation, notification,
    error propagation, authorization, termination,
  • Significant missing functionality
  • e.g., interfaces to classes of resources
  • databases, instruments, etc.
  • requires the development of specialized
    interfaces like was done for, e.g., batch systems
  • Little work on total system properties, e.g.
  • Dependability, end-to-end QoS,
  • Reasoning is made difficult by protocol/implementa
    tion heterogeneity

34
Evolution of Business
  • We see something that happened to programming
    languages
  • FORTRAN for Science
  • Cobol for business
  • But scientists want to manipulate records
  • And businesses want to do forecasting
  • Today modern language designers do not make a
    distinction between scientific languages and
    business languages.
  • Distributed computing done by scientists is
    resembling distributed computing done by
    businesses, and increasingly so.

35
Business Grid Computing
  • Walmart
  • 423 TBytes
  • data from 1,387 discount stores, 1,615
    Supercenters, 542 Sam's Clubs, and 75
    Neighborhood Markets in the United States, plus
    1,520 more stores worldwide.
  • Real time computing action
  • Amazon.com
  • Processes several GByte of data / secs
  • Linux clusters
  • eBay
  • 2 data centers, 5 planned
  • Google
  • Pixar, Dreamworks

36
Business Grid Computing
  • Grid Computing useful for businesses
  • Other intriguing applications
  • On-line gaming
  • File-sharing applications
  • Question Could many non-scientific applications
    require the same software infrastructure as
    scientific applications?
  • Should we sell GTK to industry???

37
Web Services!!
  • Increasingly popular standards-based framework
    for accessing network applications
  • W3C standardization Microsoft, IBM, Sun, others
  • WSDL Web Services Description Language
  • Interface Definition Language for Web services
  • SOAP Simple Object Access Protocol
  • XML-based RPC protocol common WSDL target
  • WS-Inspection
  • Conventions for locating service descriptions
  • UDDI Universal Desc., Discovery, Integration
  • Directory for Web services
  • Clearly provides a lot of the things we need to
    achieves the goals of Grid computing and to
    satisfy the technology requirements

38
Four Fundamental Concepts
  • Naming and bindings
  • Ways to reference a service
  • Information model
  • Ways to find out information about a service
  • Lifecycle
  • Ways to create and destroy services
  • Done by factories
  • Services have time-to-live
  • Notification
  • Ways to be notified when something happens
  • With these simple concepts, it is possible to
    re-implement all the functionality of GTK2

39
Data Mining for Bioinformatics
Community Registry
Mining Factory
Database Service
BioDB 1
Compute Service Provider
User Application
. . .
. . .
I want to create a personal database containing
data on e.coli metabolism
Database Service
Database Factory
BioDB n
Storage Service Provider
credit Ian Foster
40
Data Mining for Bioinformatics
Find me a data mining service, and somewhere to
store data
Community Registry
Mining Factory
Database Service
BioDB 1
Compute Service Provider
User Application
. . .
. . .
Database Service
Database Factory
BioDB n
Storage Service Provider
credit Ian Foster
41
Data Mining for Bioinformatics
Community Registry
Mining Factory
Database Service
Handles for Mining and Database factories
BioDB 1
Compute Service Provider
User Application
. . .
. . .
Database Service
Database Factory
BioDB n
Storage Service Provider
credit Ian Foster
42
Data Mining for Bioinformatics
Community Registry
Mining Factory
Database Service
Create a data mining service with initial
lifetime 10
BioDB 1
Compute Service Provider
User Application
. . .
. . .
Create a database with initial lifetime 1000
Database Service
Database Factory
BioDB n
Storage Service Provider
credit Ian Foster
43
Data Mining for Bioinformatics
Community Registry
Mining Factory
Database Service
Create a data mining service with initial
lifetime 10
BioDB 1
Miner
Compute Service Provider
User Application
. . .
. . .
Create a database with initial lifetime 1000
Database Service
Database Factory
BioDB n
Database
Storage Service Provider
credit Ian Foster
44
Data Mining for Bioinformatics
Community Registry
Mining Factory
Database Service
Query
BioDB 1
Miner
Compute Service Provider
User Application
. . .
. . .
Query
Database Service
Database Factory
BioDB n
Database
Storage Service Provider
credit Ian Foster
45
Data Mining for Bioinformatics
Community Registry
Mining Factory
Database Service
Query
BioDB 1
Miner
Keepalive
Compute Service Provider
User Application
. . .
. . .
Query
Database Service
Database Factory
Keepalive
BioDB n
Database
Storage Service Provider
credit Ian Foster
46
Data Mining for Bioinformatics
Community Registry
Mining Factory
Database Service
BioDB 1
Miner
Keepalive
Compute Service Provider
User Application
. . .
. . .
Results
Database Service
Database Factory
Keepalive
Results
BioDB n
Database
Storage Service Provider
credit Ian Foster
47
Data Mining for Bioinformatics
Community Registry
Mining Factory
Database Service
BioDB 1
Miner
Compute Service Provider
User Application
. . .
. . .
Database Service
Database Factory
Keepalive
BioDB n
Database
Storage Service Provider
credit Ian Foster
48
Data Mining for Bioinformatics
Community Registry
Mining Factory
Database Service
BioDB 1
Compute Service Provider
User Application
. . .
. . .
Database Service
Database Factory
Keepalive
BioDB n
Database
Storage Service Provider
credit Ian Foster
49
GT3 OGSA Globus
  • GT3 Core
  • Implements Grid service interfaces behaviors
  • Reference implementation of evolving standard
  • Java first, C soon, C?
  • GT3 Base Services
  • Evolution of current Globus Toolkit capabilities
  • Backward compatible
  • Many other Grid services on top
  • Not too often that Academia really follows
    Industry )

Other Grid
GT3
Services
Data
Services
GT3 Base Services
GT3 Core
50
WSRF
  • The WS community critiqued OGSI
  • Too much stuff in one specification
  • Does not work well with current WS and XML tools
  • WSDL2.0 incompatible with OGSI extensions of
    WSDL1.0
  • Web Service Resource Framework
  • Re-factoring of OGSI to exploit new development
    in WS technology
  • Implemented in GTK 4.x

51
Evolution
52
Production Grids
  • Lets look at one production Grid
  • TeraGrid
  • Material from the State of the TeraGrid
    presentation by Charlie Catlett
  • Other material from www.teragrid.org

53
TeraGrid
54
(No Transcript)
55
Roaming Usage?
  • Roaming Usage
  • Users request a grid resource
  • They may end up running anywhere
  • Specific Usage
  • Users request a particular resource
  • Roaming is only a small portion of the workload
  • Means that users dont really buy into the grid
    idea
  • Probably due to the fact that contention for
    resources isnt super high right now
  • But still close to 100 utilization
  • It takes time for users to truly trust this grid
    thing

56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
Tons of Hardware Resources
  • http//www.teragrid.org/userinfo/hardware/specs.ph
    p
  • Indiana
  • 96-node IA32 cluster
  • NCSA
  • 2x512-proc SGI shared memory machine
  • 632-node IA64
  • 1.2K-node Condor pool
  • 1280-node Xeon cluster
  • PSC
  • 2090-node Cray XT3
  • 765-node Alphaserver cluster
  • Purdue
  • 2.6K-node Condor pool
  • etc.....
  • Storage Visualization Common Software Stack

60
Grid Applications
  • Applications that run over multiple resources are
    not large MPI applications
  • Network latency would be prohibitive
  • Heterogeneity is sort of annoying
  • Co-scheduling may be difficult
  • Many current Grid application fit into the
    hybrid parallelism category
  • See Scheduling lecture
  • Note that many of these Grids just support many
    applications which in themselves are not really
    Grid applications
  • e.g., users just want to use a cluster

61
Conclusion
  • Grid Computing is not only a buzzword
  • Its real and grids are in place
  • But a lot of Grid solutions are just old
    products rehashed
  • Affix a Grid Inside sticker
  • Many different notions of grids
  • The current state of the software infrastructure
    is an ok engineering solution, which could be
    vastly improved
  • But its usable and used
  • Not clear where industry/academia will drive it
  • The new thing is Cloud Computing
  • Virtual Machine Technology, leasing of resources
  • In the next lecture well talk about systems
    issues for grid computing
  • These are independent on the technology trend of
    the moment
Write a Comment
User Comments (0)