Title: Grids
1Grids
- darcy.quesnel_at_canarie.ca
2A new way of doing science
- Science has been in vivo test tubes, wet labs,
and big instruments - Science is increasingly in silico large,
compute-intensive, data-intensive simulations on
distributed resources - More and more science is network and
computationally based - The US and Europe have announced new and renewed
funding for grid-related research
3Many Grid-Related Research Projects
LHC
ATLAS
4Large Hadron Collider Computing Grid
PBytes/sec
100 MBytes/sec
Offline Processor Farm 20 TIPS
100 MBytes/sec
Tier 0
CERN Computer Centre
622 Mbits/sec per channel
Tier 1
FermiLab 4 TIPS
France Regional Centre
Italy Regional Centre
Germany Regional Centre
622 Mbits/sec
Tier 2
622 Mbits/sec per channel
Institute 0.25TIPS
Institute
Institute
Institute
Physics data cache
1 MBytes/sec
Canada (through TRIUMF) is working on being a
tier 1.5 site
Tier 4
Physicist workstations
5(No Transcript)
6Canadian Forestry Grid
- SAFORAH (System of Agents for Forest Observation
Research with Automation Hierarchies). - SAFORAH connects five locations across the
country to support the monitoring of Canada's
forests - Together, all five locations will generate data
equaling 40 terabytes (TB) per month
7Folding_at_home
http//www.stanford.edu/group/pandegroup/Cosm/ htt
p//members.ud.com/vypc/cancer/
- This "virtual supercomputer" uses peer-to-peer
technology to make unprecedented amounts of
processing power available to medical researchers
to accelerate the development of improved
treatments and drugs that could potentially cure
diseases. - Rapid new discoveries in cancer research
8ALTA Cosmic Ray eScience
- The ALTA project is a collaborative scientific
research project involving the University of
Alberta Center for Subatomic Research and over 50
high schools across Canada in the area of cosmic
ray detection. - Teachers and students actively contribute to the
physics research while learning about an exciting
area of modern science. - Distributed computing at schools will be required
to analysize data from sensors in near real time - Will allow researchers to gain a deeper
understanding of deepest reaches of space and
time
9Neptune eScience Grid
- Joint US-Canadian project to build large undersea
dark fiber network off west coast of USA and
Canada - Undersea network will connect instrumentation
devices, robotic submarines, sensors, under sea
cameras, etc - All devices available to students and researchers
connected to CAnet 4 and Internet 2 networks - Neptune will be used to gather research data in a
variety of fields seismology, sea vents, fish
migrations and population, deep sea aquatic life,
etc - Distributed computing and data storage devices on
CAnet 4 and Internet 2 will be used to analyze
and store data
10Amateurs discover Supernovas
http//www.nytimes.com/2002/11/07/technology/cir
cuits/07astr.html?todaysheadlines NASA and
amateur scientists nightly harvest about 1,000
images, which are shared with other amateur
astronomers over the Internet. Together, they
analyze the pictures for previously undiscovered
supernovas, the remains of collapsed stars.
Over 58 supernovas have been discovered
While most amateur astronomers use computers
to enhance a hobby, the advances in technology
are also blurring the distinctions between
professionals and sophisticated amateurs.
11Sloan Digital SkyServer
- http//skyserver.sdss.org/en/
- Large database of astronomical data and images
- Available to scientists, students and public
- XML and Java web services interfaces
12Faulkes Telescope
- Provide UK schools with access to a research
class telescope in Hawaii - Provides an exciting resource for teachers to use
via the Web - To provide a real-time experience of astronomy,
through live use of a telescope - To allow students to participate in real research
programs, mentored by professional astronomers - Provides other public interest groups, such as
amateurs, access to high quality astronomical
data - http//www.faulkes-telescope.com/
13What is the Grid?
- Coordinated resource sharing and problem solving
in dynamic, multi-institutional virtual
organizations. - The Anatomy of the Grid, Foster, et al.
- Coordinates resources that are not subject to
centralized control - Uses standard, open, general-purpose protocols
and interfaces and - Delivers nontrivial qualities of service.
- What is the Grid? Foster
14Grid Middleware Responsibilities
- Security Infrastructure
- Single sign-on, delegation, message protection
(integrity) - Authentication (identity), authorization
(rights), policy ! - Information Management
- Soft-state, registration, discovery, selection,
monitoring - Resource Management
- Remote service invocation, reservation,
allocation, co-allocation - Resource specification
- Data Management
- High-performance, remote data access
- Cataloguing, replication, staging
15Grid Security Infrastructure
Single sign-on via grid-id generation of
proxy cred.
User
Or retrieval of proxy cred. from online
repository
Site A (Kerberos)
Site B (Unix)
Computer
Computer
Site C (Kerberos)
Storage system
16Computational Grids
RSL specialization
RSL
Application
Information Service
Queries
Info
Ground RSL
Simple ground RSL
Local resource managers
GRAM
GRAM
GRAM
LSF
Condor
NQE
17Data Grids
Attribute Specification
Replica Catalog
Metadata Catalog
Application
Multiple Locations
Logical Collection and Logical File Name
MDS
Selected Replica
Replica Selection
Performance Information Predictions
NWS
GridFTP Control Channel
Disk Cache
GridFTPDataChannel
Tape Library
Disk Array
Disk Cache
Replica Location 1
Replica Location 2
Replica Location 3
18Grid Programming
- Globus
- MPICH-G2
- grid-enabled message passing
- CoG Kits
- Java, Python, Perl
- Portals
- GridPort, GPDK, GridLab
- Condor-G
- High-throughput task broker
- Cactus
- Grid-aware numerical software framework
19Open Grid Services Architecture
- Service-oriented, wide-area architecture that
virtualizes resources - From Web Services
- Standard interface definition mechanisms (WSDL)
- Multiple protocol bindings (SOAP)
- Multiple implementations (Java, C, Python, etc.)
- Location transparency (WS-Inspection)
- From Globus
- Grid Service semantics for interaction
- Management of transient services
- Factory, discovery, notification, lifetime
management - Reliable and secure transport
20Grid and Web Services Convergence
Grid
Web
However, despite enthusiasm for OGSI, adoption
within Web community turned out to be problematic
21Open Grid Services Architecture
Domain-Specific Services
Program Execution
Data Services
Core Services
Open Grid Services Infrastructure
WS-Resource Framework
Web Services Messaging, Security, Etc.
22WS-Resource Framework
- WS-Resource Framework
- WS-ResourceProperties
- Describes how to associate stateful resources
with web services - WS-ResourceLifetime
- Allows a requestor to destroy a resource
- WS-BaseFaults
- WS-ServiceGroup
- WS-RenewableReferences
- WS-Notification
- Family of specifications
- Uses
- WS-Addressing, WS-Security, c
23Global Grid Forum
- Community forum made up of researchers and
practitioners - Publishes best practices (not really standards
or specifications) - Areas
- Applications and programming models
- Architecture
- Data
- Security
- Information systems and performance
- P2P
- Scheduling and resource management
24What Are Portals?
- Portals are
- User-friendly
- Allows for a well-known client metaphor
- User-customizable
- User preferences w.r.t. presentation, content,
etc. - Middle tier between user and resources/services
- Allows the client tier to be thin
25What Are Portals?
Customizable content and layout
Content presented in portlets
Portlets themselves are customizable
26What Are Grid Portals?
- Grid Portals, like grid client tools, deal with
- Security (credential management)
- With the extra wrinkle that your private key does
not reside on the portal - Information management
- Discover resources and services
- Monitor status of resources and jobs
- Resource management
- Job submission and control
- Data management
- File staging, etc.
- We will focus on grid portals w.r.t .resource
management, job submission and monitoring
27GridPort
28GridSphere
- Product of GridLab WP 4
- GridSphere is their implementation of a portlet
container - They also supply the GridLab web application
which is an archive of portlets and supporting
class libraries which can be contained by
GridSphere - Tracking portlets spec.
29What are Portlets?
- The Portlet API provides a means for aggregating
content sources and application front-ends and
addresses security and personalization. - Portlets are extensions of servlets designed to
be aggregated in the context of a web page. - Java Community Process JSR-168
- An evolving spec not yet ratified
- Various implementations
- JetSpeed (from the Apache Jakarta project)
- IBM WebSphere
- GridLab GridSphere
- Oracle 9i Portal
- BEA Web Logic Portal
- and many more
30NRC Internal High Performance Computing Grid
- NRC Institute of Information Technology is
working with 5 other NRC institutes on a
science-based project in multi-scale modelling - Tracking (and contributing) to the GridLab
GridSphere project
31NRC Internal High Performance Computing Grid
32Nuclear Magnetic Resonance
33University of Regina SPARRO Project
- Simulating particle beams for subatomic physics
research - Working with Jefferson Lab experiments
- Using GridPort with some additions
34University of Regina SPARRO Project
35University of Victoria Physics Grid Testbed
- Data grid oriented problems with large data sets
from the Atlas Experiment at CERNs LHC - Working with Atlas Canada and the CERN Atlas
Experiment - Using GridPort
36University of Victoria Physics Grid Testbed
37Some Grid Projects
- Enabling Grids for E-science in Europe (EGEE)
- Development and deployment of a service grid
infrastructure - DOEGrids (www.doesciencegrid.org)
- Deployment of Grid for DoE science labs (and
partner universities) - Access Grid (www.accessgrid.org)
- High-performance multi-site, video-conferencing
with shared spaces - Globus Project (www.globus.org)
- Research and development of Grid Middleware
- GridLab (www.gridlab.org)
- Development of Grid Application Toolkit (funded
by the European Union)
38Canadian Grid Groups
- Grid Research Centre at U Calgary
- Rob Simmonds, Brian Unger, c
- Atlas Canada
- Randy Sobie, U Victoria, Bryan Carron, U Alberta,
Michel Vetterli, TRIUMF, c - International Grid Testbed
- Bryan Carron, U Alberta, Wade Hong, Carleton U,
c - NRC Internal HPC Programme / Multi-Scale
Modelling Project - Roger Impey, c
- SPARRO at U Regina
- Edward Brash, c
- UQAM
- Omar Cherkaoui, c
- Grid Group at U Windsor
- You!
39Grid Canada
- Informational and consultative functions
- giis.gridcanada.ca
- Information service, resources attached to the
Grid Canada testbed - www.gridcanada.ca/ca/
- Grid Canada Certificate Authority
- Trusted member of the EU Grid PMA
- Working with others to establish trust (including
U.S. and Asia-Pacific Grid projects) - myproxy.gridcanada.ca
40Grid CanadaCertificate Authority
- Working with with others internationally towards
a Grid Policy Management Authority to establish
trust between Certificate Authorities - Formally recognized by the European Data Grid
allowing researchers with Atlas Canada to
authenticate with compute and storage resources
involved in the EDG project - Engaging grid-related projects in Canada to meet
their needs - Atlas Canada
- NRCs Grid Infrastructure Project
- WestGrid
41Grid CanadaUser Functionality
- Making access to grid functionality user-friendly
- Experience with services like the Certificate
Authority have shown the need for alternate ways
of delivery functionality such as requesting a
certificate - CANARIE and NRC HPC are implementing a suite of
functionality that can be deployed on the Grid
Canada web site - Engaging the Globus and GridLab projects in
deploying their software and providing feedback - Planning a centralized broker for access to
distributed resources
42Grid CanadaGridX1
- Making available a set of truly on-demand,
always-available, large-scale, computational and
storage resources - Established minimum criteria for inclusion in the
production grid - Resources must make available a dedicated queue
for production use only - Compute resources must have at least 10 nodes
- Well-connected
- Have chosen the Atlas Data Challenge an
international project associated with the
European Data Grid as the first target
application - Have identified resources at University of
Victoria, University of Alberta and NRC Sussex
for use in the production grid
43Grid Middleware
- Virtual Data Toolkit (VDT)
- NSF Middleware Initiative (NMI)
- Both are GT2.4-based
- Try out GT4 alpha release from Globus
- Condor-G, MPICH-G2, c
44(No Transcript)