Title: TeraGrid Overview: National Scale Cyberinfrastructure for Science
1TeraGrid OverviewNational Scale
Cyberinfrastructure for Science
Opinions are Cobbs Alone
However, many slides stolen from collaborators
However, many slides borrowed from
collaborators
- John W. Cobb
- TeraGrid Project(ORNL Resource Provider PI)
- Oak Ridge National Laboratory
- May, 11 2006
Discussion concerns TeraGrid at large.
2What is an Elephant?
The Blind Men and the Elephant - Indian
folktale retold in poetic form by Godfrey Saxe
3What is the TeraGrid?(And what is CI?)
It was six men of Indostan, To learning much
inclined, Who went to see the elephant, (Though
all of them were blind), That each by
observation Might satisfy his mind.
Its a Grid!
WWW.TERAGRID.ORG
Its a Network!
Its a Common Software Environ!
They are HPC Centers!
And More! - Viz - Facilities - Data collections
Its Apps and Support!
Its Storage!
4TeraGrid Resources
The first approached the elephant,And happening
to fallAgainst his broad and sturdy side,At
once began to bawl"God bless me! But the
elephantIs very like a wall!"
PetaFlops coming soon!
5TeraGrid Network (10-30 Gbps)
The second, feeling of the tusk, Cried "Ho! What
have we here, So very round and smooth and
sharp? To me 'tis very clear, This wonder of an
elephant Is very like a spear!"
6TeraGrid Support and Apps
The third approached the animal, And happening to
take The squirming trunk within his hands, Thus
boldly up and spake "I see," quoth he, "the
elephant Is very like a snake!
- Common Support Model
- Allocations proposal and review POPS (PACI
legacy) - Common Ticketing systems (Repurposed from
Centers program) - Common account creation system.
- Common usage reporting
- TeraGrid roaming ability to run on any
TeraGrid resource - Coordinated security and incident playbook
- Advanced support via ASTA program
- Outreach to new communities via Science Gateways
Program - Outreach to new communities via EOT program.
- LORA Learn Once Run Anywhere
7TeraGrid Support and Apps
The third approached the animal, And happening to
take The squirming trunk within his hands, Thus
boldly up and spake "I see," quoth he, "the
elephant Is very like a snake!
- Common Support Model
- Allocations proposal and review POPS (PACI
legacy) - Common Ticketing systems (Repurposed from
Centers program) - Common account creation system.
- Common usage reporting
- TeraGrid roaming ability to run on any
TeraGrid resource - Coordinated security and incident playbook
- Advanced support via ASTA program
- Outreach to new communities via Science Gateways
Program - Outreach to new communities via EOT program.
- LORA Learn Once Run Anywhere
See Scott _at_ 1100
8User Needs Analysis Drives Staff Effort
See Pancake _at_ 1030
Data
Grid Computing
Overall Score (depth of need)
Science Gateways
Partners in Need (breadth of need)
Remote File Read/Write
High-Performance File Transfer
Coupled Applications, Co-scheduling
Grid Portal Toolkits
- GPFS pilot at SDSC. NCSA, ANL
- TGCP GridFTP in production
- Manual co-scheduler in production GUR pilot
- Evaluating portal toolkits from SGW partners
- Evaluating workflow toolkits in SGW projects
- Moab pilot evaluating other approaches (e.g.
GRMS) implementing information services to
support schedulers - GPFS pilot
- GridShell pilot
- Manual advanced reservations in production
Grid Workflow Tools
Batch Metascheduling
Global File System
Client-Side Computing Tools
Batch Scheduled Parameter Sweep Tools
Advanced Reservations
In-depth discussions result with 16 TeraGrid user
teams (August 2004).
Separate TeraGrid wide audience user survey in
2005
9Storage
The fourth reached out an eager hand, And felt
about the knee. "What most this wondrous beast is
like Is might plain," quoth he "Tis clear enough
the elephant Is very like a tree."
- Globus Toolkit GridFTP deployed at all sites.
Transfer files from site to site for stage in,
stage out , and multi-site runs. Performance
varies site to site. Sustained 22 Gb/s transfer
rates demonstrated at SCxy to win Bandwidth
challenges. 100 of MB/s not unrealistic for
novice use of TeraGrid provided tuned transfer
tool (TGCP) - GPFS WAN service (at SDSC) in Production.
500TB of spinning storage available to Wide area.
Seeing 7 GB/s across machine room, 2-3 GB/s
SDSC-NCSA. Demonstrating in many cases, access
to global file systems faster than access to
local storage. - Lustre WAN Pilot
10Its a Common Software Environment
The fifth, who chanced to touch the ear, Said
"E'en the blindest man Can tell what this
resembles most Deny the fact who can, This
marvel of an elephant Is very like a fan."
- Common TeraGrid Software and Services (CTSS)
installed on all RP resources. - Uses can reasonable expect certain advanced
infrastructure services to be present at any
TeraGrid resource - CTSS V2 in production now. (http//www.teragrid.o
rg/userinfo/guide_software_exp.html) - CTSS V.3 in production in Summer of 2006
- CTSS V.3 includes Globus Toolkit 4.0
- CTSS evolves to support inter-grid collaboration
(VDT for OSG)
11Its a Grid
The sixth no sooner had begun About the beast to
grope, Than seizing on the swinging tail That
fell within his scope, "I see," quoth he, "the
elephant Is very like a rope."
- Traditional Grid tools deployed
- Support Roaming (pick your CPU after you develop
your code) - Pick best platform from architecture portfolio
for your code (or parts of your code EX SCEC )
http//www.teragrid.org/news/news05/terashake.html
- In some instances (algorithms with high latency
tolerance) conduct cross-site runs EX NEKTAR - Remote resource utilization
- Inter-grid VO operation
12How Useful is it?
And so these men of Indostan Disputed loud and
long, Each in his own opinion Exceeding stiff and
strong. Though each was partly right, All were in
the wrong.
We are useful CI Not a Platypus
- Results of 2005 User Survey- High
13Facility Integration Spallation Neutron Source
- Large science facility
- Large Cost 1.4 Billion
- Large user base expected 2000 users at steady
state - Large Science impact
- Materials
- Nano-Technology
- Biology
- Earth Science
- Chemistry
- Polymer Science
- Challenge How-to apply CI to leverage science
output Value proposition.
14SNS Construction Site
15SNS Construction Complete
16SNS Construction Complete 205
17SNS Construction Complete 4/28 204
18SNS Construction Complete 329
19SNS Construction Complete 349 ?
20SNS CD4 Results
- DOE level 1 Milestones met
- Protons on target
- neutron production efficiency
- facility readiness
- Due June -2006, achieved 4/28/2006
- Project completed on schedule and on budget!
- 4/28 morning full off Gremlins. First beam on
target shifted from 800 AM to 206 PM - Planned shakedown run of 4 days completed in 2
hours - Champagne Celebration at 500 PM It was a good
day!
21Portal Data Browser Showing CD-4 Data
MCA Data
ISAW Plot
NeXus tags
NeXus Files
metadata
22SNS Operational Ramp-up
23SNS Instrument Profile
24Models for CI development
- Question How to initiate CI Build out?
- Answers should address
- How these arrangements scale?
- Ignition
- Level of investment before reaching
self-sustaining stage - Where to book investment costs
- Possible Answers
- NSF pays all freight (never self-sustaining)
- Pure Grassroots (no ignition)
- Federated cost sharing (who are the partners?
What is the split?) - Combinations of above and others
25Role of Networking for CI (opinion)
- Use case requirements should drive network
deployment plans, but this often unavoidably
inverted. - Context
- National glass infrastructure access now
possible in Research and Education via buying
clubs - Many RON efforts exist. Future may see RONs
linked with national scale integration (for RE) - Rich networks will need to show useful
applications. (Technology push Scenario, not
demand pull) - Tipping Point anticipating what is possible and
needed before it is know all for sure. - Next steps my guess Storage at the ends of the
pipes
26Questions?
cobbjw_at_ornl.gov
http//www.teragrid.org
http//www.sns.gov