Overview and Access to PACI Resources - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

Overview and Access to PACI Resources

Description:

Printable forms available to complete and submit. Submit any time. Alliance Chautauquas 2000 ... News, Calendars, discussion groups,... Web search engines ... – PowerPoint PPT presentation

Number of Views:57

Avg rating:3.0/5.0

Slides: 30

Provided by: johnt53

Category:

more less

Transcript and Presenter's Notes

Title: Overview and Access to PACI Resources

1
Overview and Access to PACI Resources

John Towns
Division Director, Scientific Computing
NCSA and the Alliance
jtowns_at_ncsa.edu

2
PACI Resources at a Glance

Alliance Resources
NCSA
SGI Origin2000
HP X-Class (Exemplar)
NT Supercluster
Boston University
SGI Origin2000
Univ of New Mexico
IBM SP
RoadRunner cluster
Los Lobos cluster
Univ of Kentucky
HP N-Class cluster
MHPCC
IBM SP
Univ of Wisconsin
Condor flock

NPACI Resources
SDSC
IBM SP
Cray T90
Cray T3E
SUN HPC10000
Univ of Texas
Cray SV1
Cray T3E
Univ of Michigan
IBM SP
Caltech
HP X-Class (Exemplar)

3
The Big Machines in PACI

NCSA SGI Origin2000 Array
1528 MIPS R10k processors 618 GB RAM 4.3 TB
scratch disk 680 Gflop/s peak
1 x 56-proc (195MHz), 14GB RAM
1 x 64-proc (195MHz), 16GB RAM
1 x 128-proc (195MHz), 64GB RAM
4 x 128-proc (195MHz), 32GB RAM
3 x 128-proc (250MHz), 64GB RAM
1 x 128-proc (250MHz), 72GB RAM
2 x 256-proc (250MHZ), 128GB RAM

SDSC IBM SP (Blue Horizon)
1152 IBM Power3/222MHz processors in 144 nodes
512 GB RAM 5.0 TB scratch disk 1.0 Tflop/s
peak
Each node has
8 Power3 processors
4 GB RAM

4
PACI Vector Machines

NPACI Vector Resources
SDSC
Cray T90
14 Cray Vector processors
4 GB RAM
24 Gflop/s peak
Univ of Texas
Cray SV1
16 Cray CMOS Vector processors
16 GB RAM
19.2 Gflop/s peak

5
PACI Shared Memory Systems

Alliance SMP systems
NCSA
HP X-Class (Exemplar)
64 PA-8000/180MHz processors
16 GB RAM
46 Gflop/s peak
Univ of Kentucky
HP N-Class
96 PA-8500/440MHz processors
12 x 8-proc systems
96 GB RAM total
96 Gflop/s peak
Boston University
Origin2000
192 MIPS R10000/195MHz processors
36 GB RAM
74.8 Gflop/s peak

NPACI SMP systems
SDSC
SUN HPC10000
64 UltraSPARC II processors
64 GB RAM
51 Gflop/s peak
Caltech
HP X-Class (Exemplar)
256 PA-8000/180MHz processors
64 GB RAM
185 Gflop/s

6
PACI MPP Resources

Alliance MPP systems
Univ of New Mexico
IBM SP
96 IBM Power2/66MHZ processors
6.0 GB RAM
25 Gflop/s peak
Maui High Performance Computing Center (MHPCC)
IBM SP
32 IBM P2SC/160MHz processors
24 GB RAM
20 Gflop/s peak

NPACI MPP systems
SDSC
Cray T3E
272 DEC Alpha 21164 processors
34 GB RAM
154 Gflop/s peak
Univ of Texas
Cray T3E
88 DEC Alpha 21164 processors
11 GB RAM
34 Gflop/s peak
Univ of Michigan
IBM SP
64 IBM Power2/160MHz processors
64 GB RAM
30 Gflop/s peak

7
PACI PC and Workstation Clusters

Alliance PC and workstation clusters
NCSA
NT Supercluster
256 PentiumIII/550MHz processors
32 PentiumII/330MHz processors
72 GB RAM
151 Gflop/s peak
University of Wisconsin
Condor flock
700 processors of various types
64 MB - 1 GB RAM per processor

University of New Mexico
RoadRunner Linux cluster
128 PentiumII/450MHz processors
32 GB RAM
56 Gflop/s peak
LosLobos Linux cluster
512 PentiumIII/733Mhz processors
256 GB RAM
375 Gflop/s peak

8
Getting a Small Allocation

Similar processes for Alliance and NPACI
Alliance StartUp accounts
Up to 10,000 SUs on any Alliance resource
Online form with brief project description
Submit at any time approximately 30 day
turn-around
NPACI Expedited accounts
Up to 5,000 SUs on most systems
Up to 100 SUs on Cray T90
Up to 200 SUs on Cray SV1
Printable forms available to complete and submit
Submit any time

9
Getting Moderate Allocations

Similar process for Alliance and NPACI
Both are peer-review processes
Alliance Allocations Board (AAB)
10,001-100,000 SUs on any Alliance resource
Meets quarterly with allocations active on the
first of January, April, July, and October
NPACI Partnership Resource Allocation Committee
(PRAC)
5,001-50,000 SUs on most NPACI resources
Up to 101-2,000 SUs on Cray T90
Up to 201-4,000 SUs on Cray SV1
Meets twice per year with allocations active on
the first of January and July

10
Getting Large Allocations

Single process for PACI program
Jointly reviewed by AAB and PRAC
National Resource Allocation Committee (NRAC)
gt100,000 SUs on any Alliance resource
gt50,000 SUs on most NPACI resources
gt2,000 SUs on Cray T90
gt4,000 SUs on Cray SV1
Meets twice per year with allocations active on
the first of April and October

11
The Alliance VMR

An Evolving, Persistent Alliance Computational
Grid
Connects the PACS ACRS Sites

Give the User the Perception Of ONE Machine Room
12
What is the Alliance Virtual Machine Room?

Infrastructure
Supercomputers, networks, visualization
resources, data archives and databases,
instruments, etc.
Middleware
Primarily Globus components

Grid services
Security infrastructure, grid information
sources, resource management, job submission an
control, data management, etc.
Portal interfaces and portal services
User Portal, Chemical Sciences Portal, etc.
Support services
Consulting and helpdesk

13
VMR Deployment Areas

VMR Operations
Establish a distributed operations support team
for the VMR
Establish VMR Policies and Procedures
Storage
Tie together storage archive resources within the
VMR while adding new capabilities
Account Management
Account creation and management
Usage reporting for allocated projects

Grid Security Infrastructure
Deploy PKI/GSI authentication services
Interface to local policies and mechanisms
Globus Installation and Maintenance
Deploy appropriate Globus components
User Services
Prepare users for new VMR technologies
Deploy these technologies and nurture use
User Portal
User interface to VMR
Portal services to be leveraged

14
VMR Operations

VMR Resource Monitoring
Critical resource information monitored by
central VMR site management
Tools and mechanisms to monitor system resources
24x7 Operations
Management policies and procedures
Central VMR web site
Common Helpdesk
Central VMR trouble ticket system

Base System Documentation and System Admin
Support
Links to local system documentation for each
participating site
System admin policies and procedures
VMR Systems Software Repository
Current set of software necessary for a site to
participate in the VMR

15
Storage

Access to established archives
Secure FTP server (gsiftp) installed at all sites
Common command line interface to local archive or
LES archive server
Will be testing Distributed UniTree disk caches
In house at NCSA, testing stability

16
Account Management

Infrastructure
Transaction-based info exchange
Maintain local identities
Account/Allocation management
Alliance Distinguished Name (DN) creation
Account creation/removal centrally managed
Usage reporting
Regular reporting from some sites of usage
against Alliance allocations

17
Grid Security Infrastructure

Certificate Authority (CA)
Certificate request, creation, expiration
Certificate management
GSI deployment
Globus, GSI FTPd and sshd
Admin guide
Client deployment

18
Globus Installation and Maintenance

Globus installation
Globus v1.1.2 is current version used in VMR
Globus capabilities
Submit jobs to remote system
MDS information services
Initial infrastructure in place
Data being published by all sites
Hardware resources
Hardware status info
Queue status info
Globus job info

19
VMR User Support

GSI Documentation
Collaboration on GSI user tools
Feedback and testing
VMR Friendly User list generated

20
Alliance User Portal aka MyGrid

At SC99 prototype was shown
Recent work has focused on component technologies
Applicable to other portal efforts!
Also working on component applications
Been working with SDSC/NPACI on PACI User Portal
User Portal intended to be interface to VMR
computing environment

21
Component Technologies

MyProxy security
Credential delegation and authentication for
actions on behalf of the grid user
Focus on authentication
File transfer facilities
Primary concern is moving (large) data files
securely
Java interface using GSSFTP
Initial Java bean interface using GSSFTP

Job submission
Mostly app specific today
Initial general framework prototype using
plug-ins developed
Search engine and documentation
Bought AltaVista license
Currently indexing all HPC documentation at all
partner sites
Using this to re-vamp NCSA HPC documentation

22
Component Applications

Systems/job status information
What is the state of every machine in the VMR?
What is the state of every job in the VMR?
Definition of XML DTD formats
Currently implemented for Origin2000
Mass store status
Usage statistics
Network link status
Currently implemented for NCSA archive

Consulting access
On-line help with desktop sharing phone call
Trying out WebEx
Allocations data access
Interface to check allocation status from User
Portal
Direct access to centralized database backend
Proposal process
Interface to manage proposal process from User
Portal

23
User Portal Today
24
User Portal Today
25
User Portal Today
26
User Portal Today
27
User Portal Today
28
Component Applications - Ideas and things to
integrate