Overview and Access to PACI Resources - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Overview and Access to PACI Resources

Description:

Printable forms available to complete and submit. Submit any time. Alliance Chautauquas 2000 ... News, Calendars, discussion groups,... Web search engines ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 30
Provided by: johnt53
Category:

less

Transcript and Presenter's Notes

Title: Overview and Access to PACI Resources


1
Overview and Access to PACI Resources
  • John Towns
  • Division Director, Scientific Computing
  • NCSA and the Alliance
  • jtowns_at_ncsa.edu

2
PACI Resources at a Glance
  • Alliance Resources
  • NCSA
  • SGI Origin2000
  • HP X-Class (Exemplar)
  • NT Supercluster
  • Boston University
  • SGI Origin2000
  • Univ of New Mexico
  • IBM SP
  • RoadRunner cluster
  • Los Lobos cluster
  • Univ of Kentucky
  • HP N-Class cluster
  • MHPCC
  • IBM SP
  • Univ of Wisconsin
  • Condor flock
  • NPACI Resources
  • SDSC
  • IBM SP
  • Cray T90
  • Cray T3E
  • SUN HPC10000
  • Univ of Texas
  • Cray SV1
  • Cray T3E
  • Univ of Michigan
  • IBM SP
  • Caltech
  • HP X-Class (Exemplar)

3
The Big Machines in PACI
  • NCSA SGI Origin2000 Array
  • 1528 MIPS R10k processors 618 GB RAM 4.3 TB
    scratch disk 680 Gflop/s peak
  • 1 x 56-proc (195MHz), 14GB RAM
  • 1 x 64-proc (195MHz), 16GB RAM
  • 1 x 128-proc (195MHz), 64GB RAM
  • 4 x 128-proc (195MHz), 32GB RAM
  • 3 x 128-proc (250MHz), 64GB RAM
  • 1 x 128-proc (250MHz), 72GB RAM
  • 2 x 256-proc (250MHZ), 128GB RAM
  • SDSC IBM SP (Blue Horizon)
  • 1152 IBM Power3/222MHz processors in 144 nodes
    512 GB RAM 5.0 TB scratch disk 1.0 Tflop/s
    peak
  • Each node has
  • 8 Power3 processors
  • 4 GB RAM

4
PACI Vector Machines
  • NPACI Vector Resources
  • SDSC
  • Cray T90
  • 14 Cray Vector processors
  • 4 GB RAM
  • 24 Gflop/s peak
  • Univ of Texas
  • Cray SV1
  • 16 Cray CMOS Vector processors
  • 16 GB RAM
  • 19.2 Gflop/s peak

5
PACI Shared Memory Systems
  • Alliance SMP systems
  • NCSA
  • HP X-Class (Exemplar)
  • 64 PA-8000/180MHz processors
  • 16 GB RAM
  • 46 Gflop/s peak
  • Univ of Kentucky
  • HP N-Class
  • 96 PA-8500/440MHz processors
  • 12 x 8-proc systems
  • 96 GB RAM total
  • 96 Gflop/s peak
  • Boston University
  • Origin2000
  • 192 MIPS R10000/195MHz processors
  • 36 GB RAM
  • 74.8 Gflop/s peak
  • NPACI SMP systems
  • SDSC
  • SUN HPC10000
  • 64 UltraSPARC II processors
  • 64 GB RAM
  • 51 Gflop/s peak
  • Caltech
  • HP X-Class (Exemplar)
  • 256 PA-8000/180MHz processors
  • 64 GB RAM
  • 185 Gflop/s

6
PACI MPP Resources
  • Alliance MPP systems
  • Univ of New Mexico
  • IBM SP
  • 96 IBM Power2/66MHZ processors
  • 6.0 GB RAM
  • 25 Gflop/s peak
  • Maui High Performance Computing Center (MHPCC)
  • IBM SP
  • 32 IBM P2SC/160MHz processors
  • 24 GB RAM
  • 20 Gflop/s peak
  • NPACI MPP systems
  • SDSC
  • Cray T3E
  • 272 DEC Alpha 21164 processors
  • 34 GB RAM
  • 154 Gflop/s peak
  • Univ of Texas
  • Cray T3E
  • 88 DEC Alpha 21164 processors
  • 11 GB RAM
  • 34 Gflop/s peak
  • Univ of Michigan
  • IBM SP
  • 64 IBM Power2/160MHz processors
  • 64 GB RAM
  • 30 Gflop/s peak

7
PACI PC and Workstation Clusters
  • Alliance PC and workstation clusters
  • NCSA
  • NT Supercluster
  • 256 PentiumIII/550MHz processors
  • 32 PentiumII/330MHz processors
  • 72 GB RAM
  • 151 Gflop/s peak
  • University of Wisconsin
  • Condor flock
  • 700 processors of various types
  • 64 MB - 1 GB RAM per processor
  • University of New Mexico
  • RoadRunner Linux cluster
  • 128 PentiumII/450MHz processors
  • 32 GB RAM
  • 56 Gflop/s peak
  • LosLobos Linux cluster
  • 512 PentiumIII/733Mhz processors
  • 256 GB RAM
  • 375 Gflop/s peak

8
Getting a Small Allocation
  • Similar processes for Alliance and NPACI
  • Alliance StartUp accounts
  • Up to 10,000 SUs on any Alliance resource
  • Online form with brief project description
  • Submit at any time approximately 30 day
    turn-around
  • NPACI Expedited accounts
  • Up to 5,000 SUs on most systems
  • Up to 100 SUs on Cray T90
  • Up to 200 SUs on Cray SV1
  • Printable forms available to complete and submit
  • Submit any time

9
Getting Moderate Allocations
  • Similar process for Alliance and NPACI
  • Both are peer-review processes
  • Alliance Allocations Board (AAB)
  • 10,001-100,000 SUs on any Alliance resource
  • Meets quarterly with allocations active on the
    first of January, April, July, and October
  • NPACI Partnership Resource Allocation Committee
    (PRAC)
  • 5,001-50,000 SUs on most NPACI resources
  • Up to 101-2,000 SUs on Cray T90
  • Up to 201-4,000 SUs on Cray SV1
  • Meets twice per year with allocations active on
    the first of January and July

10
Getting Large Allocations
  • Single process for PACI program
  • Jointly reviewed by AAB and PRAC
  • National Resource Allocation Committee (NRAC)
  • gt100,000 SUs on any Alliance resource
  • gt50,000 SUs on most NPACI resources
  • gt2,000 SUs on Cray T90
  • gt4,000 SUs on Cray SV1
  • Meets twice per year with allocations active on
    the first of April and October

11
The Alliance VMR
  • An Evolving, Persistent Alliance Computational
    Grid
  • Connects the PACS ACRS Sites

Give the User the Perception Of ONE Machine Room
12
What is the Alliance Virtual Machine Room?
  • Infrastructure
  • Supercomputers, networks, visualization
    resources, data archives and databases,
    instruments, etc.
  • Middleware
  • Primarily Globus components
  • Grid services
  • Security infrastructure, grid information
    sources, resource management, job submission an
    control, data management, etc.
  • Portal interfaces and portal services
  • User Portal, Chemical Sciences Portal, etc.
  • Support services
  • Consulting and helpdesk

13
VMR Deployment Areas
  • VMR Operations
  • Establish a distributed operations support team
    for the VMR
  • Establish VMR Policies and Procedures
  • Storage
  • Tie together storage archive resources within the
    VMR while adding new capabilities
  • Account Management
  • Account creation and management
  • Usage reporting for allocated projects
  • Grid Security Infrastructure
  • Deploy PKI/GSI authentication services
  • Interface to local policies and mechanisms
  • Globus Installation and Maintenance
  • Deploy appropriate Globus components
  • User Services
  • Prepare users for new VMR technologies
  • Deploy these technologies and nurture use
  • User Portal
  • User interface to VMR
  • Portal services to be leveraged

14
VMR Operations
  • VMR Resource Monitoring
  • Critical resource information monitored by
    central VMR site management
  • Tools and mechanisms to monitor system resources
  • 24x7 Operations
  • Management policies and procedures
  • Central VMR web site
  • Common Helpdesk
  • Central VMR trouble ticket system
  • Base System Documentation and System Admin
    Support
  • Links to local system documentation for each
    participating site
  • System admin policies and procedures
  • VMR Systems Software Repository
  • Current set of software necessary for a site to
    participate in the VMR

15
Storage
  • Access to established archives
  • Secure FTP server (gsiftp) installed at all sites
  • Common command line interface to local archive or
    LES archive server
  • Will be testing Distributed UniTree disk caches
  • In house at NCSA, testing stability

16
Account Management
  • Infrastructure
  • Transaction-based info exchange
  • Maintain local identities
  • Account/Allocation management
  • Alliance Distinguished Name (DN) creation
  • Account creation/removal centrally managed
  • Usage reporting
  • Regular reporting from some sites of usage
    against Alliance allocations

17
Grid Security Infrastructure
  • Certificate Authority (CA)
  • Certificate request, creation, expiration
  • Certificate management
  • GSI deployment
  • Globus, GSI FTPd and sshd
  • Admin guide
  • Client deployment

18
Globus Installation and Maintenance
  • Globus installation
  • Globus v1.1.2 is current version used in VMR
  • Globus capabilities
  • Submit jobs to remote system
  • MDS information services
  • Initial infrastructure in place
  • Data being published by all sites
  • Hardware resources
  • Hardware status info
  • Queue status info
  • Globus job info

19
VMR User Support
  • GSI Documentation
  • Collaboration on GSI user tools
  • Feedback and testing
  • VMR Friendly User list generated

20
Alliance User Portal aka MyGrid
  • At SC99 prototype was shown
  • Recent work has focused on component technologies
  • Applicable to other portal efforts!
  • Also working on component applications
  • Been working with SDSC/NPACI on PACI User Portal
  • User Portal intended to be interface to VMR
    computing environment

21
Component Technologies
  • MyProxy security
  • Credential delegation and authentication for
    actions on behalf of the grid user
  • Focus on authentication
  • File transfer facilities
  • Primary concern is moving (large) data files
    securely
  • Java interface using GSSFTP
  • Initial Java bean interface using GSSFTP
  • Job submission
  • Mostly app specific today
  • Initial general framework prototype using
    plug-ins developed
  • Search engine and documentation
  • Bought AltaVista license
  • Currently indexing all HPC documentation at all
    partner sites
  • Using this to re-vamp NCSA HPC documentation

22
Component Applications
  • Systems/job status information
  • What is the state of every machine in the VMR?
  • What is the state of every job in the VMR?
  • Definition of XML DTD formats
  • Currently implemented for Origin2000
  • Mass store status
  • Usage statistics
  • Network link status
  • Currently implemented for NCSA archive
  • Consulting access
  • On-line help with desktop sharing phone call
  • Trying out WebEx
  • Allocations data access
  • Interface to check allocation status from User
    Portal
  • Direct access to centralized database backend
  • Proposal process
  • Interface to manage proposal process from User
    Portal

23
User Portal Today
24
User Portal Today
25
User Portal Today
26
User Portal Today
27
User Portal Today
28
Component Applications - Ideas and things to
integrate
  • Usage analysis
  • Use detailed job log info (JPMDB) for analysis of
    usage
  • Standard views
  • Ad Hoc queries
  • Individual user and project
  • Network status information
  • Use information from various probes in network
  • AMP, Surveyor, Ocxmon, Network Weather Service,
    etc.
  • Provide info on network link status
  • up/down, current traffic, current latency, etc
  • Allocation/account management tools
  • Build on database backend and ALPO project
  • Proposal submission, review award
  • Add/remove users, check allocation status
  • Provide end of project report

29
More Ideas
  • Access to other information sources
  • Abstract databases
  • News, Calendars, discussion groups,...
  • Web search engines
  • Electronic scientific notebooks
  • Collaboration tools
  • Frameworks for user extensibility
  • Job submission for specific applications
  • Job monitoring hooks
Write a Comment
User Comments (0)
About PowerShow.com