University of Virginia HPC Institute - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

University of Virginia HPC Institute

Description:

Flexible and Secure Logging of Grid Data Access. ... Newegg.com (today): $3,898.78. 13.79 GFlops. standalone AD domain. OGSA HPC Profile: History ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 33
Provided by: martyhu
Category:

less

Transcript and Presenter's Notes

Title: University of Virginia HPC Institute


1
University of Virginia HPC Institute
  • Marty Humphrey
  • Assistant Professor
  • Department of Computer Science
  • University of Virginia
  • Charlottesville, VA USA

2
Grids on .NET
Legacy Grids
Web services Grids
GridFTP.NET
GGF OGSA HPC Profile
GRAM.NET
GGF OGSA Data Profile
(TeraGrid06)
GridFTP Web service
WWF, too!
3
Grids on .NET Recent Papers
  • W. Zhang, D. Del Vecchio, G. Wasson, and M.
    Humphrey. Flexible and Secure Logging of Grid
    Data Access. 7th IEEE/ACM International
    Conference on Grid Computing (Grid 2006), Sept
    28-29, 2006, Barcelona Spain.
  • J. Feng, L. Cui, G. Wasson, and M. Humphrey.
    Policy-Directed Data Movement in Grids. 12th
    International Conference on Parallel and
    Distributed Systems (ICPADS 2006), Minneapolis,
    MN, July 12-15, 2006.
  • W. Zhang, D. Del Vecchio, G. Wasson, and M.
    Humphrey. Integrating Legacy Authorization
    Systems into the Grid A Case Study Leveraging
    AzMan and ADAM. 2006 International Conference on
    Computational Science (ICCS 2006). Reading, UK.
    May 28-31, 2006.
  • J. Feng, L. Cui, G. Wasson, and M. Humphrey.
    Toward Seamless Grid Data Access Design and
    Implementation of GridFTP on .NET. Proceedings of
    the 2005 Grid Workshop (Associated with
    Supercomputing 2005). Nov 13-14, 2005. Seattle,
    WA.
  • M. Humphrey, G. Wasson, Y. Kiryakov, S-M. Park,
    D. Del Vecchio, N. Beekwilder, and J. Gray.
    Alternative Software Stacks for OGSA-based Grids.
    Proceedings of Supercomputing 2005, Seattle, WA,
    Nov 12-18, 2005.

4
Our CCS Cluster Scientist mode
  • Modest size 4 nodes 1 head node
  • AMD Athlon 64 3700, 250GB HD, 2GB, Shuttle,
    on-board GigE
  • Newegg.com (today) 3,898.78
  • 13.79 GFlops
  • standalone AD domain

5
OGSA HPC Profile History
  • GGF14 Chicago Jul 14 2005
  • Minimal Web Services BOF (aka WS-Management)
    Newhouse, Theimer, Humphrey, Tollefsrud
  • GGF15 Boston Oct 6 2005
  • UVa update on WS-Management use for OGSA
  • Specific technical thoughts on the support of
    dual stacks (Suspended given rumored
    reconcilation)
  • GGF16 (Athens, Feb 2006) An Evolutionary
    Approach to Realizing the Grid Vision (Theimer,
    Parastatidis, Hey, Humphrey, Fox)
  • incremental stepspragmatic, small and
    application domain specific functional subsets of
    the overall vision
  • Scope the use of computational resources on the
    Internet to run programs.

6
OGSA HPC Profile GGF17 BOF
  • Objective the profile and protocol
    specifications needed to realize the vertical use
    case of batch job scheduling of
    scientific/technical applications  
  • Evolutionary approach
  • A simple base case will be defined that we expect
    to have universally implemented by all batch job
    scheduling clients and schedulers
  • All additional functionality will be defined in
    terms of optional extensions (which are
    anticipated to be widely applicable)
  • simple living

7
GGF OGSA WG Deliverables
8
Base Case
  • High throughout compute cluster used only within
    the enterprise
  • User requests
  • Submit a job with specification of resource
    requirements ? unique jobID or fault
  • Query a specific job for its current state
  • Cancel a specific job
  • List jobs
  • State diagram queued, running, finished

9
Base Case Out of Scope
  • Data access issues
  • Programs are assumed to be pre-installed
  • Creation and management of user security
    credentials
  • No need for directory services beyond something
    like DNS
  • Management of the system resources

10
Common Cases
  • Purpose of enumerating common cases use as the
    basis for creating appropriate extension
    mechanisms
  • 13 cases

11
13 Common Cases
  • Exposing existing schedulers functionality
  • Condor, Globus, LSF, Maui, Microsoft-CCS, PBS,
    SGE, etc.
  • Polling vs. notification
  • notification call-back messages for
    significant changes in the state of a job
  • What are the semantics of message delivery?
  • At-Most-Once and Exactly-Once Submission
    Guarantees
  • The base use case allows the possibility that a
    client cant directly tell whether its job
    submission request has been successfully received
    by a job scheduler or not
  • Types of Data Access
  • non-transparent staging of data between
    independent storage systems.
  • explicitly supports transparent data access
    within a virtual organization or across a
    federated set of organizations

12
13 Common Cases (cont.)
  • Types of Programs to Install/Provision/Run
  • users may have programs that require explicit
    installation of some form.
  • Multiple Organization Use Cases
  • Submission of jobs requires additional security
    support (e.g., foreign credential)
  • Data current resides outside of the enterprise in
    question
  • Additional sandboxing of non-local users
  • Extended Resource Descriptions
  • allow arbitrary resource types whose semantics
    are not understood by the HPC infrastructure
  • accounting information returned for a job

13
13 Common Cases (cont.)
  • Extended Client/System Administrator Operations
  • users may wish to modify the requirements for a
    job after it has already been submitted
  • Arrays of jobs
  • system administrators suspension/resumption of
    jobs and migration of jobs among compute nodes
  • Extended Scheduling Policies
  • shortest/smallest-job-first, weighted-fair-share
    scheduling, etc.
  • multiple submission queues, job submission
    quotas, and various forms of SLAs, such as
    guarantees on how quickly a job will be scheduled
    to run.

14
13 Common Cases (cont.)
  • Parallel Programs and Workflows of Programs
  • instantiate such programs (e.g., MPI) across
    multiple compute nodes in a suitable manner,
    including provision of information that will
    allow the various program instances to find each
    other within the cluster
  • Programs may have execution dependencies on each
    other.
  • Advanced Reservations and Interactive Use Cases
  • reserve resources for use at a specific future
    time
  • communicate in real time with external client
    users
  • Cycle Scavenging
  • batch job scheduler dispatches jobs to machines
    that have dynamically indicated to it that they
    are currently available for running guest jobs.

15
13 Common Cases (cont.)
  • Multiple Schedulers
  • submit work to the whole of the computing
    infrastructure without having to manually select
    which facility to submit to

16
OGSA HPC Profile Current Issues(led by Marvin
Theimer)
  • BES
  • WSRF-isms
  • Arrays
  • Extensibility of activity state model
  • Separate user vs. management interface?
  • JSDL
  • All CPUs must be the same?
  • Only job requirements or also resource
    capabilities?
  • Remove data staging?
  • Information model CIM?
  • Use of WS-Management?

17
UVa OGSA HPC Reference Implementation
  • BES
  • CreateActivityFromJSDL
  • QueryActivityStatus
  • CancelActivity
  • QueryScheduler
  • GetActivityJSDLDocument
  • JSDL documents for job descriptions
  • (binary name, working dir,
    args, stdin/stdout/stderr, etc.)
  • (min/max processors, etc.)
  • State model pending ? running ? done (also
    unknown)

18
OGSA HPC Reference Implementation UVa Design
Decisions
  • Not clear how to specify more than one task in
    an activity in JSDL
  • JSDL does not handle password, just username
  • Return is EPR AND separate job ID
  • GetActivityJSDLDocument not currently
    implemented (I think)
  • BES management activities not implemented

19
HPC Profile Reference Implementation
wincluster1
Remote Execution Web service
client
BES/JESDL
Job Submission GUI
GridFTPd
GridFTP GUI
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
Demo Recap
  • Server-side
  • Exposed BES-like Web service via WSE 3.0
  • Connected via (undocumented) .NET interface to
    Wincluster scheduler
  • Implemented JSDL-like capabilities
  • Client-side
  • Generated JSDL-like document
  • Consumed BES-like Web service via WSE 3.0
  • Retrieved output via GridFTP GUI

30
FY07 Plans
  • Continue development of OGSA HPC Profile
  • Protocol definition/standardization with Marvin
    Theimer
  • Reference implementation .NET 3.0 WCF, WF (?)
  • Perhaps Win XP, Linux
  • Web service-based GridFTP
  • .NET 3.0 WCF, WS-Security, etc.
  • GGF OGSA Data profile

31
Collaboration with HPC Institutes
  • Please download and use our software
  • GridFTP.NET, GRAM.NET (short-term), HPC Profile
    (long-term)
  • Tell us (GridFTP) Requirements
  • Security Authorization? Authentication? Logging?
  • Lets discuss after lunch in the open session!

32
Summary
  • Grids on .NET
  • Legacy Grids present
  • Web services Grids future
  • Downloads (http//www.cs.virginia.edu/gsw2c/GridT
    ools)
  • GridFTP.NET, GRAM.NET (now)
  • OGSA HPC Reference Implementation (soon)
  • Were looking for collaboration between the
    institutes
Write a Comment
User Comments (0)
About PowerShow.com