Title: Scheduling
1Scheduling Resource ManagementGGF3 Area
- Jennifer Schopf Bill Nitzberg, co-directors
- www.cs.nwu.edu/jms/sched-wg/
- sched-wg_at_gridforum.org
- GGF 3 Meeting
- Frascati, Italy
- 7-10 Oct 2001
2Meeting SummaryScheduling Resource
ManagementGGF3 Area
- Bill Nitzberg Jennifer Schopf, co-directors
- www.cs.nwu.edu/jms/sched-wg/
- sched-wg_at_gridforum.org
- GGF 3 Meeting
- Frascati, Italy
- 7-10 Oct 2001
3Active Groups
- Ready for review
- 10 Actions for Superscheduling in review
- Advance Reservation API in review
- Scheduler Attributes WG 13
- Standardizing Run my job
- BOF Scheduling Command Line Interface 15
- BOF DRM Application API 20
- WG Grid Resource Management Protocol 31
- Others
- WG Scheduling Dictionary no meeting
- BOF Scheduling Optimization 23
4WG Scheduler Attributes13 people
- Presented revised document
- Minor revisions suggested
- Suggestion to look at Data Grid attributes
(perhaps as next step) - Document is ready to enter
- GGF publication process
5Grid Resource Management Protocol WG31 people
- Presented revised document (SchedWD 12.1) focused
on requirements - Good feedback and discussion on the overall space
of issues - Next steps
- Better grouping capabilities vs. requirements of
the protocol - Include both requirements and non-requirements
- Identify low hanging fruit capabilities and
draft protocols for them - Perhaps a name change (which is more specific)
- As the protocol is not intended for
administrative management of resource managers
(e.g., defining policy)
6BOF Distributed Resource Management Application
API20 people
- Proposed charter
- standard API for run my job
- Base API on existing work from existing DRM
systems (Intel, Sun, Veridian, Globus, ) - Timeframe 9 months
- Discussion and comments were positive
- Good focus if its both quick, simple, and based
on existing systems - Should be easily extendible for web services
(portals) - Next steps
- Finalize charter via sched-wg email list
- Work via email telecons to develop 1st draft
7BOF Scheduling Command Line Interface15 people
- Presented qprep document
- Standard command line for Run my job
- Discussion
- User focused
- Compared with POSIX Globus approaches
- Next steps
- Finalize charter via sched-wg email
- Expand involvement
8BOF Scheduling Optimization23 people
- Proposal for a Research Group
- Enable "better" "use of" resources
- Issues
- Metrics/constraints for better
- Algorithms
- Information gathering
- Long term goal better optimization for real
schedulers - Next steps
- Discuss charter via sched-wg email
9Scheduling AreaObjectives Progress
10High-level OverviewSolve Grid Resource
Management
- Who? Developers
- What? -- Agreements / standards
- Capabilities, general protocols, APIs
- Why? -- Interoperability
- Reserving, allocating, using resources
- Managing resources (owners pt-of-view)
- Support co-scheduling diverse resources
- Enable "better" "use of" resources
11Charter
- Look at what is done today, gather
requirements - ...refining protocols, interactions, etc.
- ...work to standardize APIs and protocols
12Process
- Review working document (lt 5 minutes)
- Focus on understanding the document (revisions)
rather than correcting it - Gather discussion items during/after review
- Prioritize discuss each item
- Next steps (last 5 minutes)
- Revised documents posted to web site within 2
weeks.
13GGF-3 Agenda
- Monday
- 1200-130 Grid Resource Management WG
- Tuesday
- 1000-1100 Scheduling Command Line API BOF
- 1130-1330 Scheduling Optimization BOF
- 1430-1530 Scheduling Attributes WG
- 1600-1730 Distributed Resource Management
Application API BOF - Not being presented here
- Scheduling Dictionary WG
14Grid Resource Management Working Group
(GRM)www.cs.northwestern.edu/jms/sched-wg/grm-w
g.htmlsched-wg_at_gridforum.orgInterim Chair
Bill Nitzberg bill_at_computer.org
- Grid Resource Management Protocol Requirements
- Authors Karl Czajkowski Volker Sander
- SchedWD 12.1
15History -- Grid RM Protocol Requirements
- Child of Advance Reservation API work
- Charter expanded from Advance Reservation to
Resource Management - GGF-2 consensus you basically have to do most of
a general RM protocol to do Advance Reservations
anyway - Charter focused to start with Requirements
16Why Protocol? ? Interoperability
- Command Line standards
- API (Application Programmer Interface)
- Protocol
- Language/Syntax
17A Protocol can have Multiple APIsE.g., TCP/IP
- TCP/IP APIs include BSD sockets, Winsock, System
V streams, - The protocol provides interoperability programs
using different APIs can exchange information - I dont need to know remote users API
Application
Application
WinSock API
Berkeley Sockets API
TCP/IP Protocol Reliable byte streams
Slide courtesy of Globus Tutorial --
www.globus.org
18An API can have Multiple ProtocolsE.g., Message
Passing Interface
- MPI provides portability any correct program
compiles runs on a platform - Does not provide interoperability all processes
must link against same SDK - E.g., MPICH and LAM versions of MPI
Slide courtesy of Globus Tutorial
www.globus.org
19Grid RMPerhaps a Document Series
- Grid RM Protocol Requirements
- Grid RM Protocol Operations
- Capabilities and semantics
- Leverage Other Grid Services Standards
- Transport services
- Security
- Language
- Grid RM Protocol Bindings
20RequirementsSchedWD 12.1
- Control
- Extensibility
- Notification
- Reliability
- Protocol timers
- Negotiation
- Hierarchy
- Secure messaging
- Security language
- Resource language
211. Control
- Remote resources (diff. administrative domains)
- Coordination
- Wording may imply accting resource (consumption)
check on this are we considering this? How
does accounting piece interface here? - Consumption reserving and utilizing, which is
meant here? - Consumption negotiation of rscs vs negotiation
of capabilities
22Charter question
- clarify that this is a clients running job
focus not a management interface, this is
resource access and assignment - Run my job from the resource consumers perspective
232. Extensibility
- Bill says tastes great, less filing (of course
we want it extensible) - There will be issues in charging and accounting
that will be difficult to address - For example switch in accounting methodology
from charge per cpu hour vs congestion based
charging - So we might want to be clear if this is not
included in what we mean by extensibility
243. Notification
- Asynchronous as well as synchronous
254. Reliability
- What do we mean by reliable semantics
- One possible meaning default does what you
think it should - Another is that at every point a protocol can
fail it is recognized and taken care of in the
proper way (aka you dont submit a job twice, or
not at all) - reliable semantics is a confusing term,
reliability semantics might be the better way to
say this (as a first guess)
265. Protocol timers
- Low level enough bill wants to skip over )
276. Negotiation
287. Hierarchy
- Simple version if you end up stacking rms on
top of each other it would be nice if they all
talked the same protocol
298. Secure messaging
- How much of this should be a requirement as
opposed to a suggestion? - (Keep alives in particular)
- What about data protection?
- Integrity cant forge it
- Protection (confidentiality) cant see it
309. Security language
- Requirement vs supports?
- Need a broker to be able to delegate
- What about binding of authorization to a
collection of resources? - Maybe a single capability that is a collection of
services - This may be already being discussed in Sec Area
- This binding needs to be determined
3110. Resource language
- What do we mean by structured language?
- This needs to be clarified
- It should not imply a strict hierarchy, for
example
32Other topics in the document
- Monitoring notification
- Reliable protocol semantics
- Soft-state/keep-alive support
- Generalizing resource types
- Generalizing acquisition modes
- Service guarantees
- Access schedule
- Delayed commitment
- Dynamic binding
33Other topics in the document (2)
- Embedded resource language
- Communication model
34- Authors have been asked to restructure doc
- Qualities of the protocol
- Capabilities that the protocol should support
- Stuff we should work on right away, stuff that
can wait a bit.
35Delayed commit - 4.3, page 5
- Why is this a requirement
- This has come up in past meeting
- Saying protocol has to be confirmed may be a
simplifying assumption this should be clarified - Two examples
- Fancy dinner resv which is confirm or lose it
- Airline you can cancel it but it is confirmed
automatically
36Delayed commit - 2
- How does accounting tie into delayed or revocable
requests - Rephrase how much do we need to worry about
accounting issues in this protocol? - We probably need to identify when in protocol you
can cancel a request - Keep in mind with multi-resource requests partial
acquirement is a likelihood
37- With language a multi-request may be nested
inside, and many of these issues can be hidden
38Quality of info discussion
- It will become important that would allow the
collection of metadata about requests - for example how often a request is granted
(monitoring info) - Quality info EG ontime airline example, 5 star
system for hotels - Express this quality in the language?
39Next Steps
- Another draft, additional authors(?)
- Sched-wg_at_gridforum.org for comments
- New version 2 weeks, telecon shortly thereafter
(we hope) - 1) Which of these are required vs suggested?
Minimal set? - 2) What about use cases or examples?
40Scheduling command line API (SCLA) BOF
- Joe WerneProposed new working group
41Scheduling Command Line API (SCLA) BOF
- Proposed New Working Group (SRM Area)
- Led by J. Werne and J. Schopf
- Standardizing a command line interface to
schedulers
42U.S. DoD Uniform Command-Line Interfaces for Job
Submission and Data Archivinghttp//www.pstoolki
t.org
- Joe Werne, Michael Gourlay, Chris Meyer, Chris
Bizon - Colorado Research Associates Division, NorthWest
Research Associates, Inc. - Aram Kevorkian (Chair, Metacomputing Working
Group), Bill Asbury, - Winfried Bernhard, Anthony DelSorbo, Mark Dotson,
Joseph Robichaux (ASC), Virginia Bedford (ARSC),
Bradford Blasing (AHPCRC), Dan Duffy, Rebecca
Fahey, Jeff Hensley, David Sanders (ERDC), Mitch
Murphy (MHPCC), John Skinner, - Ray Sheppard (NAVO), Steve Thompson (ARL)
- DoD High Performance Computing Modernization
Office (HPCMO) - and the users who have provided valuable
feedback
43DoD HPCMP Challenge Program
(an incubator for grid-related problem solving)
- Multi-platform and multi-center compatibility
- Run-time data transfer and migration off-line
- Automated and Semi-automated error recovery
- Automated job preparation and submission
- Remote post-processing and visualization
- Distance collaboration
44The Problem
Site A
Code A
Site B
Code B
Code C
Site C
45The Solution
Archive is a Routine that Tr anslates
from Your code to The native syntax And
semantics, Because archive Figures that You
have better Things to do with Your
tianslates from Your code to The native
syntax And semantics, to do with Your
tianslates from Your code to The native
syntax And semantics, me like Sc me
like Sc Ie nce! Because
archive Figures that You have better Things
to do with is a Routine that Tr anslates
from Your code to The native syntax And
semantics, Because archive Figures that You
have better Things to do with
Site A
Site B
Site C
46DoD HPCMO Metacomputing Working Group Initiative
Goals
- Provide translation tools to all DoD users in the
form of a uniform command-line interface. - Write nimble translation tools that are modular,
maintainable, and reusable. - Shepherd implementation on all major U.S.
super-computer centers not just the DoD centers.
47DoD HPCMO Metacomputing Working Group Initiative
Methods 1. Open Source
- Enhance uniformity
- Permit user implementation and modification
- Maximize community support
- Avoid duplicated effort
48DoD HPCMO Metacomputing Working Group Initiative
Methods 2. Perl
- Advanced text-processing capability
- Ample support from open-source community
- POD man pages, html, LaTeX, FrameMaker
www.perldoc.org
49Tools for Uniform SuperComputing (TUSC) qprep
qprep simplifies and unifies job submission to
queues qprep is not a replacement for existing
queuing systems, e.g., PBS, NQS, rather,
qprep is a translator between them.
50Tools for Uniform SuperComputing (TUSC) qprep
qprep will work in one of two ways, both of which
are intended to be familiar to users 1.
Command-line arguments qprep nodes512
walltime2400 script 2. Pseudo-comment
directives in script preamble PSTQ
nodes512 PSTQ walltime2400 qprep will
edit the script, translating the preamble to the
native queuing system. To the extent possible,
our plan is to implement qprep via a translation
table.
51Tools for Uniform SuperComputing (TUSC) qprep
qprep prototype code is currently running on T3E,
O2K, O3K ( Compaq) systems at six supercomputer
centers ERDC, NAVO, ASC, AHPCRC, PSC,
SDSC. qprep specification for the more general
routine (which currently does not exist) has
grown from 1. User feedback 2. Review
of NQS, PBS, LSF, GRD functionality qprep
specification contains 39 directives for job
environment (4), reporting (5), job control (3),
qprep control (4), limit (11), job dependence
(11), pass through (1)
52Tools for Uniform SuperComputing (TUSC) qprep
JOB ENVIRONMENT DIRECTIVES
accountaccount_string usernameuserlist
user_at_host,user_at_host, exportvarsy
export shell variables to script
shellshell_name REPORTING DIRECTIVES
stderrfile stdoutfile keepstderr,stdout
maila,b,e,r,t abort,begin,end,rerun,
routed mailtouserlist
53Tools for Uniform SuperComputing (TUSC) qprep
JOB CONTROL DIRECTIVES queuename
jobnamename checkpoint(nsnumber)
neverqueue shut downevery (number)
minutes qprep CONTROL DIRECTIVES silenty
do not write job identifier when
submitting to queue outscriptoutfile
by default, outfile is filename.pst submity
erasey if submity, erase translated
script after submission
54Tools for Uniform SuperComputing (TUSC) qprep
LIMIT DIRECTIVES nodesnumber
cpuspernodenumber walltimetime
maximum walltime for script processcputimetime
maximum CPU time for any single process
in script totalcputimetime total CPU
time used by all single processes in script
procfilesizesize maximum total size of
files for single process totalfilesizesize
maximum total size of all filles for all
processes tapemt(abcdefgh)number
maximum number of tape drives in the device
class nicenumber nice level for
script when resources are shared
processmemorynumber maximum memory used
by a single process (shared)
totalmemorynumber total memory used by
all processes in script (shared)
55Tools for Uniform SuperComputing (TUSC) qprep
JOB-DEPENDENCE DIRECTIVES synccountnumber
synchronize (number) jobs executed by 1st
job syncwithnumber synchronize with
job in which synccount is set
afterjobid,jobid, run after
specified jobs have begun afterokjobid,jobid,
run after specified jobs have ended
w/out errors afternotokjobid,jobid,
run after specified jobs have ended w/errors
afteranyjobid,jobid, run after
specified jobs have ended /- errors
dependsonnumber run after (number)
before dependencies satisfied
beforejobid,jobid, permit specified
jobs after current job begins
beforeokjobid,jobid, permit
specified jobs after current ends w/out errors
beforenotokjobid,jobid, permit
specified jobs after current ends w/errors
beforeanyjobid,jobid, permit
specified jobs after current ends /- errors
56Tools for Uniform SuperComputing (TUSC) qprep
EXTRA DIRECTIVES passthrougharguments
pass quote-delimited commands directly to
underlying queuing system
57(No Transcript)
58Tools for Uniform SuperComputing (TUSC) qprep
qprep implementation will follow along lines of
archive
- qprep will be written in two layers
- qprep.pm will be a Perl module containing
subroutines that are site-independent. - local.pm will be a Perl module which contains
subroutines that depend on the local system. - local.pm will intentionally contain as little
code as possible and ample comment statements to
facilitate implementation at a new site.
59Example local.pm (from archive routine)
sub local_put my (host,path,file)_at__ my
(line,status) line /bin/cp -f file path
2gt1 status? return(line,status) su
b directory_exists my (host,path)_at__ my
(direxists) if ( -d path ) direxists
"true" else direxists "false"
return(direxists) sub get_file_size my
(host,path,file)_at__ my (size) size
-s "pathfile" return(size) sub
local_migrate my (host,path,file)_at__
print "local_migrate COMMENT Doing
nothing.\n" print "If this were a real
archival system, this would migrate your
file.\n" return(0)
60Learn more at www.pstoolkit.org
61Learn more at www.pstoolkit.org
62PST Schedule
Over the next 12 months, tools will be
implemented, released, and refined based on user
input, and the number of centers that support PST
will increase, aided by the open source
philosophy. May 18, 2001 Initial release of
archive man page, establish www.pstoolkit.org
email groups June 18, 2001 Advertise PST at
DoD HPC UGC July 15, 2001 Initial release of
qprep man page. October, 2001 Implement archive
on ERDC, NAVO, ASC, ARL, AHPCRC, MHPCC, SDSC,
PSC December, 2001 qprep version 1.0
release. May, 2002 Initial release of MD and
SET layers. October, 2002 Full release of
TUSC, MD, and SET layers. ACT PEP layer
codes will be released as contributed by users.
63Notes and Next Steps
- What about Globusrun?
- Doesnt allow users to define own
- Globus may not address the attributes issues
addressed here - All these sites will not be running a Globus
gatekeeper so using the Globus code is probably
the wrong way to go about this
64What about the posix standards?
- Standard doesnt go far enough (possibly)
- Written before smp nodes existed thereby
ambiguous - What are the 5-6 things needed that arent in the
posix standard, agree on those, and these could
get added in. - However. LSF does not follow posix standard
65People to coordinate with
- Fabrizio Pacini, (Datamat), fpacini_at_datamat.it
- Mike Russell (U of Chicago, Cactus),
russell_at_cs.uchicago.edu - Jenny Schopf (for a better Globus contact.)
jms_at_mcs.anl.gov
66Next Step
- Write charter, determine if there are people to
help
67Scheduling Optimization BOF
- Vincenzo DiMartino and Marco Mililotti
- Proposed New Research Group
68BOF Scheduling Optimization
- Proposed Research group on scheduling
Optimization techniques - SRM-OPT Scheduling and Resource Management Area
- (area chairs Bill Nitzberg, Jennifer Schopf)
- Discussion coordinators
Vincenzo Di Martino, Marco Mililotti
69BOF Scheduling Optimization
- BOF goal
- To gather interest and requirements and .
- to start a Research Group on scheduling
optimization Open to anyone in this room, - Please specify your level of future involvement
in the R.G. in the BOF participant list
70Tentative program (subject to be changed on the
fly)
- 15 discussion on Scheduling Optimization
meaning. - 30 discussion on Technicalities to obtain
optimal Scheduling - 15 discussion Interaction with the others WGs
and RGs and crossfertilization. - 30 discussion Research Group Milestones and
Organization active members and interested
members. - 30 GGAS experience presentation
7115 What is the meaning of scheduling
optimizaton
- The art of running as much job as possible with
the minimum usage of resources - How to avoid resources request conflict between
Jobs - How to maximize the GRID computing total
troughput without to penalize the single
job/user. - To negotiate different cost/performance to
predictable users - To keep at minimum the Geographical area network
load
72Scheduling new techniques and practice.
Evolutionary Algorithms
- Genetic Algorithms
- Genetic Programming
- Evolutionary programming
Tabù search- reactive tabù search
Swarm Intelligence Agent based systems Particle
swarm optimization Ant colony
73WG and RG tight binded
All the SRM wg and rg
- Performance wg
- Application RG
- Architecture wg
74Milestones and Research Group activity plan
October 01 gtgt E-mail distribution list and www
site November 01gtgt Research group draft
document December 2001 ? Prototype Software for
R.G. repository Two month R.G progress revision
process. GGF meeting and activity
synchronization
75Notes/suggestions
- Common vocabulary
- What are constraints
- What are the metrics
- High level optimization topics
- What are the constraints? How do we do smart
choices? What do users want?
76Output of a RG/WG papers - what papers will be
discussed here?
- Possibilities
- There are many different approaches to scheduling
- taxonomy - What are the constraints that are possible for
optimization - How should constraints be described?
- How to capture/suggest flexibility to a current
scheduling system
77Layered approach
- First goal - identify other groups
- Paper list of capabilities that are required to
do good optimization, what constraints to be used
- dynamic, etc. - Paper what is being used in the current
scheduling systems? How does this relate to
research in optimization? - Longer term goal
- Development of a simulator to test environment
- Needs constraints, language, etc
- Longest term goal get better optimization into
functioning schedulers
78Other possibilities- information
- What do you assume is being furnished by the
users? - Paper what information is needed by the
optimization tech is being used - Paper What meta-data is supplied by the
resources (nodes,hw,nw) in a grid? (whats
currently being reported?) - Paper How are jobs being described? How should
they be described in the future?
79Suggestion
- Paper comparing current schedulers optimization
techniques - - This needs classification of techniques perhaps
- Two views - compare user-optimized vs system
optimized
80Potential focus?
- Different resources will have different
schedulers/optimization techniques - how to
decide between them - Given a single users running a job - might want
to let user know what the choice in selecting
this is (if a choice is available)
81Next Steps
- Define a charter Identify which paper topics
should be looked at first - Since we cant do this on the fly - who wants to
do it in email? - Bruno Volckaert has a related paper.
- 8-10 people would like to discuss further on the
maillist
82Scheduling AttributesWorking Group
(SG)http//www.cs.northwestern.edu/jms/sched-wg
/sa-wg.htmlsched-wg_at_gridforum.orgChair
Uwe.Schwiegelshohn_at_udo.edu
- Attributes for Communication between Scheduling
Instances - Authors Uwe Schwiegelshohn Ramin Yahyapour
- SchedWD 10.5
83(No Transcript)
84(No Transcript)
85(No Transcript)
86(No Transcript)
87(No Transcript)
88(No Transcript)
89(No Transcript)
90Notes
- Should we include required job length in the
document, or is this part of the resource
description - Attendees (including Uwe) 13
- Number who read the paper 1
91Migration Attribute
- Does this mean within the local management
sphere or to the outside? - Answer outside (as this is only dealing with
scheduler to scheduler interactions) - Migratable means I can stop the job and package
it - What about I can receive a packaged job?
- May need 2 attributes I can pack I can unpack
and run - What about stop, move, restart someplace else in
the system? - This is not migration, this is checkpointing
and restarting - Perhaps it should be broken down as
- Checkpoint continue
- Checkpoint stop or stop checkpoint
- Checkpoint, migrate, restart
92Migration Attribute, cont.
- You may not have to capture all state to migrate
(e.g., you could forward messages in flight) - Checkpointing implies you can turn the system off
(and the checkpoint is still OK) this is not
necessarily true with migration.
93Notes
- High level scheduler vs. Low level scheduler?
- Defined by behavior (a la client/server)
- Whats the data model?
- Some attributes are intrinsic, other attributes
may be derived by combining the attributes of
lower level schedulers - How you combine the attributes is not the subject
of this document
94Notes
- Do we lose anything by restricting ourselves to a
static heirarchy (e.g., all schedulers can be
represented as a DAG).
95Data Grid Scheduling Attributes
- Document appears close enough for CPU-type
scheduling, perhaps it should be done - It doesnt appear complete for data-type
attributes - For example
- Guaranteed data transfer completion (reliability)
e.g., after the job is done, the stage-out will
definitely happen - May need some timeframe (e.g., in less than 200
years) - There are probably others
96Notes
- How about the opposite of guaranteed
completion? - Perhaps it just doesnt set the attribute
- Perhaps this assumption should be added to the
document
97Next Steps
- Add
- restart attribute
- Assumption that lack of assertion of an attribute
is equivalent to asserting the negative of the
attribute - This document enters formal document process
- Then
- Architecture that makes use of this
- API (maybe protocol) that uses this
- Identify what existing schedulers provide
- Will need a prototype to know if this is useful
98- BOF -DRMAA Distributed Resource Management
Application APIProposed Co-chairsJohn
Tollefsrud john.tollefsrud_at_eng.sun.com,
SunBill Nitzberg bill_at_computer.org, Veridian
- www.cs.northwestern.edu/jms/sched-wg
- sched-wg_at_gridforum.org
99Proposed Scope Run a Job API(Steps from Ten
Actions when SuperScheduling, GGF SchedWD 8.5,
J.M. Schopf, July 2001)
- Phase 1 Resource Discovery
- Step 1 Authorization Filtering
- Step 2 Application requirement definition
- Step 3 Minimal requirement filtering
- Phase 2 System Selection
- Step 4 Gathering information (query)
- Step 5 Select the system(s) to run on
- Phase 3 Run job
- Step 6 (optional) Make an advance reservation
- Step 7 Submit job to resources
- Step 8 Preparation Tasks
- Step 9 Monitor progress (maybe go back to 4)
- Step 10 Find out J is done
- Step 11 Completion tasks
100Why API? ? Code Re-Use
- Command Line standards
- Script re-use
- API (Application Programmer Interface)
- Code re-use
- Protocol
- Interoperability
- Language/Syntax
- Re-use interoperability
101An API can have Multiple ProtocolsE.g., Message
Passing Interface
- MPI provides portability any correct program
compiles runs on a platform - Does not provide interoperability all processes
must link against same SDK - E.g., MPICH and LAM versions of MPI
Slide courtesy of Globus Tutorial
www.globus.org
102A Protocol can have Multiple APIsE.g., TCP/IP
- TCP/IP APIs include BSD sockets, Winsock, System
V streams, - The protocol provides interoperability programs
using different APIs can exchange information - I dont need to know remote users API
Application
Application
WinSock API
Berkeley Sockets API
TCP/IP Protocol Reliable byte streams
Slide courtesy of Globus Tutorial --
www.globus.org
103(No Transcript)
104(No Transcript)
105(No Transcript)
106(No Transcript)
107(No Transcript)
108(No Transcript)
109(No Transcript)
110(No Transcript)
111(No Transcript)
112(No Transcript)
113(No Transcript)
114(No Transcript)
115(No Transcript)
116Proposed Focus
- Timing API defined in 9 months
- E.g., Jul02 DRMAA v1.0 GWD submitted for review
- Standardize existing systems
- Dont invent something new
117Next Steps
- Solicit participation ? YOU ARE HERE
- Especially resource management vendors
application developers - Finalize charter, milestones,
- Discuss via sched-wg email list
- Submit to GGF chair to form DRMAA working group
- Draft/standardize API
118Proposed Charter
- Develop an API specification for the submission
and control of jobs to one or more Distributed
Resource Management (DRM) systems. - The scope of this specification is all the high
level functionality which is necessary for an
application to consign a job to a DRM system
including common operations on jobs like
termination or suspension. - The objective is to facilitate the direct
interfacing of applications to today's DRM
systems by application's builders, portal
builders, and Independent Software Vendors (ISVs).
119Notes
- Good focus if its both quick, simple, and based
on existing systems - If it drags along, then maybe we should do a more
in depth process - Is this applicable from a web service point of
view? - XML? SOAP objects? Or some such
120Scheduling Working GroupA Brief History
121Advance Reservation Co-Scheduling Workshop,
May 1999
- Defined reservation
- Resource start end duration
- Enumerated desired capabilities
- de-coupled from job submission
- unique printable reservation ID
- query/response - returns list of available slots
- hard and soft reservations
- Enumerated harder stuff to put off til later,
e.g., guarantee, cost model
122Grid Forum 1 (NASA Ames)June 1999
- Initial Charter
- Solve Grid Resource Management
- Three focus areas
- Advance reservations
- Super scheduling
- Resource specification (semantics tokens)
123Grid Forum 2 (Northwestern)October 1999
- Refined charter
- Requested
- lists of tokens from different groups
- architecture pictures of existing systems
- Discussed What is X?
- e.g., job, scheduler
124Grid Forum 3 (UCSD)March 2000
- Adopted charter refocused
- Decided not to work on architecture
- Developed Super-scheduler Model (10 steps)
- Gave overviews of advance reservation systems
(GARA, Maui, PBS, LSF) - Commitments to draft several SchedRFCs
125Grid Forum 4 (Microsoft)July 2000
- Changed Sched RFC to Sched Working Document
- Revised working document drafts
- Query Interface
- Resource Acquisition Steps
- Security Requirements
- Advance Reservation API
- Scheduling Information
- Suggested new working documents
- 10 Steps Run a job API
126Grid Forum 5 (Boston) October 2000
- Generic Grid Resource Description combined to
become the new advance reservation API document - Ten Steps for Superscheduling ever closer to
done - Security Requirements of the Scheduling Working
Group passed on to Security as a usage scenario - Grid Query and Reservation Interface
127GGF-1 (Amsterdam)March 2001
- Three documents discussed
- Ten Actions for SuperScheduling, Arch/Framework,
J. Schopf - basically done, only minor edits suggested
- in progress, refocusing, to be discussed in
telecon - Advance Reservation API, API, A. Roy, V. Sander
- basically done, only minor edits suggested
- Lower Level Scheduling Attributes,
Syntax/Language, U. Schwiegelshohn - New areas suggested for attention
- Advance reservation Protocol, Protocol, A. Roy,
V. Sander, K. Czajkowski, J. Karpovich - Scheduling Dictionary, Syntax/Language, J. Schopf
128GGF-2 (Washington, DC)July 2001
- Scheduling Dictionary
- Collecting words
- New co-chairs Mary Roehrig, Wolfgang Ziegler,
and Jennifer Schopf - Scheduling Attributes
- Presented and discussed draft document
- Advance Reservation Protocol
- Presented draft document
- Refocused to Grid Resource Mgmt Protocol
- Decided to attack requirements first.
129GGF-3 (Frascati, IT)October 2001
- WG Grid Resource Management
- WG Scheduling Attributes
- WG Scheduling Dictionary
- New co-chairs, no GGF3 meeting
- 3 potential new activities (BOFs)
- Scheduling Command Line Interface
- Scheduling Optimization
- Distributed Resource Management Application API