The Bologna Batch System: Flexible Policy with Condor - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

The Bologna Batch System: Flexible Policy with Condor

Description:

www.cs.wisc.edu/condor. The Bologna Batch System ... checkpointing and resuming), and thus once they start they must not be preempted ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 25
Provided by: peter732
Category:

less

Transcript and Presenter's Notes

Title: The Bologna Batch System: Flexible Policy with Condor


1
The Bologna Batch System Flexible Policy with
Condor
2
The Bologna Batch System
  • Custom batch scheduling system for local users at
    INFN in Bologna, Italy.
  • Istituto Nazionale di Fisica Nucleare
  • Dr. Paolo Mazzanti initiated the idea.
  • Implement on a small subset of machines within
    the larger nationwide INFN Condor pool
  • INFN Condor Pool 300 CPUs
  • INFN-Bologna Condor 100 CPUs
  • Bologna Batch System 50 CPUs

3
Where We Started
  • Basic Condor Policy
  • Opportunistic resources
  • Jobs only run when machines are otherwise idle
  • Jobs can be preempted for machine owners or
    higher-priority users
  • Fair-share across INFN pool
  • Highest priority user in the pool gets first
    crack at a given resource
  • The more you use, the worse your priority becomes
  • Some problems
  • Long-running vanilla jobs (with no checkpointing)
    were frequently preempted before running to
    completion
  • Users dislike waiting for a resource if they only
    want to run a short job
  • High-priority users from other INFN sites running
    on local resources while lower-priority local
    users wait.

4
BBS Policy Requirements
  • Prioritize local work
  • Share resources, but run outside jobs as backfill
  • Treat local servers as dedicated resources for
    local jobs, but opportunistic resources for
    other jobs.
  • Run outside Condor jobs only if the server is
    idle.
  • Run local batch jobs regardless of other system
    load or console activity.
  • Preempt outside Condor jobs to allow local batch
    jobs to run, but dont preempt local jobs for
    outside work.

5
BBS Policy Requirements
  • Ensure resource availability for both short and
    long-running jobs
  • Prioritize short batch jobs so that they are
    never kept waiting by long batch jobs.
  • Prevent long batch jobs from being preempted or
    starved by short jobs.
  • Never waste resources
  • No idle CPUs when jobs are waiting to run!
  • No preemption of vanilla jobs!
  • Preemption ideal if you can checkpoint, but here
    we cant

6
A Contradiction!
  • No way to guarantee resource availability for
    short or long jobs without reserving some CPUs
    for each
  • ...But no way to avoid idle CPUs without allowing
    them to start any kind of job
  • If CPUs reserved for short jobs are used for long
    jobs, they become unavailable to run short jobs.
  • If CPUs reserved for short jobs are not used for
    long jobs, theyre being wasted when there are no
    short jobs to run.
  • What to do, what to do

7
A Solution!
  • Allow resources to be temporarily overcommitted
  • We treat one CPU as two
  • On a two-CPU machine, define four Condor VMs
    (virtual machines) two for short jobs and two
    for long jobs.
  • Allow jobs to be suspended rather than preempted
  • Think of as checkpointing to swap
  • OR allow jobs to be de-prioritized temporarily
  • If memory is adequate, allow suspended long
    jobs to continue running at a poor OS priority
    and steal cycles whenever active short jobs are
    busy doing I/O.

8
Everybody wins!
  • Short jobs start right away on dedicated short
    VMs
  • Long jobs arent preempted by short jobs, but
    rather suspend temporarily or run at a lower
    priority.
  • Outside jobs run only when no Bologna jobs
    waiting.
  • All CPUs available to all types of jobs.
  • No idle CPUs when jobs are waiting.

9
Okay, how?
  • Flipside of flexibility is complexity!
  • Its pretty cool that Condor allows you to
    combine dedicated and opportunistic scheduling in
    one system, but it takes a bit of work to get it
    all set up
  • Luckily for yall, weve already done the hard
    part, and now you can copy it. ?

10
Copy it from where?
  • Bologna Batch System document
  • http//www.cs.wisc.edu/pfc/bbs.doc
  • A detailed walk-through of the specific policies
    and the necessary Condor configuration to make
    each one work.
  • Line by line examples of how we implemented each.
  • Whats in it? Lets take a look

11
First Step No hand waving!!
  • Bologna Batch Jobs are specially-designated jobs
    which may run only on specially-designated
    Bologna Batch Servers.
  • Only users in Bologna may submit Bologna Batch
    Jobs.
  • Bologna Batch Jobs must be vanilla-universe jobs
    (and therefore are not capable of checkpointing
    and resuming), and thus once they start they must
    not be preempted for other jobs.
  • Bologna Batch Servers prefer Bologna Batch Jobs
    over other Condor jobs, and will start Bologna
    Batch Jobs regardless of system load or console
    activity.
  • There are two types of Bologna Batch Jobs,
    short-running and long-running. Bologna Batch
    Jobs are assumed to be short-running unless they
    are explicitly labeled as long-running when they
    are submitted.
  • A short-running Bologna Batch Job must not be
    forced to wait for the completion of a
    long-running Bologna Batch Job before starting.
  • When short and long-running Bologna Batch Jobs
    are running simultaneously on the same physical
    machine, the short-running job processes should
    run at a lower (better) OS priority than the
    long-running jobs.
  • A short-running Bologna Batch Job may only run
    for one hour, after which point it should be
    killed and removed from the queue.
  • Bologna Batch Jobs have priority over other
    Condor jobs. This means two things other jobs
    must never preempt Bologna Batch Jobs, and
    Bologna Batch Jobs must always immediately
    preempt other jobs.

12
Review
  • Job
  • Requirements
  • Machine
  • START
  • PREEMPT
  • RANK
  • WANT_SUSPEND,
  • JOB_RENICE_INCREMENT
  • PREEMPTION_REQUIREMENTS
  • STARTD_EXPRS, SUBMIT_EXPRS

13
Requirement 1, Bologna Batch Jobs are
specially-designated jobs which may run only on
specially-designated Bologna Batch Servers.
  • To identify the servers, place into local condor
    config
  • BolognaBatchServer True
  • STARTD_EXPRS (STARTD_EXPRS) BolognaBatchServer
  • To indentify Bologna Batch Jobs by inserting the
    following line into their job submit description
    files
  • BolognaBatchJob True
  • Now Bologna Batch Jobs and Servers can identify
    one another, users ensure that Bologna Batch Jobs
    run only on Bologna Batch Servers by specifying a
    job requirement
  • Requirements (BolognaBatchServer True)

14
Requirement 2, Only users in Bologna may submit
Bologna Batch Jobs.
  • Each Bologna Batch Server double-checks the
    origin of a job claiming to be a Bologna Batch
    Job
  • IsBBJob ( TARGET.BolognaBatchJob ? True \
  • TARGET.SUBMIT_SITE_DOMAIN
    (SUBMIT_SITE_DOMAIN) )
  • SUBMIT_SITE_DOMAIN is an attribute that INFN
    defines on all machines, and which they
    previously configured the Condor schedd to
    automatically add to each jobs classad .
    Individual Condor users are not able to override
    it
  • SUBMIT_SITE_DOMAIN "(UID_DOMAIN)"
  • SUBMIT_EXPRS (SUBMIT_EXPRS) SUBMIT_SITE_DOMAIN

15
Requirement 3, BB Jobs must be vanilla-universe
jobs, and thus once they start they must not be
preempted
  • Next we modified each Bologna Batch Servers
    WANT_SUSPEND_VANILLA and PREEMPT expressions,
    which Condor uses to decide when to suspend or
    preempt a vanilla job, so that INFNs default
    preemption policy would only affect non-Bologna
    Batch Jobs.
  • IsNotBBJob ( (IsBBJob) ! True )
  • WANT_SUSPEND_VANILLA ( (IsNotBBJob)
    ((WANT_SUSPEND_VANILLA)) )
  • PREEMPT ( (IsNotBBJob) ((PREEMPT)) )

16
Requirement 4, Bologna Batch Servers prefer
Bologna Batch Jobs over other Condor jobs, and
will start Bologna Batch Jobs regardless of
system load or console activity
  • RANK (IsBBJob)
  • INFN_START ( (LoadAvg - CondorLoadAvg) lt 0.3 \
  • KeyboardIdle gt (15 60) \
  • TotalCondorLoadAvg lt 1.0 )
  • START ( (IsBBJob) ((INFN_START)) )

17
Requirement 5, There are two types of Bologna
Batch Jobs, short-running and long-running.
Bologna Batch Jobs are assumed to be
short-running unless they are explicitly labeled
as long-running when they are submitted.
  • Declare long running jobs by placing the
    following into submit file
  • LongRunningJob True
  • The in the config file, take advantage of
    meta-operators
  • IsLongBBJob ( (IsBBJob) TARGET.LongRunningJo
    b ? True )
  • IsShortBBJob ( (IsBBJob) TARGET.LongRunningJ
    ob ! True )

18
Requirement 6, A short-running Bologna Batch
Job must not be forced to wait for the completion
of a long-running Bologna Batch Job before
starting..
  • Declare more Virtual Machines than there are
    actual CPUs (dual CPU 2 short VMs, 4 long)
  • NUM_SHORT_RUNNING_VMS 2
  • IsShortRunningVM (VirtualMachineID lt
    (NUM_SHORT_RUNNING_VMS))
  • IsLongRunningVM (VirtualMachineID gt
    (NUM_SHORT_RUNNING_VMS))
  • Change the start expression
  • SHORT_RUNNING_VM_START ( (IsShortBBJob) \
  • ( (IsNotBBJob)
    (INFN_START) ) )
  • LONG_RUNNING_VM_START (IsLongBBJob)
  • START ( ( (IsShortRunningVM)
    (SHORT_RUNNING_VM_START) ) \
  • ( (IsLongRunningVM)
    (LONG_RUNNING_VM_START) ) )

19
Requirement 7, When short and long-running BB
Jobs are running simultaneously on the same
physical machine, the short-running job processes
should run at a lower (better) OS priority
  • JOB_RENICE_INCREMENT
  • ( 5 ( 10 ( LongRunningJob ? True \
  • BolognaBatchJob ! True ) )
  • If LongRunningJob is true in the job classad, the
    expression evaluates to (5 (10 1)), or 15.
    If LongRunningJob is undefined or false in the
    job classad, but BolognaBatchJob is true, the
    expression evaluates to (5 (10 0)), or 5. If
    neither is defined, the expression evaluates to
    (5 (10 1)), or 15

20
Requirement 8, A short-running Bologna Batch
Job may only run for one hour, after which point
it should be killed and removed from the queue.
  • Declare long running jobs by placing the
    following into submit file
  • PREEMPT ( ( (IsNotBBJob) ((PREEMPT)) ) \
  • ( (IsShortBBJob) ((ActivityTimer)
    gt 6060) ) )
  • SHORT_RUNNING_VM_START (( (IsShortBBJob) \
  • (RemoteWallClockTimelt6060) ! False) \
  • ( (IsNotBBJob) ((INFN_START)) ) )
  • To remove from the queue, in the job ad add
  • Periodic_Remove ( LongRunningJob ! True \
  • (RemoteWallClockTime lt
    6060) )

21
Requirement 9, Bologna Batch Jobs have priority
over other Condor jobs other jobs must never
preempt BBJobs, and BB Jobs must always
immediately preempt other jobs..
  • RANK already dealt with, now priority preemption
  • INFN_PREEMPTION_REQUIREMENTS
  • ( (StateTimer) gt (2 (60 60)) \
  • RemoteUserPrio gt SubmittorPrio 1.2 )
  • PREEMPTION_REQUIREMENTS \
  • (( BolognaBatchServer!True
    (INFN_PREEMPTION_REQUIREMENTS)) \
  • (BolognaBatchServer ? True \
  • ( BolognaBatchJob ! True \
  • ( TARGET.BolognaBatchJob ?
    True \
  • (INFN_PREEMPTION_REQUIREMEN
    TS) ))))

22
Wrap condor_submit to make it easy for
usersbbs_submit_short / bbs_submit_long
  • !/bin/sh
  • _CONDOR_APPEND_REQ_VANILLA'(BolognaBatchServer
    True)'
  • export _CONDOR_APPEND_REQ_VANILLA
  • condor_submit -a 'BolognaBatchJob True' \
  • -a 'should_transfer_files
    IF_NEEDED' \
  • -a 'when_to_transfer_output
    ON_EXIT' \
  • -a 'universe vanilla' \
  • -a 'periodic_remove (
    LongRunningJob ! True
  • (RemoteWallClockTime gt
    6060) ) ' \

23
Simple for Users
  • Although policy is complicated, the interface for
    users is kept simple
  • Users call bbs_submit_long or bbs_submit_short,
    just as they would condor_submit
  • Short jobs start quickly, but those that run for
    gt1 hour are killed.
  • Long jobs will run to completion...
  • bbs_submit_ scripts automatically add the
    appropriate classad attributes to the job to take
    advantage of the long or short running VMs on
    Bologna Batch Servers.

24
Any Questions?
  • Email me at condor-admin_at_cs.wisc.edu.
  • Check the Bologna Batch System document at
    http//www.cs.wisc.edu/pfc/bbs.doc
  • Thanks!
Write a Comment
User Comments (0)
About PowerShow.com