Title: Data Grid Scheduler Belle Demo
1Data Grid SchedulerBelle Demo
- KEK Grid Workshop
- Dec 2004
- Lyle Winton, University of Melbourne
2Data Grid Scheduling
- User Interface
- How to distribute our task, breaking into smaller
distributed jobs? - easily divisible tasks basic workflow
- events, files, groups of files, runs
- simple technique of parameter sweep (eg.
Xfile1, file2) - Resource Manager
- Which resources are usable? (RSL, GSI)
- What resources are available? (MDS, GRIS)
- Scheduling
- Which resource combination is best?
- Dispatch
- Execute the job (via low level tools, eg. Globus)
- Monitor the job
- Check for failure, retry, reschedule, and report
- Output retrieval
3Data Grid Scheduling
- Task -gt Job1, Job2 ...
- Job1 -gt input file replica 1, replica 2 ...
- Job1 Input replica -gt CPU resource 1 ...
- Need to determine whatwhere is best
- we are using several metrics
4Data Grid Scheduling
- GQSched (Grid Quick Scheduler) v2004
statustesting for production - Parameter sweep over files and collections
- Job descriptions are simple shell script (CSH,
SH, familiar to Physicists) - Parameters are implemented as Environment
variables (easy to use) - Data Grid Enabled
- Access to SRB, GlobusRC, GSIFTP, GASS, HTTP etc.
- Scheduling based on metrics for CPU Resource
Data Resource combinations - previous failures of job on resource
- nearness of physical file locations (replicas,
SRB copy location) - resource availability
- Extra features
- Pre- and Post-processing for preparation/collation
of data and job status checks - Dynamic job specification (user can modify job
specs during scheduling) - Creation and clean-up of unique job execution
area - Private network friendly staging of files for
specific resources (3 stage jobs) - Automatic retry and resubmit of jobs
- Reporting of file access errors and job errors
- Merging of RSL requirements for Resources and
Jobs - Automatic checking and creation of Grid proxy
5Data Grid Scheduling
- gqsched myresources myscript.csh
!/bin/csh -f Param MYFILE GridFile
srb/anusf/home/ljw563.anusf/proc1/.mdst Stage
In MYFILE StageIn recon.conf event.conf
particle.conf echo Processing Job JOBID on
MYFILE on host hostname basfexec -v
b20020424_1007 ltlt EOF path create main module
register user_ana path add_module main
user_ana initialize histogram define
myana.hbook process_event MYFILE_LOCAL
0 terminate EOF echo Finished JobID JOBID
. StageOut output.mdst srb/anusf/home/ljw563.a
nusf/procout1/ StageOut myana.hbook
myana.JOBID.hbook
6Australian EHEP testbed
- Uni.Adelaide CS group
- 2 Xeon 2.6GHz (IBM)70 GB disk
- ANUSF/APAC (at ANU)
- 2 Xeon 2.6GHz (IBM)70 GB disk
- Uni.Melbourne EPP group
- 1 P4 Intel 1.7GHz70 GB disk
- 8 P3 workstations PBS
- Uni.Melbourne GridBus/CS
- 2 Xeon 2.6GHz (IBM)70 GB disk
- Uni.Sydney HEP group
- 2 Xeon 2.6GHz (IBM)70 GB disk
- SRB storage resources
- ANUSF Petabyte Storage
- Uni.Melb EPP 70 GB disk
7Australian EHEP testbed
- GrangeNet
- 10 Gb research link connecting many east coast
institutions - Analysis of MC events of interest B0 ? D- D
KS - .