Distributed Monte Carlo Instrument Simulations at ISIS - PowerPoint PPT Presentation

About This Presentation
Title:

Distributed Monte Carlo Instrument Simulations at ISIS

Description:

Most PCs not used at all after 5pm. Even with heavily used' ... Uploads new command line data executable. Parameter Scan. Send each run to a separate machine ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 32
Provided by: mcnsi
Category:

less

Transcript and Presenter's Notes

Title: Distributed Monte Carlo Instrument Simulations at ISIS


1
Distributed Monte Carlo Instrument Simulations at
ISIS
Tom Griffin, ISIS Facility University of
Manchester
2
Introduction
  • What is Distributed Computing
  • The software we use
  • VITESS Specifics
  • McStas Specifics
  • Conclusions

3
What do I mean by Distributed Grid?
  • A way of speeding up large, compute intensive
    tasks
  • Break large jobs into smaller chunks
  • Send these chunks out to (distributed) machines
  • Distributed machines do the work
  • Collate and merge the results

4
Spare Cycles Concept
  • Typical PC usage is about 10
  • Most PCs not used at all after 5pm
  • Even with heavily used (Outlook, Word, IE)
    PCs, the CPU is still grossly underutilised
  • Everyone wants a fast PC!
  • Can we use (steal?) their unused CPU cycles?
  • SETI_at_home, World Community Grid
    (www.worldcommunitygrid.org)

5
Possible Software Implementations
  • Toolkit e.g. COSM
  • Low level toolkit source code level
    integration
  • So time consuming work, for each application
  • Entropia DC Grid
  • Trial run at ISIS two years ago. Some success
  • Company bought out and in limbo (?)
  • United Devices Grid MP
  • What were currently using
  • Quite expensive
  • Condor
  • Free (academic research project)
  • In our experience 2 yrs ago, not reliable with
    Windows

6
The United Devices System
  • Server hardware
  • We use two, dual Xeon servers 280 client
    licenses
  • Could (will) easily cope with more clients
  • Software
  • Servers run RedHat Linux Advanced Server / DB2
  • Clients available for Windows, Linux, SPARCs and
    Macs
  • Programming
  • MGSI Web Services interface XML, SOAP
  • Accessed with C and Java classes etc
  • Management Console
  • Web browser based
  • Can manage services, jobs, devices etc

7
Visual Introduction to the Grid
8
Suitable / Unsuitable Applications
  • CPU Intensive
  • Low to moderate memory use
  • Not too much file output
  • Coarse grained
  • Command line / batch driven
  • Licensing issues?

9
Objects within the Grid
  • Program
  • McStas
  • Job
  • wish_simulation
  • Jobstep
  • Workunit
  • sent to a Device
  • Data Set
  • Data

10
How to write Grid Programs
  • Fairly easy to write
  • Interface to grid via Web Services
  • So far used C, Java, Perl, Fortran, C
  • Think about how to split your data and merge
    results
  • Wrap and upload your executable
  • Write the application service
  • Pre and Post processing
  • Use the Grid

11
Wrapping Your Executable
  • Executable any dlls etc
  • Standard data files
  • Compression
  • Encryption
  • Capture screen output
  • Set Environmental Variables
  • Command Line

12
Application Service
  • Pre-processing
  • Partition data
  • Package data partitions
  • Log in to the Grid server
  • Create a Job and Job Step
  • Create a Data Set
  • Create Datas and upload data packages
  • Create Workunits
  • Set the Job running
  • Post-Processing
  • Retrieve results
  • Merge results

13
Monte Carlo Speed-up Ideas
  • Two scenarios
  • Single large simulation run
  • Split the neutrons into smaller numbers and
    execute separately
  • Merge results in some way
  • Many smaller runs
  • Parameter scan

14
VITESS Splitting It
  • Easy mode of operation fixed executables data
    files
  • Executables held on server
  • Split command line into bits divide Ncount
  • Vary the random seed
  • Create data packages
  • Upload data packages

15
VITESS Running It
  • Use GUI to create instrument Save As Command
  • Parameter directory set to .
  • Submit program parses bat file
  • Substitutes V and P
  • Removes header and footer
  • Creates many new bat files with different --Zs
    and

16
C\My_GRID\VITESSE\VITESSE\buildgtVitess-Submit.exe
example_job example.bat req_files 20 logging in
to https//bruce.nd.rl.ac.uk18443/mgsi/rpc_soap.f
cgi as tom.... Adding Vitesse dataset.... Adding
Vitesse datas.... 3e007 neutrons split into 20
chunks, of -n1500000 neutrons Total number of
Vitesse 'runs' 20 Uploading data for run
1... Uploading data for run 2... . . Uploading
data for run 19... Uploading data for run
20... Adding Vitesse datas to system.... Adding
job.... Adding jobstep.... Turning on automatic
workunit generation.... Closing jobstep.... All
done Your job_id is 4878
VITESS Running It
  • Submit program creates many bat files

17
VITESS Monitoring It
  • Web Interface

18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
VITESS Merging It
  • Download the chunks
  • Merge Data files
  • DetectedNeutrons.dat concatenate
  • vpipes trajectories count rate
  • Two classes of files
  • 1D - Values sum divide by num chunks-
  • - Errors square, sum and divide
  • 2D Sum / num of chunks

23
VITESS Advantages and Problems
  • Many times faster linear increase
  • Needs verification runs (x3)
  • Typically 11 (potentially) 30 times faster
  • 12 hours runs in 1 hour!
  • Very large simulations reach random limits

24
VITESS Some Results
176 hours 59 hours
6hrs 20mins
25
McStas Splitting It
  • Different executable for every run
  • Executable must be uploaded at run time
  • Split n into chunks
  • or run many instances (parameter scan)
  • Create data ( executable) packages
  • Upload packages

26
McStas Running It
  • Use McGui to create and compile executable
  • Create input file for Submit program

27
McStas Running It
  • Large run
  • Submit program breaks up n
  • Uploads new command line data executable
  • Parameter Scan
  • Send each run to a separate machine

28
McStas Merging It
  • Many output files ? Separate merge program
  • PGPLOT and Matlab implemented
  • Very similar
  • PGPLOT
  • 1D intensities sum and divide. Errors square,
    sum and divide. Events Sum
  • 2D intensities sum and divide. Errors square,
    sum and divide. Events Sum
  • Matlab
  • 1D Same maths, different format
  • 2D Virtually the same
  • Metadata leave untouched

29
McStas Advantages and Problems
  • Security Do we trust users?
  • 100 times faster?
  • Linux version much faster than Windows ?
  • How do we merge certain fields?
  • values '1.44156e006 10459.9 30748'
  • statistics 'X03.5418 dX1.52975
    Y00.000822474 dY1.0288'
  • Some issue related to randomness of moderator
    file

30
Future Developments - Expansion
  • Expansion
  • Proposal accepted for an additional 400 licenses
  • Giving us a total of 480
  • Change in licensing model

50k
45k
  • Bottom Line Costs
  • Setup, server licenses, 80 client licenses
    support 18k CMSD

50k
  • Total 250k

83k
31
Conclusions
  • Both run well under Grid MP
  • Submit Retrieve a few hours work
  • Merge a bit more
  • Needs to merge more output formats ?
  • Issues with very large simulations
  • More info on Grid MP at www.ud.com
Write a Comment
User Comments (0)
About PowerShow.com