Title: Presentazione di PowerPoint
1Geant4 medical simulations in a distributed
computing environment
S. Guatelli, A. Mantero, J. Moscicki, M. G. Pia
4th Workshop onGeant4 Bio-medical
DevelopmentsGeant4 Physics ValidationINFN
Genova, 13-20 July 2005
2Outline
- Problem how to obtain quick response
- Brief introduction of DIANE
- How to parallelize a Geant4 application
- Project
- Parallelization of two medical physics Geant4
applications - Brachytherapy
- IMRT application
- Study of the performance
- Using a dedicated cluster
- Using the GRID
- First results
- Work in progress
3Execution time - Brachy
- Number of events for a sufficient statistic for a
dosimetric study of a single brachytherapic
source 20 M events - Execution time of 20 M events on a Pentium IV, 3
GHz - 16650 s 5 h
- Clinical use quick response means order of
minutes
4Execution times IMRT
- Number of events for a sufficient statistic 109
events - Execution time of 109 events
- 822890. s 228 h 9 days and half
- Quick response required for clinical use
5Speed adequate for clinic use
Parallelisation
Transparent configuration in sequential or
parallel mode
Access to distributed computing resources
Transparent access to the GRID through an
intermediate software layer
6Project
- Parallelization of the Geant4 IMRT and
Brachytherapy application - Parallelization through DIANE
- Performance test
- Run on a single machine
- Run on a dedicated cluster
- Run on the GRID
7DIANE DIstributed ANalysis Environment
Geant4 Simulation and Anaphe Analysis on a
dedicated Beowulf Cluster S. Chauvie et al., IRCC
Torino, Siena 2002
Previous studies for parallelization of a Geant4
based medical application
- speed OK
- but expensive hardware investment maintenance
IMRT
DIANE
Alternative strategy
Transparent access to a distributed computing
environment
Parallelisation
Access to the GRID
8DIANE DIstributed ANalysis Environment
Hide complex details of underlying technology
- Parallel cluster processing
- make fine tuning and customisation easy
- transparently using GRID technology
- application independent
Developed by J. Moscicki, CERN
http//cern.ch/DIANE
9Practical example
- How to dianize the Geant4 application
- Look the Geant4 extended example ExDIANE
- in the parallel directory
- Completely transparent to the user same G4 code
- Documentation at www.cern.ch/diane/ specific for
Geant4 applications available
10Run through DIANE
- --python--
- Application "G4Analysis"
- WorkerInitData
- 'G4ApplicationComponentName' "G4MedLinac",
- initMacroFile' ""
- /control/verbose 1/run/verbose 1
- /control/saveHistory
- /run/initialize/tracking/storeTrajectory 1
- /Jaws/X1/DistanceFromAxis -5.0 cm
- /Jaws/X2/DistanceFromAxis 5.0 cm
- /Jaws/Y1/DistanceFromAxis -5.0 cm/
- Jaws/Y2/DistanceFromAxis 5.0 cm
- /Jaws/update/energy 6.0 MeV
- /sourceType 0.127 MeV """
- JobInitData
- 'runParams'
- 'seed' 0 ,
- 'eventNumber' 100000,
- 'macroFileTemplate'
"/run/beamOn " ,
Example of a macro file
11Run on parallel mode on a dedicated cluster
- Type the command
- Diane.startjob j macrofileName.mac w2_at_cluster
wmsIPLIST - IPLIST if a file name containing the list of the
names of the machines the user intends to use
12Practical Example
- example Geant4 simulation with analysis
- the total number of events of the simulation is
divided in tasks - each task produces a file with histograms
- job result sum of histograms produced by tasks
- master-worker model
- client starts a job
- workers perform tasks
- and produce histograms
- master integrates
- the results
13 Resources of this project
- Dedicated cluster of 4 pcs in Genova (Pentium
IV, 3 GHz) - For preliminary tests
- Dedicated 30 pcs (biprocessors) cluster (Xeon,
2.8 GHz) - Thanks to H.C. Lee, Academia Sinica Computing
Center, Taiwan - LSF cluster at CERN
- To study a real case of running on a cluster,
used by more users - GRID
- Run on a distributed computing environment
14Running on a single CPU
- To study the efficiency of DIANE
- Plot of
- with respect to the number of events
- Execution time using DIANE means running
sequentially the dianized simulation, dividing
the job in tasks - Useful study to optimize the number of events for
task
Preliminary
Overhead of DIANE
Brachy-Iridium source simulation
15Preliminary results
- Running on a dedicated cluster
- Divide the total number of events in tasks
- Dispatch the tasks on more workers
- Execution times of the Brachy with respect to the
number of CPUs used - No merging of the output files of the simulations
16Preliminary results
- Running on a dedicated cluster
- Efficiency
- The efficiency is higher with higher number of
tasks
Preliminary
N is the number of CPUs
17Performance Optimization
- Why the efficiency is higher with higher number
of tasks
- The execution time is bigger because there is
still one task to end - Splitting the job in more tasks increases the
balance in execution times of the workers
18Work in progress
- Optimization of the method to merge the output
files of the tasks - In the present situation the merging introduces a
significant overhead on the results - We found problems with adding histograms with PI
- A. Pfeiffer and L. Moneta are helping in this
last task - Last step running on the GRID
- Refine the results