Title: CG Architecture
1Experiments with Distributed Training of Neural
Networks on the Grid
Maciej Malawski1 Marian Bubak1,2 Elzbieta
Richter-Was3,4 Grzegorz Sala3,5 Tadeusz
Szymocha3 1Institute of Computer Science AGH,
Mickiewicza 30, 30-059 Kraków, Poland 2Academic
Computer Centre CYFRONET, Nawojki 11, 30-950
Kraków, Poland 3Institute of Nuclear Physics,
Polish Academy of Sciences, Krakow,
Poland 4Institute of Physics, Jagiellonian
University, Kraków, Poland 5Faculty of Physics
and Applied Computer Science AGH, Kraków, Poland
bubak,malawski_at_agh.edu.pl, elzbieta.richter-was_at_
cern.ch, sala_at_fatcat.ftj.agh.edu.pl,
Tadeusz.Szymocha_at_ifj.edu.pl
- Challenges
- Neural network training is a highly
compute-intensive task may need High
Performance Computing - Finding optimal configuration may be time
consuming many experiments with various
parameters may need High Throughput Computing
- Target application
- High Energy Physics
- Discrimination between signal and background
events coming from the particle detector
(simulation) - ROOT and Athena as basic data analysis tools
- Why neural networks
- Once trained, are efficient and accurate
- Applicable for classification and prediction
- Proven in wide area of applications
- Solution The Grid
- The distribution of the computation on a cluster
of machines can lead to significant improvement
in decreasing computation time. - Utilizing resources (multiple clusters) available
on the Grid can make this task less time
consuming for researcher.
- Observation
- Training of neural networks on the Grid requires
many repeated tasks - job preparation,
- submission,
- monitoring of status,
- gathering results.
- Performing them manually is time consuming for
the researcher - ? Preparation of tools for automating such tasks
can facilitate the whole process considerably.
- Our Goals
- Develop the tools facilitating usage of Grid for
multiple classification experiments - Investigate and validate algorithms for
distributed neural network training - Allow seamless integration with data analysis
tools such as ROOT
- Testbed for our experiments EGEE project
- Virtual Organization for Central Europe
- CYFRONET Kraków, PSNC Poznan, KFKI Budapest,
CESNET Prague, TU Kosice Grid sites - Support for MPI applications