AstroGrid-D WP 5: Resource Management for Grid Jobs - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

AstroGrid-D WP 5: Resource Management for Grid Jobs

Description:

Meeting 13:10 14:30 WG5. Meeting WG5 and friends GridWay Discussion ... hydra.ari.uni-heidelberg.de. Scheduler / Broker. Gridway. Information System. Matchmaking ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 40
Provided by: hen146
Category:

less

Transcript and Presenter's Notes

Title: AstroGrid-D WP 5: Resource Management for Grid Jobs


1
AstroGrid-DWP 5 Resource Management for Grid
Jobs
  • Report by
  • Rainer Spurzem
  • (ZAH-ARI)
  • spurzem_at_ari.uni-heidelberg.de
  • and T. Brüsemeister, J. Steinacker

2
Meeting 1310 1430 WG5
  • Meeting WG5 and friends GridWay Discussion
    (together with Ignacio Llorente) Expected List
    of Topics
  • The Present Gridway Installation in Heidelberg -
    Solutions and Problems. Which use cases work?
    How? Demos or screenshots if available.
  • how about more than one gridway installation in
    Astrogrid-D simultaneously at different sites?
  • cooperation of information system and job
    submission (in general or in the special cases of
    our Astrogrid-D information system and Gridway?
  • miscellaneous (data staging postponed to next
    session)

3
Meeting 1310 1430 WG5
  • GridWay
  • Leightweight Metascheduler on top of GT2.4/GT4
  • Central Server Architecture
  • Support of GGF DRMAA standard API for job
    submission and management
  • Simple round robin/flooding scheduling algorithm,
    but extensible

4
Meeting 1310 1430 WG5
A practical example with screenshots
Information System
GT4 Resources
Matchmaking
hydra.ari.uni-heidelberg.de
Gridway
Scheduler / Broker
Job Status gwps
5
Meeting 1310 1430 WG5
Our View (Thanks Hans-Martin)
6
Meeting 1310 1430 WG5
  • D5.1 central resource broker with queue
  • Present use GridWay, throughway, round-robin
  • More Installations useful?
  • Questions
  • Parameters needed Gridway Information System
  • (queue status, module availability, data
    availabilty, hardware)
  • When is it feasible to have a real brokerage?
    How?

7
Meeting 1515 1700 Use Cases
  • Porting Use Cases onto the Grid NBODY6
    Astrophysical Case for direct N-body Star
    Clusters, Galactic Nuclei, Black Holes,
    Gravitational Wave Generation
  • Special Hardware GRAPE, MPRACE (FPGA), future
    technologies (HT, Xtoll, GRAPE-DR)
  • GRAPE in the Grid, Astrogrid-D, International
  • DEISA

8
Meeting 1515 1700 Use Cases
9
N-Body Grav. Waves _at_ ARI Peter Berczik, Ingo
Berentzen, Jonathan Downing, Miguel Preto, Gabor
Kupi, Christoph Eichhorn David Merritt (RIT,
USA) in VESF/LSC collaboration on
gravitational wave modelling from dense star
clusters Pau Amaro-Seoane (AEI, Potsdam, D) G.
Schäfer, A. Gopakumar (Univ. Jena, D) M.
Benacquista (UT Brownsville, USA) Further
collaborations Sverre Aarseth (IoA Cambridge
UK) Seppo Mikkola (U Turku, FIN)
Jun Makino and colleagues in Tokyo support,
cooperation, over many years
10
Globular Cluster ? Centauri (Central Region)
Ground Based View
11
Detection of Gravitational Waves?
Was Einstein right?
12
Example VIRGO Detector in Cascina near Pisa,
Italy
13
Basic idea of any GRAPE N-body code
N
N2
14
Hardware - GRAPE
128 Gflops for a price 5K USD Memory for up to
128K particles
GRAPE6a PCI board
GRAPE6a, -BL - PCI Board for PC-Clusters PROGRAPE-
4, FPGA based board from RIKEN (Hamada) GRAPE7
new FPGA based board from Tokyo Univ.
(Fukushige) GRAPE-DR new board from Makino et
al. NAOJ MPRACE1,2 FPGA boards from Univ.
Mannheim/GRACE (Kugel et al.)
15
(No Transcript)
16
ARI 32 node GRAPE6a clusters
  • 32 dual-Xeon 3.0 GHz nodes
  • 32 GRAPE6a
  • 14 TB RAID
  • Infiniband link (10 Gb/s)
  • Speed 4 Tflops
  • N up to 4M
  • Cost 500K USD
  • Funding NSF/NASA/RIT
  • 32 dual-Xeon 3.2 GHz nodes
  • 32 GRAPE6a
  • 32 FPGA
  • 7 TB RAID
  • Dual port Infiniband link (20 Gb/s)
  • Speed 4 Tflops
  • N up to 4M
  • Cost 380K EUR
  • Funding Volkswagen/Baden-Württemberg

Infiniband Dual 20Gb/s
17
ARI-ZAH RIT GRAPE6a clusters
Performance Analysis (3.2 Tflop/s) Harfst et
al. 2007, New Astron.
18
(No Transcript)
19
Hardware
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
Meeting 1515 1700 Use Cases
Software High Accuracy Integrators for Systems
with long-range force relaxation (gravothermal)
  • S.J.Aarseth, S. Mikkola (ca. 20.000 lines)
  • Hierarchical Block Time Steps
  • Ahmad-Cohen Neighbour Scheme
  • Kustaanheimo-Stiefel and Chain-Regular.
  • for bound subsystems of Nlt6 (Quaternions!)
  • 4th order Hermite scheme (pred/corr)
  • Bulirsch-Stoer (for KS)
  • NBODY6 (Aarseth 1999)
  • NBODY6 (Spurzem 1999) using MPI/shmem, copy
    algorithm
  • Parallel Binary Integration in Progress
  • Parallel GRAPE Use (Harfst, Gualandris, Merritt,
  • Spurzem, Berczik,
    Portegies Zwart, 2007)

26
Meeting 1515 1700 Use Cases
High Accuracy Integrators Record with GRAPE
cluster at 2 million particles!
?Harfst, Gualandris, Merritt, Spurzem, Berczik
?Baumgardt, Heggie, Hut Baumgardt, Makino
by D.C. Heggie Via www.maths.ed.ac.uk Larger N
needed!
27
Meeting 1515 1700 Use Cases
ARI Cluster 3.2 Tlop/s sustained
Harfst, Gualandris, Merritt, Spurzem, Portegies
Zwart, Berczik, New Astron. 2007.
Parallel PP on GRAPE6a cluster
28
Visualisation
With S. Dominiczak W. Frings John-von- Neumann In
stitute for Computing (NIC) FZ Jülich google
for xnbody
29
Meeting 1515 1700 Use Cases
  • Xnbody Visualization with FZ Jülich (Unicore)
  • NBODY6 UseCase in Astrogrid-D (Globus GT4.0)
  • Simple JSDL Job
    ok
  • Parallel Job
    GRAPE/MPRACE request
  • in
    progress Astrogrid-D
  • Participation in international networks, like
    MODEST, AGENA (EGEE)
  • Goal share and load-balance GRAPE/MPRACE
  • resources in international grid-based frame

30
International GRAPE-Grid Collaboration
Meeting 1515 1700 Use Cases
Members of Astrogrid-D ARI-ZAH Univ.
Heidelberg, D Main Astron. Obs. Kiev,
UA Candidates Univ. Amsterdam, NL Obs. Astroph.
Marseille, F Fessenkov Obs., Almaty, KZ
31
Meeting 1515 1700 Use Cases
  • NBODY6 Requirements
  • Fortran 77 with cpp Preprocessor and make
  • Data Access for Job Chain
  • Staging of binary and ASCII input/output
  • Optional
  • Parallel Runs (PBS, mpich-mpif77, mpirun,
    others)
  • GRAPE hardware
  • xnbody direct visualization and interaction
    interface
  • Future
  • GridMPI, Runs across sites

32
Meeting 1730 1830 WG5 with WG3
  • Common Workgroup Meeting of WG3 (Distributed Data
    Management) with WG5 (Resource Management for
    Grid Jobs) Expected List of Topics
  • How can we improve data staging together? Which
    steps, what is needed, action items, people?
  • Further Interaction with other WG's e.g. WG7 user
    interfaces, WG6 Data Streaming, WG1 system
    integration
  • Next deliverables 5.4-5.8, others...
  • Open Discussion on sustainability,
    internationality, EGEE, followup project,
    breakout ideas, guided by Goals Last Year

33
Meeting 1730 1830 WG5 with WG3
  • How can we improve data staging together? Which
    steps, what is needed, action items, people?
  • Use Astrogrid-D file management system?

34
WP5 Resource Management for Grid Jobs Tasks
  • Task V-1 Specification of Requirements and
    Architecture
  • AIP (8), ARI-ZAH (6), ZIB (6), AEI (2), MPE
    (2), MPA (1)
  • Start Sep. 05, Deliverable D5.1 Oct. 2006
    COMPLETED
  • Task V-2 Development of Grid-Job Management
    (Feb. 07)
  • ZIB (24), ARI-ZAH (12), MPA (5)
  • Start June 06, Deliverable D5.2 Feb. 2007,
    D5.6 June 2008
  • 5.2 COMPLETED
  • Task V-4 Adaptation of User- and Programmer
    Interfaces (May 07)
  • AIP (18), ARI-ZAH (12), AEI (5), MPE (4),
    MPA (1)
  • Start Dec. 06 Deliverable D5.4 May 2007,
    D5.7 Sep. 2008 PENDING
  • Task V-3 Development Link to Robotic Telescopes,
    Requests (Feb 07) AIP (17), ZIB (6) , Start Sep.
    06 Deliverable D5.3 Feb. 2007, D5.5 Oct. 2007,
    D5.8 Sep. 2008 IN PROGRESS

35
Meeting 1730 1830 WG5 with WG3
Next Steps in WG-5 / WG-3
  • Short Term
  • Improve the deployment by pushing the
    implementation of modules
  • for at least 2-5 pioneer usecases (this year)
    D5.4, 5.7.
  • Demonstrate the ability to deploy and run these
    use case on more than one
  • resource using Gridway (this year) D5.4, 5.7.
  • Use first primitive data staging (handing data
    through).
  • Note Useful Document GridGateWay 2007-10-05
    by HMA et al.
  • Middle Term
  • Enable GridWay as AstroGrid-D job manager (May
    08) D5.6
  • Solve the problem how to handle data management
    together
  • with Gridway (Aug 08) TA II-5
  • increase number of use cases and prospective
    users D5.4
  • Improve international impact / compatibility
    issues e.g. with EGEE

36
wird vom open grid forum (OGF) unterstützt
WG5 Current status,Job Management
Entscheidung für die Job Submission Data Language
(JSDL)
wird vom open grid forum (OGF) unterstützt
jsdlproc
JSDL
RSL/XML
GUI
(GT4.2 wird gerade entwickelt und wird JSDL
direkt unterstützen)
GT4.0
37
WG5 Current status, Scheduler/Broker
  • GridWay
  • Leightweight Metascheduler on top of GT2.4/GT4
  • Central Server Architecture
  • Support of GGF DRMAA standard API for job
    submission and management
  • Simple round robin/flooding scheduling algorithm,
    but extensible

38
WG5 Current status,Scheduler/Broker
Information System
GT4 Resources
Matchmaking
hydra.ari.uni-heidelberg.de
Gridway
Scheduler / Broker
Job Status gwps
39
WG5 Current status, Robotic Telescopes
STELLA-I
  • First Steps accomplished toward the integration
    into AstroGrid
  • Adopted the REMOTE TELESCOPE MARKUP LANGUAGE
    (RTML) and developed a first description of
    STELLA-I
  • This description can contain dynamic information
    e.g. about weather
  • Developed a generic transformation from RTML to
    RDF which we can upload to the AstroGrid
    information service
  • (Therefore we modified the program OwlMap from
    the FRESCO project)
  • The user can use SPARQL queries to find
    appropriate telescopes.
  • Also SPARQL queries can be implemented in tools
    like the Grid-Resource Map.

Robotic Telescopes STELLA-I II in
Tenerife (Canary Islands)
40
WG5 Next Steps, Robotic Telescopes
  • Next steps
  • RTML description of STELLA-II, RoboTel and other
    robotic telescopes
  • Develop a system that adds dynamic weather
    information
  • Develop transformation from RTML to telescope
    specific language for AIP operated telescopes to
    be able to send observation requests in RTML
  • Provide access through the AstroGrid by applying
  • Grid security mechanisms
  • VO management
  • Development of a scheduler for a network of
    robotic telescopes
  • A lot of testing
  • The AIP has a simulator for STELLA and RoboTel
Write a Comment
User Comments (0)
About PowerShow.com