ACI MD GDS - PowerPoint PPT Presentation

About This Presentation

Title:

ACI MD GDS

Description:

R servation de ressource dans un ASP hi rarchique. D ploiement automatique. DIET en P2P ... Communications dans DIET. Une application pour GDS. 3. Eddy Caron ... – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 57

Provided by: frdr79

Category:

more less

Transcript and Presenter's Notes

Title: ACI MD GDS

1
ACI MD GDS

Le middleware pour GDS
http//graal.ens-lyon.fr/diet

2
Plan

Réservation de ressource dans un ASP hiérarchique
Déploiement automatique
DIET en P2P
DIET vs NetSolve
VizDIET
Communications dans DIET
Une application pour GDS

3
Context

One long term idea for Grid computing renting
computational power and memory capacity over the
Internet
Very high potential
Need of Problem Solving Environments (PSEs)
Applications need more and more memory capacity
and computational power
Some proprietary libraries or environments need
to stay in place
Difficulty of installation for some libraries or
applications
Some confidential data must not circulate over
the net
Use of computational servers accessible through a
simple interface
Need of schedulers
Moreover
Still difficult to use for non-specialists
Almost no transparency
Security and accounting issues usually not
addressed
Often application dependent PSEs
Lack of standards
(CORBA, JAVA/JINI, sockets, ) to build the
computational servers

4
RPC and Grid-computing ? GridRPC

A simple idea
RPC programming model for the Grid
Use of distributed collections of heterogeneous
platforms on the Internet
For applications require memory capacity and/or
computational power
Task parallelism programming model
(synchronous/asynchronous) data parallelism on
servers ? mixed parallelism
Needed functionality
Load balancing
resource discovery
performance evaluation
Scheduling
Fault tolerance,
Data redistribution,
Security,
Interoperability,

5
GridRPC
Client
AGENT(s)
Op(C, A, B)
S1
S3
S4
S2
6
GridRPC (cont)

5 main components
Client
submits problems to servers
Gives users interfaces
Server
solves problems sent by clients
Runs software
Database
contains dynamic and static information about
software and hardware resources
Scheduler
chooses an appropriate server depending of
the problem sent
the information contained in the database
Monitor
gets information about the status of the
computational resources

7
DIET - Distributed Interactive Engineering
Toolbox -

Hierarchical architecture for an improved
scalability
Distributed information in the tree
Plug-in schedulers

MA
MA
MA
MA
MA
Master Agent
Server front end
A
Direct connection
LA
LA
8
FAST - Fast Agents System Timer -

NWS-based (Network Weather Service, UCSB)
Computational performance
Load, memory capacity, and performance of batch
queues (dynamic)
Benchmarks and modeling of available libraries
(static)
Communication performance
To be able to guess the data redistribution cost
between two servers (or between clients and
servers) as a function of the network
architecture and dynamic information
Bandwidth and latency (hierarchical)

9
PIF - Propagate Information Feedback -

Algorithm from distributed system research
PIF Propagate Information Feedback
Two steps
First phase broadcast phase
Broadcast one message through the tree
Second phase feedback phase
When the leaf has no descendant? feedback
message is sent to their parent
When the parent receives the feedback messages
from all its descendants, it sends a feedback
message to its own parent, and so on

10
PIF and DIET - broadcast phase -
MA
1. Broadcast the clients request
2. Sequential FAST interrogation for each LA
3. Resource reservation
11
PIF and DIET - feedback phase -
MA
1. chooses the identity of the most (or list of)
appropriate'' server(s)
2. unused resources are released
12
PIF and DIET - feedback phase -
MA
1. chooses the identity of the most (or list of)
appropriate'' server(s)
2. unused resources are released
13
PIF and DIET - feedback phase -
S12
MA
1. chooses the identity of the most (or list of)
appropriate'' server(s)
2. unused resources are released
14
Server failure and reactivity
MA
A
A
S2 DEAD LINE 1
S12
S15
S7
S2
LA
LA
LA
LA
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15
S16

Take into account server failure and increase the
DIET reactivity
Time out at the LA level
Dead Line 1 ß1 Call_FAST_time ß2 nb_server

15
Hierarchical fault tolerance
MA
S7 DEAD LINE 2
S12
A
A
S12
S15
S7
LA
LA
LA
LA
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15
S16

No answer after dead line 1
Dead Line 2 ß3 level_tree

16
Simulation SimGRID2

Real experiments or simulations are often used to
test or to compare heuristics.
Designed for distributed heterogeneous platforms
Simulations can enable reproducible scenarios
Simgrid a distributed application simulator for
scheduling algorithm evaluation purposes.
Event-driven simulation
Fixed or According to a trace values to
characterize SimGrid resources
Processors
Network links
Simgrid2 A simulator built using SG. This layer
implements realistic simulations based on the
foundational SG and is more application-oriented.
Simulations are built in terms of communicating
agents.

17
The DIET SimGRID2 simulator
18
Evaluation of the PIF scheduler
19
Conclusion and future work

Conclusion
Benefit from distributed system research
Fault tolerance into DIET
Server failure
Branch failure
Resource reservation performs a good QoS for
client requests
DIET SimGRID2 simulator
Can be reused to validate other algorithms
Future work
Implementation of some tools to guarantee the
resource reservation
Integrate NWS trace into the simulator
How to fix deadlines on a given heterogeneous
platform ?

20
Plan

Réservation de ressource dans un ASP hiérarchique
Déploiement automatique
DIET en P2P
DIET vs NetSolve
VizDIET
Communications dans DIET
Une application pour GDS

21
Automatic deployment

Problem Take the right number of components
(resources) and place them in right way, to
increase the overall performance of the platform.
Motivation how to deploy DIET on the grid ?
Foundation Idea given in the article
Scheduling strategies for master-slave tasking
on heterogeneous processor grids by C.Banino,
O.Beaumont, A. Legrand and Y.Robert.

22
Introduction

Solution
Generate a new structure by arranging the
resources according to the graph, which gives
best throughput.
For homogeneous platform, the resources should be
arranged in a binary tree type structure.
For heterogeneous platform, more resources should
be added by checking the bottleneck in the
structure.

23
Deployment

Architectural model -

24
Deployment

wi (Mflp/s) Computing power of node Pi
bij capacite of link (links are symmetric and
bidirectional)
Sini size of incomming request from client
Souti size of outgoing request (response)
alphaini fraction of time for the computation
of incomming request
alphaouti fraction of time taken for the
computaion of outgoing resuest

25
Operations in steady state

Calculation of the throughput of a node

26
Calculation of the throughput of a graph
27
Example Calculation of throughput of graph
Min(20,12)
12
Min(12,77)
28
Homogeneous Structures
All nodes have same computing power and bandwidth
link
Star Graph
2 Depth Star Graph
Binary Tree
2 Chain Graph
Chain Graph
29
Homogeneous Structures

Simulation results (with 8nodes) -

30
Homogeneous Structures

Simulation results (with 32 nodes) -

31
Homogeneous Structures

Simulation results (for Binary graph) -

32
Heterogeneous Networks
33
Throughput of network
25
R 2
30
25
25
30
25
25
30
1?200
1?40
1?15
34
Throughput of network by adding LAs
24
R 2
R 2.2
R 2.65
17.88
19
15
0.11
0.052
29
0.88
1?40
1?200
1?15
35
Heterogeneous Network
36
Experimental results

1 client with n requests
no steady state (MA performed good)
n clients with n requests
no steady state (MA performed good)
Pipeline effect
not enough nodes (clients)
Buffered the requests at MA
a new client implementation to make an effect of
steady state
MA failed with 960 requests (due to memory
problem)

37
Experimental results
38
Conclusion

Select best structure
Improve the throughput of the network
Predict the performance of the structure
Can find the effects on performance if different
changes are done in the structure configuration
Bottleneck is not caused at MA

39
Conclusion

Homogeneous
Binary tree type structure is best
Number of nodes is proportionate to number of
servers
Star graph type structure, when nodes are less
and servers are more than 60
Heterogeneous
Find the bottleneck
Improve the throughput
Modelizing the DIET

40
Future work

Calculate the throughput of structures with
multi-client and multi-master agents.
Dynamic updating with the use of package GRAS
Timer addition into the tool to get real value
for CORBA implementation of DIET
Check the LA and SeD as the cause of bottleneck
Combine scheduling and deployment to increase the
performance
Validation of work by real deployment.

41
Automatic Deployment first tool
42
Plan

Réservation de ressource dans un ASP hiérarchique
Déploiement automatique
DIET en P2P
DIET vs NetSolve
VizDIET
Communications dans DIET
Une application pour GDS

43
DIET en P2P

Existant
Multi-MA disponible avec connection en JXTA
Docs disponibles
Archive disponible diet-0.7_beta-dev-jxta.tgz
TODO list
Evaluer les performances
Vérifier le respect des coding standard
Intégration au CVS DIET
Briser la contrainte 1 composant JXTApour 1
composant DIET
Algorithmes intelligents pour le parcours des
MA ?

MA
MA
MA
Connexions JXTA
MA
MA
A
LA
LA
LA
44
Plan

Réservation de ressource dans un ASP hiérarchique
Déploiement automatique
DIET en P2P
DIET vs NetSolve
VizDIET
Communications dans DIET
Une application pour GDS

45
DIET vs NetSolve

Scripts de déploiement.
Utilisation de CVS pour mettre à jour les
fichiers de configuration.

clients
paraski
agents servers
sunlabs
clients
ls
46
DIET vs NetSolve
47
DIET vs NetSolve

TODO List
Tests avec API asynchrone.
'Multithreader' le client.
Amélioration des statistiques (indice de
dispersion).
Amélioration des scripts de déploiements
(fichiers de configuration DIET et omniORB).
Expliquer les résultats de NetSolve
Expliquer le problème des 40 clients de DIET
Tests sur les SPARC
Tests icluster2 ?

48
Plan

Réservation de ressource dans un ASP hiérarchique
Déploiement automatique
DIET en P2P
DIET vs NetSolve
VizDIET
Communications dans DIET
Une application pour GDS

49
VizDIET

Chaque LogManager collecte les infos de son agent
et les envoie au LogCentral situé en dehors de la
structure DIET.
VizDiet
outil de visualisation en Java
Interraction sur la plate-forme

50
VizDIET 1.0

Intégration de LogService (LogManager/LogCentral)
dansles agents DIET
Transfert de messages depuis l'agent par
l'intermédiaire du LogManager
pas de stockage sur disque
Etude vizPerf vs vizDIET
Conclusion vizPerf trop éloigné de la structure
DIET

51
VizDIET 2.0

VizDiet collectera les infos de la structure DIET
par LogCentral et les affichera en temps réel.
VizDiet doit pouvoir aussi agir sur la structure
en générant des scripts XML (flèche rouge) ou en
modifiant des infos XML existantes (flèche bleue)

52
Screenshot
53
Plan

Réservation de ressource dans un ASP hiérarchique
Déploiement automatique
DIET en P2P
DIET vs NetSolve
VizDIET
Communications dans DIET
Une application pour GDS

54
Du neuf dans les communications DIET

Asynchrone
Finaliser la compatibilité GridRPC (erreur,
handle,)
Bug sur le diet_wait_and ?
Fuites mémoires ? (test déchelle)
PadicoTM
Compilation/Test
Intégration dans DIET en cours
Limitation des plate-formes utilisables
Bug sur le dechargement des modules (lié à la
fonction dlopen selon la libc utilisée)

55
Plan