Title: Dynamic adaptation of parallel codes Toward self-adaptable components for the Grid
1Dynamic adaptation of parallel codesToward
self-adaptable components for the Grid
- Françoise André, Jérémy Buisson Jean-Louis
Pazat - IRISA / INSA de Rennes / Université de Rennes 1
2Our view of the Grid
Cluster resource
Cluster resource
Application
WAN
Our point of interest
Cluster resource
3Our view of the Grid
Cluster resource
Processor resources
Network resource
4Our view of the Grid
- Environment that is
- Parallel
- Grid is built up from parallel machines
- Dynamic
- Resource allocation may change dynamically
- Distributed
- Resources are distributed over a network
- Resources are in different administration domains
- Need for a new programming technique
- Parallel self-adaptable distributed components
5Related works
- Parallel and distributed components / objects
exist - Example GridCCM, PARDIS
- Self-adaptable components exist
- Example ACEEL, DART
- But no parallel and self-adaptable distributed
component
6Principles of parallel components
- Encapsulation of a parallel code
- Collaboration of several communicating processes
- Goal allow to easily couple parallel codes
7Principles of dynamic adaptation
- Modification of the executed code
- Reflexive programming
- Goal better fit to allocated resources
Execution flow
1. Event
2. Reaction
8Dynamic adaptation
- Three key questions
- When should the component adapt?
- How should the component be modified?
- Where can the reaction be executed?
9Dynamic adaptation
- When should the component adapt?
- Upon reception of an event from a monitor
- According to the policy
-
Monitor
Decider
Notifies of events
Interprets
Adaptation policy
10Dynamic adaptation
- How should the component be modified?
- Executing special code
- Following directives of the policy
-
Coordinator
Requests execution of reactions
Decider
Executes
Interprets
Reaction
Adaptation policy
11Dynamic adaptation
- Where can the reaction be executed?
- At the next adaptation point
- Approximated prediction of the next point
- Based on control flow graph
-
Behavior 1
Behavior 2
Reaction
Not an adaptation point
An adaptation point
12Dynamic adaptation
Coordinator
Requests execution of reactions
Monitor
Decider
Executes
Notifies of events
Platform
Component
Interprets
Reaction
Adaptation policy
Modifies
Behavior
13Mixing parallelism and adaptation
Parallel coordinator
Requests execution of reactions
Monitor
Decider
Executes
Notifies of events
Platform
Component
Interprets
Parallel reaction
Adaptation policy
Modifies
Parallel behavior
Parallel behavior
Parallel behavior
14Mixing parallelism and adaptation
- Introduction of global adaptation points
- All the processes at the same state
- Need to coordinate all the processes
- Example SPMD code
- Adaptation point between each phase
Local adaptation point
Global adaptation points
Not global adaptation points
15Mixing parallelism and adaptation
- Need for a distributed algorithm for the parallel
coordinator - Only consider globally reachable points
- In the future of all the processes
- Make an agreement of all the processes
- Choose the same point for all the processes
16Mixing parallelism and adaptation
- Need to control the non-determinism
- Due to parallelism
- Dynamically insert synchronization statements
- Due to unpredictable conditional instructions
- Force the result of the conditions if possible
- Example insertion of empty iterations in loops
- Otherwise postpone the decision-making
17Experiment
- Experiment
- Iterative SPMD code
- Adaptation points between each iteration
- Increase of the number of processors
- Results
- Negligible time in adaptation points
- Gain thanks to the adaptation
- Expected to scale well
Adaptation
18Related domains
Man-Machine Interface
- Computation steering
- Notions equivalent to global adaptation points
- Need to execute some special code at the next
special point - Particular use of adaptation mechanisms
- User interface instead of monitors
Parallel coordinator
Requests execution of reactions
Monitor
Decider
Executes
Notifies of events
Platform
Component
Interprets
Parallel reaction
Adaptation policy
Modifies
Parallel behavior
Parallel behavior
Parallel behavior
19Related domains
- Fault tolerance
- Consider dynamic environment
- Need for a global consistent state
- In the past for fault tolerance
- In the future for dynamic adaptation
- Relation to dynamic adaptation
- An application?
- A complementary feature?
20Work done
- Design of the overall architecture
- Identification of functional boxes
- Distributed algorithm for the coordinator
- Automated instrumentation by static behavioral
reification - Simple negotiation protocol
- Demonstration prototype
- Ad-hoc mechanisms
- Proof of concept
21Future work
- Generalizing the approach
- Generic definition of global adaptation points
- Limits of the same state definition
- Case of non-SPMD codes
- Expression of the adaptation policy
- Limits of explicit event-based rules
- Need for more sophisticated (intelligent?)
policies - Smoothing measures of resource availability
- Balancing instabilities
22Future work
- Collaborative adaptation of components
- Control side-effects
- Avoid adaptation cycles
- Common policy at the level of
- A group of components
- A composite
- The whole application
- Consider full Grid applications
- Not only their components
23Dynamic adaptation of parallel codesToward
self-adaptable components for the Grid