Organic Grid: selforganizing computational biology on desktop grids - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

Organic Grid: selforganizing computational biology on desktop grids

Description:

organizing computation on a system composed of 105-106 nodes. Constraints: ... The tasks that compose an application are implemented as mobile and autonomous agents ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 30

Provided by: chonh

Category:

more less

Transcript and Presenter's Notes

Title: Organic Grid: selforganizing computational biology on desktop grids

1
Organic Gridself-organizing computational
biology on desktop grids

A.J.Chakravarti
MathWorks, Inc.
G. Baumgartner
M. Lauria
Dept. of Computer Science Engineering
Ohio State University

2
Large scale systems for HPC

How to organize a computation on large scale
loosely organized distributed systems?
Challenges distribution of code and data, load
balancing, tolerance to faults on a highly
dynamic, uncontrollable and unpredictable
platform
Current approaches (desktop/traditional Grids)
have limitations
only independent task, centralized scheduling,
etc
Propose a distributed, adaptive scheduling scheme
HPC (High Performance Computing)

3
Approach

Problem
organizing computation on a system composed of
105-106 nodes
Constraints
Minimal knowledge about the system
Completely decentralized decision process
Solution design a system in which the pieces of
a computation are autonomous and take independent
decisions based on local knowledge
Take inspiration from models of organization of
complex systems in nature (principle of
emergence)
Elements interaction with simple rule ?
complicated patterns

4
Organic Grid principles

Basic concept 1
The tasks that compose an application are
implemented as mobile and autonomous agents
Decentralized organization is achieved through
agents autonomous behavior
Every agent follows simple rules for
Communicating data
Deciding which computation to perform
Authentic Peer-to-Peer approach
Every machine in the system can start a
computation, or contribute to someone elses
computation
i.e. it can either generate its own population
of agents, or accept someone elses

5
Organic Grid principles

Basic concept 2
Mobil Agents organize themselves autonomously
into an overlay network
Overlay network construction based on a minimum
initial knowledge of underlying system
Every machine has a friend-list (neighboring
list)
URL of a handful of other machines
The overlay network is not prearranged
it emerges on the fly from the interaction
between friends

6
Organic Grid principles

Basic concept 3
each agent takes decisions based on locally
obtained knowledge of the system
Agents make use of feedback from environment
(passive resource monitoring)
measure of friends performance provides most of
the information required for decision making
keep track of total execution time for an
assigned subtask ? provides aggregate evaluation
of link bandwidth, computational performance,
available CPU cycles to friends

7
challenge

The challenge creating an agent behavior that
addresses the following problems
Task distribution
Data distribution
Collection of results
Fault Tolerance
Detection of termination

8
Overlay Tree

Every node has parent children
Every node has ancestor list children list
Every node has former children list
Every node has subtask list

executing
previous
waiting
done
(queue)
(history of parent)
(children list)
potential
active
(history of children)
9
Subtasks

Independent Task Application (ITA)
every subtasks are identical
the size of each is generally large
can be independently executed

10
Request Send Subtasks

A node requests subtasks s to its parent
when it is idling (no computations that should be
performed)
At the beginning, it asks neighbors for subtasks
A parent sends subtasks to children
when receiving request (S) from children
when there are subtasks that should be performed
the size of subtask list is gt S, send S subtasks
the size of subtask list is lt S, send all
subtasks in the list

11
Children List

A node has up to c active p potential children
Active children
They are ranked based on rate R at which it sends
r results
R average time intervals
Potential children
They have not yet evaluated by the node (i.e.
parent)
i.e. not yet received results from the children
When the size of children list becomes gt c
The slowest child is removed from children list
It is added to former children list (max size o)

12
Report Results

Result-burst size
A node reports the results to its parent
When it collects r results (either itself or from
children)
A parent manage children rank by
Time taken to obtain r (R1) results from a
child
Calculate the time when it receives results from
any child

13
Updating Tree

A node periodically informs its parent about the
best performing child
A parent checks if it is in its former children
list
If not, a parent adds the child

14
Self-Adjustment

A node request more subtasks t (s at the
beginning) to increase the utilization of its
resources
Once it has finished computing subtasks (i.e.
idling)
compares the average time to compute a subtask on
this run to that of the previous run
Depends on the comparison
the node requests i(t), d(t), t subtasks in the
next request
Function i(t), d(t), t
Constant, Linear, Exponential

15
Fault Tolerant

If the parent of a node became inaccessible
A node removes the parent from its ancestor list
A node sends a message to (a-1)-th node in the
list
Until an ancestor accepts to be a new parent
If the size of ancestor list is 0
A node should find a new parent
A node sends request to neighbors (start again)

parent
old1
old2
parent
old2
node
node
16
BLAST

Tools of gene sequence similarity search
Step 1 Given query sequence Q, compile the list
of possible words which form with words in Q high
scoring word pairs.
Query PQGKLTVNQ k-words (k3) PQG, QGK,
GKL, neighbors PKG, PLG, PTG, QPK, QGK,
QLK,

17
BLAST

Step 2 Scan database for exact matching with the
list of words complied in step 1.
Score each of the words (based on PAM/BLOSUM
matrix)
List them with score gt T (e.g. T 12) PQG(18),
PEG(15), PRG(14), , PSG(13), PQA(12),

seeds
18
BLAST

Step 3 Extending hits from step 2.
For each word in the list, extends the word in
both sides
Until its score no more increases

query
database
19
BLAST

Report all alignment with score gt threshold

20
Experiments

BLAST(Tools of gene sequence similarity search)
Ran BLAST code onset of machines across Ohio
(Cluster Ohio)
Overlay Network(random configuration)

21
Measurement

Child propagation
Result-burst size
Evaluation for children management
Self-adjustment
Prefetching of subtasks (efficient use of idling)
The number of children

22
Effect of Child propagation
Propagation enabled
Propagation disabled
Running time 2294 sec.
Running time 3035 sec.
23
Effect of varying Result Burst Size
Result Burst 1
Result Burst 8
24
Effect of Task Prefetching

The right amount of prefetching (asking for a
bunch of tasks at a time) helps because reduces
idle time between computations

25
Effect of Prefetching Ramp Up

Faster nodes are allowed to ask for larger and
larger number of tasks
The increase is in response to feedback from the
node (i.e. the amount of time it takes to perform
its computations) exponential increase seems to
work better in this case

26
Effect of Number of Children per Node
Ramp-up Time
27
Related work

Hierarchical
Adaptive Master / Worker Heymann et al.
Two-level scheduling Santoso et al.
Peer-to-Peer
Bandwidth-centric Beaumont et al.
Uniform task-distribution Montresor et al.
Market economy, auction-based scheduling
G-commerce Wolksi et al.

28
Conclusions and Future work

Based on preliminary results, this scheme for
decentralized scheduling is promising
With little modifications scheduling can be made
fault tolerant
Sorking Organic Grid prototype that is used to
run a real application (BLAST)
Demonstrated a very structured application
(Cannons algorithm for matrix-matrix
multiplication) in addition to ITA

29
Note (check list)