Organic Grid: selforganizing computational biology on desktop grids - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Organic Grid: selforganizing computational biology on desktop grids

Description:

organizing computation on a system composed of 105-106 nodes. Constraints: ... The tasks that compose an application are implemented as mobile and autonomous agents ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 30
Provided by: chonh
Category:

less

Transcript and Presenter's Notes

Title: Organic Grid: selforganizing computational biology on desktop grids


1
Organic Gridself-organizing computational
biology on desktop grids
  • A.J.Chakravarti
  • MathWorks, Inc.
  • G. Baumgartner
  • M. Lauria
  • Dept. of Computer Science Engineering
  • Ohio State University

2
Large scale systems for HPC
  • How to organize a computation on large scale
    loosely organized distributed systems?
  • Challenges distribution of code and data, load
    balancing, tolerance to faults on a highly
    dynamic, uncontrollable and unpredictable
    platform
  • Current approaches (desktop/traditional Grids)
    have limitations
  • only independent task, centralized scheduling,
    etc
  • Propose a distributed, adaptive scheduling scheme
  • HPC (High Performance Computing)

3
Approach
  • Problem
  • organizing computation on a system composed of
    105-106 nodes
  • Constraints
  • Minimal knowledge about the system
  • Completely decentralized decision process
  • Solution design a system in which the pieces of
    a computation are autonomous and take independent
    decisions based on local knowledge
  • Take inspiration from models of organization of
    complex systems in nature (principle of
    emergence)
  • Elements interaction with simple rule ?
    complicated patterns

4
Organic Grid principles
  • Basic concept 1
  • The tasks that compose an application are
    implemented as mobile and autonomous agents
  • Decentralized organization is achieved through
    agents autonomous behavior
  • Every agent follows simple rules for
  • Communicating data
  • Deciding which computation to perform
  • Authentic Peer-to-Peer approach
  • Every machine in the system can start a
    computation, or contribute to someone elses
    computation
  • i.e. it can either generate its own population
    of agents, or accept someone elses

5
Organic Grid principles
  • Basic concept 2
  • Mobil Agents organize themselves autonomously
    into an overlay network
  • Overlay network construction based on a minimum
    initial knowledge of underlying system
  • Every machine has a friend-list (neighboring
    list)
  • URL of a handful of other machines
  • The overlay network is not prearranged
  • it emerges on the fly from the interaction
    between friends

6
Organic Grid principles
  • Basic concept 3
  • each agent takes decisions based on locally
    obtained knowledge of the system
  • Agents make use of feedback from environment
    (passive resource monitoring)
  • measure of friends performance provides most of
    the information required for decision making
  • keep track of total execution time for an
    assigned subtask ? provides aggregate evaluation
    of link bandwidth, computational performance,
    available CPU cycles to friends

7
challenge
  • The challenge creating an agent behavior that
    addresses the following problems
  • Task distribution
  • Data distribution
  • Collection of results
  • Fault Tolerance
  • Detection of termination

8
Overlay Tree
  • Every node has parent children
  • Every node has ancestor list children list
  • Every node has former children list
  • Every node has subtask list

executing
previous
waiting
done
(queue)
(history of parent)
(children list)
potential
active
(history of children)
9
Subtasks
  • Independent Task Application (ITA)
  • every subtasks are identical
  • the size of each is generally large
  • can be independently executed

10
Request Send Subtasks
  • A node requests subtasks s to its parent
  • when it is idling (no computations that should be
    performed)
  • At the beginning, it asks neighbors for subtasks
  • A parent sends subtasks to children
  • when receiving request (S) from children
  • when there are subtasks that should be performed
  • the size of subtask list is gt S, send S subtasks
  • the size of subtask list is lt S, send all
    subtasks in the list

11
Children List
  • A node has up to c active p potential children
  • Active children
  • They are ranked based on rate R at which it sends
    r results
  • R average time intervals
  • Potential children
  • They have not yet evaluated by the node (i.e.
    parent)
  • i.e. not yet received results from the children
  • When the size of children list becomes gt c
  • The slowest child is removed from children list
  • It is added to former children list (max size o)

12
Report Results
  • Result-burst size
  • A node reports the results to its parent
  • When it collects r results (either itself or from
    children)
  • A parent manage children rank by
  • Time taken to obtain r (R1) results from a
    child
  • Calculate the time when it receives results from
    any child

13
Updating Tree
  • A node periodically informs its parent about the
    best performing child
  • A parent checks if it is in its former children
    list
  • If not, a parent adds the child

14
Self-Adjustment
  • A node request more subtasks t (s at the
    beginning) to increase the utilization of its
    resources
  • Once it has finished computing subtasks (i.e.
    idling)
  • compares the average time to compute a subtask on
    this run to that of the previous run
  • Depends on the comparison
  • the node requests i(t), d(t), t subtasks in the
    next request
  • Function i(t), d(t), t
  • Constant, Linear, Exponential

15
Fault Tolerant
  • If the parent of a node became inaccessible
  • A node removes the parent from its ancestor list
  • A node sends a message to (a-1)-th node in the
    list
  • Until an ancestor accepts to be a new parent
  • If the size of ancestor list is 0
  • A node should find a new parent
  • A node sends request to neighbors (start again)

parent
old1
old2
parent
old2
node
node
16
BLAST
  • Tools of gene sequence similarity search
  • Step 1 Given query sequence Q, compile the list
    of possible words which form with words in Q high
    scoring word pairs.
  • Query PQGKLTVNQ k-words (k3) PQG, QGK,
    GKL, neighbors PKG, PLG, PTG, QPK, QGK,
    QLK,

17
BLAST
  • Step 2 Scan database for exact matching with the
    list of words complied in step 1.
  • Score each of the words (based on PAM/BLOSUM
    matrix)
  • List them with score gt T (e.g. T 12) PQG(18),
    PEG(15), PRG(14), , PSG(13), PQA(12),

seeds
18
BLAST
  • Step 3 Extending hits from step 2.
  • For each word in the list, extends the word in
    both sides
  • Until its score no more increases

query
database
19
BLAST
  • Report all alignment with score gt threshold

20
Experiments
  • BLAST(Tools of gene sequence similarity search)
  • Ran BLAST code onset of machines across Ohio
    (Cluster Ohio)
  • Overlay Network(random configuration)

21
Measurement
  • Child propagation
  • Result-burst size
  • Evaluation for children management
  • Self-adjustment
  • Prefetching of subtasks (efficient use of idling)
  • The number of children

22
Effect of Child propagation
Propagation enabled
Propagation disabled
Running time 2294 sec.
Running time 3035 sec.
23
Effect of varying Result Burst Size
Result Burst 1
Result Burst 8
24
Effect of Task Prefetching
  • The right amount of prefetching (asking for a
    bunch of tasks at a time) helps because reduces
    idle time between computations

25
Effect of Prefetching Ramp Up
  • Faster nodes are allowed to ask for larger and
    larger number of tasks
  • The increase is in response to feedback from the
    node (i.e. the amount of time it takes to perform
    its computations) exponential increase seems to
    work better in this case

26
Effect of Number of Children per Node
Ramp-up Time
27
Related work
  • Hierarchical
  • Adaptive Master / Worker Heymann et al.
  • Two-level scheduling Santoso et al.
  • Peer-to-Peer
  • Bandwidth-centric Beaumont et al.
  • Uniform task-distribution Montresor et al.
  • Market economy, auction-based scheduling
  • G-commerce Wolksi et al.

28
Conclusions and Future work
  • Based on preliminary results, this scheme for
    decentralized scheduling is promising
  • With little modifications scheduling can be made
    fault tolerant
  • Sorking Organic Grid prototype that is used to
    run a real application (BLAST)
  • Demonstrated a very structured application
    (Cannons algorithm for matrix-matrix
    multiplication) in addition to ITA

29
Note (check list)
  • LASI
  • BLAST
  • Strong mobile agent
Write a Comment
User Comments (0)
About PowerShow.com