Bulk-Synchronous Parallel ML - PowerPoint PPT Presentation

About This Presentation
Title:

Bulk-Synchronous Parallel ML

Description:

Bulk-Synchronous Parallel ML Implementation of the Parallel Superposition Fr d ric Gava – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 31
Provided by: lacl8
Category:

less

Transcript and Presenter's Notes

Title: Bulk-Synchronous Parallel ML


1
Frédéric Gava
Bulk-Synchronous Parallel ML Implementation of
the Parallel Superposition
2
Background
Parallel programming
3
Projects
  • 2002-2004
  • ACI Grid
  • LIFO, LACL, PPS, INRIA
  • Design of parallel and Grid librairies for OCaml.
  • 2004-2007
  • ACI  Young researchers 
  • LIFO, LACL
  • Production of a programming environment in which
    certified parallel programs can be written and
    safely executed.

4
Outline
  1. The BSML language
  2. Multi-programming (superposition)
  3. Implementation of the superposition
  4. Conclusion and future works

5
The BSML language
6
The BSML  spirite 
  • Bugs grow faster than Moores law. (G. Berry)
  • High-level language ?? lines of code ?? number of
    bugd
  • Certified library ?? number of bugs
  • Small is beautiful. (R. H. Bisseling)
  • BSML only use 5 primitives
  • Who would drive a non-deterministic car ? (G.
    Berry)
  • Propriety of confluence of the semantic of BSML
  • French Proverb  All the roads go to Roma  But
    the better way is to choose the shorter
  • One can give BSP costs to BSML programs
  • Different of concurrent programming cost and
    confluence

7
The BSP model
BSP architecture
  • Characterized by
  • p Number of processors
  • r Processors speed
  • L Global synchronization
  • g Phase of communication (1 word at most sent
    of received by each processor)

8
Model of execution
Beginning of the super-step i
Local computing on each processor
Global (collective) communications between
processors
Global synchronization exchanged data available
for the next super-step
Cost(i) (max0?xltp wxi) hi?g L
9
Example broadcast
  • Direct broadcast (one super-step)

BSP cost p?n?g L
  • Broadcast with 2 super-steps

BSP cost 2?n?g 2?L
10
The BSML language
?-calculus
  • Structured parallelism as an explicit parallel
    extension of ML
  • Functional language with BSP cost predictions
  • Allows the implementation of skeletons
  • Implemented as a parallel library for the
    "Objective Caml" language
  • Using a parallel data structure called parallel
    vector

11
A BSML program
Replicated part
Sequential part
12
Parallel primitives of BSML
  • Asynchronous primitives
  • Creation of a vector (creation of local values)
  • mkpar (int ? ?) ? ? par
  • Parallel point-wize application
  • apply (? ? ?) par ? ? par ? ? par
  • Synchronous and communications primitives
  • Communications
  • put (int??) par ? (int??) par
  • Projection of local values (to be replicated)
  • proj ? par ? (int??)

13
Semantics
Programming model Easy for proofs (Coq)
Natural semantics
Easy for costs
Execution model Make asynchronous steps
appear Close to a real implemantation
14
Natural semantics
  • Semantics set of axioms and inference rules
  • Easy to understand, makes proofs more easy
  • Example

15
Small steps semantics
Local costs
  • Semantics set of rewriting rules
  • Using contexts for the strategy
  • Easier understanding of costs and errors
  • Example

Global cost
16
Distributed semantics
  • Semantics set of parallel rewriting rules
  • SPMD style

Parallel vector
Distributed evaluation
17
Multi-programming
18
Parallel composition
  • Several programs on the same machine
  • Primitive of parallel composition Superposition
  • Divide-and-conquer BSP algorithms

19
Parallel Superposition
  • super (unit ? ?) ? (unit ? b) ? ? ? b
  • super E1 E2 ? (E1 (), E2())
  • Fusion of communications/synchronisations using
    super-threads
  • Keep the BSP model
  • Pure functional semantics

20
Parallel Superposition
21
Implementationof the superposition
22
Semantics (1)
23
Semantics (2)
24
Semantics based implementation
  • The semantics makes appear 3 low level
    primitives
  • Send to send the data of the environment of
    communication
  • Rcv to received them
  • Wait to allow a super-thread to wait his brother
  • BSML primitives are thus simple calls of them
    (as in the small-steps semantics)
  • Super-threads could be implemented using threads
  • A scheduler of this threads is thus need for the
    special management of our super-threads
  • The environment of communications is just a
    Hashtable with pid of super-threads as keys

25
Example, prefixes calculus
scan (?????) ? ? par ? ? par scan () ltv0,
, vp-1gt ltv0, v0v1, , v0v1 vp-1gt
scan () ltv0, , vm, gt lt w0 , , wm ,
gt
scan () lt ,vm1, , vp-1gt lt, wm1 , ,
wp1gt
lt w0 , , wm , wmwm1, , wmwp1gt
ltv0, v0v1, v0vm, v0vm1,, v0vp-1gt
26
Benchmarks
Time (s)
Size of the polynomials
27
Conclusion and future works
28
Conclusion
  • BSMLBSPML
  • Superposition primitive of parallel composition
  • Small-step semantics of the superposition
  • Distributed semantics as small one
  • Superposition implemented using threads as in the
    small-step semantics

29
Future works
  • Implementation using continuation
    (transformation of sources code with the help of
    a type checker) and proof of equivalence using
    our semantics
  • Implentation of bigger algorithms for better
    benchmarks of BSML and its superposition
  • Implementation of parallel skeletons (management
    of tasks) using the superposition ?
  • BSP model-checking of high-level Petri-nets
    (M-nets). The main difficult find a non-trivial
    algorithm as the community of concurrent
    programming does. Possible but need more
    theoretical optimisations

30
Thanks for your attention
Write a Comment
User Comments (0)
About PowerShow.com