Problem - PowerPoint PPT Presentation

About This Presentation
Title:

Problem

Description:

In Cilk, steals need to be accounted for during execution. Theorem ... Scheduling multithreaded computations by work stealing. ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 30
Provided by: coursesC1
Category:
Tags: problem | steals

less

Transcript and Presenter's Notes

Title: Problem


1
Problem
  • Parallelize (serial) applications that use files.
  • Examples compression tools, logging utilities,
    databases.
  • In general
  • applications that use files depend on sequential
    output,
  • serial append is the usual file I/O operation.
  • Goal
  • perform file I/O operations in parallel,
  • keep the sequential, serial append of the file.

2
Results
  • Cilk runtime-support for serial append with good
    scalability.
  • Three serial append schemes and implementations
    for Cilk
  • ported Cheerio, previous parallel file I/O API
    (M. Debergalis),
  • simple prototype (with concurrent Linked Lists),
  • extension, more efficient data structure
    (concurrent double-linked Skip Lists).
  • Parallel bz2 using PLIO.

3
Single Processor Serial Append
FILE (serial append)
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
4
Single Processor Serial Append
FILE (serial append)
1
2
3
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
5
Single Processor Serial Append
FILE (serial append)
1
2
3
4
5
6
7
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
6
Single Processor Serial Append
FILE (serial append)
1
2
3
4
5
6
7
8
9
10
11
12
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
7
Single Processor Serial Append
FILE (serial append)
1
2
3
4
5
6
7
8
9
10
11
12
Why not in parallel?!
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
8
Fast Serial Append
  • ParalleL file I/O (PLIO) support
  • for Serial Append in
  • Cilk
  • Alexandru Caracas

9
Outline
  • Example
  • single processor multiprocessor
  • Semantics
  • view of Cilk Programmer
  • Algorithm
  • modification of Cilk runtime system
  • Implementation
  • Previous work
  • Performance
  • Comparison

10
Multiprocessor Serial Append
FILE (serial append)
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
11
Multiprocessor Serial Append
FILE (serial append)
1
2
7
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
12
Multiprocessor Serial Append
FILE (serial append)
1
2
3
5
7
8
9
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
13
Multiprocessor Serial Append
FILE (serial append)
1
2
3
4
5
7
8
9
10
6
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
14
Multiprocessor Serial Append
FILE (serial append)
1
2
3
4
5
6
7
8
9
10
11
12
computation DAG
1
12
11
8
7
9
10
5
2
6
3
4
15
File Operations
  • open (FILE, mode) / close (FILE).
  • write (FILE, DATA, size)
  • processor writes to its PION.
  • read (FILE, BUFFER, size)
  • processor reads from PION.
  • Note a seek operation may be required
  • seek (FILE, offset, whence)
  • processor searches for the right PION in the
    ordered data structure

16
Semantics
  • View of Cilk programmer
  • Write operations
  • preserve the sequential, serial append.
  • Read and Seek operations
  • can occur only after the file has been closed,
  • or on a newly opened file.

17
Approach (for Cilk)
  • Bookkeeping (to reconstruct serial append)
  • Divide execution of the computation,
  • Meta-Data (PIONs) about the execution of the
    computation.
  • Observation
  • In Cilk, steals need to be accounted for during
    execution.
  • Theorem
  • expected of steals O ( PT8 ).
  • Corollary (see algorithm)
  • expected of PIONs O ( PT8 ).

18
PION (Parallel I/O Node)
  • Definition a PION represents all the write
    operations to a file performed by a processor in
    between 2 steals.
  • A PION contains
  • data bytes written,
  • victim processor ID,
  • pointer to written data.

p1
p3
p2
p4
PION
1
2
3
4
5
6
7
8
9
10
11
12
FILE
19
Algorithm
  • All PIONSs are kept in an ordered data structure.
  • very simple Example Linked List.
  • On each steal operation performed by processor Pi
    from processor Pj
  • create a new PION pi,
  • attach pi immediately after pj, the PION of Pj in
    the order data structure.

PIONs
p1
pk
pj
20
Algorithm
  • All PIONSs are kept in an ordered data structure.
  • very simple Example Linked List.
  • On each steal operation performed by processor Pi
    from processor Pj
  • create a new PION pi,
  • attach pi immediately after pj, the PION of Pj in
    the order data structure.

pi
PIONs
p1
pk
pj
21
Algorithm
  • All PIONSs are kept in an ordered data structure.
  • very simple Example Linked List.
  • On each steal operation performed by processor Pi
    from processor Pj
  • create a new PION pi,
  • attach pi immediately after pj, the PION of Pj in
    the order data structure.

PIONs
p1
pj
pk
pi
22
Implementation
  • Modified the Cilk runtime system to support
    desired operations.
  • implemented hooks on the steal operations.
  • Initial implementation
  • concurrent Linked List (easier algorithms).
  • Final implementation
  • concurrent double-linked Skip List.
  • Ported Cheerio to Cilk 5.4.

23
Details of Implementation
  • Each processor has a buffer for the data in its
    own PIONs
  • implemented as a file.
  • Data structure to maintain the order of PIONs
  • Linked List, Skip List.
  • Meta-Data (order maintenance structure of PIONs)
  • kept in memory,
  • saved to a file when serial append file is closed.

24
Skip List
  • Similar performance with search trees
  • O ( log (SIZE) ).

NIL
NIL
NIL
NIL
25
Double-Linked Skip List
  • Based on Skip Lists (logarithmic performance).
  • Cilk runtime-support in advanced implementation
    of PLIO as rank order statistics.

NIL
NIL
NIL
NIL
26
PLIO Performance
  • no I/O vs writing 100MB with PLIO (w/ linked
    list),
  • Tests were run on yggdrasil a 32 proc Origin
    machine.
  • Parallelism32,
  • Legend
  • black no I/O,
  • red PLIO.

27
Improvements Conclusion
  • Possible Improvements
  • Optimization of algorithm
  • delete PIONs with no data,
  • cache oblivious Skip List,
  • File system support,
  • Experiment with other order maintenance data
    structures
  • B-Trees.
  • Conclusion
  • Cilk runtime-support for parallel I/O
  • allows serial applications dependent on
    sequential output to be parallelized.

28
References
  • Robert D. Blumofe and Charles E. Leiserson.
    Scheduling multithreaded computations by work
    stealing. In Proceedings of the 35th Annual
    Symposium on Foundations of Computer Science,
    pages 356-368, Santa Fe, New Mexico, November
    1994.
  • Matthew S. DeBergalis. A parallel file I/O API
    for Cilk. Master's thesis, Department of
    Electrical Engineering and Computer Science,
    Massachusetts Institute of Technology, June 2000.
  • William Pugh. Concurrent Maintenance of Skip
    Lists. Departments of Computer Science,
    University of Maryland, CS-TR-2222.1, June, 1990.

29
References
  • Thomas H. Cormen, Charles E. Leiserson, Donald L.
    Rivest and Clifford Stein. Introduction to
    Algorithms (2nd Edition). MIT Press. Cambridge,
    Massachusetts, 2001.
  • Supercomputing Technology Group MIT Laboratory
    for Computer Science. Cilk 5.3.2 Reference
    Manual, November 2001. Available at
    http//supertech.lcs.mit.edu/cilk/manual-5.3.2.pdf
    .
  • bz2 source code. Available at http//sources.redha
    t.com/bzip2.
Write a Comment
User Comments (0)
About PowerShow.com