CS 267: Applications of Parallel Computers Final Project Suggestions - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

CS 267: Applications of Parallel Computers Final Project Suggestions

Description:

'Application' could be full scientific application, or important kernel ... Wes Bethel, (graphics and data visualization) Phil Colella, (adaptive mesh refinement) ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 11
Provided by: kathyy
Category:

less

Transcript and Presenter's Notes

Title: CS 267: Applications of Parallel Computers Final Project Suggestions


1
CS 267 Applications of Parallel
ComputersFinal Project Suggestions
  • James Demmel
  • www.cs.berkeley.edu/demmel/cs267_Spr06

2
Outline
  • Kinds of projects
  • Evaluating and improving the performance of a
    parallel application
  • Application could be full scientific
    application, or important kernel
  • Parallelizing a sequential application
  • other kinds of performance improvements possible
    too, eg memory hierarchy tuning
  • Devise a new parallel algorithm for some problem
  • Porting parallel application or systems software
    to new architecture
  • Example of previous projects (all on-line)
  • Upcoming guest lecturers
  • See their previous lectures, or contact them, for
    project ideas
  • Suggested projects

3
CS267 Class Projects from 2004
  • BLAST Implementation on BEE2  Chen Chang
  • PFLAMELET An Unsteady Flamelet Solver for
    Parallel Computers  Fabrizio Bisetti
  • Parallel Pattern Matcher  Frank Gennari, Shariq
    Rizvi, and Guille Díez-Cañas
  • Parallel Simulation in Metropolis  Guang Yang
  • A Survey of Performance Optimizations for
    Titanium Immersed Boundary Simulation  Hormozd
    Gahvari, Omair Kamil, Benjamin Lee, Meling Ngo,
    and Armando Solar
  • Parallelization of oopd1  Jeff Hammel
  • Optimization and Evaluation of a Titanium
    Adaptive Mesh Refinement Code  Amir Kamil, Ben
    Schwarz, and Jimmy Su

4
CS267 Class Projects from 2004 (cont)
  • Communication Savings With Ghost Cell Expansion
    For Domain Decompositions Of Finite Difference
    Grids  C. Zambrana Rojas and Mark Hoemmen
  • Parallelization of Phylogenetic Tree
    Construction  Michael Tung
  • UPC Implementation of the Sparse Triangular Solve
    and NAS FT  Christian Bell and Rajesh Nishtala
  • Widescale Load Balanced Shared Memory Model for
    Parallel Computing  Sonesh Surana, Yatish Patel,
    and Dan Adkins

5
Planned Guest Lecturers
  • Katherine Yelick (UPC, heart modeling)
  • David Anderson (volunteer computing)
  • Kimmen Sjolander (phylogenetic analysis of
    proteins SATCHMO Bonnie Kirkpatrick)
  • Julian Borrill, (astrophysical data analysis)
  • Wes Bethel, (graphics and data visualization)
  • Phil Colella, (adaptive mesh refinement)
  • David Skinner, (tools for scaling up
    applications)
  • Xiaoye Li, (sparse linear algebra)
  • Osni Marques and Tony Drummond, (ACTS Toolkit)
  • Andrew Canning (computational neuroscience)
  • Michael Wehner (climate modeling)

6
Suggested projects (1)
  • Weekly research group meetings on these and
    related topics (see J. Demmel and K. Yelick)
  • Contribute to upcoming ScaLAPACK release (JD)
  • Proposal, talk at www.cs.berkeley.edu/demmel
    ask me for latest
  • Performance evaluation of existing parallel
    algorithms
  • Ex New eigensolvers based on successive band
    reduction
  • Improved implementations of existing parallel
    algorithms
  • Ex Use UPC to overlap communication, computation
  • Many serial algorithms to be parallelized
  • See following slides

7
Missing Drivers in Sca/LAPACK
LAPACK ScaLAPACK
Linear Equations LU Cholesky LDLT xGESV xPOSV xSYSV PxGESV PxPOSV missing
Least Squares (LS) QR QRpivot SVD/QR SVD/DC SVD/MRRR QR iterative refine. xGELS xGELSY xGELSS xGELSD missing missing PxGELS missing missing missing (intent?) missing missing
Generalized LS LS equality constr. Generalized LM Above Iterative ref. xGGLSE xGGGLM missing missing missing missing
8
More missing drivers
LAPACK ScaLAPACK
Symmetric EVD QR / BisectionInvit DC MRRR xSYEV / X xSYEVD xSYEVR PxSYEV / X PxSYEVD missing
Nonsymmetric EVD Schur form Vectors too xGEES / X xGEEV /X missing driver missing driver
SVD QR DC MRRR Jacobi xGESVD xGESDD missing missing PxGESVD missing (intent?) missing Missing
Generalized Symmetric EVD QR / BisectionInvit DC MRRR xSYGV / X xSYGVD missing PxSYGV / X missing (intent?) missing
Generalized Nonsymmetric EVD Schur form Vectors too xGGES / X xGGEV / X missing missing
Generalized SVD Kogbetliantz MRRR xGGSVD missing missing (intent) missing
9
Suggested projects (2)
  • Contribute to sparse linear algebra (JD KY)
  • Performance tuning to minimize latency and
    bandwidth costs, both to memory and between
    processors (sparse gt few flops per memory
    reference or word communicated)
  • Typical methods (eg CG conjugate gradient) do
    some number of dot projects, saxpys for each
    SpMV, so communication cost is O( iterations)
  • Our goal Make latency cost O(1)!
  • Requires reorganizing algorithms drastically,
    including replacing SpMV by new kernel Ax, A2x,
    A3x, , Akx, which can be done with O(1)
    messages
  • Projects
  • Study scalability bottlenecks of current CG on
    real, large matrices
  • Optimize Ax, A2x, A3x, , Akx on sequential
    machines
  • Optimize Ax, A2x, A3x, , Akx on parallel
    machines

10
Suggested projects (3)
  • Evaluate new languages on applications (KY)
  • UPC or Titanium
  • UPC for asynchrony, overlapping communication
    computation
  • ScaLAPACK in UPC
  • Use UPC-based 3D FFT in your application
  • Optimize existing 1D FFT in UPC, to use 3D
    techniques
  • Porting, Evaluating parallel systems software
    (KY)
  • Port UPC to RAMP
  • Port GASNET to Blue Gene, evaluate performance
Write a Comment
User Comments (0)
About PowerShow.com