CS 267: Applications of Parallel Computers Final Project Suggestions - PowerPoint PPT Presentation

1 / 10

About This Presentation

Title:

CS 267: Applications of Parallel Computers Final Project Suggestions

Description:

'Application' could be full scientific application, or important kernel ... Wes Bethel, (graphics and data visualization) Phil Colella, (adaptive mesh refinement) ... – PowerPoint PPT presentation

Number of Views:35

Avg rating:3.0/5.0

Slides: 11

Provided by: kathyy

Learn more at: https://people.eecs.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: CS 267: Applications of Parallel Computers Final Project Suggestions

1
CS 267 Applications of Parallel
ComputersFinal Project Suggestions

James Demmel
www.cs.berkeley.edu/demmel/cs267_Spr06

2
Outline

Kinds of projects
Evaluating and improving the performance of a
parallel application
Application could be full scientific
application, or important kernel
Parallelizing a sequential application
other kinds of performance improvements possible
too, eg memory hierarchy tuning
Devise a new parallel algorithm for some problem
Porting parallel application or systems software
to new architecture
Example of previous projects (all on-line)
Upcoming guest lecturers
See their previous lectures, or contact them, for
project ideas
Suggested projects

3
CS267 Class Projects from 2004

BLAST Implementation on BEE2 Chen Chang
PFLAMELET An Unsteady Flamelet Solver for
Parallel Computers Fabrizio Bisetti
Parallel Pattern Matcher Frank Gennari, Shariq
Rizvi, and Guille Díez-Cañas
Parallel Simulation in Metropolis Guang Yang
A Survey of Performance Optimizations for
Titanium Immersed Boundary Simulation Hormozd
Gahvari, Omair Kamil, Benjamin Lee, Meling Ngo,
and Armando Solar
Parallelization of oopd1 Jeff Hammel
Optimization and Evaluation of a Titanium
Adaptive Mesh Refinement Code Amir Kamil, Ben
Schwarz, and Jimmy Su

4
CS267 Class Projects from 2004 (cont)

Communication Savings With Ghost Cell Expansion
For Domain Decompositions Of Finite Difference
Grids C. Zambrana Rojas and Mark Hoemmen
Parallelization of Phylogenetic Tree
Construction Michael Tung
UPC Implementation of the Sparse Triangular Solve
and NAS FT Christian Bell and Rajesh Nishtala
Widescale Load Balanced Shared Memory Model for
Parallel Computing Sonesh Surana, Yatish Patel,
and Dan Adkins

5
Planned Guest Lecturers

Katherine Yelick (UPC, heart modeling)
David Anderson (volunteer computing)
Kimmen Sjolander (phylogenetic analysis of
proteins SATCHMO Bonnie Kirkpatrick)
Julian Borrill, (astrophysical data analysis)
Wes Bethel, (graphics and data visualization)
Phil Colella, (adaptive mesh refinement)
David Skinner, (tools for scaling up
applications)
Xiaoye Li, (sparse linear algebra)
Osni Marques and Tony Drummond, (ACTS Toolkit)
Andrew Canning (computational neuroscience)
Michael Wehner (climate modeling)

6
Suggested projects (1)

Weekly research group meetings on these and
related topics (see J. Demmel and K. Yelick)
Contribute to upcoming ScaLAPACK release (JD)
Proposal, talk at www.cs.berkeley.edu/demmel
ask me for latest
Performance evaluation of existing parallel
algorithms
Ex New eigensolvers based on successive band
reduction
Improved implementations of existing parallel
algorithms
Ex Use UPC to overlap communication, computation
Many serial algorithms to be parallelized
See following slides

7
Missing Drivers in Sca/LAPACK
LAPACK ScaLAPACK
Linear Equations LU Cholesky LDLT xGESV xPOSV xSYSV PxGESV PxPOSV missing
Least Squares (LS) QR QRpivot SVD/QR SVD/DC SVD/MRRR QR iterative refine. xGELS xGELSY xGELSS xGELSD missing missing PxGELS missing missing missing (intent?) missing missing
Generalized LS LS equality constr. Generalized LM Above Iterative ref. xGGLSE xGGGLM missing missing missing missing
8
More missing drivers
LAPACK ScaLAPACK
Symmetric EVD QR / BisectionInvit DC MRRR xSYEV / X xSYEVD xSYEVR PxSYEV / X PxSYEVD missing
Nonsymmetric EVD Schur form Vectors too xGEES / X xGEEV /X missing driver missing driver
SVD QR DC MRRR Jacobi xGESVD xGESDD missing missing PxGESVD missing (intent?) missing Missing
Generalized Symmetric EVD QR / BisectionInvit DC MRRR xSYGV / X xSYGVD missing PxSYGV / X missing (intent?) missing
Generalized Nonsymmetric EVD Schur form Vectors too xGGES / X xGGEV / X missing missing
Generalized SVD Kogbetliantz MRRR xGGSVD missing missing (intent) missing
9
Suggested projects (2)

Contribute to sparse linear algebra (JD KY)
Performance tuning to minimize latency and
bandwidth costs, both to memory and between
processors (sparse gt few flops per memory
reference or word communicated)
Typical methods (eg CG conjugate gradient) do
some number of dot projects, saxpys for each
SpMV, so communication cost is O( iterations)
Our goal Make latency cost O(1)!
Requires reorganizing algorithms drastically,
including replacing SpMV by new kernel Ax, A2x,
A3x, , Akx, which can be done with O(1)
messages
Projects
Study scalability bottlenecks of current CG on
real, large matrices
Optimize Ax, A2x, A3x, , Akx on sequential
machines
Optimize Ax, A2x, A3x, , Akx on parallel
machines