Classroom Data Analysis - PowerPoint PPT Presentation

About This Presentation

Title:

Classroom Data Analysis

Description:

Using Lincoln Labs serial Matlab generator at present ... Different algorithm from Lincoln Labs executable spec, which uses seed-growing ... – PowerPoint PPT presentation

Number of Views:69

Avg rating:3.0/5.0

Slides: 30

Provided by: johnrg5

Learn more at: https://sites.cs.ucsb.edu

Category:

more less

Transcript and Presenter's Notes

Title: Classroom Data Analysis

1
Sparse Matrices and Combinatorial Algorithms
in Star-P Status and PlansApril 8, 2005
2
Matlabs sparse matrix design principles

All operations should give the same results for
sparse and full matrices (almost all)
Sparse matrices are never created automatically,
but once created they propagate
Performance is important -- but usability,
simplicity, completeness, and robustness are more
important
Storage for a sparse matrix should be O(nonzeros)
Time for a sparse operation should be
O(flops)(as nearly as possible)

3
Matlabs sparse matrix design principles

All operations should give the same results for
sparse and full matrices (almost all)
Sparse matrices are never created automatically,
but once created they propagate
Performance is important -- but usability,
simplicity, completeness, and robustness are more
important
Storage for a sparse matrix should be O(nonzeros)
Time for a sparse operation should be
O(flops)(as nearly as possible)

Star-P dsparse matrices same principles, some
different tradeoffs
4
Sparse matrix operations

dsparse layout, same semantics as ddense
For now, only row distribution
Matrix operators , -, max, etc.
Matrix indexing and concatenation
A (13, 4 5 2) B(, 7) C
A \ b by direct methods

5
Star-P sparse data structure

Full
2-dimensional array of real or complex numbers
(nrowsncols) memory

Sparse
compressed row storage
about (2nzs nrows) memory

6
Star-P distributed sparse data structure
P0
P1

Each processor stores
of local nonzeros
range of local rows
nonzeros in CSR form

P2
Pn
7
Sorting in Star-P

V, perm sort (V)
Useful primitive for some sparse array
algorithms, e.g. the sparse constructor
Star-P uses a parallel sample sort

8
The sparse( ) constructor

A sparse (I, J, V, nr, nc)
Input ddense vectors I, J, V, and dimensions
nr, nc
A(I(k), J(k)) V(k) for each k
Sort triples (i, j, v) by (i, j)
Adjust for desired row distribution
Locally convert to compressed row indices
Sum values with duplicate indices

9
Sparse matrix times dense vector

y A x
The first call to matvec caches a
communication schedule for matrix A. Later
calls to multiply any vector by A use the
cached schedule.
Communication and computation overlap.

10
Sparse linear systems

x A \ b
Matrix division uses MPI-based direct solvers
SuperLU_dist nonsymmetric static pivoting
MUMPS nonsymmetric multifrontal
PSPASES Cholesky
ppsetoption(SparseDirectSolver,SUPERLU)
Iterative solvers implemented in Star-P
Ongoing work on preconditioners

11
Sparse eigenvalue solvers

V, D eigs(A) etc.
Implemented via MPI-based PARPACK
Lehoucq, Maschhoff, Sorensen, Yang
Work in progress

12
Application Fluid dynamics

function lambda peigs (A, B, sigma, iter, tol)
m n size (A)
C A - sigma B
y rand (m, 1)
for k 1iter
q y ./ norm (y)
v B q
y C \ v
theta dot (q, y)
res norm (y - thetaq)
if res lt tol
break
end
end
lambda 1 / theta

Modeling density-driven instabilities in miscible
fluids (Goyal, Meiburg)
Groundwater modeling, oil recovery, etc.
Mixed finite difference spectral method
Large sparse generalized eigenvalue problem

13
Page ranking example

Uses power method, i.e. matvec and vector
arithmetic
Page ranking demo from SC03 on a web crawl of
mit.edu (170,000 pages)

14
Combinatorial Scientific Computing

Sparse matrix methods machine learning search
and information retrieval adaptive multilevel
modeling geometric meshing computational
biology . . .
How will combinatorial methods be used by people
who dont understand them in complete detail?

15
Analogy Matrix division in Matlab

x A \ b
Works for either full or sparse A
Is A square?
no gt use QR to solve least squares problem
Is A triangular or permuted triangular?
yes gt sparse triangular solve
Is A symmetric with positive diagonal elements?
yes gt attempt Cholesky after symmetric minimum
degree
Otherwise
gt use LU on A(, colamd(A))

16
Combinatorics in Star-P

A sparse matrix language is a good start on
primitives for combinatorial computing.
Random-access indexing A(i,j)
Neighbor sequencing find (A(i,))
Sparse table construction sparse (I, J, V)
Other primitives we are building sorting,
searching, pointer-jumping, connectivity and
paths, . . .

17
Matching and depth-first search in Matlab

dmperm Dulmage-Mendelsohn decomposition
Square, full rank A
p, q, r dmperm(A)
A(p,q) is block upper triangular with nonzero
diagonal
also, strongly connected components of a directed
graph
also, connected components of an undirected graph
Arbitrary A
p, q, r, s dmperm(A)
maximum-size matching in a bipartite graph
minimum-size vertex cover in a bipartite graph
decomposition into strong Hall blocks

18
Connected components (work in progress)

Sequential Matlab uses depth-first search
(dmperm), which doesnt parallelize well
Shiloach-Vishkin algorithm
repeat
Link every (super)vertex to a random neighbor
Shrink each tree to a supervertex by pointer
jumping
until no further change
Hybrid SV / search method under construction
Other potential graph kernels
Breadth-first search (after Husbands, LBNL)
Bipartite matching (after Riedy, UCB)
Strongly connected components (after Pinar, LBNL)

19
SSCA 2 -- Overview

Randomly sized cliques
Most intra-clique edges present
Random inter-clique edges,gradually 'thinning'
with distance
Integer and character string edge labels
Randomized vertex numbers

Scalable data generator
Kernel 1 Graph Construction
Kernel 2 Classify Large Sets (Search)
Kernel 3 Subgraph Extraction
Kernel 4 Graph Clustering / Partitioning

20
HPCS Benchmark SpectrumSSCA 2
21
Scalable Data Generator

Produce edge tuples containing the start vertex,
end vertex, and weight for each directed edge
Hierarchical nature, with random-sized cliques
connected by edges between some of the cliques
Interclique edges are assigned using a random
distribution that represents a hierarchical
thinning as vertices are further apart
Weights are either random integer values or
randomly selected character strings
Random permutations to avoid induced locality
Vertex numbers local processor memory locality
Parallel global memory locality
Data sets must be developed in their entirety
before graph construction
May be parallelized data tuple vertex naming
schemes must be globally consistent
Data generator will not be timed
Statistics will be saved to be used in later
verification stages

22
Kernel 1 Graph Construction

Construct the multigraph in a format usable by
all subsequent kernels
No subsequent modifications will be permitted to
benefit specific kernels
Graph construction will be timed

23
Kernel 2 Sort on Selected Edge Weights

Sort on edge weights to determine those vertex
pairs with
The largest integer weights
With a desired string weight
The two vertex pair lists will be saved for use
in the subsequent kernel
Start set SI for integer start sets
Start set SC for character start sets
Sorting weights will be timed

24
Kernel 3 Extract Subgraphs

Extract a series of subgraphs formed by following
paths of length k from a start set of initial
vertices
The set of initial vertices will be determined
from the previous kernel
Produce a series of subgraphs consisting of
vertices and edges on paths of length k from the
vertex pairs start sets SI and SC
A possible computational kernel for graph
extraction is Breadth First Search
Extracting graphs will be timed

25
Kernel 4 Partition Graph Using a Clustering
Algorithm

Apply a graph clustering algorithm to partition
the vertices of the graph into subgraphs no
larger than a maximum size so as to minimize the
number of edges that need be cut, e.g.
Kernighan and Lin
Sangiovanni-Vincentelli, Chen, and Chua
The output of this kernel will be a description
of the clusters and the edges that form the
interconnecting network
This step should identify the structure in the
graph specified in the data generation phase
It is important that the implementation of this
kernel does not utilize a priori knowledge of the
details in the data generator or the statistics
collected in the graph generation process
Identifying the clusters and the interconnecting
network will be timed

26
Concise SSCA2

A serial Matlab implementation of SSCA2
Coding style is simple and data parallel
Our starting point for Star-P implementation
Edge labels are only integers, not strings yet

27
Data generation

Using Lincoln Labs serial Matlab generator at
present
Data-parallel Star-P generator under construction
Generate edge triples as distributed dense
vectors

28
Kernel 1 Graph construction

cSSCA2 K1 30 lines of executable serial
Matlab
Multigraph is a cell array of simple directed
graphs
Graphs are dsparse matrices, created by sparse( )

29
Kernel 2 Search in large sets

cSSCA2 K2 12 lines of executable serial
Matlab
Uses max( ) and find( ) to locate edges of max
weight

ng length(G.edgeWeights) number of graphs
maxwt 0 for g 1ng maxwt max(maxwt,
max(max(G.edgeWeightsg))) end intedges
zeros(0,2) for g 1ng intstart intend
find(G.edgeWeightsg maxwt) intedges
intedges intstart intend end
30
Kernel 3 Subgraph extraction

cSSCA2 K3 25 lines of executable serial
Matlab
Returns subgraphs consisting of vertices and
edges within specified distance of specified
starting vertices
Sparse matrix-vector product for breadth-first
search

31
Kernel 4 Partitioning / clustering

cSSCA2 K4 serial prototype exists, not yet
parallel
Different algorithm from Lincoln Labs executable
spec, which uses seed-growing breadth-first
search
cSSCA2 uses spectral methods for clustering,
based on eigenvectors (Fiedler vectors) of the
graph
One version seeks small balanced edge cuts
(graph partitioning) another seeks clusters
with low isoperimetric number (clustering)