Numerical Linear Algebra in the Streaming Model - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

Numerical Linear Algebra in the Streaming Model

Description:

Numerical Linear Algebra in the Streaming Model. Ken Clarkson - IBM. David Woodruff - IBM. The Problems. Given n x d matrix A and n x d' matrix B, we want ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 20

Provided by: IBMU328

Category:

more less

Transcript and Presenter's Notes

Title: Numerical Linear Algebra in the Streaming Model

1
Numerical Linear Algebra in the Streaming Model

Ken Clarkson - IBM
David Woodruff - IBM

2
The Problems

Given n x d matrix A and n x d matrix B, we want
estimators for
The matrix product AT B
The matrix X minimizing AX-B
A slightly generalized version of least-squares
regression
Given integer k, the matrix Ak of rank k
minimizing A-Ak
We consider the Frobenius matrix norm root of
sum of squares

3
General Properties of Our Algorithms

1 pass over matrix entries, given in any order
(allow multiple updates)
Maintain compressed versions or sketches of
matrices
Do small work per entry to maintain the sketches
Output result using the sketches
Randomized approximation algorithms

4
Matrix Compression Methods

In a line of similar efforts
Element-wise sampling AM01, AHK06
Sketching / Random Projection maintain a small
number of random linear combinations of rows or
columns S06
Row / column sampling DK01, DKM04, DMM08
Usually more than 1 pass
Here sketching

5
Outline

Matrix Product
Regression
Low-rank approximation

6
Approximate Matrix Product

A and B have n rows, we want to estimate ATB
Let S be an n x m sign (Rademacher) matrix
Each entry is 1 or -1 with probability ½
m is small, to be specified
Entries are O(log 1/d)-wise independent
Sketches are STA and STB
Our estimate of ATB is ATSSTB/m
Easy to maintain sketches given updates
O(m) time per update, O(mc log(nc)) bits of space
for STA and STB
Output ATSSTB/m using fast rectangular matrix
multiplication

7
Expected Error and a Tail Estimate

Using linearity of expectation,
EATSSTB/m ATESSTB/m ATB
Moreover, for d, e gt 0, there is m O(log 1/ d)
e-2 so that
PrATSSTB/m-ATB gt e A B d
(again C Si, j Ci, j21/2)
This tail estimate seems to be new
Follows from bounding O(log 1/d)-th moment of
ATSSTB/m-ATB
Improves space of SarlÓs 1-pass algorithm by a
log c factor

8
Matrix Product Lower Bound

Our algorithm is space-optimal for constant d
a new lower bound
Reduction from a communication game
Augmented Indexing, players Alice and Bob
Alice has random x 2 0,1s
Bob has random i 2 1, 2, , s
also xi1, , xs
Alice sends Bob one message
Bob should output xi with probability at least
2/3
Theorem MNSW Message must be ?(s) bits on
average

9
Lower Bound Proof

Set s (ce-2log cn)
Alice makes matrix U
Uses x1xs
Bob makes matrix U and B
Uses i and xi1, , xs
Alg input will be AUU and B
A and B are n x c/2
Alice
Runs streaming matrix product Alg on U
Sends Alg state to Bob
Bob continues Alg with A U U and B
ATB determines xi with probability at least 2/3
By choice of U, U, B
Solving Augmented Indexing
So space of Alg must be ?(s) ?(ce-2log cn) bits

10
Lower Bound Proof

U U(1) U(2) , U(log (cn)) 0s
U(k) is an (e-2) x c/2 submatrix with entries in
-10k, 10k
U(k)i, j 10k if matched entry of x is 0, else
U(k)i, j -10k
Bobs index i corresponds to U(k)i, j
U is such that A UU U(1) U(2) , U(k)
0s
U is determined from xi1, , xs
ATB is i-th row of U(k)
A ¼ U(k) since the entries of A grow
geometrically
e2A2 B2, the squared error, is small, so
most entries of the approximation to ATB have the
correct sign

11
Linear Regression

The problem minX AX-B
X minimizing this has X A-B, where A- is the
pseudo-inverse of A
The algorithm is
Maintain STA and STB
Return X solving minX ST(AX-B)
Main claim if A has rank k, there is an
m O(ke-1log(1/d)) so that with probability
at least 1- d, AX-B (1e)AX-B
That is, relative error for X is small
A new lower bound (omitted here) shows our space
is optimal

12
Regression Analysis

Why should X be good?
First reduce to showing that A(X-X) is
small
Use normal equation ATAX ATB
Implies AX-B2 AX-B2 A(X-X)2

13
Regression Analysis Continued

Bound A(X-X)2 using several tools
Normal equation (STA)T(STA)X (STA)T(STB)
Our tail estimate for matrix product
Subspace JL for m O(ke-1log(1/d)), ST
approximately preserves lengths of all vectors in
a k-space
Overall, A(X-X)2 O(e)AX-B2
But AX-B2 AX-B2 A(X-X)2

14
Best Low-Rank Approximation

For any matrix A and integer k, there is a matrix
Ak of rank k that is closest to A among all
matrices of rank k.
LSI, PCA, recommendation systems, clustering
The sketch STA holds information about A
There is a rank k matrix Ak in the rowspace of
STA so that A-Ak (1e)A-Ak

15
Low-Rank Approximation via Regression

Why is there such an Ak? Apply the regression
results with A ! Ak, B ! A
The X minimizing ST(AkX-A) has
AkX-A (1 e)Ak X-A
But here X I, and X (ST Ak)- STA
So the matrix AkX Ak(STAk)-STA
Has rank k
Is in the rowspace of STA
Is within 1 e of the smallest distance of rank-k
matrix
Problem seems to require 2 passes

16
1 Pass Algorithm

Suppose R is a d x m sign matrix. By regression
results transposed, the columnspace of AR
contains a good rank-k approximation to A
X minimizing ARX-A has ARX-A (1
e)A-Ak
Apply regression results with A ! AR and B ! A
and X minimizing ST(ARX-A)
So X(STAR)-STA has
ARX-A (1 e)ARX-A (1e)2A-Ak
Algorithm maintains AR and STA, and computes
ARX AR(STAR)-STA
Compute best rank-k approximation to ARX in the
columnspace of AR (ARX has rank ke-2)

17
Concluding Remarks

Space bounds are tight for product, regression
Get 1-pass and O(ke-2(nde-2)log(nd)) space for
low-rank approximation.
First sublinear 1-pass streaming algorithm with
relative error
Show our space is almost optimal via new lower
bound (omitted)
Total work is O(Nke-4), where N is the number of
non-zero entries of A
In next talk, NDT give a faster algorithm for
dense matrices
Improve the dependence on e
Look at tight bounds for multiple passes

18
A Lower Bound
binary string x
matrix A
index i and xi1, xi2,
k e-1 columns per block
0s
k rows
-1000, -1000 -1000, 1000 1000, 1000

0, 0 0, 0 0, 0
10000, -10000 -10000, -10000 -10000, -10000
0, 0 0, 0 0, 0
0 0
10, -10 -10, 10 -10,-10
-100, -100 100, 100 100, 100
n-k rows
Error now dominated by block of interest Bob
also inserts a k x k identity submatrix into
block of interest
19
Lower Bound Details
Block of interest
k e-1 columns
k rows
0s
0s
PIk
n-k rows

Bob inserts k x k identity submatrix, scaled by
large value P Show any rank-k approximation must
err on all of shaded region So good rank-k
approximation likely has correct sign on Bobs
entry

Write a Comment

User Comments (0)