Singular Value Decomposition and Data Management - PowerPoint PPT Presentation

1 / 70
About This Presentation
Title:

Singular Value Decomposition and Data Management

Description:

Singular Value Decomposition and Data Management – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 71
Provided by: George935
Learn more at: https://www.cs.bu.edu
Category:

less

Transcript and Presenter's Notes

Title: Singular Value Decomposition and Data Management


1
Singular Value Decomposition and Data Management
2
SVD - Detailed outline
  • Motivation
  • Definition - properties
  • Interpretation
  • Complexity
  • Case studies
  • Additional properties

3
SVD - Motivation
  • problem 1 text - LSI find concepts
  • problem 2 compression / dim. reduction

4
SVD - Motivation
  • problem 1 text - LSI find concepts

5
SVD - Motivation
  • problem 2 compress / reduce dimensionality

6
Problem - specs
  • 106 rows 103 columns no updates
  • random access to any cell(s) small error OK

7
SVD - Motivation
8
SVD - Motivation
9
SVD - Definition
  • An x m Un x r L r x r (Vm x r)T
  • A n x m matrix (eg., n documents, m terms)
  • U n x r matrix (n documents, r concepts)
  • L r x r diagonal matrix (strength of each
    concept) (r rank of the matrix)
  • V m x r matrix (m terms, r concepts)

10
SVD - Properties
  • THEOREM Press92 always possible to decompose
    matrix A into A U L VT , where
  • U, L, V unique ()
  • U, V column orthonormal (ie., columns are unit
    vectors, orthogonal to each other)
  • UT U I VT V I (I identity matrix)
  • L singular values, non-negative and sorted in
    decreasing order

11
SVD - Example
  • A U L VT - example

retrieval
inf.
lung
brain
data
CS
x
x

MD
12
SVD - Example
  • A U L VT - example

retrieval
CS-concept
inf.
lung
MD-concept
brain
data
CS
x
x

MD
13
SVD - Example
doc-to-concept similarity matrix
  • A U L VT - example

retrieval
CS-concept
inf.
lung
MD-concept
brain
data
CS
x
x

MD
14
SVD - Example
  • A U L VT - example

retrieval
strength of CS-concept
inf.
lung
brain
data
CS
x
x

MD
15
SVD - Example
  • A U L VT - example

term-to-concept similarity matrix
retrieval
inf.
lung
brain
data
CS-concept
CS
x
x

MD
16
SVD - Example
  • A U L VT - example

term-to-concept similarity matrix
retrieval
inf.
lung
brain
data
CS-concept
CS
x
x

MD
17
SVD - Detailed outline
  • Motivation
  • Definition - properties
  • Interpretation
  • Complexity
  • Case studies
  • Additional properties

18
SVD - Interpretation 1
  • documents, terms and concepts
  • U document-to-concept similarity matrix
  • V term-to-concept sim. matrix
  • L its diagonal elements strength of each
    concept

19
SVD - Interpretation 2
  • best axis to project on (best min sum of
    squares of projection errors)

20
SVD - Motivation
21
SVD - interpretation 2
SVD gives best axis to project
v1
  • minimum RMS error

22
SVD - Interpretation 2
23
SVD - Interpretation 2
  • A U L VT - example

24
SVD - Interpretation 2
  • A U L VT - example

variance (spread) on the v1 axis
x
x

25
SVD - Interpretation 2
  • A U L VT - example
  • U L gives the coordinates of the points in
    the projection axis

x
x

26
SVD - Interpretation 2
  • More details
  • Q how exactly is dim. reduction done?

27
SVD - Interpretation 2
  • More details
  • Q how exactly is dim. reduction done?
  • A set the smallest singular values to zero

x
x

28
SVD - Interpretation 2
x
x

29
SVD - Interpretation 2
x
x

30
SVD - Interpretation 2
x
x

31
SVD - Interpretation 2

32
SVD - Interpretation 2
  • Equivalent
  • spectral decomposition of the matrix

x
x

33
SVD - Interpretation 2
  • Equivalent
  • spectral decomposition of the matrix

l1
x
x

u1
u2
l2
v1
v2
34
SVD - Interpretation 2
  • Equivalent
  • spectral decomposition of the matrix

m


...
n
35
SVD - Interpretation 2
  • spectral decomposition of the matrix

m
r terms


...
n
n x 1
1 x m
36
SVD - Interpretation 2
  • approximation / dim. reduction
  • by keeping the first few terms (Q how many?)

m
To do the mapping you use VT X VT X


...
n
assume l1 gt l2 gt ...
37
SVD - Interpretation 2
  • A (heuristic - Fukunaga) keep 80-90 of
    energy ( sum of squares of li s)

m


...
n
assume l1 gt l2 gt ...
38
SVD - Interpretation 3
  • finds non-zero blobs in a data matrix

x
x

39
SVD - Interpretation 3
  • finds non-zero blobs in a data matrix

x
x

40
SVD - Interpretation 3
  • Drill find the SVD, by inspection!
  • Q rank ??

x
x
??

??
??
41
SVD - Interpretation 3
  • A rank 2 (2 linearly independent rows/cols)

x
x
??

??
??
??
42
SVD - Interpretation 3
  • A rank 2 (2 linearly independent rows/cols)

x
x

orthogonal??
43
SVD - Interpretation 3
  • column vectors are orthogonal - but not unit
    vectors

0
0
x
x
0

0
0
0
0
0
0
0
44
SVD - Interpretation 3
  • and the singular values are

0
0
x
x
0

0
0
0
0
0
0
0
45
SVD - Interpretation 3
  • A SVD properties
  • matrix product should give back matrix A
  • matrix U should be column-orthonormal, i.e.,
    columns should be unit vectors, orthogonal to
    each other
  • ditto for matrix V
  • matrix L should be diagonal, with positive values

46
SVD - Complexity
  • O( n m m) or O( n n m) (whichever is
    less)
  • less work, if we just want singular values
  • or if we want first k left singular vectors
  • or if the matrix is sparse Berry
  • Implemented in any linear algebra package
    (LINPACK, matlab, Splus, mathematica ...)

47
Optimality of SVD
  • Def The Frobenius norm of a n x m matrix M is
  • (reminder) The rank of a matrix M is the number
    of independent rows (or columns) of M
  • Let AULVT and Ak Uk Lk VkT (SVD
    approximation of A)
  • Ak is an nxm matrix, Uk an nxk, Lk kxk, and Vk
    mxk
  • Theorem Eckart and Young Among all n x m
    matrices C of rank at most k, we have that

48
Kleinbergs Algorithm
  • Main idea In many cases, when you search the web
    using some terms, the most relevant pages may not
    contain this term (or contain the term only a few
    times)
  • Harvard www.harvard.edu
  • Search Engines yahoo, google, altavista
  • Authorities and hubs

49
Kleinbergs algorithm
  • Problem dfn given the web and a query
  • find the most authoritative web pages for this
    query
  • Step 0 find all pages containing the query terms
    (root set)
  • Step 1 expand by one move forward and backward
    (base set)

50
Kleinbergs algorithm
  • Step 1 expand by one move forward and backward

51
Kleinbergs algorithm
  • on the resulting graph, give high score (
    authorities) to nodes that many important nodes
    point to
  • give high importance score (hubs) to nodes that
    point to good authorities)

hubs
authorities
52
Kleinbergs algorithm
  • observations
  • recursive definition!
  • each node (say, i-th node) has both an
    authoritativeness score ai and a hubness score hi

53
Kleinbergs algorithm
  • Let E be the set of edges and A be the adjacency
    matrix
  • the (i,j) is 1 if the edge from i to j exists
  • Let h and a be n x 1 vectors with the
    hubness and authoritativiness scores.
  • Then

54
Kleinbergs algorithm
  • Then
  • ai hk hl hm
  • that is
  • ai Sum (hj) over all j that (j,i) edge
    exists
  • or
  • a AT h

k
i
l
m
55
Kleinbergs algorithm
  • symmetrically, for the hubness
  • hi an ap aq
  • that is
  • hi Sum (qj) over all j that (i,j) edge
    exists
  • or
  • h A a

n
i
p
q
56
Kleinbergs algorithm
  • In conclusion, we want vectors h and a such that
  • h A a
  • a AT h
  • Recall properties
  • C(2) A n x m v1 m x 1 l1 u1 n x 1
  • C(3) u1T A l1 v1T

57
Kleinbergs algorithm
  • In short, the solutions to
  • h A a
  • a AT h
  • are the left- and right- eigenvectors of the
    adjacency matrix A.
  • Starting from random a and iterating, well
    eventually converge
  • (Q to which of all the eigenvectors? why?)

58
Kleinbergs algorithm
  • (Q to which of all the eigenvectors? why?)
  • A to the ones of the strongest eigenvalue,
    because of property B(5)
  • B(5) (AT A ) k v (constant) v1

59
Kleinbergs algorithm - results
  • Eg., for the query java
  • 0.328 www.gamelan.com
  • 0.251 java.sun.com
  • 0.190 www.digitalfocus.com (the java developer)

60
Kleinbergs algorithm - discussion
  • authority score can be used to find similar
    pages to page p
  • closely related to citation analysis, social
    networs / small world phenomena

61
google/page-rank algorithm
  • closely related The Web is a directed graph of
    connected nodes
  • imagine a particle randomly moving along the
    edges ()
  • compute its steady-state probabilities. That
    gives the PageRank of each pages (the importance
    of this page)
  • () with occasional random jumps

62
PageRank Definition
  • Assume a page A and pages T1, T2, , Tm that
    point to A. Let d is a damping factor. PR(A) the
    pagerank of A. C(A) the out-degree of A. Then

63
google/page-rank algorithm
  • Compute the PR of each pageidentical problem
    given a Markov Chain, compute the steady state
    probabilities p1 ... p5

2
1
3
4
5
64
Computing PageRank
  • Iterative procedure
  • Also, navigate the web by randomly follow links
    or with prob p jump to a random page. Let A the
    adjacency matrix (n x n), di out-degree of page i
  • Prob(Ai-gtAj) pn-1(1-p)di1Aij
  • Ai,j Prob(Ai-gtAj)

65
google/page-rank algorithm
  • Let A be the transition matrix ( adjacency
    matrix, row-normalized sum of each row 1)

2
1
3

4
5
66
google/page-rank algorithm
  • A p p

A p p
2
1
3

4
5
67
google/page-rank algorithm
  • A p p
  • thus, p is the eigenvector that corresponds to
    the highest eigenvalue (1, since the matrix is
    row-normalized)

68
Kleinberg/google - conclusions
  • SVD helps in graph analysis
  • hub/authority scores strongest left- and right-
    eigenvectors of the adjacency matrix
  • random walk on a graph steady state
    probabilities are given by the strongest
    eigenvector of the transition matrix

69
Conclusions so far
  • SVD a valuable tool
  • given a document-term matrix, it finds concepts
    (LSI)
  • ... and can reduce dimensionality (KL)

70
Conclusions contd
  • ... and can find fixed-points or steady-state
    probabilities (google/ Kleinberg/ Markov Chains)
  • ... and can solve optimally over- and
    under-constraint linear systems (least squares)
Write a Comment
User Comments (0)
About PowerShow.com