1 Introduction - PowerPoint PPT Presentation

About This Presentation
Title:

1 Introduction

Description:

medic god space kei. patient christian nasa encrypt. year peopl orbit secur ... secur medic year thing. govern question system church. clipper ve high bibl ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 19
Provided by: cis5
Category:

less

Transcript and Presenter's Notes

Title: 1 Introduction


1
(No Transcript)
2
1 Introduction
  • ICA proposed as a useful technique for
    findingmeaningful directions in multivariate
    data
  • The objective function affects the form of
    potential structurediscovered
  • Here, the problem is partitioning and analysis of
    sparse multivariate data
  • Prior knowledge is used to derive a
    computationally inexpensive ICA

3
2 Introduction, continued
  • Two complementary architectures
  • Skewness (asymmetry) is the right objective to
    optimize
  • The two tasks will be unified in a single
    algorithm
  • Result - fast convergence -
    computational cost linear in training points

separate
Observeddocuments
Documentprototypes
separate
Observedwords
Topic-features
4
3 Data Representation
  • Vector space representation document t1,
    t2, . . . , tT T
  • T number of words in the dictionary (tens of
    thousands)
  • elements are binary indicators or frequencies
    sparse representation
  • D term ? document matrix (T ? N, N number
    of documents)

5
4 Preprocessing
  • Assumption observations noisy expansion of
    some denser group oflatent topics
  • Number of clusters or topics set a priori
  • K-dimensional LSA space USED AS
    topic-concepts subspace
  • PCA may lose important data componentssparsity
    infrequent, meaningful correlation
    less concern
  • Reconstruction D DKUEVT

6
5 Prototype Documents from
a Corpus
  • Assumption documents noisy linear mixture
    of (independent) document prototypes
  • N. of prototypes n. of topics
    prototypes reside in LSA-space (K
    dimensions)
  • Data projection onto right eigenvectors
    variance normalization X(1)E-1VT(DT)UT
    (K ? T matrix)
  • Task find mixing matrix W(1), source documents
    S(1) so that X(1)W(1)TS(1) (S(1) K ?
    T matrix)

7
6 Prototype Documents from a
Corpus, continued
  • Basis vectors of topic space assumed different
    to separate prototypes, find independent
    componentsWords in documents are distributed in
    a positively skewed way
  • Search restricted to skewed (perhaps asymmetric)
    distributions
  • LSA unmixing matrix must be orthogonal (
    W(1)-1W(1)T )

W(1)E-1VT
8
7 Prototype Documents from a Corpus,
continued
  • Objective Skewness measure
    Fisher-skewness
  • Prior knowledge small component mean
    projection variance restricted to unity
    Simplified objective G(s) ( 3rd order moment)
  • Prevent degenerate solutions Restrict
    wTw1 for stationary points
  • Solve with gradient methods or iteratively


9
8 Prototype Documents from a Corpus,
continued
  • Sources positive ? is positive (output
    sign is relevant!)
  • K orthonormal projection directions matrix
    iteration
  • Similar to approximate Newton-Raphson
    optimization(FastICA type derivation
    small additional term)
  • Computational complexity O(2K2T KT 4K3)

10
9 Topic Features from
Word Features
  • Assumption terms noisy linear expansion of
    (independent) concepts (topics)
  • Data compression X(2)E-1UT(D)VT
    (K ? N matrix)
  • Task find unmixing matrix W(2), topic features
    S(2) so that X(2)W(2)TS(2) (S(2) K
    ? N matrix)
  • This time, use a Clustering criterion

11
10 Topic Features from Word
Features, continued
W(2)E-1UT
  • Objective function (zkn indicate class of xn)
  • Stochastic minimization EM-type algorithm


12
11 Topic Features from Word
Features, continued
  • Comparison approach set of binary classifiers
    algorithm
  • Maximizes skewed, monotonic increasing
    function of topic sk skewed prior is
    appropriate
  • Variance normalized after LSA, independent
    topics source components aligned to
    orthonormal axes
  • Similar to previous architechture


13
12 Combining the
Tasks
  • Joint optimization problem
  • Information from linear outputs and from weights
    are complementary Topic clustering weight
    peaks representative words projections
    clustering information Document
    weight peaks clustering information
    prototype search projections index
    terms
  • Review the separating weights on
    D W(2)TE-1UT



14
13 Combining the Tasks,
continued
  • Whitening allows inspection but isn't practical
    normalize variance along the K principal
    directions! D' UE-1UTD
  • Find new unmixing matrix to maximize
    W(2') G(W(2')TUTD') ...
    G(W(2')TX(2)) W(2') W(2)
  • Solve the relation W(2)TUTS(1) W(1)TUTS(
    1)
  • Rewrite objective concatenate data UT,
    VT


W(1)W(2)W
15
14 Combining the Tasks,
continued
  • Resultant algorithm O(2K2(T N) K(T N)
    4K3) Inputs D, K 1. Decompose D with
    Lanczos algorithm. Retain K
    first singular values. Obtain U, E, V. 2. Let X
    UT, VT 3. Iterate until convergence O
    utputs S?RK?(TN) , W?RK?K
  • S T document prototypes N topic-features, W
    structure information of identified
    topics in the corpus

16
15 Simulations

1. Newsgroup data ('sci.crypt', 'sci.med',
'sci.space', 'soc.religion.christian')
kei effect space
peopl encrypt year nasa
christian system call
orbit god chip peopl
dai rutger secur medic
year thing govern question
system church clipper ve
high bibl public
doctor launch question
peopl find man part
escrow patient scienc
find comput studi engin
christ
medic god space kei
patient christian nasa
encrypt year peopl orbit
secur effect rutger
launch govern diseas thing
dai system doctor bibl
mission clipper studi christ
flight chip health
understand engin public call
church shuttl escrow
test point system de
physician question scienc
law
10 most representative words 10 most frequent
wordsselected by algorithm conformal with
human labeling
people god dai
space sex church man
year system issu group life
nasa shuttl term thing
love moon design sexual
year christian jpl
research basi find live earth
cost respons question jesu
orbit human homosexu bibl
christ part discuss refer
read rutger gov
launch fornic faith human ron
dr intercours issu save
venu station law
Simulation 2.10 most representative words,using
5 topics and 2 document classes('sci.space',
'soc.religion.christian')
I II III IV V
17
16 Conclusions

Dependency structure of the splitting in
simulation 2
sci.space soc.religion.christian space
shuttle space shuttle christian
christian christian design (IV) mission
(III) church (I) religion (II)
morality (V)
  • Clustering and keyword identification by ICA
    variant that maximizes skewness
  • Key assumption asymmetrical latent prior
  • Joint problem solved (D and DT)
    'spatio-temporal' ICA
  • Algorithm is linear in number of documents,
    O(K2N)
  • Fast convergence (3 - 8 steps)
  • Potential number of topics can be greater than
    indicated bya human labeler discover
    subtopics
  • Hierarchical partitioning possible (recursive
    binary splits)

18
17 Further Work

x'sci.crypt', o'sci.space', ?'sci.med',
'soc.religion.christian'
1
2
3
  • Study links with other methods improve
    flexibility
  • Or develop a mechanismto allow more
    structuredrepresentation, in a mixed or
    hierarchical manner
  • For example build in model-estimation to the
    algorithm
  • Relax equal wk norm assumption
Write a Comment
User Comments (0)
About PowerShow.com