Dirichlet Component Analysis: Feature Extraction for Compositional Data - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Dirichlet Component Analysis: Feature Extraction for Compositional Data

Description:

Dirichlet Component Analysis: Feature Extraction for ... storyline. intro. general concepts and background. a toy example. how is our approach motivated ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 28
Provided by: velblodVid
Category:

less

Transcript and Presenter's Notes

Title: Dirichlet Component Analysis: Feature Extraction for Compositional Data


1
Dirichlet Component AnalysisFeature Extraction
for Compositional Data
The 25th International Conference on Machine
Learning (ICML) 2008, Helsinki, Finland
  • Hua-Yan Wang
  • Peking University
  • Qiang Yang
  • Hong Kong University of Science and Technology
  • Hong Qin
  • SUNY at Stony Brook
  • Hongbin Zha
  • Peking University

2
storyline
  • intro
  • general concepts and background
  • a toy example
  • how is our approach motivated
  • DCA
  • how does it work
  • experiment results
  • synthetic and real-world datasets

3
storyline
  • intro
  • general concepts and background
  • a toy example
  • how is our approach motivated
  • DCA
  • how does it work
  • experiment results
  • synthetic and real-world datasets

4
intro
  • Feature extraction (dimensionality reduction) is
    useful in many aspects
  • avoid over-fitting of classification / regression
    models
  • improve domain understanding
  • reduce computational expense of subsequent
    processing
  • facilitate visualization of high-D datasets

5
intro
  • We investigate feature extraction for
    compositional data.
  • compositional data normalized histograms,
    representing relative proportion of different
    ingredients in an object

positive, constant-sum, real vectors points in a
simplex
6
storyline
  • intro
  • general concepts and background
  • a toy example
  • how is our approach motivated
  • DCA
  • how does it work
  • experiment results
  • synthetic and real-world datasets

7
a toy example
  • Suppose we have some rock samples collected.
  • In lab, these samples are decomposed by some
    chemical approach, and we record relative
    proportions of 3 major elements A, B, and C in
    each sample.

3 peaks 3 substances that have fixed
compositions in terms of A, B, and C.
C
The major patterns (peaks) are explained by
linear combinations of the variables (features).
A
B
a rock sample
8
a toy example
C
3 peaks 3 substances that have fixed
compositions in terms of A, B, and C.
The major patterns (peaks) are explained by
linear combinations of the variables (features).
A
B
a rock sample
In PCA, we try to explain the major patterns
(variance) separately by individual variables,
instead of their linear combinations.
(diagonalizing the covariance matrix).
PCA
9
a toy example
C
3 peaks 3 substances that have fixed
compositions in terms of A, B, and C.
The major patterns (peaks) are explained by
linear combinations of the variables (features).
A
B
a rock sample
Analogously, is it possible to find a new
representation for this toy example, in which the
major patterns (peaks) are explained separately
by individual variables instead of their linear
combinations ?
10
a toy example
C
A
B
a rock sample
Analogously, is it possible to find a new
representation for this toy example, in which the
major patterns (peaks) are explained separately
by individual variables instead of their linear
combinations ?
11
a toy example
C
How?
A
B
a rock sample
Sometimes we need to extract features for
compositions, and the new features also have a
natural interpretation as compositions. That is,
extract new compositions from old compositions.
12
storyline
  • intro
  • general concepts and background
  • a toy example
  • how is our approach motivated
  • DCA
  • how does it work
  • experiment results
  • synthetic and real-world datasets

13
DCA
  • The (N-1)-simplex is denoted by
  • Variables in compositional data are referred as
    components
  • the family of linear projections that preserve
    the simplex constraint

14
DCA
  • To avoid degenerate cases, such as
  • we further require the rows of the projection
    matrix being constant-sum

15
DCA
  • So far, weve identified the family of
    simplex-to-simplex non-degenerate linear
    projections.
  • However, such projection has an awkward property
    due to the simplex constraint

16
DCA
  • So we define a regularization operator to
    compensate for this effect

17
DCA
  • Principal Component Analysis (PCA)
  • Solution space orthogonal projections
  • Objective empirical Gaussian variance
  • Dirichlet Component Analysis (DCA)
  • Solution space balanced rearrangements
  • Objective empirical Dirichlet precision

18
DCA
  • Dirichlet component analysis
  • Find the balanced rearrangement, which, when
    applied to data together with a regularization
    operator, minimizes the empirical Dirichlet
    precision.
  • optimization no obvious efficient solution due
    to
  • the simplex constraint
  • the regularization operator
  • Our current implementation is based on the
    genetic algorithm.

19
DCA
random initialization
population of balanced rearrangements
new generation
apply to data (for each candidate)
transformed data
sample the population, generate new candidates by
linear combination
regularization
transformed data
serve as weights
estimate (using T. Minkas code)
empirical Dirichlet precision
fitness scores for each candidate
20
storyline
  • intro
  • general concepts and background
  • a toy example
  • how is our approach motivated
  • DCA
  • how does it work
  • experiment results
  • synthetic and real-world datasets

21
experiment results (synthetic data)
22
(No Transcript)
23
experiment results (real-world data)
DCA
24
experiment results (real-world data)
  • bag-of-words data (20 newsgroup dataset)
  • validate the effect of our method in avoiding
    over-fitting of classification models (we use
    linear SVM), especially when the training set is
    extremely small

25
(No Transcript)
26
Thanks!
27
after lunch
S5 (3rd floor) 200 225 pm In multiple
instance learning and learning with missing
features, categorical features
coming up
Write a Comment
User Comments (0)
About PowerShow.com