Dirichlet Component Analysis: Feature Extraction for Compositional Data - PowerPoint PPT Presentation

1 / 27

About This Presentation

Title:

Dirichlet Component Analysis: Feature Extraction for Compositional Data

Description:

Dirichlet Component Analysis: Feature Extraction for ... storyline. intro. general concepts and background. a toy example. how is our approach motivated ... – PowerPoint PPT presentation

Number of Views:40

Avg rating:3.0/5.0

Slides: 28

Provided by: velblodVid

Category:

more less

Transcript and Presenter's Notes

Title: Dirichlet Component Analysis: Feature Extraction for Compositional Data

1
Dirichlet Component AnalysisFeature Extraction
for Compositional Data
The 25th International Conference on Machine
Learning (ICML) 2008, Helsinki, Finland

Hua-Yan Wang
Peking University
Qiang Yang
Hong Kong University of Science and Technology
Hong Qin
SUNY at Stony Brook
Hongbin Zha
Peking University

2
storyline

intro
general concepts and background
a toy example
how is our approach motivated
DCA
how does it work
experiment results
synthetic and real-world datasets

3
storyline

intro
general concepts and background
a toy example
how is our approach motivated
DCA
how does it work
experiment results
synthetic and real-world datasets

4
intro

Feature extraction (dimensionality reduction) is
useful in many aspects
avoid over-fitting of classification / regression
models
improve domain understanding
reduce computational expense of subsequent
processing
facilitate visualization of high-D datasets

5
intro

We investigate feature extraction for
compositional data.
compositional data normalized histograms,
representing relative proportion of different
ingredients in an object

positive, constant-sum, real vectors points in a
simplex
6
storyline

intro
general concepts and background
a toy example
how is our approach motivated
DCA
how does it work
experiment results
synthetic and real-world datasets

7
a toy example

Suppose we have some rock samples collected.
In lab, these samples are decomposed by some
chemical approach, and we record relative
proportions of 3 major elements A, B, and C in
each sample.

3 peaks 3 substances that have fixed
compositions in terms of A, B, and C.
C
The major patterns (peaks) are explained by
linear combinations of the variables (features).
A
B
a rock sample
8
a toy example
C
3 peaks 3 substances that have fixed
compositions in terms of A, B, and C.
The major patterns (peaks) are explained by
linear combinations of the variables (features).
A
B
a rock sample
In PCA, we try to explain the major patterns
(variance) separately by individual variables,
instead of their linear combinations.
(diagonalizing the covariance matrix).
PCA
9
a toy example
C
3 peaks 3 substances that have fixed
compositions in terms of A, B, and C.
The major patterns (peaks) are explained by
linear combinations of the variables (features).
A
B
a rock sample
Analogously, is it possible to find a new
representation for this toy example, in which the
major patterns (peaks) are explained separately
by individual variables instead of their linear
combinations ?
10
a toy example
C
A
B
a rock sample
Analogously, is it possible to find a new
representation for this toy example, in which the
major patterns (peaks) are explained separately
by individual variables instead of their linear
combinations ?
11
a toy example
C
How?
A
B
a rock sample
Sometimes we need to extract features for
compositions, and the new features also have a
natural interpretation as compositions. That is,
extract new compositions from old compositions.
12
storyline

intro
general concepts and background
a toy example
how is our approach motivated
DCA
how does it work
experiment results
synthetic and real-world datasets

13
DCA

The (N-1)-simplex is denoted by
Variables in compositional data are referred as
components
the family of linear projections that preserve
the simplex constraint

14
DCA

To avoid degenerate cases, such as
we further require the rows of the projection
matrix being constant-sum

15
DCA

So far, weve identified the family of
simplex-to-simplex non-degenerate linear
projections.
However, such projection has an awkward property
due to the simplex constraint

16
DCA

So we define a regularization operator to
compensate for this effect

17
DCA

Principal Component Analysis (PCA)
Solution space orthogonal projections
Objective empirical Gaussian variance
Dirichlet Component Analysis (DCA)
Solution space balanced rearrangements
Objective empirical Dirichlet precision

18
DCA

Dirichlet component analysis
Find the balanced rearrangement, which, when
applied to data together with a regularization
operator, minimizes the empirical Dirichlet
precision.
optimization no obvious efficient solution due
to
the simplex constraint
the regularization operator
Our current implementation is based on the
genetic algorithm.

19
DCA
random initialization
population of balanced rearrangements
new generation
apply to data (for each candidate)
transformed data
sample the population, generate new candidates by
linear combination
regularization
transformed data
serve as weights
estimate (using T. Minkas code)
empirical Dirichlet precision
fitness scores for each candidate
20
storyline

intro
general concepts and background
a toy example
how is our approach motivated
DCA
how does it work
experiment results
synthetic and real-world datasets

21
experiment results (synthetic data)
22
(No Transcript)
23
experiment results (real-world data)
DCA
24
experiment results (real-world data)

bag-of-words data (20 newsgroup dataset)
validate the effect of our method in avoiding
over-fitting of classification models (we use
linear SVM), especially when the training set is
extremely small

25
(No Transcript)
26
Thanks!
27
after lunch
S5 (3rd floor) 200 225 pm In multiple
instance learning and learning with missing
features, categorical features
coming up

Write a Comment

User Comments (0)