L15:Microarray analysis (Classification) - PowerPoint PPT Presentation

About This Presentation
Title:

L15:Microarray analysis (Classification)

Description:

L15:Microarray analysis (Classification) – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 25
Provided by: Vine95
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: L15:Microarray analysis (Classification)


1
L15Microarray analysis (Classification)
2
The Biological Problem
  • Two conditions that need to be differentiated,
    (Have different treatments).
  • EX ALL (Acute Lymphocytic Leukemia) AML (Acute
    Myelogenous Leukima)
  • Possibly, the set of genes over-expressed are
    different in the two conditions

3
Geometric formulation
  • Each sample is a vector with dimension equal to
    the number of genes.
  • We have two classes of vectors (AML, ALL), and
    would like to separate them, if possible, with a
    hyperplane.

4
Hyperplane properties
  • Given an arbitrary point x, what is the distance
    from x to the plane L?
  • D(x,L) (?Tx - ?0)
  • When are points x1 and x2 on different sides of
    the hyperplane?
  • Ans If D(x1,L) D(x2,L) lt 0

x
?0
5
Separating by a hyperplane
  • Input A training set of ve -ve examples
  • Recall that a hyperplane is represented by
  • x-?0?1x1?2x20 or
  • (in higher dimensions) x ?Tx-?00
  • Goal Find a hyperplane that separates the two
    classes.
  • Classification A new point x is ve if it lies
    on the ve side of the hyperplane (D(x,L)gt 0) ,
    -ve otherwise.

6
Hyperplane separation
  • What happens if we have many choices of a
    hyperplane?
  • We try to maximize the distance of the points
    from the hyperplane.
  • What happens if the classes are not separable by
    a hyperplane?
  • We define a function based on the amount of
    mis-classification, and try to minimize it

7
Error in classification
  • Sample Function sum of distances of all
    misclassified points
  • Let yi-1 for ve example i, yi1 otherwise.
  • The best hyperplane is one that minimizes D(?,?0)
  • Other definitions are also possible.


?
x2
-
x1
8
Restating Classification
  • The (supervised) classification problem can now
    be reformulated as an optimization problem.
  • Goal Find the hyperplane (?,?0), that optimizes
    the objective D(?,?0).
  • No efficient algorithm is known for this problem,
    but a simple generic optimization can be applied.
  • Start with a randomly chosen (?,?0)
  • Move to a neighboring (?,?0) if D(?,?0)lt
    D(?,?0)

9
Gradient Descent
  • The function D(?) defines the error.
  • We follow an iterative refinement. In each step,
    refine ? so the error is reduced.
  • Gradient descent is an approach to such iterative
    refinement.

D(?)
D(?)
?
10
Rosenblatts perceptron learning algorithm
11
Classification based on perceptron learning
  • Use Rosenblatts algorithm to compute the
    hyperplane L(?,?0).
  • Assign x to class 1 if f(x) gt 0, and to class 2
    otherwise.

12
Perceptron learning
  • If many solutions are possible, it does no choose
    between solutions
  • If data is not linearly separable, it does not
    terminate, and it is hard to detect.
  • Time of convergence is not well understood

13
Linear Discriminant analysis
  • Provides an alternative approach to
    classification with a linear function.
  • Project all points, including the means, onto
    vector ?.
  • We want to choose ? such that
  • Difference of projected means is large.
  • Variance within group is small

14
LDA contd
  • What is the projection of a point x onto ??
  • Ans ?Tx
  • What is the distance between projected means?

x
15
LDA Contd
Fisher Criterion
16
LDA
Therefore, a simple computation (Matrix inverse)
is sufficient to compute the best separating
hyperplane
17
Maximum Likelihood discrimination
  • Suppose we knew the distribution of points in
    each class.
  • We can compute Pr(x?i) for all classes i, and
    take the maximum

18
ML discrimination recipe
  • We know the distribution for each class, but not
    the parameters
  • Estimate the mean and variance for each class.
  • For a new point x, compute the discrimination
    function gi(x) for each class i.
  • Choose argmaxi gi(x) as the class for x

19
ML discrimination
  • Suppose all the points were in 1 dimension, and
    all classes were normally distributed.

?1
?2
x
20
ML discrimination (multi-dimensional case)
Not part of the syllabus.
21
Dimensionality reduction
  • Many genes have highly correlated expression
    profiles.
  • By discarding some of the genes, we can greatly
    reduce the dimensionality of the problem.
  • There are other, more principled ways to do such
    dimensionality reduction.

22
Principle Components Analysis
  • Consider the expression values of 2 genes over 6
    samples.
  • Clearly, the expression of the two genes is
    highly correlated.
  • Projecting all the genes on a single line could
    explain most of the data.
  • This is a generalization of discarding the gene.

23
PCA
  • Suppose all of the data were to be reduced by
    projecting to a single line ? from the mean.
  • How do we select the line ??

m
24
PCA contd
  • Let each point xk map to xk. We want to mimimize
    the error
  • Observation 1 Each point xk maps to xk m
    ?T(xk-m)?

xk
?
xk
m
Write a Comment
User Comments (0)
About PowerShow.com