Gene Shaving - PowerPoint PPT Presentation

About This Presentation
Title:

Gene Shaving

Description:

Title: An example of HDLSS: Microarray data Author: rizem Last modified by: aidong_Home_desktop Created Date: 5/4/2001 2:08:15 AM Document presentation format – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 14
Provided by: riz84
Learn more at: https://cse.buffalo.edu
Category:

less

Transcript and Presenter's Notes

Title: Gene Shaving


1
Gene Shaving Applying PCA
  • Identify groups of genes a set of genes using PCA
    which serve as the informative genes to classify
    samples.
  • The gene shaving method is also a method of
    clustering genes and sample cells. But unlike
    classic clustering, in this method, one gene
    could belong to more than one cluster.

2
Features of Gene Groups
  • The genes in each cluster behave in a similar
    manner, which suggests similar or related
    function among genes
  • The cluster centroid shows high variance across
    the samples, which indicates the potential of
    this cluster to distinguish sample classes
  • The groups are as much uncorrelated between each
    other (which encourages seeking groups of
    different specification) as possible.

3
Motivation and Details
  • We favor subsets of genes that
  • All behave in a similar manner (coherence)
  • And all show large across the cell lines.
  • Given an expression array, we seek a sequence of
    nested gene clusters of size k. has the
    property that the variance of the cluster mean is
    maximum over all clusters of size k.

4
Gene Shaving approach finds the linear
combination of genes having maximal variation
among samples. This linear combination of genes
is viewed as a super gene. The genes having
lowest correlation with the super gene is
removed (shaved). The process is continued until
the subset of genes contains only one gene. This
process produces a sequence of gene blocks, each
containing genes that are similar to one another
and displaying large variance across samples.
A statistical approach Identifies subsets of
genes with coherent expression patterns and large
variation across conditions Gene may belong to
more than one cluster Can be either un-supervised
or supervised
5
Gene Shaving Algorithm-1
  • STEP 1. Start with the entire expression data X,
    each row centered to have zero mean.
  • STEP 2. Compute the leading principal component
    of the rows of X.
  • STEP 3. Shave off the proportion alpha (typically
    10) of the rows having smallest inner-product
    with the leading principal component.
  • STEP 4. Repeat step 2 and 3 until only one gene
    remains.

6
Gene Shaving Iteration
7
Gene Shaving Algorithm-2
  • STEP 5. This produces a sequence of nested gene
    clusters
  • where denotes a cluster of k genes.
    Estimate the optimal cluster size
  • STEP 6. Orthogonalize each row of X with respect
    to , the average gene in
  • STEP 7. Repeat steps 1-5 above with the
    orthogonalized data, to find the second optimal
    cluster. This process is continued until a
    maximum of M clusters are found, with M chosen
    apriori.

8
Principal Component of the rows
slides
Z1
slide
Super-gene
9
The Gap estimate of cluster size
Vb
Vt
We then select as the optimal number of genes
that value k producing The largest gap
10
Gene Shaving (Cont.)
The first three gene clusters found for the DLCL
data
11
Gene Shaving (Cont.)
Percent of gene variance explained by first j
gene shaving column averages (averages of the
genes in each cluster, j 1,2,... 10) (solid
curve), and by first j principal components
(broken curve). For the shaving results, the
total number of genes in the first j clusters is
also indicated.
12
Gene Shaving ( Cont.)
  • Variance plots for real and randomized data. The
    percent variance explained by each cluster, both
    for the original data, and for an average over
    three randomized versions.
  • Gap estimates of cluster size. The gap curve,
    which highlights the difference between the pair
    of curves, is shown.

13
References
  • Gene Shaving as a method for identifying
    distinct sets of genes with similar expression
    patterns T. Hastie, R. Tibshirani, M.B. Eisen, A
    Alizadeh, R. Levy,L Staudt, W.C Chan, D.Botstein
    and P. Brown. Genome Biology 2000.
    http//genomebiology.com/2000/1/2/research/0003/B
    14.
Write a Comment
User Comments (0)
About PowerShow.com