Introduction to the analysis of microarray data by GeneCluster - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Introduction to the analysis of microarray data by GeneCluster

Description:

... first use of the program. GeneCluster Capabilities. Dataset Preprocessing ... GeneCluster selects predictor genes of clinical status using a training data set, ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 15
Provided by: dsch53
Category:

less

Transcript and Presenter's Notes

Title: Introduction to the analysis of microarray data by GeneCluster


1
Introduction to the analysis of microarray data
by GeneCluster
  • Guide for the first use of the program

2
(No Transcript)
3
GeneCluster Capabilities
  • Dataset Preprocessing/Filtering
  • Data Analysis
  • Marker Selection/Neighborhood Analysis
  • Find Classes of Similarly Expressed Genes
  • Build Predictor Sets of Genes

4
Dataset Format
  • .GCT or .RES files are used for (already
    normalized) gene expression data
  • .GCT files are tab delimited files of expression
    values with simple headers
  • .RES files are the same, but also include
    Presence calls
  • .GCT files can be automatically created in dChip
  • .CLS files are used for class data
  • Separate .CLS files are required for every
    covariate

5
Example .GCT file
6
Example .CLS files
7
Dataset Preprocessing/Filtering
  • Clip values to Min or Max values
  • Keep if Max/Min gt ratio threshold
  • Keep if Max-Min gt difference threshold
  • Normalize to mean 0, variance 1
  • Scale by minimum value
  • Shift global/row mins
  • Log10 transform

8
Marker Selection/Neighborhood Analysis
  • For a set of classification markers in the .cls
    file, GeneCluster will look for the genes that
    most highly correlate with the classification.
  • Possible metrics include t-tests and signal to
    noise ratios.
  • Permutation tests quantify significance of the
    selected genes.

9
Top 5 over- and under-expressed genes correlating
with MMC status
10
Find Classes of Similarly Expressed Genes
  • Use k-means clustering or self-organizing map
    (SOM)
  • User can specify exactly how many clusters or let
    GeneCluster decide
  • Genes with similar expression patterns over the
    treatment conditions will cluster together

11
SOM of filtered Fanconi data
12
Build Predictor Sets of Genes
  • GeneCluster can be used for tumor
    subclassification using supervised learning
  • (Golub et al 1999,Slonim et al 2000)
  • GeneCluster selects predictor genes of clinical
    status using a training data set, and can be
    tested by predicting clinical status for a test
    data set

13
GeneCluster Resources
  • http//www-genome.wi.mit.edu/cancer/software/genec
    luster2/gc2.html
  • Tutorial and Instruction Manual in MACF, Smith
    1158

14
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com