A New Approach to Analyzing Gene Expression Time Series Data

About This Presentation
Title:

A New Approach to Analyzing Gene Expression Time Series Data

Description:

Presented By: Adam Segoli Schubert. May 16, 2005. Overview. Gene Expression. Time Series ... Aligning Time-Series Data. Aligning Temporal Data Using Splines ... –

Number of Views:102
Avg rating:3.0/5.0
Slides: 28
Provided by: Ada570
Category:

less

Transcript and Presenter's Notes

Title: A New Approach to Analyzing Gene Expression Time Series Data


1
A New Approach to Analyzing Gene Expression Time
Series Data
  • Ziv Bar-Joseph
  • Georg Gerber
  • David K. Gifford
  • Tommi S. Jaakkola
  • Itamar Simon

Learning Seminar Bioinformatics Other
Applications Prof. Nathan Intrator Presented
By Adam Segoli Schubert May 16, 2005
2
Overview
  • Gene Expression
  • Time Series
  • Statistical Analysis of Time-Series
  • DNA Microarray
  • Gene Expression Time-Series
  • Analyzing Gene Expression Time-Series Data
  • Estimating Unobserved Expression Values and Time
    Points
  • What is a Spline?
  • Using the Splines
  • Parameters Analysis
  • Aligning Time-Series Data
  • Aligning Temporal Data Using Splines
  • Results Unobserved Data Estimation
  • Result - Aligning Temporal Data
  • References

3
Gene Expression
4
Time-Series
  • A series of values of variables taken in
    successive periods of time
  • Time Points
  • Sampling Intervals (constant / inconstant)
  • A well established area in statistical analysis
    of data is dedicated to the study of time-series

5
Statistical Analysis of Time-Series
  • Two main goals
  • Identifying the nature of the phenomenon
  • Predicting unobserved values of the time-series
    variable

6
DNA Microarray
  • Allows the monitoring of expression levels of
    thaousands of genes under a variety of
    conditions.
  • The data of microarray experiments is usually in
    the form of a large matrix.
  • Very Expensive.

7
Gene Expression Time-Series
  • Determined by measuring mRNA levels or protein
    concentrations
  • Commonly are very short (i.e. 4 to 20 samples)
  • Usually unevenly sampled
  • The measuring techniques are extremely
    noise-prone and/or subject to bias in the
    biological measurements.

8
Analyzing Gene Expression Time-Series Data
  • Estimating Unobserved Expression Values and Time
    Points
  • Aligning Time-Series Data

9
Estimating Unobserved Expression Values and Time
Points
  • Row Average or Filling with Zeros
  • Singular Value Decomposition (SVD)
  • Weighted K-Nearest Neighbors
  • Linear Interpolation

10
A New Analysis Approach
  • By using Cubic Splines.

11
What is a Spline?
  • A special curve defined piecewise by polynomials.
  • Given k points ti called knots in an interval
    a,b with
  • The parametric curve is
    called a Spline of degree n if
    and
  • A Cubic Spline if n 3.

12
Using the Splines
  • We Obtain a continues time formulation by using
    cubic splines to represent gene expression
    curves.
  • Spline control points are uniformly spaced.
  • We constrain spline coefficients of co-expressed
    genes to have the same covariance matrix.

13
Estimating Unobserved Data Using Splines
  • Given c Genes Classes.
  • - The gene i (of class j) value as
    observed at time t
  • Can be written as

14
Estimating Unobserved Data Using Splines
  • Resampling gene I at any time t of an unobserved
    time point
  • Estimating Missing Values
  • Averaging of the observed values using the class
    covariance matrix , class average and
    the gene specific variation .
  • Where are determined by a
    probabilistic model.

15
Estimating Unobserved Data Using Splines
Parameters Analysis
  • Yi Vector of observed expression values for
    gene i.
  • Si Matrix mxq for m observations.

16
Aligning Time-Series Data
  • Dynamic Time Wraping
  • Developed for voice recognition purposes at the
    70s.
  • Dynamic Programming
  • John Aach George M. Church
  • operates on individual genes

17
Aligning Temporal Data Using Splines
  • Operates on a set of genes.
  • Assume we have two spline curve for gene i
  • We define a mapping function T(s) t



18
Aligning Temporal Data Using Splines
  • We Define the alignment error for each gene
  • Alignment Limits
  • Starting Point
  • Ending Point

19
Aligning Temporal Data Using Splines
  • We define the error for a set of genes S of size
    n as
  • - Weighted coefficients that sum to one
  • (uniform / nonuniforn).

20
Aligning Temporal Data Using Splines
  • The Mapping function (T(s) t) can then be found
    by minimizing s value. Using standard
    non-linear optimization techniques.

21
Results Unobserved Data Estimation
  • Comparison of the new approach with
  • Linear Interpolation
  • Spline interpolation using individual genes
  • K-Nearest neighbors (KNN)
  • k 20

22
(No Transcript)
23
(No Transcript)
24
Result - Aligning Temporal Data
  • Aligned three yeast cell-cycle gene expression
    time series

25
(No Transcript)
26
Thank You!
  • Any Questions?

27
References
  • C. S. Moller-Levet. Clustering of Gene Expressiom
    Time-Series Data.
  • Biology. Fifth Edition By Neil A. Campbell, Jane
    B. Reece, and Lawrence G. Mitchell.
  • J. Aach and G. M. Church. Aligning gene
    expression time series with time warping
    algorithms. Bioinformatics, 17495-508, 2001.
  • C. de Boor. A practical guide to splines.
    Springer, 1978.
  • P. Dhaeseleer, X. Wen, S. Fuhrman, and R.
    Somogyi. Linear modeling of mrna expression
    levels during cns development and injury. In
    PSB99, 1999.
  • G. James and T. Hastie/ Functional linear
    discriminant analysis for irregulary sampled
    curves. Jurnal of the Royal Statistical Society,
    to appear, 2001.
  • Sharan R. and Shamir R. Algorithmic approaches to
    clustering gene expression data/ current topics
    in coputational Biology, To appear.
  • O. Troyanskaya, M. Cantor, and et al/ Missing
    value estimation methods for dna microarrays.
    bioinformatics, 17520-525, 2001.
Write a Comment
User Comments (0)
About PowerShow.com