Methylation predictors - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Methylation predictors

Description:

As in biology not all the CpG islands stays unmathylated and it is ... 'Predicting aberrant CpG island methylation', Feltus et al., PNAS,2003. Proof of concept: ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 25
Provided by: molge6
Category:

less

Transcript and Presenter's Notes

Title: Methylation predictors


1
Methylation predictors
  • Paz Polak

2
Background
  • Methylation is assumed to target the majority of
    CpG sites in the human genome.
  • As a consequence of mutational processes the
    amount of CpG decreases along the human genome
  • However, there are regions that are rich in CpGs
    and they tend to be less methylated (CpG
    islands).
  • As in biology not all the CpG islands stays
    unmathylated and it is not clear why.

3
Goal of computational methylation
  • Predict by DNA sequence which genomic regions
    are methylated and which are not. In particular,
    we wish to acquire the ability to predict which
    CGIs are methylated and which are unmethylated in
    different tissues.

4
Two fundamental papers (1)
  • Predicting aberrant CpG island methylation,
  • Feltus et al., PNAS,2003
  • Proof of concept
  • Methylated CGI and unmethylated CGIs have
    different
  • intrinsic DNA properties.

5
Two fundamental papers (2)
Methylator http//bio.dfci.harvard.edu/Methylator
/
6
The scheme
  • Since there is no good model of methylation
    status of CGIs, the common approach is
    statistical
  • Step 0 Classify regions according to your
    favorite CpG island definition
  • Step 1 Find set of regions which are known to be
    methylated and non methylated
  • Step 2Build a classifier that distinguish
    between the CpG islands using different genomic
    parameters.
  • Step 3 After defining the discriminate
    parameters. Use them to predict the methylation
    status
  • Step 4 Verify your best predictions in
    experiment that show that you are right in at
    least 90 of the cases.

7
Step -1 receive large scale data sets, some how
8
Step 0 define a CpG island criteria
  • Gardiner-Garden
  • GC content above 50,
  • ratio of observed versus expected number of CpG
    dinucleotides above 0.6
  • more than 200 basepairs in length.
  • Takai Johns
  • GC content above 55
  • Ratio above 0.65
  • More than 500 basepairs in length

9
Step 0 Define a CpG island criteria
  • The previous are too sharp.
  • Recently new criteria for defining a strength of
    CpG islands has been suggested, which have
    continuous values
  • Bock et. al. (Plos Computational biology, 2007).
    Using other types of epigenetic information to
    classify CGIs. For example histone modifications
    some types of modifications are indicative to
    transcriptional activation other to repression
  • Tanay et al. (PNAS, 2007).
  • Method based on conservation

10
Step 1 Map methylated unmethylated CGI
  • Bock et al, (March 2006, Plos Genetics
    Lymphocytes, HEP (Rakyan et al.,2004)
  • Fang Fang et al. (July 2006, Bioinformatics)
    Brain, 30 Mb DNA (Rollins et al.,2006), HEP,
    MethDB
  • Das et al.(July 2006, PNAS) similar to Fang
    Fang

11
Step 2 Build classifier2.1 Choose your DNA
attributes
  • DNA sequence properties and patterns.
  • Repeat frequency and distribution
  • CpG island frequency and distribution
  • Predicted DNA structure
  • Gene and exon distribution
  • Predicted transcription factor binding sites
  • Evolutionary conservation
  • Single nucleotides polymorphism (SNPs)

Bock et al.
12
Step 2 Build classifier2.1 Choose your
attributes
13
Step 2 Build classifier2.1 Choose your
attributes
14
Step 2 Build classifier2.1 Choose your
attributes
  • Fang et al. used
  • GC
  • CpG ratio
  • TpG
  • Overlapping of CGI with AluY
  • 74 TFBS, using transfac matrices

15
The top 4 discriminating TFBS
Fang et al. (2006)
16
TFBS of known neural regulators found 3 fold
times more in unmethylated CGIs
  • AP1 TF family regulates gene expression in
    neural cells. They can only bind to unmethyalted
    sites.
  • KROX- Egr. Regulate genes that are related to
    neural plasticity. Known to bind to unmethylated
    sites
  • ZF5 expressed in neural tissues. Its preference
    to methylated sites is not known
  • FOXM

17
Das et al. 17 features are enough
18
Step 2 Build classifier2.2 Choose your method
  • Bock et al
  • SVM
  • Fang et al
  • compared different methods but have chosen SVM
  • Das et al
  • Similar to Fang

19
Step 3 Train and predictHow good is your
prediction?
  • TN- True Negative
  • TP- True Positive
  • FN- False Negative
  • FP- False Positive
  • SP-Specificity
  • SE-Sensitivity
  • ACC- Accuracy
  • Correlation Coeff

20
Step 3 Train and predictBock et al.
21
Step 3 Train and predict Fang et al.
22
Comparison
23
(No Transcript)
24
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com