PowerPoint Poster Template - PowerPoint PPT Presentation

About This Presentation
Title:

PowerPoint Poster Template

Description:

Novel Peptide Identification using ESTs and Genomic Sequence USHUPO 2006 Nathan J. Edwards1, Xue Wu2, Chau-Wen Tseng2 1Center for Bioinformatics & Computational ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 2
Provided by: MarekJ
Category:

less

Transcript and Presenter's Notes

Title: PowerPoint Poster Template


1
Novel Peptide Identification using ESTs and
Genomic Sequence
USHUPO2006
Nathan J. Edwards1, Xue Wu2, Chau-Wen
Tseng21Center for Bioinformatics Computational
Biology 2Department of Computer Science
University of Maryland, College Park
Exhaustive Peptide Sequence
Novel Peptide and Protein Isoforms
Introduction
  • Self Corrected ESTs

Novel Splice Forme-value 10-6.82 Gene
LIME1 Chromosome 20 Dataset Peptide Atlas
raftflow (von Haller, et al.) Evidence 10s of
ESTs, mRNA Feature Novel 3 splice site.
Traditional tandem mass spectrometry search
engines only identify peptides from protein
sequence databases, such as IPI and Swiss-PROT.
These databases are incomplete. We construct an
exhaustive HUMAN putative peptide sequence
database using ESTs and genomic sequence. This
peptide sequence database make it possible to
identify novel protein isoforms that would
otherwise be missed.
  • 6 frame translation.
  • Open reading frames 150 nucleotides.
  • Amino-acid 30-mers observed at least twice.
  • Genome Corrected ESTs
  • Map EST sequence to genome.
  • Correct ESTs using genomic sequence.
  • 6 frame translation of mapped corrected ESTs.
  • Open reading frames 150 nucleotides.

Novel Protein e-value 10-9.6 Chr X Dataset
Peptide Atlas A8_IP (Resing et al.) Evidence
100s of ESTs, mRNA Feature Straddles intron.
  • Genscan exons exon pairs

Expressed Sequence Tags
  • All Genscan 1 exons, including suboptimal
    exons, with probability gt 0.001.
  • All pairs of Genscan exons within 100Kb.
  • 3 frame translation.
  • Open reading frames 90 nucleotides.

ESTs are short single-pass sequencing reads of
cDNA clones obtained by reverse transcription of
messenger RNA sequence. Pros Transcribed, no
introns. Cons short (300-500 bases), 1 error,
large (4Gb), very redundant, no
translation direction or frame.
  • Genomic ORFs

Novel Mutation e-value 10-7.6 Ala2 Deletion
Gene TTR Chr 18 Dataset HUPO PPP
29_b1-EDTA_1 (Qian/He Omenn et al.) Evidence 2
ESTs from same clone library Feature
Ala2-to-Pro associated with familial amyloidotic
polyneuropathy.
  • 6 frame translation.
  • Open reading frames 150 nucleotides.

Genomic Sequence
C3 Database Compression 3
Genomic sequence is the assembly of multiple
overlapping DNA sequencing reads into contiguous
chromosomes. Pros Complete correct
sequence. Cons Introns, gt95 non-coding,
large (3Gb), no translation direction
or frame.
Complete All amino-acid 30-mers
represented Correct No new A-A 30-mers are
represented Compact A-A 30-mers occur exactly
once.
References
IPI Common Variant Elimination e-value 10-5.9
Chr 19 Gene C3 Dataset HUPO PPP
29_b1-CIT_win1 (Qian/He Omenn et al.) Evidence
100s of ESTs, mRNA Feature IPI has (rare)
variant, Insertion of AS_at_10 due to 5 splice
site.
1 Burge Karlin. Prediction of Complete Gene
Structures in Human Genomic DNA. J. Mol. Biol.
1997 2 Craig Beavis. TANDEM matching
proteins with tandem mass spectra.
Bioinformatics. 2000. 3 Edwards Lippert.
Sequence database compression for peptide
identification from tandem mass spectra. WABI.
2004.
Human Exons Introns
Average exons per gene 8 1 Average exon
length 150 nucleotides Average intron
length 4500 nucleotides
Write a Comment
User Comments (0)
About PowerShow.com