Computational Analysis of Transcript Identification Using GenBank - PowerPoint PPT Presentation

About This Presentation
Title:

Computational Analysis of Transcript Identification Using GenBank

Description:

Computational Analysis of Transcript Identification Using GenBank. Slides by Terry Clark ... D. Rowley. San Ming Wang. Terry Clark. Andrew Huntwork. Josef Jurek ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 35
Provided by: terry124
Category:

less

Transcript and Presenter's Notes

Title: Computational Analysis of Transcript Identification Using GenBank


1
Computational Analysis of Transcript
Identification Using GenBank
  • Slides by Terry Clark

2
Differentiation of hematopoietic cells
3
(No Transcript)
4
Genome-wide gene expression
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
SAGE (Serial Analysis of Gene Expression)
17
Figure 1 Schematic illustration of the SAGE
process
Jes Stollberg et al. Genome Res. 2000 10
1241-1248
18
SAGE GLGI Overview
19
What is the chance of duplicate tags?
  • We can assume we are drawing randomly from the
    set of all 4-letters sequences of the given tag
    length
  • This is the same problem as having unique
    overlaps in the contig matching problem for
    shotgun sequencing

20
Random Model
21
Random model does not reflect biological process
  • Genes evolve by duplication as well as point
    mutation
  • Many motifs are repeated
  • Function widgets at work?
  • Result is a strong bias in observed biological
    sequences, not a uniform distribution as the
    simple model hopes.
  • Here are some numbers .

22
SAGE tags match to many genes(Tags from
Hashimoto S, et al. Blood 94837, 1999)
23
Tag Frequency Groups for 10-base Tag
SetContaining 878,938 Tags for UniGene Human
24
Unique Tags among 878,938 EST Derived Tags
25
Unique Tags among 32,851 Gene Derived Tags
26
Converting tag into longer 3 sequence
27
Generation of Longer 3'cDNA for Gene
Identification (GLGI)
28
UniGene Human 3 Part Length Distribution
29
Myeloid Tag Matches with UniGene Human SAGE Tag
Reference Database
30
SAGE Tag Processing with GIST
31
k-mer tree
32
(No Transcript)
33
GIST Performance with Improved IO
34
Conspirators
Terry Clark Andrew Huntwork Josef Jurek L.
Ridgway Scott
Sanggyu Lee Janet D. Rowley San Ming Wang
Write a Comment
User Comments (0)
About PowerShow.com