Bioinformatics and Evolutionary Genomics High throughput - PowerPoint PPT Presentation

About This Presentation
Title:

Bioinformatics and Evolutionary Genomics High throughput

Description:

Title: Amsterdam 2004 Author: Berend Snel Last modified by: Snel Created Date: 1/10/2003 9:04:34 AM Document presentation format: On-screen Show (4:3) – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 35
Provided by: Beren2
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics and Evolutionary Genomics High throughput


1
Bioinformatics and Evolutionary GenomicsHigh
throughput functional data / functional
genomics / Omics
2
High-throuhput data on gene function
  • What do I mean omics, microarray, chip-on-chip
  • Why are people generating these data?
  • post-genomic era / systems biology the challenge
    to understand the roles of the e.g. 6,000 gene
    products in yeast and how they interact to create
    a eukaryotic organism.
  • Because they can apply automation also to other
    areas of molecular biology beyond sequencing
  • To have screens for the research question at
    hand rather than to have to test each guess at a
    time
  • What about evolutionary genomics?
  • Yeast
  • Accuracy / noise

3
HTP data
  • What do they mean experimental knowledge, but
    still what do they in terms of e.g. function?
  • A deluge
  • Bioinformatics is needed for basic data handling
    and has IMHO only scratched the surface in terms
    of coming up with biological questions with which
    we can probe this data

4
Microarray data
5
Microarray data
two conditions often used for screens
6
(Correlated) mRNA expression
  • mRNA levels are systematically measured under a
    variety of different cellular conditions, and
    genes are grouped if they show a similar
    transcriptional response to these conditions.

7
Hughes et al. 2000Cell
  • Profile Similarity Identifies Sterol-Pathway
    Disturbance Resulting from Deletion of
    Uncharacterized ORF YER044c (ERG28) and from
    Dyclonine Treatment
  • Prominent gene clusters responding to
    interference with ergosterol biosynthesis,
  • Comparison of the transcript profile of an erg28?
    strain to that of an erg3? strain.
  • (C) Sterol content of wild-type (left) and erg28?
    (right) strains.

8
Ihmels et al. 2002 Nature Genetics
Conventional hierarchical clustering of
co-expression data could fail, because genes can
play a role in multiple cellular processes and
their common regulatory element can only be
detected in a subset of experiments. detect
genes that are co-expressed under a subset of
conditions. a comprehensive set of overlapping
transcriptional modules
9
Citric acid cycle? Different activity under
different experimental conditions
10
Rapid divergence in expression between duplicate
genes inferred from microarray promotor data
0.1 3.2 My
11
Clustering conditions where the conditions are
genes yet another way to get to functional
links
12
Yeast-2-hybrid
Pairs of proteins to be tested for interaction
are expressed as fusion proteins ('hybrids') in
yeast one protein is fused to a DNA-binding
domain, the other to a transcriptional activator
domain. Any interaction between them is detected
by the formation of a functional transcription
factor.
13
  • Examples from the original Ito publication
  • A autophagy
  • B spindle pole body function
  • C and vesicular transport
  • Arrows orientation of two-hybrid interaction,
    beginning from the bait to the prey.

14
Accuracy of Y2H and how to improve it
b
15
Improving reliability using protein complexes
reasoning /internal consistency
Internal filtering!
16
Accuracy of Y2H and how to improve it
B
17
Mass spectrometry of purified complexes.
  • Individual proteins are tagged and used as
    'hooks' to biochemically purify whole protein
    complexes. These are then separated and their
    components identified by mass spectrometry.

18
(No Transcript)
19
b
20
(No Transcript)
21
Exosome
Ski
socio-affinity indices dotted lines, 510
dashed lines, 1015 plain lines, gt15. Bait
proteins are shown in bold and shaded circles
around groups of proteins indicate cores and
modules.
Stages in mRNA degradation
22
Cellular Function
Phylogenetic profile
pdb
Y2H
23
Protein interactions literature databases
  • Literature derived, normally manually curated (as
    opposed to text mining)
  • Biased?
  • No new knowledge
  • Useful for benchmarking for the study of the
    evolution of e.g. protein complexes
  • For example Munich Informatation center for
    Protein Sequences (MIPS)
  • Databases that contain literature and omics
    Database of Interacting Proteins (DIP),
    Biomolecular INteraction Database (BIND),

24
Systematic screening for lethality of knockouts
on a rich medium
  • The functions of many open reading frames (ORFs)
    identified in genome-sequencing projects are
    unknown. New, whole-genome approaches are
    required to systematically determine their
    function. A total of 6925 Saccharomyces
    cerevisiae strains were constructed, by a
    high-throughput strategy, each with a precise
    deletion of one of 2026 ORFs Of the deleted ORFs,
    17 percent were essential for viability in rich
    medium.

Winzeler et al. 1999 Science
25
Genetic interactions (synthetic lethal/sick)
  • Two nonessential genes that cause lethality when
    mutated at the same time form a synthetic lethal
    interaction. Such genes are often functionally
    associated and their encoded proteins may also
    interact physically.

Tong et al. 2001 Science
26
(No Transcript)
27
One thing we can do with synthetic lethals
  • Ideker protein interactions

28
What do to with synthetic lethals?
Kelley and Ideker 2005 Nature Biotech
29
(No Transcript)
30
ChIP-on-chip
  • Tagged strains (one strain for each regulator).
  • Micro-array for a strain to see which pieces of
    DNA are found in excess if you isolate the
    regulator plus bound DNA.

b
31
Gfp localization
  • Mating of fluorescent protein markers specific
    for organelles plus fluorescent protein tags for
    each gene

32
Other functional genomics data the omes
  • quantitative proteomics
  • Kinome
  • PTMome
  • (almost) All of these data is freely and publicly
    available
  • Take home message wow this exists !!!

33
Bioinformatics for Benchmarking Integration
purified complexes TAP
Purified Complexes HMS-PCI
genomic context
mRNA co-expression
two methods
synthetic lethality
Coverage
combined evidence
fraction of reference set covered by data
yeast two-hybrid
three methods
raw data
filtered data
parameter choices
Accuracy
fraction of data confirmed by reference set
34
Advanced integration
B
Write a Comment
User Comments (0)
About PowerShow.com