From motif search to gene expression analysis - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

From motif search to gene expression analysis

Description:

Title: Bioinformatics Tools Author _ _ Last modified by: Yael Mandel-Gutfreund Created Date: 3/28/2003 11:41:44 AM Document presentation format: On-screen Show (4:3) – PowerPoint PPT presentation

Number of Views:194
Avg rating:3.0/5.0
Slides: 45
Provided by: 5655
Category:

less

Transcript and Presenter's Notes

Title: From motif search to gene expression analysis


1
  • From motif search to gene expression analysis

2
Finding TF targets using a bioinformatics
approach
Scenario 1 Binding motif is known (easier
case) Scenario 2 Binding motif is unknown
(hard case)
3
Are common motifs the right thing to search for ?
4
Solutions
-Searching for motifs which are enriched in one
set but not in a random set - Use experimental
information to rank the sequences according to
their binding affinity and search for enriched
motifs at the top of the list
5
ChIP-Seq
  • Sequencing the regions in the genome to which a
    protein (e.g. transcription factor) binds to.

6
Finding the p53 binding motif in a set of p53
target sequences which are ranked according to
binding affinity
Best Binders
ChIP SEQ
Weak Binders
7
  • a word search approach to search
  • for enriched motif in a ranked list

CTGTGA
CTGTGA
CTGTGA
CTGTGA
CTGTGA
CTGTGC
CTGTGA
CTGTGA
CTGTGA
8
uses the minimal hyper
geometric statistics (mHG) to find enriched
motifs
CTGTGA
CTGTGA
CTGTGA
CTGTGA
CTGTGA
CTGTGA
CTGTGA
CTGTGA
9
The enriched motifs are combined to get a PSSM
which represents the binding motif
10
(No Transcript)
11
Protein Motifs
Protein motifs are usually 6-20 amino acids long
and can be represented as a consensus/profile
PEDXKRWRKXED
or as PWM
12
Gene Expression Analysis
13
Gene Expression
protein
RNA
DNA
14
Gene Expression
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
mRNA gene1
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
mRNA gene2
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
mRNA gene3
15
Studying Gene Expression 1987-2013
Microarray (first high throughput gene expression
experiments)
DNA chips
RNA-seq (Next Generation Sequencing)
16
Classical versus modern technologies to study
gene expression
  • Classical Methods (Spotted microarray, DNA chips)
  • -Require prior knowledge on the RNA transcript
  • Good for studying the expression of known genes
  • New generation RNA sequencing
  • Do not require prior knowledge
  • Good for discovering new transcripts

17
Experimental Protocol Two channel cDNA arrays
http//www.bio.davidson.edu/courses/genomics/chip/
chip.html
18
One channel DNA chips
  • Each sequence is represented by a probe set
    colored with one fluorescent dye
  • Target hybridizes to complimentary probes only
  • The fluorescence intensity is indicative of the
    expression of the target sequence

19
Affymetrix Chip
20
RNA-seq
21
Clustering the data according to expression
profiles
NEXT
  • .

Genes
Expression in different conditions
22
WHY?What can we learn from the clusterers?
  • Identify gene function
  • Similar expression can infer similar function
  • Diagnostics and Therapy
  • Different genes expression can indicate a disease
    state
  • Genes which change expression in a disease can be
    good candidates for drug targets

23
A molecular signature of metastasis in primary
solid tumors
Samples were taken from patients with
adenocarcinoma. Hundreds of genes that
differentiate between cancer tissues in
different stages of the tumor were found. The
arrow shows an example of a tumor cells which
were not detected correctly by histological or
other clinical parameters.
Ramaswamy et al, 2003 Nat Genet 3349-54
24
HOW?Different clustering approaches
  • Unsupervised
  • - Hierarchical Clustering
  • - K-means
  • Supervised Methods
  • -Support Vector Machine (SVM)

25
Clustering
  • Clustering organizes things that are close into
    groups.
  • - What does it mean for two genes to be close?
  • - Once we know this, how do we define groups?

26
What does it mean for two genes to be close?
We need a mathematical definition of distance
between the expression pattern of two genes
For example distance between gene 1 and 2
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16
17 18 19 20 21 22
Gene 1
Gene 2
Gene1 (E11, E12, , E1N) Gene2 (E21, E22, ,
E2N)
Euclidean distance Sqrt of Sum of (E1i -E2i)2,
i1,,N
27
Clustering the genes according to expression
Hierarchical Clustering
Generate a tree based on the distances between
genes (similar to a phylogenetic tree) Each gene
is a leaf on the tree Distances reflect the
similarity of their expression pattern
Gene Cluster
Genes
Expression in different conditions
28
Clustering the genes according to gene expression
Genes
Distance Table
GENE a 1, -1, 1, 1, 1,-1,-1,-1 GENE b 1,
1, -1, 1, 1, 1,-1, 1 GENE c 1, -1, 1, -1,
1,-1,-1,-1 GENE d -1, 1, -1, 1, 1, 1,-1,-1
a b c d
a 0 4 2 4
b 4 0 4.47 2.82
c 2 4.47 0 4.47
d 4 2.82 4.47 0
Distances (Euclidian distance)
Dab 4 Dac 2 Dad 4 Dbc 4.47 Dbd 2.82
Dcd 4.47
  • Can be calculated using
  • different distance metrics

28
29
Analyzing the clusters of genes
Cluster 2
Cluster 3
Cluster 4
30
What can we learn from clusters with similar gene
expression ??
31
EXAMPLE- hnRNP A1 and SRp40
HnRNPA1 and SRp40 are not clear homologs based
on blast e-value but have a very similar gene
expression pattern in different tissues
32
Are hnRNP A1 and SRp40 functionally homologs ??
hnRNP A1
SF
SF
SF
SF
SF
SF
SF
SF
SF
SF
SF
SF
SRP40
YES!!!!
33
What else can we learn from clusters with
similar gene expression ??
  • Similar expression between genes
  • The genes have similar function
  • One gene controls the other
  • All genes are controlled by a common regulatory
    genes

34
How can gene expression help in diagnostics?
35
How can gene-expression help in diagnostics ?
Genes
RESEARCH QUESTION Can we distinguish BRCA1 from
BRCA2 cancers based solely on their gene
expression profiles? HERE we want to cluster the
patients not the genes !!!
36
How can gene expression be applied for diagnostic
?
5 Breast Cancer Patient
Patient 1 patient 2 patient 3 patient4 patient 5
Gen1 - -
Gen2 - -
Gen3 - -
Gen4 - -
Gen5 - - -
37
How can gene expression be applied for diagnostic
?
BRCA1
BRCA2
patinet1 patient 2 patient4 patient 3 patient 5
Gen1 - -
Gen3 - -
Gen4 - -
Gen2 - -
Gen5 - - -
Informative Genes
Two-Way clustering clustering the patients and
genes
38
Supervised approachesfor diagnostic based on
expression data
Support Vector Machine SVM
39
  • SVM would begin with a set of samples from
    patients which have been diagnosed as either
    BRCA1 (red dots) or BRCA2 (blue dots).

Each dot represents a vector of the expression
pattern taken from the microarray experiment of a
patient.
40
How do SVMs work with expression data?
The SVM is trained on data which was classified
based on histology.
After training the SVM to separated the BRCA1
from BRAC2 tumors given the expression data, we
can then apply it to diagnose an unknown tumor
for which we have the equivalent expression data .
41
Projects 2013-14
42
Instructions for the final project Introduction
to Bioinformatics 2013-14
Key dates 12.12 lists of suggested projects
published You are highly encouraged to choose
a project yourself or find a relevant project
which can help in your research 9.1 Submission
project overview (one page) -Title -Main
question -Major Tools you are planning to use to
answer the questions Final week meetings on
projects 12.3 Poster submission 19.3 Poster
presentation
43
2. Planning your research After you have
described the main question or questions of your
project, you should carefully plan your next
steps A. Make sure you understand the problem and
read the necessary background to proceed B.
formulate your working plan, step by step C.
After you have a plan, start from extracting the
necessary data and decide on the relevant tools
to use at the first step. When running a tool
make sure to summarize the results and extract
the relevant information you need to answer your
question, it is recommended to save the raw data
for your records , don't present raw data in your
final project. Your initial results should guide
you towards your next steps. D. When you feel you
explored all tools you can apply to answer your
question you should summarize and get to
conclusions. Remember NO is also an answer as
long as you are sure it is NO. Also remember this
is a course project not only a HW exercise. .

44
  • Summarizing final project in a poster (in pairs)
  • Prepare in PPT poster size 90-120 cm
  • Title of the project
  • Names and affiliation of the students presenting
  • The poster should include 5 sections
  • Background should include description of your
    question (can add figure)
  • Goal and Research Plan
  • Describe the main objective and the research plan
  • Results (main section) Present your results in
    3-4 figures, describe each figure (figure
    legends) and give a title to each result
  • Conclusions summarized in points the
    conclusions of your project
  • References List the references of
    paper/databases/tools used for your project

Examples of posters will be presented in class
Write a Comment
User Comments (0)
About PowerShow.com