Title: Introduction to Regulatory Genomics
1Introduction to Regulatory Genomics
2Overview
- DNA gt RNA gt Protein
- Exons/Introns
- Alternative Splicing
- Transcription
- Promoter, Transcription Factorsgt enhance binding
to RNA Polymerase IIgt allows for initiation of
transcription.
3Transcription in eukaryotic organisms
- Cis-regulatory sequences
- Short (usually 5-10nt) DNA sequences associated
with each gene - Degenerate similar sequences confer equivalent
binding site
4Transcription in Eukaryotic Organisms
- Variable positions relative to gene proximal or
distal - Sequences often cluster together forming
"cis-regulatory modules - Modules act independently to direct transcription
of gene
5Transcription Factors
- DNA-binding proteins that bind cis-regulatory
sequences and can enhance or repress
transcription - Bind cooperatively to adjacent sites
- TFTF interactions are relatively weak and
nonspecificsmall changes (point mutations) can
have large effects. - Combinatorial control of transcription
6(No Transcript)
7The framework
- Animal body plans and their structure/function
are the result of development processes in time
and space - Development is mediated by the regulation of
thousands of genes in time and space.
8The framework
- This regulation is largely done by the
interaction of DNA-binding proteins called
transcription factors with other TFs and gene
promoters. - Often, binding sites upstream of the gene
promoter as also involved in these regulatory
processes.
9The framework
- The DNA sequence of these binding sites can
change over evolutionary time due to mutations. - Alterations in the animal body plan are caused by
changes in the organization of these regulatory
sequences.
10Regulatory Information and the Genome
- The genome contains the information for all cell
types in the body. - In order for an organism to develop, different
sets of genes need to be expressed or repressed
at the right place and time. - If this does not occur properly, the effects are
usually lethal.
11Regulatory Information and the Genome
- Often, what is regulated in development are the
expression of transcription factors. - These factors then influence the expression of
other TFs as well as intercellular signals.
12General Principles of Organization of Regulation
in Development
- 1) Signalling affects regulatory gene expression
Cells often send chemical signals to other cells
which help to pattern gene expression in space. - This means that gene regulatory elements often
contain sequence that is responsive to
intracellular signal transduction pathways.
13General Principles of Organization of Regulation
in Development
- Intracellular signal transduction pathways are
the biochemical pathways by which the cell takes
external signals and changes gene expression. - 2) Development is controlled by networks of genes
(often coding for TFs) that regulate other genes.
14General Principles of Organization of Regulation
in Development
- Transcription factors often target hundreds of
genes. Which genes are actually activated (or
repressed) depends on the combination of TFs
present at the promoter region of the gene.
15Gene Regulatory Networks
- Davidson conceives of these gene regulatory
networks in the following way each node is a
gene, which takes multiple inputs. - Depending on the specific combination of inputs,
the gene provides multiple outputs to other genes
in the network.
16Regulatory Genes perform multiple roles in
development
- The number of regulatory genes is limited, and
all animals use more or less the same number of
DNA-binding motifs used by different
transcription factors. - Transcription factors are often required for
different processes at different times in
development, and they are often used for many
unrelated purposes in the life cycle.
17C-value paradox
18C-value paradox
- Genome size does not scale with complexity of the
organism - Number of protein-coding genes does not scale
with complexity of the organism - Biologists assume this is due to more complex
ways of regulating genes. - Alternative splicing also different domains to be
swapping in and out of proteins.
19Overview of Regulatory Architecture
- The short size and degeneracy of regulatory DNA
motifs means that they will occur at random in
enormous number. - Functional distribution of these motifs is highly
non-random. - Functional regulatory elements that have been
isolated consist of relatively dense clusters of
distinct sites recognized by diverse DNA binding
proteins.
20Overview of Regulatory Architecture
- Specific clusters of sites specify regulatory
activity. - Cis-regulatory modules produce unique regulatory
outputs in time and space in the organism. - In other words, these functional groupings are
associated with certain cells at certain times in
development.
21Overview of Regulatory Architecture
- Davidson defines enhancers as cis-regulatory
modules located many kB distant from the basal
transcriptional apparatus - AKA the BTA the promoter, the transcriptional
start site, etc. - Enhancers communicate with the BTA by DNA
looping.
22Overview of Regulatory Architecture
- Others might function as silencers by preventing
proteins from binding to each other or to the
BTA. - Only by experimentally verifying the function of
target sites within a cis-regulatory element do
we understand what the genomic regulatory
sequence means.
23Gene Regulatory Networks
- At the periphery of developmental gene networks
are the sets of protein-coding differentiation
genes that define particular cell types. - These genes do not have outputs affecting other
developmental genes. - Developmental gene networks progressively specify
exclusionary fates for cell types.
24Cell types and the Genome
- All the specificity for each cell in the organism
is ultimately contained in the genome. - Spatial expressionEve2 as example
- One of the most important regulatory objectives
in development is the control of spatial gene
expression.
25Cell types and the Genome
- Cis-regulatory modules are sufficient to drive
spatial expression, given the presence of the
appropriate TF input. - Ectopically incorporated cis-regulatory modules
are sufficient to generate correct patterns of
spatial gene expression
26Regulation
- Methylation of DNA-inhibits transcription
- Methylation (transcriptional repression) and
acetylation (transcriptional activation) of
nucleosomal histones - Polycomb proteins remodel chromatin so TFs cant
bind to promoters - microRNAs repress expression via
post-transcriptional binding.
27Cis-regulatory modules
- CRMs non random clusters of certain target sites
that usually span a few hundred bases. - Modules dictate where, when, and how genes are
expressed in development - CRMs can be repressive or activating
- Output can be considered the combinatorial
product of multiple operations.
28Cis-regulatory modules
- CRMs present their information as inputs.
- The DNA binding proteins must bind and these
outputs must be effectively communicated
elsewhere to incorporate changes, such as the
basal transcription apparatus or molecules
associated with it.
29Computational Identification
- CRMs tend to be highly conserved functional
significance inferred from less than expected DNA
divergence. - Putative binding sites can be detected upon
examination of highly conserved sites.
30More on CRMs
- Davidson calls the active TFs that drive
different regulatory states by appearing and
activating genes at different times and place
drivers. - These drivers do not contain the whole story.
- There are many more tightly and specifically
bound sites than just the subset occupied by
drivers.
31CRM example cyIIIa
- Network of conditional logic interactions
programmed into the DNA
32(No Transcript)
33The combinatorial cis-regulatory logic code
- Genomic regulatory code how do different
combinations of binding sites regulate genes? - Sorin is working to solve this problem in
conjunction with Eric Davidson and Ryan Tarpine.
34Steps to doing this computationally
- Identify target sites (and the factors that bind
them) - Interpret the functions mediated by these sites
and factors - Find rules for combining these functions to infer
overall cis-regulatory output.
35Steps to doing this computationally
- Need a sufficiently useful, discriminatory and
general target site database. - A given TF can activate or repress, depending on
combinations of protein partners. - Want to find functional combinations of target
sites/proteins.
36Logic Functions
- Certain sites within Modules can be activated in
a cooperative fashion and associate with other
modules - Can develop logic functions for the proper
expression of a gene.