Title: Modeling DNA Sequence Based cis-Regulatory Gene Networks
1Modeling DNA Sequence Based cis-Regulatory Gene
Networks
- Hamid Bolouri and Eric H. Davidson
- Presented by Geoffrey
2Introduction
- Cis-Regulatory elements can be regarded as pieces
of DNA sequences that have target site sequences
recognized by binding proteins - They are genetically hardwired information
processors and are linked together to form a huge
network - Each element receives informational input that
determines its activity and produces an
informational output in the form of regulatory
instructions (i.e. activates or inhibits other
elements)
3Introduction
- Genetic regulatory apparatus remains unchanged in
every cell. What it does will depend on the
inputs that it receives at each point in time - Part of the inputs depends on prior transactions
of genes that synthesize the necessary factors,
and part on other events, such as extra-cellular
signals
4Cis-Regulatory Elements
- Each element carries out some processing of its
input information - Inputs are often multiple while the output is a
unique function that informs the basal
transcription apparatus how frequently to
initiate transcription - Example of an element in diagram form
- (This diagram shows a gene whose expression is
activated by Ubiquitous activator and inhibited
by protein A)
5Cis-Regulatory Elements in Development
- Cis-Regulatory information processing is
important in development because development
depends fundamentally on spatial (which type of
cells and where) and temporal (when) control of
gene expression - These decisions result from logic functions
carried out by the regulatory elements - For example, a given cis-regulatory element might
lead to the expression of a gene when two inputs
overlap (AND operation), resulting in the
appearance of a new factor or it might control
the expression through the interplay between
positive and negative inputs.
6Cis-Regulatory Elements in Development
- Hence thinking about cis-regulatory elements from
an informational point of view leads to the
mutable, measurable and regulatory properties of
genomic DNA - The gene sequence of each element will dictate
which input the element will listen to and the
functions it is capable of processing - Each input hence indicates a target site sequence
that can be tested and recognized via mutation or
gene transfer
7Illustration Endo16 Model
- The cis-regulatory system of the endo16 gene of
the sea urchin has been studied in great detail - It has a modestly complex pattern of expression
during its embryogenesis - It is activated in the vegetal plate of the
embryo, specifically in the Veg2 lineage, at
about the 8th cleavage - The Veg2 lineage consists of the progeny of eight
6th cleavage founder cells, and from it derives
most of the endoderms
8Illustration Endo16 Model
- The endo16 gene is transcribed in this
endomesodermal field until gastrulation (process
by which germ cells of the blastoderm are
translocated to new positions in the embryo),
during which it is expressed throughout the
invaginating archenteron but no longer in the
mesodermal domain
- As the gut become regionalized, expression is
extinguished in the foregut and hindgut but
accelerated in the midgut where it continues to
be expressed in the feeding larva
9Illustration Endo16 Model
- Summary of the expression pattern of endo16 gene
(shown in blue)
10Illustration Endo16 Model
- The cis-regulatory system that controls the
endo16 expression is about 2300 base pairs in
length and it consists of several clusters of
target sites that execute distinct functions,
hence each can be thought of as separable modular
regulatory elements
- The basal transcription apparatus (Bp) has no
regulatory activity on its own and is used to
service regulatory elements expressed in every
domain of the embryo
11Illustration Endo16 Model
- Modules A and B carry out many interesting
regulatory functions - They have altogether 17 target sites for factors
that recognize and bind specifically at given
sequences
- A protein, SpGCF1, interacts at 5 sites of module
A. The other 12 target sites are serviced by 9
different transcription factors where each
interaction has a distinct and measurable
functional meaning - The details of the interaction are shown in the
box below. The target sites are indicated by
boxes (blue for Module B and red for Module A).
The arrows lead from the target site to the logic
operations indicated in circles. The logic
operation will then state how the factors will
interact
12Illustration Endo16 Model
- The complete model of the endo16 expression is
shown in the figure below. The elements now are
the individual genes that are involved in the
expression - Details of the model can be seen here click
13Logic Operation
- The model specifies logic operations by which the
inputs are processed and the altered values are
carried forward - Common operations includes
- AND when all the conditions are met, then the
indicated operations on the value of the output
at that node will take place - OR when one (or some) of the conditions are
met, then the indicated operations will take
place
14Logic Operation
- There are direct physical implications of the
logic operations. The AND operator shows that
the proteins binding at the respective sites are
together necessary for the function to occur
(e.g. formation of a huge functional complex by
the transcriptional factors) - However, it does not necessary mean that it is an
all-or-none output. Alternate outputs with values
could be associated with inputs not being present
by adding the else portion - The point is that the model describes the
functions that are mediated by each site,
conditional on the inputs present. It does not
attempt to describe the biochemistry of the
proteins that contribute to this function - Simply put, they are just information processing
constructs similar to those that can be found in
normal programming languages
15Continuous and Boolean Functions
- Taking again the computational model as an
example
- The fill in boxes with solid lines extending
indicates inputs where the amplitude varies over
time, e.g. UI, R, OTX - Open boxes with dashed lines indicate inputs that
are often present in excess, and hence can be
regarded as boolean inputs, i.e. either they are
present, or they are not - Open boxes with thin lines indicate scalar
operations on the inputs of the node
16Continuous and Boolean Functions
- Hence the endo16 model is not a kinetic model per
se - It does not consist of a set of time based
differential equations describing the kinetic
reactions - Instead, it describes the logic functions
mediated by the DNA target sites - Although it is not something new in other fields,
say, engineering, but it does offer a refreshing
way of modeling gene regulatory networks, which
are predominantly based on differential equations
17Models for Networks of cis-Regulatory Elements
Symbolism and Significance
- All major processes in animal development are
driven forward by regulatory genes, i.e. genes
that express transcription factors - Development events are not discrete and the
regulatory networks that control development are
often connected to other networks that control
prior and surrounding processes in both the
spatial and temporal domains - The model used for the cis-Regulatory elements
can be used to model the beginning of the process
for which the network displays the genetic
program, as well as the end, which is the
activation of gene batteries (a series of genes),
e.g. endo16 which expresses an adhesion protein
involved in the gastrulation of the sea urchin
embryo
18General Purposes of DNA sequence based Network
Models
- The objective of such a model is to
- State the key inputs and outputs of the
cis-regulatory system - Explain why each gene runs where and when it does
- How the spatial territories are being built up
- Even incomplete models are informative as the
interactions found always have some functional
meaning - Each cis-regulatory system can also be considered
as a black box which can be connected to other
systems
19Genomic and Nuclear Views
- A useful concept for DNA sequence-level network
is the distinction between View from Genome and
View from Nucleus - The VFG shows all the interactions that the
system is capable of while the VFN focus on those
sites that are occupied by the indicated inputs
in any given nucleus at any given time, i.e.
snapshot
20Genomic and Nuclear Views
- A simple illustration is shown in the figure.
Here there are two spatial domains of an embryo
domain A, and the rest (A) - The VFG shows that there is a ubiquitous positive
activator needed for all three genes. But gene 1
also requires another positive input to be
activated and it acts positively in domain A and
negatively in others (A) - This will then affect the expression of gene 2
and gene 3 - Hence in any development stage, either VFN(A) or
VFN(A) could be possible
21Conclusion
- Cis-Regulatory networks serve as a development
biologists essential organizer for getting
causal relationship between genes - They are essential due to the myriad of
information and possible interactions that may
occur - The models used are not actually genetic models
although their key elements are genomic target
site sequence elements - The relationship between the elements can be
viewed from several angles, i.e. views VFG,
VFN, Black Box View (Birds eye view). No
transformations are needed to transit from one
view to another - The model serves also as a predictive tool,
enabling developmental biologists to see what
might happen to the regulatory system if a target
site is mutated or experimentally altered