Title: Human Genome: Mapping, sequencing, Techniques Diseases
1Human Genome Mapping, sequencing, Techniques
Diseases
Lecture 4 BINF 7580
2After discussing the paper Recent acceleration of
human adaptive evolution in the last
lecture, Today we will talk about What is
Evolution in Biology?
Feb. 12 is "Darwin Day"
age 7 age 30 age 49 age 55
age 65 age 71
CHARLES ROBERT DARWIN
February 12, 1809 to April
19, 1882
Darwin Day is an international celebration of
science and humanity held on or around February
12, the day that Charles Darwin was born on in
1809. Specifically, it celebrates -- the man who
first described biological evolution via natural
selection For information, about The Darwin Day
Celebration visit www.DarwinDay.org
3Evoltion Different Definitions Evolution is a
change in the gene pool of a population over
time. Evolution is a change in the inherited
traits of a population from one generation to the
next. This process causes populations of
organisms to change over time. The genetic
change in populations of organisms over
successive generations. A gradual process in
which something changes into a different and
usually more complex or better form.
Evolution is a process that results in heritable
changes in a population spread over many
generations. This is a good working scientific
definition of evolution one that can be used to
distinguish between evolution and similar changes
that are not evolution. Evolution can be
precisely defined as any change in the frequency
of alleles within a gene pool from one generation
to the next."
4In fact, it was difficult for me to give a
precise definition. Maybe you will select what
you like more..
The process of evolution can be summarized in
three sentences 1. Genes mutate. 2.
Individuals are selected. 3. Populations evolve.
Evolution can be divided into MIcroevolution and
MAcroevolution. Microevolution is a change in
gene frequency within a single population - a
group of organisms that interbreed with each
other, i.e. they all share a gene pool.
.
The big question is, how did it happen?
5Mechanisms of Microevolution
Mutation Some green genes randomly mutated to
brown genes. This process alone cannot account
for a big change in allele frequency over one
generation).
Mutated gene results in brown
Migration (or gene flow) Some beetles with brown
genes immigrated from another population, or some
beetles carrying green genes emigrated.
6Mechanisms of Microevolution
Genetic drift When the beetles reproduced, just
by random luck more brown genes than green genes
ended up in the offspring. Brown genes occur
slightly more frequently in the offspring (29)
than in the parent generation (25).
75 25 71 29
Natural selection Beetles with brown genes
escaped predation and survived to reproduce more
frequently than beetles with green genes, so that
more brown genes got into the next generation.
7Macroevolution change that occurs at or above the
level of species - a group of organisms capable
of interbreeding and producing fertile offspring
(compare with microevolution -which refers to
smaller evolutionary changes, below the level of
species, within a species or population.
Just as in microevolution, basic evolutionary
mechanisms like mutation, migration, genetic
drift, and natural selection can produce major
evolutionary change if given enough time.
.
The question "Do you believe in macroevolution?"
Came the rely "Well, it depends on how you
define it."
8There is no difference between micro- and
macroevolution except that genes between species
usually diverge, while genes within species
usually combine. The same processes that cause
within-species evolution are responsible for
above-species evolution.
Species A changes over time to become species B,
while species B changes over time by splitting
into species C and D, neither of which are very
different from B or each other. The changing axis
represents change of form, either genetic or
phenotypic.
.
Larger changes, such as when a new species is
formed, are called macroevolution. Some
biologists feel the mechanisms of macroevolution
are different from those of microevolutionary
change. Others think the distinction between the
two is arbitrary -- macroevolution is cumulative
microevolution.
9Genome Evolution A major mechanism that has
shaped the genome of many organisms is gene
duplications. As the number of completely
sequenced genomes increases, the impact of gene
duplications on their structure becomes obvious.
Gene duplication (or chromosomal duplication) is
any duplication of a region of DNA that contains
a gene.
Some General Points about Duplicated Genes most
genes occur in multiple copies (gene families)
most genes within a
gene family differ with regard to expression in
time and space (i.e., exact functional
duplication is infrequent)
10Multigene families In the human genome 15
protein genes duplicated (Li, 2001) 16
yeast, 25 Arabidopsis (Wolfe, 2001)
- Multigene families due to
- Single gene duplication
- Segment duplication Tandem duplication or
duplication transposition - a b c d e f g
- ?
- a b c d e f b c d g
- Horizontal gene transfer
- Genome-wide doubling event
b c d
b c d
b c d
11Susumu Ohno in his classic book Evolution by gene
duplication (1970) have argued that gene
duplication is the most important evolutionary
force.
Had evolution been entirely dependent upon
natural selection, from a bacterium only numerous
forms of bacteria would have emerged. The
creation of metazoans, vertebrates, and finally
mammals from unicellular organisms would have
been quite impossible, for such big leaps in
evolution required the creation of new gene loci
with previously nonexistent function. Only the
cistron that became redundant was able to escape
from the relentless pressure of natural
selection. By escaping, it accumulated formerly
forbidden mutations to emerge as a new gene
locus.
H.A. How do you understand this point?
12So, the main idea of Ohno is duplicated genes
can supply new free genetic raw material. Gene
duplications are the principal source of new
genes.
This new genetic raw material is available for
the emergence of new functions through the forces
of mutation and natural selection. So the
working genome cannot allow itself to make
experiments with mutations. It may cause a dead
end for an organism. A very smart Ohnos idea
give a material for experiments with mutations.
He has hypothesized that after duplication, one
copy would preserve the original function and the
other copy would be free to diverge. .
13As described by him, duplication creates a
redundant gene copy that is free from the
pressure of natural selection and can
accumulate previously forbidden mutations,
eventually leading to a new function. The main
idea is that the newly duplicated gene is
supposed to be neutral and therefore subject to
loss by drift and by common inactivating
mutations (deletions, frameshifts, nonsense
mutations). Thus, the extra copy must drift to
high frequency in the population and remain
functionally intact long enough to acquire a new
selectable function by rare beneficial mutations.
H.A. How do you understand this idea?
14H.A. Why and how a gene will be deleted?
It is clear that if this extra gene does not
work, it soon be deleted from a genome?
To acquire rare mutations the gene must be in the
population for a sufficient time and at a
sufficient allele frequency. The standard
solution would be to maintain the extra copy by
selection. However, such selection would restrict
the ability of the copy to lose its old activity
and gain a new function.
H.A. Why?
Despite the general assumption that duplications
are neutral, it seems likely that they are often
counterselected (a condition that prevents
growth) Several ways of resolving the dilemma
have been suggested
H.A. Read and explain the results of the paper
Ohnos dilemma Evolution of new genes under
continuous selection. PNAS, 2007 vol. 104 no.
43, 1700417009 http//www.pnas.org.libproxy2.umdn
j.edu/cgi/reprint/104/43/17004.pdf
15Duplication can involve individual genes, genomic
segments or whole genomes. Whole-genome
duplication (WGD) is a particularly intriguing
but poorly understood situation. It has been
postulated as a powerful mechanism of
evolutionary innovation Recently, it has become
possible to test this notion by searching
complete genome sequence for signs of ancient
duplication.
16How do genomes evolve? In a typical evolutionary
tree, the yeast species stand on a branch apart
from those of plants and animals. But the cells
that make up these three types of organism have
some crucial similarities, such as the presence
of a nucleus and other subcellular structures.
They are termed eukaryotic cells, in contrast to
the simpler bacteria and archaea (prokaryotes).
Because of these similarities, researchers have
been able to use the rapidly reproducing and
experimentally manageable yeast Saccharomyces
cerevisiae to decipher some of the basic
properties of all eukaryotic cells. Now the use
of other yeast species is opening up a new
research front that seeks to identify the
mechanisms behind genome evolution.
17The clearest way to prove the existence of an
ancient WGD would be to find a yeast species,
denoted as Y, that descends directly from a
common ancestor along a lineage that diverged
before the duplication as shown in the figure
(the next slide) from the paper Proof and
evolutionary analysis of ancient genome
duplication in the yeast Saccharomyces cerevisiae
NATURE, 2004 In this paper it was shown that
the yeast Saccharomyces cerevisiae arose from
ancient whole-genome duplication, by sequencing
and analysing Kluyveromyces waltii, a related
yeast species that diverged before the
duplication.
18See explanation on the next slide.
19a, After divergence from K. waltii, the
Saccharomyces lineage underwent a genome
duplication event, creating two copies of every
gene and chromosome.
b, The vast majority of duplicated genes
underwent mutation and gene loss.
c, Sister segments retained different subsets
of the original gene set, keeping two copies for
only a small minority of duplicated genes, which
were retained for functional purposes.
d, Within S. cerevisiae, the only
evidence comes from the conserved order of
duplicated genes (numbered 3 and 13) across
different chromosomal segments the intervening
genes are unrelated. e, Comparison with K.
waltii reveals the duplicated nature of the S.
cerevisiae genome, interleaving genes from sister
segments on the basis of the ancestral
gene order.
20It was shown in this research that S. cerevisiae
arose from complete duplication of eight
ancestral chromosomes. The two genomes are
related by a 12 mapping, with each region of K.
waltii corresponding to two regions of S.
cerevisiae, as expected for whole-genome
duplication.
H.A. Shortly describe what conclusions about a)
the dynamics of genome evolution and
b) the emergence of new functions
were made in this paper.
21One of the main question is how these duplicated
genes may acquire new functions and how do these
changes occur? One of the answers to these
questions come from a study of leaf-eating
monkeys - colobines. (Nature Genetics, 2002).
They are different from other monkeys in that
they primarily eat leaves rather than fruit or
insects, and leaves are very difficult to digest.
For a analysis was used a pancreatic enzyme,
RNASE1. Typical primates have one gene encoding
this enzyme, but a colobine monkey has two - one
encodes RNASE1, and its duplicate encodes a new
enzyme - RNASE1B. The duplication occurred about
4 million years ago, after colobines split off
from the other monkeys, The researchers found
that RNASE1 works best at pH 7.4, but the new
enzyme works six times better under the more
acidic conditions typical for colobine monkeys.
But if the new enzyme is
so much more efficient, why has not natural
selection done away with the old one?
In humans, RNASE1 has two functions to digest
dietary RNA and to degrade double stranded RNA,
perhaps as a defense against double-stranded RNA
viruses. So one enzyme is doing two jobs. After
duplication, colobine have two enzymes, each
doing just one job, but doing it better than the
other.
22In most models of the development of evolutionary
novelty by gene duplication, it is implicitly
assumed that a single mutation to the duplicated
gene can confer a new selectable property. It
describe usually a biological functional
property. Another aspect of this problem is
considered in the paper
Simulating evolution by gene duplication of
protein features that require multiple amino acid
residues Protein Science (2004) how novel
protein structural features may develop
throughout evolution (In the previous examples
of researches I showed so called experimental
investigations genome sequencing and the
analysis of the genome, - in vitro analysis.
Here I show computer model analysis in silica
analysis) The researches are particularly
interested in the question of how novel protein
structural features may develop throughout
evolution Perhaps the simplest example of
this is the disulfide bond. In order to produce a
novel disulfide bond, a duplicated gene coding
for a protein lacking unmatched cysteines would
require at least two mutations in separate codons.
23Another example is a protein binding site. It is
known that a ligand bound to a protein interacts
with multiple amino acid residues. In
general, therefore, in order to produce a binding
site for a new ligand in a protein originally
lacking the ability to bind it, multiple
mutational events would be necessary. Li drew
attention to this fact in his textbook Molecular
Evolution (1997) acquiring a new function may
require many mutational steps, and a point that
needs emphasis is that the early steps might have
been selectively neutral because the new function
might not be manifested until a certain number of
steps had already occurred In this paper the
models of such process is considered. It was
assumed that a single mutation to the duplicated
gene can confer a new selectable property.
24I give here a very short description a model
developed in this work and short conclusion to
give an idea what kind of research is possible in
this field. The model The model assumes that any
given organism in the population may be thought
to have anywhere from zero to multiple extra
copies of the gene and that all duplicate genes
are selectively neutral. The basic task that
the model asks a duplicate gene to perform is to
accumulate mutations at the correct nucleotide
positions to code for a new selectable feature
before suffering a null mutation.
25A duplicate gene coding for a protein is
represented as an array of squares that stand for
nucleotide positions.
If several point mutations (indicated by a )
accumulate at specific nucleotide positions
(indicated by the three squares outlined in blue)
in the gene then several amino acid residues will
have been altered and the new feature will have
been successfully built in the protein (indicated
by the green-shaded area).
26Conclusion. The Model consider point mutation in
duplicated genes. It was shown that for very
large population sizes N, the time to fixation in
the population hovers near the inverse of the
point mutation rate, and varies sluggishly with
the ? th root of 1/N, where ? is the number of
nucleotide positions that must be mutated to
produce the feature.
At smaller population sizes, the time to
fixation varies linearly with 1/N and exceeds the
inverse of the point mutation rate.