Title: Genomes
1Genomes
12
2Chapter 12 Genomes
- Key Concepts
- 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products - 12.2 Prokaryotic Genomes Are Relatively Small and
Compact - 12.3 Eukaryotic Genomes Are Large and Complex
- 12.4 The Human Genome Sequence Has Many
Applications
3Chapter 12 Opening Question
- What does genome sequencing reveal about dogs and
other animals?
4Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- The Human Genome Project was proposed in 1986 to
determine the normal sequence of all human DNA. - The publicly funded effort was aided and
complemented by privately funded groups. - Methods used were first developed to sequence
prokaryotes and simple eukaryotes.
5Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- A key to interpreting DNA sequences is to
experiment simultaneously on a given chromosome
and to break the DNA into fragments. - The fragment sequences are put together using
larger, overlapping fragments. - Next-generation DNA sequencing uses DNA
replication and the polymerase chain reaction
(PCR).
6Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- One approach to next-generation DNA sequencing
- DNA is cut into 100 bp fragments.
- DNA is denatured by heat, and each single strand
then acts a template for synthesis. - Each fragment is attached to adapter sequences
and then to supports. - Fragments are then amplified by PCR.
7Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- Amplified DNA attached to a solid substrate is
ready for sequencing - Fragments are denatured and primers, DNA
polymerase, and fluorescently labeled nucleotides
are added. - DNA is replicated by adding one nucleotide at a
time. - Fluorescent color of the particular nucleotide is
detected as it is added, indicating the sequence
of the DNA.
8Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- The power of this method derives from the fact
that - It is fully automated and miniaturized.
- Millions of different fragments are sequenced
at the same time. This is called massively
parallel sequencing. - It is an inexpensive way to sequence large
genomes.
9Figure 12.1 DNA Sequencing (Part 1)
10Figure 12.1 DNA Sequencing (Part 2)
11Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- Determining sequences is possible because
original DNA fragments are overlapping. - Example A 10 bp fragment cut three different
ways yields - TG, ATG, and CCTAC
- AT, GCC, and TACTG
- CTG, CTA, and ATGC
- The correct sequence is ATGCCTACTG.
12Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- For genome sequencing the fragments are called
reads. - The field of bioinformatics was developed to
analyze DNA sequences using complex mathematics
and computer programs.
13Figure 12.2 Arranging DNA Sequences
14Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- In functional genomics, sequences identify the
functions of various parts - Open reading framesthe coding regions of the
genes, recognized by start and stop codons for
translation, and sequences indicating location of
introns - Amino acid sequences of proteins
15Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- Regulatory sequencespromoters and terminators
for transcription - RNA genes, including rRNA, tRNA, small nuclear
RNA, and microRNA genes - Other noncoding sequences in various categories
16Figure 12.3 The Genomic Book of Life
17Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- Comparative genomics compares a newly sequenced
genome with sequences from other organisms. - It provides information about function of
sequences and can trace evolutionary
relationships. - Genetic determinismthe concept that a phenotype
is determined solely by his or her genotype
18Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- Many genes encode for more than one protein,
through alternative splicing and
posttranslational modifications. - The proteome is the total of the proteins
produced by an organismmore complex than its
genome.
19Figure 12.4 Proteomics (Part 1)
20Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- Two techniques are used to analyze proteins and
the proteome - Two-dimensional gel electrophoresis separates
proteins based on size and electric charges. - Mass spectrometry identifies proteins by their
atomic masses. - Proteomics seeks to identify and characterize all
of the expressed proteins.
21Figure 12.4 Proteomics (Part 2)
22Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- The metabolome is the description of all of the
metabolites of a cell or organism - Primary metabolites are involved in normal
processes, such as in pathways like glycolysis.
Also includes hormones and other signaling
molecules. - Secondary metabolites are often unique to
particular organisms or groups. - Examples Antibiotics made by microbes, and
chemicals made by plants for defense.
23Concept 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing Gene Products
- Metabolomics aims to describe the metabolome of a
tissue or organism under particular environmental
conditions. - Analytical instruments can separate molecules
with different chemical properties, and other
techniques can identify them. - Measurements can be related to physiological
states.
24Figure 12.5 Genomics, Proteomics, and
Metabolomics
25Concept 12.2 Prokaryotic Genomes Are Relatively
Small and Compact
- Features of bacterial and archaeal genomes
- Relatively small, with single, circular
chromosome - Compactmostly protein-coding regions
- Most do not contain introns
- Often carry plasmids, smaller circular DNA
molecules
26Concept 12.2 Prokaryotic Genomes Are Relatively
Small and Compact
- Functional genomics assigns functions to the
products of genes. - H. influenzae chromosome has 1,727 open reading
frames. When it was first sequenced, only 58
percent coded for proteins with known functions. - Since then, the roles of almost all other
proteins have been identified. - More genes are involved in each function in the
larger E. coli.
27Table 12.1 Gene Functions in Three Bacteria
28Concept 12.2 Prokaryotic Genomes Are Relatively
Small and Compact
- Next, the study of the smallest known genome (M.
genitalium) was completed. - Comparative genomics showed that M. genitalium
lacks many enzymes and must obtain them from its
environment. - It also has very few genes for regulatory
proteinsits flexibility is limited by its lack
of control over gene expression.
29Concept 12.2 Prokaryotic Genomes Are Relatively
Small and Compact
- Transposons (or transposable elements) are DNA
segments that can move from place to place in the
genome. - They can move from one piece of DNA (such as a
chromosome), to another (such as a plasmid). - If a transposon is inserted into the middle of a
gene, it will be transcribed and result in
abnormal proteins.
30Figure 12.6 DNA Sequences That Move (Part 1)
31Figure 12.6 DNA Sequences That Move (Part 2)
32Concept 12.2 Prokaryotic Genomes Are Relatively
Small and Compact
- Prokaryotes can be identified by their growth in
culture, but DNA can also be isolated directly
from environmental samples. - Metagenomicsgenetic diversity is explored
without isolating intact organisms. - DNA can be cloned for libraries or amplified
and sequenced to detect known and unknown
organisms.
33Figure 12.7 Metagenomics
34Concept 12.2 Prokaryotic Genomes Are Relatively
Small and Compact
- Comparing genomes of prokaryotes and eukaryotes
- Certain genes are present in all organisms
(universal genes) and some universal gene
segments are present in many organisms. - This suggests that a minimal set of DNA sequences
is common to all cells.
35Concept 12.2 Prokaryotic Genomes Are Relatively
Small and Compact
- Efforts to define a minimal genome involve
computer analysis of genomes, the study of the
smallest known genome (M. genitalium), and using
transposons as mutagens. - Transposons can insert into genes at random the
mutated bacteria are tested for growth and
survival, and DNA is sequenced.
36Figure 12.8 Using Transposon Mutagenesis to
Determine the Minimal Genome (Part 1)
37Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- There are major differences between eukaryotic
and prokaryotic genomes - Eukaryotic genomes are larger and have more
protein-coding genes. - Eukaryotic genomes have more regulatory
sequences. Greater complexity requires more
regulation. - Much of eukaryotic DNA is noncoding, including
introns, gene control sequences, and repeated
sequences.
38Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- Several model organisms have been studied
extensively. - Model organisms are easy to grow and study in a
laboratory, their genetics are well studied, and
their characteristics represent a larger group of
organisms.
39Table 12.2 Representative Sequenced Genomes
40Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- The yeast, Saccharomyces cerevisiae
- Yeasts are single-celled eukaryotes.
- Yeasts and E. coli appear to use about the same
number of genes to perform basic functions. - However, the compartmentalization of the
eukaryotic yeast cell requires it to have many
more genes to target proteins to organelles.
41Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- The nematode, Caenorhabditis elegans
- A millimeter-long soil roundworm made up of about
1,000 cells, yet has complex organ systems. - Its genome is 8 times larger than yeast, and it
has about 3.5 times as many protein-coding genes
as do yeasts. - Other genes are for cell differentiation,
intercellular communication, and forming tissues
from cells.
42Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- The fruit fly, Drosophila melanogaster
- The fruit fly has ten times more cells and is
more complex than C. elegans, undergoing more
developmental stages. - It has a larger genome with many genes encoding
transcription factors needed for development.
43Figure 12.9 Functions of the Eukaryotic Genome
44Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- The thale cress, Arabidopsis thaliana
- The genomes of some plants are huge, but A.
thaliana has a much smaller genome. - Many of the genes found in fruit flies and
nematodes have orthologsgenes with very similar
sequencesin plants, suggesting a common
ancestor.
45Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- Arabidopsis has some genes related to functions
unique to plants - Photosynthesis, water transport, assembly of the
cell wall, and making molecules for defense
against microbes and herbivores - The basic plant genome may be determined by
comparing different plant genomes for common
sequences.
46Figure 12.10 Plant Genomes
47Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- Eukaryotes have closely related genes called gene
families. - These arose over evolutionary time when different
copies of genes underwent separate mutations. - For example Genes encoding the globin proteins
in hemoglobin and myoglobin all arose from a
single common ancestral gene.
48Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- During development, different members of the
globin gene family are expressed at different
times and in different tissues. - Hemoglobin of the human fetus contains ?-globin,
which binds O2 more tightly than adult
hemoglobin. - Hemoglobins with different affinities are
provided at different stages of development.
49Figure 12.11 The Globin Gene Family
50Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- Many gene families include nonfunctional
pseudogenes (?), resulting from mutations that
cause a loss of function, rather a new one. - A pseudogene may simply lack a promoter, and thus
fail to be transcribed, or a recognition site,
needed for the removal of an intron.
51Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- Eukaryotic genomes have repetitive DNA sequences
- Highly repetitive sequencesshort sequences (lt
100 bp) repeated thousands of times in tandem
not transcribed - Short tandem repeats (STRs) of 15 bp are
scattered around the genome and can be used in
DNA fingerprinting.
52Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- Moderately repetitive sequences are repeated
101,000 times. - Includes the genes for tRNAs and rRNAs
- Single copies of the tRNA and rRNA genes are
inadequate to supply large amounts of these
molecules needed by cells, so genome has multiple
copies in clusters - Most moderately repeated sequences are
transposons.
53Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- Transposons are of two main types in eukaryotes
- Retrotransposons (Class I) make RNA copies of
themselves, which are copied into DNA and
inserted in the genome. - LTR retrotransposons have long terminal repeats
of DNA sequences - Non-LTR retrotransposons do not have LTR
sequencesSINEs and LINEs are types of non-LTR
retrotransposons
54Concept 12.3 Eukaryotic Genomes Are Large and
Complex
- DNA transposons (Class II) do not use RNA
intermediates. - They are excised from the original location and
inserted at a new location without being
replicated.
55Table 12.3 Types of Sequences in Eukaryotic
Genomes
56Concept 12.4 The Human Genome Sequence Has Many
Applications
- By 2010 the complete haploid genome sequence was
completed for more than ten individuals. - Soon, a human genome will be sequenced for less
than 1,000.
57Concept 12.4 The Human Genome Sequence Has Many
Applications
- Some interesting facts about the human genome
- Protein-coding genes make up about 24,000 genes,
less than 2 percent of the 3.2 billion base pair
human genome. - Each gene must code for several proteins, and
posttranscriptional mechanisms (e.g., alternative
splicing) must account for the observed number of
proteins in humans.
58Concept 12.4 The Human Genome Sequence Has Many
Applications
- An average gene has 27,000 base pairs, but size
varies greatly as does the size of the proteins. - All human genes have many introns.
- 3.5 percent of the genome is functional but
noncodinghave roles in gene regulation
(microRNAs) or chromosome structure.
59Concept 12.4 The Human Genome Sequence Has Many
Applications
- Over 50 percent of the genome is transposons and
other repetitive sequences. - Most of the genome (97 percent) is the same in
all people. - Chimpanzees share 95 percent of the human genome.
60Figure 12.12 Evolution of the Genome
61Concept 12.4 The Human Genome Sequence Has Many
Applications
- Rapid genotyping technologies are being used to
understand the complex genetic basis of diseases
such as diabetes, heart disease, and Alzheimers
disease. - Haplotype maps are based on single nucleotide
polymorphisms (SNPs)DNA sequence variations that
involve single nucleotides. - SNPs are point mutations in a DNA sequence.
62Concept 12.4 The Human Genome Sequence Has Many
Applications
- SNPs that differ are not all inherited as
independent alleles. - A set of SNPs that are close together on a
chromosome are inherited as a linked unit. - A piece of chromosome with a set of linked SNPs
is called a haplotype. - Analyses of human haplotypes have shown that
there are, at most, 500,000 common variations. - .
63Concept 12.4 The Human Genome Sequence Has Many
Applications
- Technologies to analyze SNPs in an individual
genome include next-generation sequencing
methods and DNA microarrays. - A DNA microarray detects DNA or RNA sequences
that are complementary to and hybridize with an
oligonucleotide probe. - The aim is to find out which SNPs are associated
with specific diseases and identify alleles that
contribute to disease.
64Figure 12.13 SNP Genotyping and Disease
65Concept 12.4 The Human Genome Sequence Has Many
Applications
- Genetic variation can affect an individuals
response to a particular drug. - A variation could make an drug more or less
active in an individual. - Pharmacogenomics studies how the genome affects
the response to drugs. - This makes it possible to predict whether a drug
will be effective, with the objective of
personalizing drug treatments.
66Figure 12.14 Pharmacogenomics
67Concept 12.4 The Human Genome Sequence Has Many
Applications
- Comparisons of the proteomes of humans and other
eukaryotes has revealed categories of proteins. - The human proteome includes a set of 1,300
proteinsalso present in yeasts, nematodes, and
fruit fliesthat carry out the basic metabolic
functions of the cell.
68Concept 12.4 The Human Genome Sequence Has Many
Applications
- Proteomics can be useful in the diagnosis of
diseases by studying the pattern of proteins made
in a particular tissue at a particular time. - Metabolomics may also be able to aid in
diagnostics when patterns of metabolites can be
associated with physiology.
69Concept 12.4 The Human Genome Sequence Has Many
Applications
- DNA fingerprinting refers to a group of
techniques used to identify individuals by their
DNA. - Short tandem repeat (STR) analysis is most
common. - When several different STR loci are analyzed, a
unique pattern becomes apparent. - Can be used for questions of paternity and in
crime investigation
70Figure 12.15 DNA Fingerprinting (Part 1)
71Figure 12.15 DNA Fingerprinting (Part 2)
72Answer to Opening Question
- Genome sequencing in dogs led to the
identification of an SNP in the IGF-1 gene that
is important in determining size. - Large and small breeds have different alleles of
the gene. - Another gene shows differences in the musculature
of dogs and cattle when a mutation is present.
73Figure 12.16 Muscular Gene (Part 1)
74Figure 12.16 Muscular Gene (Part 2)