Title: Genome and proteom
1Genome and proteom
2- 3 broad areas
- Genomes, transcriptomes, proteomes
- Applications of the human genome project
- (C) Genome evolution
3A) Genomes, transcriptomes, proteomes
- Genome projects
- - Human Genome Project (HGP) a history
- - Other genome projects why do it
- - Genome organisation
- insights from HGP
- Repeat elements
- Transposable elements
- Mitochondrial genomes
- Y chromosome
- Post-genomics
- -transcriptomes
- - proteomes
4(A) Genomes, transcriptomes and proteomes
genome
Entire DNA complement of any organism which
include organelle DNA
transcriptome
All RNA transcribed from genome of a cell or
tissue
proteome
all proteins expressed by a genome, cell or tissue
5Why study the genome?
- 3 main reasons
- description of sequence of every gene valuable.
Includes regulatory regions which help in
understanding not only the molecular activities
of the cell but also ways in which they are
controlled. - identify characterise important inheritable
disease genes or bacterial genes (for industrial
use) - Role of intergenic sequences e.g. satellites,
intronic regions etc
6HGP
- Goal Obtain the entire DNA sequence of human
genome - Players
- International Human Genome Sequence Consortium
(IHGSC) - - public funding, free access to all, started
earlier - - used mapping overlapping clones method
- (B) Celera Genomics
- private funding, pay to view
- - started in 1998
- - used whole genome shotgun strategy
-
7Whose genome is it anyway?
- International Human Genome Sequence Consortium
(IHGSC) - - composite from several different people
generated from 10-20 primary samples taken from
numerous anonymous donors across racial and
ethnic groups - (B) Celera Genomics
- 5 different donors (one of whom was J Craig
Venter himself !!!)
8Strategies for sequencing the human genome
9Figure 12.2 Arranging DNA Sequences
10Strategies for sequencing the human genome
11Whole-genome shotgun sequencing
Private company Celera used to sequence whole
human genome
- Whole genome randomly sheared three times
- Plasmid library constructed with 2kb inserts
- Plasmid library with 10 kb inserts
- BAC library with 200 kb inserts
- Computer program assembles sequences into
chromosomes - No physical map construction
- Only one BAC library
- Overcomes problems of repeat sequences
- Whole genome randomly sheared three times
- Plasmid library constructed with 2kb inserts
- Plasmid library with 10 kb inserts
- BAC library with 200 kb inserts
- Computer program assembles sequences into
chromosomes - No physical map construction
- Only one BAC library
- Overcomes problems of repeat sequences
Fig. 10.13 Genetics by Hartwell
Fig. 10.13
12sequencing larger genomes
Mapping phase
Genomes - Dr. MV Hejmad
Sequencing phase
http//www.DNAi.org
13Other genomes sequenced
1997 4,200 genes
2002 36,000 genes
1998 19,099 genes
Sept 2003 18,473 human orthologs
2002 38,000 genes
Science (26 Sep 2003)Vol301(5641)pp1854-1855
14Genomics World's smallest genome
- the smallest genome known is the DNA of a
'nucleomorph' of Bigelowiella natans, a
single-celled algae of the group known as
chlorarachniophytes. - 373,000 base pairs and a mere 331 genes
- The nucleomorph is an evolutionary vestige that
was originally the nucleus of a eukaryotic cell.
The eukaryotic cell swallowed a cyanobacterium to
acquire a photosynthetic 'plastid' organelle, and
that cell was in turn engulfed by another cell to
produce B. natans as we know it. Now, most of the
nucleomorph's genome is concerned with its own
maintenance, and just 17 of its genes still exert
any control over the plastid. Its small size
suggests it is heading for evolutionary oblivion.
Proc. Natl Acad. Sci. USA 103, 95669571 (2006)
by G McFadden, University of Melbourne, Australia
15Organisation of human genome
- Nuclear genome (3.2 Gbp)
- 24 types of chromosomes
- Y- 51Mb and chr1 -279Mbp
http//www.ncbi.nlm.nih.gov/Genomes/
16Nuclear genome organisation (human)
Genomes 2 by TA Brown pg 23
17Nuclear genome organisation (human)
- 1) Gene and gene related sequences
- Coding regions Exons (5)
- Non-coding regions
- RNA genes
- Introns
- Pseudogenes
- Gene fragments
18- Basic structure of a gene
Fig. 21.11
19Polypeptide-coding regions
20Non polypeptidecoding RNA encoding
21Nuclear genome organisation (human)
RNA genes -
Major classes of RNA involved in gene expression
- 16S, 23S, 28S, 18S etc
- 22 types of mitochondrial 49 cytoplasmic
- U1,U2.U4,U5,U6 etc
- gt 100 types
- Other RNA classes
- microRNA
- XIST RNA
- Imprinting associated RNA
- Nervous system specific
- Antisense RNA
- Others
22introns
Non-coding regions..
23Pseudogenes (?)
Non-coding regions..
- A non functional copy of most or all of a gene
- Inactivated by mutations that may cause either
- inhibition of signal for initiation or
transcription - prevent splicing at exon-intron boundary
- premature termination of translation
Human Mol Gen 3 by Strachan Read pgs 262-264
24Pseudogenes (?)
Non-coding regions..
- Different classes include
- Non-processed
- contain non functional copies of genomic DNA
sequence incl exons and introns - arise from gene duplication events
- E.g. rabbit pseudogene ?b2
25rabbit pseudogene ?b2
Non-coding regions..
- Related to b1
- Usual exon and intron organisation
b1
?b2
26Pseudogenes - processed
Non-coding regions
27Nuclear genome organisation (human)
28Nuclear genome organisation (human)
- 2) Extragenic (intergenic) DNA
- (62 of genome)
- A) Unique or low copy number sequences
- B) Repetitive sequences ( 53)
29A) Unique or low copy number sequences
-
- Non coding, non repetitive and single copy
sequences of no known function or significance
30B) Repetitive sequences
- Significance
- Evolutionary signposts
- Passive markers for mutation assays
- Actively reorganise gene organisation by
creating, shuffling or modifying existing genes - Chromosome structure and dynamics
- Provide tools for medical, forensic, genetic
analysis
31- Eukaryotic genomes have repetitive DNA sequences
- Highly repetitive sequencesshort sequences (lt
100 bp) repeated thousands of times in tandem
not transcribed - Short tandem repeats (STRs) of 15 bp are
scattered around the genome and can be used in
DNA fingerprinting.
32- Moderately repetitive sequences are repeated
101,000 times. - Includes the genes for tRNAs and rRNAs
- Single copies of the tRNA and rRNA genes are
inadequate to supply large amounts of these
molecules needed by cells, so genome has multiple
copies in clusters - Most moderately repeated sequences are
transposons.
33- Transposons are of two main types in eukaryotes
- Retrotransposons (Class I) make RNA copies of
themselves, which are copied into DNA and
inserted in the genome. - LTR retrotransposons have long terminal repeats
of DNA sequences - Non-LTR retrotransposons do not have LTR
sequencesSINEs and LINEs are types of non-LTR
retrotransposons
34- Transposons (or transposable elements) are DNA
segments that can move from place to place in the
genome. - They can move from one piece of DNA (such as a
chromosome), to another (such as a plasmid). - If a transposon is inserted into the middle of a
gene, it will be transcribed and result in
abnormal proteins.
35(No Transcript)
36DNA Sequences That Move (Part 2)
37- DNA transposons (Class II) do not use RNA
intermediates. - They are excised from the original location and
inserted at a new location without being
replicated.
38Table 12.3 Types of Sequences in Eukaryotic
Genomes
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43Concept 12.4 The Human Genome Sequence Has Many
Applications
- By 2010 the complete haploid genome sequence was
completed for more than ten individuals. - Soon, a human genome will be sequenced for less
than 1,000.
44Concept 12.4 The Human Genome Sequence Has Many
Applications
- Some interesting facts about the human genome
- Protein-coding genes make up about 24,000 genes,
less than 2 percent of the 3.2 billion base pair
human genome. - Each gene must code for several proteins, and
posttranscriptional mechanisms (e.g., alternative
splicing) must account for the observed number of
proteins in humans.
45Concept 12.4 The Human Genome Sequence Has Many
Applications
- Over 50 percent of the genome is transposons and
other repetitive sequences. - Most of the genome (97 percent) is the same in
all people. - Chimpanzees share 95 percent of the human genome.
46Concept 12.4 The Human Genome Sequence Has Many
Applications
- Rapid genotyping technologies are being used to
understand the complex genetic basis of diseases
such as diabetes, heart disease, and Alzheimers
disease. - Haplotype maps are based on single nucleotide
polymorphisms (SNPs)DNA sequence variations that
involve single nucleotides. - SNPs are point mutations in a DNA sequence.
47Concept 12.4 The Human Genome Sequence Has Many
Applications
- SNPs that differ are not all inherited as
independent alleles. - A set of SNPs that are close together on a
chromosome are inherited as a linked unit. - A piece of chromosome with a set of linked SNPs
is called a haplotype. - Analyses of human haplotypes have shown that
there are, at most, 500,000 common variations. - .
48Concept 12.4 The Human Genome Sequence Has Many
Applications
- Technologies to analyze SNPs in an individual
genome include next-generation sequencing
methods and DNA microarrays. - A DNA microarray detects DNA or RNA sequences
that are complementary to and hybridize with an
oligonucleotide probe. - The aim is to find out which SNPs are associated
with specific diseases and identify alleles that
contribute to disease.
49(No Transcript)
50(No Transcript)
51Figure 12.13 SNP Genotyping and Disease
52Concept 12.4 The Human Genome Sequence Has Many
Applications
- Genetic variation can affect an individuals
response to a particular drug. - A variation could make an drug more or less
active in an individual. - Pharmacogenomics studies how the genome affects
the response to drugs. - This makes it possible to predict whether a drug
will be effective, with the objective of
personalizing drug treatments.
53Figure 12.14 Pharmacogenomics
54(No Transcript)
55(No Transcript)
56(No Transcript)
57https//genographic.nationalgeographic.com/