BIO341 Gene Discovery Section 3 Genome Organisation - PowerPoint PPT Presentation

1 / 70
About This Presentation
Title:

BIO341 Gene Discovery Section 3 Genome Organisation

Description:

But skewed distribution in flies, mammals (plants etc) ... These can be selected for in populations. Many only give real advantage in unusual environments ... – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 71
Provided by: djrp
Category:

less

Transcript and Presenter's Notes

Title: BIO341 Gene Discovery Section 3 Genome Organisation


1
BIO341Gene DiscoverySection 3 - Genome
Organisation
  • Jasper Rees
  • Department of Biochemistry, UWC
  • www.biotechnology.uwc.ac.za/teaching/BIO341

2
Eukaryote Genome organisation
  • Genome sizes
  • Cot analysis and genome complexity
  • Repetitive and unique sequences
  • mRNA expression
  • How much coding sequence in a genome
  • Rot analysis of mRNA expression
  • Tissue specific genes

3
Genome sizes in different phyla
4
Cot analysis of genomic DNA
  • Anneal DNA and measure double stranded component
  • Annealing rate depends on initial concentration
    (Co) and time (t)
  • Speed of annealing is dependent on sequence
    complexity
  • Sequence complexity is the length of unique
    sequence

5
Complexity of Genomes
  • Chemical complexity amount of DNA determined by
    chemical analysis
  • Kinetic complexity amount of DNA determined by
    Cot analysis
  • Generally values agree well, except of polyploid
    genomes

6
Several classes of sequence complexity in
eukaryote genomes
  • Cot analysis shows three classes of sequence
    complexity
  • Highly repetitive, mostly from centromeres
  • Middle repetitive, mostly from longer repeat
    sequences
  • Unique sequences

7
Repetitive and unique sequence in eukaryote
genomes
  • As genomes get larger, proportion of repetitive
    sequence increases
  • Amount of unique sequence is still high even in
    large genomes
  • Distribution of three class of sequence not
    revealed by Cot analysis

8
mRNA as a product of genomic DNA
  • Using mRNA as a probe for reassociation kinetics
  • Shows that majority of mRNA is from unique DNA
    sequences
  • Not all unique DNA sequences are transcribed

9
Estimating the amount of coding sequence in a
genome
  • Saturation of reassociation experiment shows the
    total proportion of DNA transcribed in small
  • The result will vary depending on the tissue
    source of mRNA
  • This is with cytoplasmic (spliced) mRNA

10
Measurement of the sequence complexity of mRNA
expressed
  • Rot analysis shows the abundance of mRNA
  • Chick oviduct expresses large amount of ovalbumin
    (50 of mRNA)
  • About 10 other abundant genes (15 of mRNA,
    including lyzosyme)
  • And 35 of the mRNA represents all other
    transcribed genes in the tissue

11
Overall expression of mRNA in different tissues
12
Organisation of genes in genomes
  • Overview of exon and intron size and numbers
  • Differences between small genomes and large
    genomes
  • Variations in gene strategy
  • Implications for genome organisation

13
Overview of splicing of introns
14
Identification of introns gene and mRNA are not
contiguous
15
Globin gene structure common structure to whole
family of genes
  • Position of introns in coding sequences is
    constant.
  • Length of introns varies
  • All globin genes organised the same way
  • Implies ancient evolutionary relationship

16
Mammalian DHFR genes have common organisation
  • Dihydrofolate reductase required for purine
    nucleotide biosynthesis
  • Exon organisation is common to all mammals (To
    all vertebrates?)
  • Introns vary greatly in length and sequence
    between species
  • Typical of other genes

17
Intron and exon length distribution
18
Distribution of exon numbers
  • Saccharomyces small genome, little repetitive
    sequence, very few introns, highly compact
  • Flies and mammals, larger genomes, more
    repetitive sequence, more introns. Genes dispersed

19
Overall gene sizes when there are more introns
  • Gene sizes a Normal Distribution in yeast
  • But skewed distribution in flies, mammals (plants
    etc)
  • Increase in gene size related to increase in
    presence of introns and number of introns

20
mRNA and gene sizes
  • Yeasts few introns, mRNA mostly same size as
    genes
  • Flies, mammals, plants etc mRNA much smaller
    than genes.
  • hnRNA has same size as genes (from which is is
    transcribed)
  • hnRNA spliced into mRNA

21
Alternative functions in genes
  • Promoters
  • Termination
  • Splicing

22
Human Chromosome 16 - DWNN gene shows complex
organisation
DWNN gene is 35 000 bases long
23
Exons are conserved, introns vary
  • Dotplots show similarity graphically
  • Can show for DNA or protein sequences
  • For two mouse alpha globin genes, can show exons
    are conserved, but introns do not show strong
    sequence relationship.
  • Alpha globin genes recently duplicated

24
Exons show similarity between species
  • Zoo Blots used to detect related DNA sequences in
    different species
  • Detect exons by hybridisation
  • Based on conservation of coding sequence
  • More highly conserved genes imply more
    limitations on protein sequence variation
  • Basis for computational identification of genes
    between species

25
Generation of new genes
  • Diversification of species occurs by addition of
    new genes
  • New genes occur from mutation and selection of
    existing genes
  • Or gene duplication followed by mutation and
    selection
  • Or recombination between two genes (gene
    shuffling)
  • Or horizontal transfer of genes from other
    species
  • New genes are very very rarely ( never) de novo
    events

26
Gene duplications
  • Occur as the result of duplicating region of
    genome by recombination mechanisms
  • From small duplication, to very large
  • One gene duplicated or many
  • Gene duplication generally not bad for organism
  • Duplication of genes allow for divergence

27
cDNA to gene duplication
  • Occurs by reverse transcription of mRNA into DNA
    by retroviral enzymes, in germ line
  • Integration of cDNA into germline gives spliced
    copy in DNA
  • Generally not functional copy of gene (no
    promoter)
  • Provides sequences that could be fused to other
    genes

28
Pseudogenes
  • Non functioning copies of genes
  • Promoter, splicing, termination, translational
    mutations
  • Various types, can be one or many mutations.
  • Some transcribed, some not.
  • Generated by recombination or cDNA insertion
  • Can be close or distant in genome

29
Gene families
  • Group of genes with related sequences and
    fucntions
  • May be small or large family
  • May be closely related (sequence /function)
  • Or very distant
  • Generated by duplication by recombination
  • Level of relationship of products depends on
    extent of divergence

30
Protein Families
  • The result of gene family divergence
  • Must have related sequences, and thus related
    functions.
  • May be related functions
  • Thus may all be kinases, but all have different
    substrates
  • Examples kinases, proteases, globins, DNA
    binding proteins, ATPases, NADP reductases

31
Protein Superfamilies
  • Large family relationship
  • May only be domains of the protein that are
    related
  • Functions may be very different
  • Evolutionarily very divergent
  • Often the result of exon-shuffling
  • Examples, Immunoglobulin superfamily, protein
    kinase superfamilies, blood clotting enzymes

32
Homologs, orthologs, paralogs
33
Exons and domains
  • Can often relate exons to domains in proteins
  • Domains are independently folding structures in
    proteins
  • When exon boundaries coincident with the boundary
    of the domains then can plug together exons to
    create modular proteins
  • Need the same reading frame at each exon boundary
  • Many good examples of this, but not for every
    gene or domain.

34
Exon shuffling
  • If exons have the same reading frame
  • because splicing is independent of coding
    sequence
  • Can splice together whatever set of exons is
    transcribed.
  • If splicing assembles fused open reading frame
  • Then can translate protein
  • If DNA recombination assembles two exons together
    and they can translate into a protein that can
    fold correctly into it domains, then this protein
    is highly likely to be functional

35
Uniqueness of Exon shuffling
  • Exons only in eukaryotes
  • Exon shuffling only in eukaryotes
  • One mechanism for acceleration of gene
    diversification in eukaryotes
  • Provides engine to power evolutionary change
  • Is a selective advantage for presence of
    introns/exons

36
Globin gene family
  • In humans have a and b globin gene clusters
  • In adults form haemoglobin, a2b2 hetero-tetramers
    which bind oxygen in the blood
  • Foetal and Embryonic forms bind oxygen more
    tightly to transfer oxygen from maternal blood
  • Gene expression developmentally regulated

37
Developmental expression of globins
  • Embryonic z2e2 z2?2 and ?2e2
  • Foetal z2?2
  • Adult ?2?2 and ?2?2
  • Sequential replacement of gene expression
  • Genes activated along a and b clusters (see
    figure 23.2)

38
Beta-globin genes in other vertebrates
  • Find varying numbers of genes
  • However, have embryonic, foetal and adult forms
  • Find pseudogenes in many clusters
  • Can trace evolutionary history of genes by
    analysing sequence relationships

39
Evolution of globin gene families
  • Evolution of both a and b globin gene families is
    by duplication
  • Result of unequal crossing over events
  • Fixation of additional genes in populations
    allows divergence of function
  • Loss of function results in psuedogenes
  • Divergence of function resulted in development of
    three different types (embryonic, foetal and
    adult)
  • See figure 23.7 for examples of diverence

40
Why increased globin gene diversity?
  • To generate globin proteins with different oxygen
    binding affinities
  • Selective advantage to get more oxygen to
    developing embryo/foetus
  • As development becomes more complex and embryo
    develops in maternal body, then need more complex
    oxygen transport, and thermodynamics

41
Globin genetic diseases mutations
  • Find many different mutations in a and b globins
  • Class of diseases termed thalassaemias
  • Point mutations that affect oxygen binding, and
    regulation of oxygen binding (Hill effect etc)
  • Also promoter, splicing and poly A mutations
  • Must be able to generate enough hemoglobin to
    transport oxygen at all times though

42
Globin genetic diseases deletions
  • Also have major deletion events in a and b globin
    clusters
  • Result from deletions occurring in unequal
    crossing over events, between regions of homology
    in globin clusters
  • Deletion of different combinations of genes gives
    different phenotypes
  • See figures 23.5 and 23.6

43
Why so many globin mutations
  • In certain human populations globin mutations are
    extremely common
  • Reduced hemoglobin function causes red blood
    cells to be fragile
  • Fragile RBCs break open more easily when infected
    with malaria parasite (Plasmodium)
  • So people with thalassaemia have increased
    resistance to malaria, when heterozygous for
    globin mutations

44
Heterozygous Advantage
  • Termed Heterozygous Advantage, so selects for
    people with globin mutations in presence of
    malaria
  • Results very high of thalassaemia in West
    Africa, Mediterranean area, South East Asia.
  • Is a directly observable effect of selection
    pressure on human populations resulting in
    evolutionary divergence on historical timescales

45
Recombination events
  • Crossing over - homologous recombination
    occurring normally during meiosis
  • Unequal crossing over resulting from mismatching
    regions of homology, that are incorrectly aligned
    during meiosis
  • Results in generation of additional sequence in
    one chromosome, and loss of sequence in the other.

46
Homologous Recombination
  • Two genes in parental chromosomes
  • Unequal crossing over
  • Results in generation of 1 and 3 genes

47
Gene duplications and divergence
  • Generation of 2 genes from 1 gene occurs when
    sequences outside gene allow unequal crossing
    over
  • Once two copies of a gene exist on a chromosome
    selection can occur
  • Multiple copies must be compatible with life,
    true for many genes
  • Then copies can diverge by accumulating mutations
    and developing new functions

48
Divergence and selective advantage
  • With divergence of function, have the possibility
    that new function will provide advantage to
    organism
  • Most mutations will destroy function of gene
  • Small proportion will improve it
  • These can be selected for in populations
  • Many only give real advantage in unusual
    environments
  • May gain frequency from population bottleneck

49
Evolutionary or molecular clock
  • Rate at which molecular changes are fixed in a
    population is measurable
  • Can measure length of time since divergence of
    species or genes using molecular clock
  • Can compare data with fossil record and isotopic
    dating systems to calibrate clock

50
ReplacementRates
  • Replacement rates for non-coding positions faster
    than for coding positions
  • Effect of selection on protein function
  • Use non-coding for recent divergence, coding
    sequence for older divergence

51
Evolutionary divergence of globins
  • Can measure evolution of globin gene clusters, in
    terms of molecular clocks, and which species have
    which globin genes
  • Increased complexity of globin gene clusters in
    higher vertebrates
  • Present in lower vertbrates, invertebrates, and
    distantly related protein found in plants
  • See Figures 23.7 and 23.9

52
Gene correction
  • Poorly understood mechanism by which duplicated
    genes are corrected against each other
  • Can result in the maintenance of many identical
    copies of a gene over many generations
  • Correction event may be rare. May replace many
    divergent copies with one specific type.
  • Loss of gene correction will allow evolution of
    duplicated genes

53
Divergence or gene death
  • When genes diverge can get loss of function
    creation of a pseudogene.
  • Or generation of mutated functional version that
    can be selected for or against
  • Selection over long period can result in the
    emergence of a novel protein with changed
    function
  • Duplication is essential for this divergence of
    gene function

54
Gene duplication central to evolution
  • Require increased genetic complement to allow for
    selection of novel biological functions
  • Gene duplication is the simplest and most common
    way to create this
  • When selective pressure on one copy of the gene
    is lost, then can accumulate mutations, and
    select for new functions
  • Overall result is diversification of function and
    creation of gene families

55
Repetitive DNA Sequences
  • Classes of repetitive sequences found in all
    higher eukaryotes
  • Simple sequences found in satellite regions and
    centromers
  • Middle repetitive sequences in several classes
  • Many repetitive sequences are mobile genetic
    elements (eg transposons)

56
Repetitive sequences expand in higher eukaryote
genomes
  • As genomes get larger, proportion of repetitive
    sequence increases
  • Repetitive sequences fall into various classes by
    sequence relationship and mode of amplification
  • Amount of unique sequence is still high even in
    large genomes

57
Simple sequence Satellites
  • Defined by unique density caused by unusual
    sequence composition
  • Made up of very short sequence repeats

58
Satellites localise to centromers
  • Sequences found in satellite regions
  • And centromers
  • Localised by in situ hybridisation
  • Define centromeric structure and function

59
Mammalian satellites
  • Mouse satellites made up of 238 bp repeat
  • This is made up of internally repeated sequences
  • Result from repeated duplication of 9 bp unit

60
Mini and micro satellites
  • Generated from repeated sequences
  • Repeat units from 2 to gt50 bp
  • May result in many alleles at a locus
  • May be coding or non-coding
  • Valuable for genetic markers and genotyping
    experiments
  • Used in DNA fingerprinting (eg in forensics)

61
Microsatellite allele analysis
62
Microsatellites and disease
  • Microsatellites with 3 bp repeats common in human
    genetic disease, where amplification generates
    expanded protein sequences
  • Causes expansion of regions of poly-glutamine,
    which cause protein precipitation and cell death
  • Expansion observed in families
  • Disease gets worse with additional generations as
    the microsatellite expands
  • Examples Huntingdons Disease, myotonic dystrophy

63
Middle Repeat Sequences
  • Can be caused by expansion of classes of larger
    sequence elements
  • 300 - 20 000 bp
  • Scattered through the genome
  • Not caused by unequal crossing over
  • Generally transcribed into RNA and converted to
    DNA and integrated into the genome
  • Some elements can insert and excise, and are
    therefore considered to be mobile elements

64
Alu sequences
  • Largest class of middle repetitive sequences in
    humans
  • About 300 bp long
  • Have polyA tract at one end
  • Are related to 7SL RNA sequence
  • Contain internal RNA Pol III promoter
  • Thus when reverse transcribed and reintegrate
    create a new Pol III promoter
  • Can cause mutations by integration into genes

65
LINEs
  • Genetic structure similar to retrovirus
  • Long terminal repeats (LTR) at each end
  • Internal sequences code for transposase proteins
  • Transcribed by RNA Pol II
  • Reverse transcribed and integrated into genome
  • Cannot be excised
  • Do not appear to be packaged as a virus

66
Transposons
  • Similar to LINE and retroviruses, except that can
    insert and excise from genome
  • Some excisions leave small number of bases
    inserted into genome
  • Others excise cleanly
  • These are true mobile elements
  • Can be the cause of frequently mutating and
    reverting genotypes
  • Reponsible for classic maize colour mutations

67
Spontaneous mutations in pears
68
Distribution of repetitive sequences
  • Distribution of middle repetitive sequences in
    animal and plant genomes is completely random
  • Occur in introns, inter-genic regions, highly
    transcribed and transcriptionally inactive
    regions
  • May be transcriptionally inactive or active
  • May be complete copies or partial copies

69
Function of repetitive sequences
  • junk DNA - implies has no function?
  • Mobile elements generates
  • Genome rearrangements
  • Duplications
  • Mutations and gene inactivation
  • Gene activation
  • Generation of additional DNA in genome

70
BLAST and repetitive human sequences
  • Repetitive sequences in your sequence will result
    in large numbers of matches to genome data
  • Solution check the filter human repeats check
    box in the search set up.
  • This will remove repeats based on a database of
    repeat sequences.
  • With the switch on, you can detect the repeats
    and identify them
Write a Comment
User Comments (0)
About PowerShow.com