DNA Sequencing - PowerPoint PPT Presentation

About This Presentation
Title:

DNA Sequencing

Description:

How we obtain the sequence of nucleotides of a species ... Answer two: it ... old technology by F. Sanger. Whole genome strategies. Physical mapping ... – PowerPoint PPT presentation

Number of Views:142
Avg rating:3.0/5.0
Slides: 22
Provided by: root
Category:
Tags: dna | sanger | sequencing

less

Transcript and Presenter's Notes

Title: DNA Sequencing


1
DNA Sequencing
2
DNA sequencing
  • How we obtain the sequence of nucleotides of a
    species

ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGAC
TACGTTTTA TATATATATACGTCGTCGT ACTGATGACTAGATTACAG
ACTGATTTAGATACCTGAC TGATTTTAAAAAAATATT
3
Which representative of the species?
  • Which human?
  • Answer one
  • Answer two it doesnt matter
  • Polymorphism rate number of letter changes
    between two different members of a species
  • Humans 1/1,000
  • Other organisms have much higher polymorphism
    rates
  • Population size!

4
Why humans are so similar
Out of Africa
N
  • A small population that interbred reduced the
    genetic variation
  • Out of Africa 40,000 years ago

Heterozygosity H H 4Nu/(1 4Nu) u 10-8, N
104 ? H 4?10-4
5
Human population migrations
  • Out of Africa, Replacement
  • Grandma of all humans (Eve) 150,000yr
  • Ancestor of all mtDNA
  • Grandpa of all humans (Adam) 100,000yr
  • Ancestor of all Y-chromosomes
  • Multiregional Evolution
  • Fossil records show a continuous change of
    morphological features
  • Proponents of the theory doubt mtDNA and other
    genetic evidence
  • New fossil records bury multirigionalists
  • Nice article in Economist on that
  • http//www.economist.com/science/displaystory.cfm?
    story_id9507453

6
DNA Sequencing Overview
1975
  • Gel electrophoresis
  • Predominant, old technology by F. Sanger
  • Whole genome strategies
  • Physical mapping
  • Walking
  • Shotgun sequencing
  • Computational fragment assembly
  • The futurenew sequencing technologies
  • Pyrosequencing, single molecule methods,
  • Assembly techniques
  • Future variants of sequencing
  • Resequencing of humans
  • Microbial and environmental sequencing
  • Cancer genome sequencing

2015
7
DNA Sequencing
  • Goal
  • Find the complete sequence of A, C, G, Ts in
    DNA
  • Challenge
  • There is no machine that takes long DNA as an
    input, and gives the complete sequence as output
  • Can only sequence 900 letters at a time

8
DNA Sequencing vectors
DNA
Shake
DNA fragments
Known location (restriction site)
Vector Circular genome (bacterium, plasmid)


9
Different types of vectors
VECTOR Size of insert
Plasmid 2,000-10,000 Can control the size
Cosmid 40,000
BAC (Bacterial Artificial Chromosome) 70,000-300,000
YAC (Yeast Artificial Chromosome) gt 300,000 Not used much recently
10
DNA Sequencing gel electrophoresis
  1. Start at primer (restriction site)
  2. Grow DNA chain
  3. Include dideoxynucleoside (modified a, c, g, t)
  4. Stops reaction at all possible points
  5. Separate products with length, using gel
    electrophoresis

11
Method to sequence longer regions
genomic segment
cut many times at random (Shotgun)
Get one or two reads from each segment
900 bp
900 bp
12
Reconstructing the Sequence (Fragment Assembly)
reads
Cover region with high redundancy
Overlap extend reads to reconstruct the
original genomic region
13
Definition of Coverage
C
  • Length of genomic segment G
  • Number of reads N
  • Length of each read L
  • Definition Coverage C N L / G
  • How much coverage is enough?
  • Lander-Waterman model Prob not covered bp
    e-C
  • Assuming uniform distribution of reads, C10
    results in 1 gapped region /1,000,000 nucleotides

14
Repeats
  • Bacterial genomes 5
  • Mammals 50
  • Repeat types
  • Low-Complexity DNA (e.g. ATATATATACATA)
  • Microsatellite repeats (a1ak)N where k 3-6
  • (e.g. CAGCAGTAGCAGCACCAG)
  • Transposons
  • SINE (Short Interspersed Nuclear Elements)
  • e.g., ALU 300-long, 106 copies
  • LINE (Long Interspersed Nuclear Elements)
  • 4000-long, 200,000 copies
  • LTR retroposons (Long Terminal Repeats (700 bp)
    at each end)
  • cousins of HIV
  • Gene Families genes duplicate then diverge
    (paralogs)
  • Recent duplications 100,000-long, very similar
    copies

15
Sequencing and Fragment Assembly
3x109 nucleotides
50 of human DNA is composed of repeats
Error! Glued together two distant regions
16
What can we do about repeats?
  • Two main approaches
  • Cluster the reads
  • Link the reads

17
What can we do about repeats?
  • Two main approaches
  • Cluster the reads
  • Link the reads

18
What can we do about repeats?
  • Two main approaches
  • Cluster the reads
  • Link the reads

19
Sequencing and Fragment Assembly
3x109 nucleotides
ARB, CRD or ARD, CRB ?
20
Sequencing and Fragment Assembly
3x109 nucleotides
21
Strategies for whole-genome sequencing
  • Hierarchical Clone-by-clone
  • Break genome into many long pieces
  • Map each long piece onto the genome
  • Sequence each piece with shotgun
  • Example Yeast, Worm, Human, Rat
  • Online version of (1) Walking
  • Break genome into many long pieces
  • Start sequencing each piece with shotgun
  • Construct map as you go
  • Example Rice genome
  • Whole genome shotgun
  • One large shotgun pass on the whole genome
  • Example Drosophila, Human (Celera),
Write a Comment
User Comments (0)
About PowerShow.com