Title: In Introduction to DNA Forensics: The Basics
1In Introduction to DNA Forensics The Basics
Michael D. Kane, Ph.D. Professor of Biomedical
Informatics Department of Computer Information
Technology Lead Genomic Scientist, Bindley
Bioscience Center Purdue University Contact
(office) 765-494-2564, mdkane_at_purdue.edu
2DNA is Information Storage
3Computer-Lingo
Zipped Files
Decompression
Executable Files
4DNA is Double Stranded One strand is the
coding strand and the other strand is there to
stabilize the DNA sequence when not in use.
Double-stranded DNA is very durable in our
environment.
5CAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGAC
TCTTGCTACTCCTGGTTCAGCGCCACCCTAACACCCATGACCGCCTCCCA
CCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAG
AAGAGGCCTACTCAAATCCTTTCTGAGGTTCCGAGAGAAATATGGGGACG
TCTTCACGGTACACCTGGGACCGAGGCCCGTGGTCATGCTGTGTGGAGTA
GAGGCCATACGGGAGGCCCTTGTGGACAAGGCTGAGGCCTTCTCTGGCCG
GGGAAAAATCGCCATGGTCGACCCATTCTTCCGGGGATATGGTGTGATCT
TTGCCAATGGAAACCGCTGGAAGGTGCTTCGGCGATTCTCTGTGACCACT
ATGAGGGACTTCGGGATGGGAAAGCGGAGTGTGGAGGAGCGGATTCAGGA
GGAGGCTCAGTGTCTGATAGAGGAGCTTCGGAAATCCAAGGGGGCCCTCA
TGGACCCCACCTTCCTCTTCCAGTCCATTACCGCCAACATCATCTGCTCC
ATCGTCTTTGGAAAACGATTCCACTACCAAGATCAAGAGTTCCTGAAGAT
GCTGAACTTGTTCTACCAGACTTTTTCACTCATCAGCTCTGTATTCGGCC
AGCTGTTTGAGCTCTTCTCTGGCTTCTTGAAATACTTTCCTGGGGCACAC
AGGCAAGTTTACAAAAACCTGCAGGAAATCAATGCTTACATTGGCCACAG
TGTGGAGAAGCACCGTGAAACCCTGGACCCCAGCGCCCCCAAGGACCTCA
TCGACACCTACCTGCTCCACATGGAAAAAGAGAAATCCAACGCACACAGT
GAATTCAGCCACCAGAACCTCAACCTCAACACGCTCTCGCTCTTCTTTGC
TGGCACTGAGACCACCAGCACCACTCTCCGCTACGGCTTCCTGCTCATGC
TCAAATACCCTCATGTTGCAGAGAGAGTCTACAGGGAGATTGAACAGGTG
ATTGGCCCACATCGCCCTCCAGAGCTTCATGACCGAGCCAAAATGCCATA
CACAGAGGCAGTCATCTATGAGATTCAGAGATTTTCCGACCTTCTCCCCA
TGGGTGTGCCCCACATTGTCACCCAACACACCAGCTTCCGAGGGTACATC
ATCCCCAAGGACACAGAAGTATTTCTCATCCTGAGCACTGCTCTCCATGA
CCCACACTA
6(No Transcript)
7Simple Summary of Human Genomics
- There are 3 billion base-pairs (or bytes) of
information in the human genome. - Only 2 of the human genome is made up of
genes, the remaining 98 is somewhat unique to
each individual, and important in deriving
DNA-based evidence. - A gene encodes a protein. Proteins are the
functional units of living systems (hair, cotton,
skin, venoms, pollens, foods, etc, etc, etc) - Only about 0.1 of our genome is unique to us
individually (as opposed to race, gender or
familial inheritance), or about 3 million base
pairs of DNA.
Genes
8Simple Summary of Molecular Biology
- DNA can be isolated from different sample types.
- Sections of DNA can be amplified 1-billion fold
in a few hours, which means to enrich for certain
sections for subsequent analysis (PCR
amplification). - DNA has many areas of repeated sequence (e.g.
catg-catg-catg-catg-catg) - DNA can be cut at specific sequence points
(e.g. ACTG).
This is the basis for DNA-based forensics
evidence STR (new method) RFLP (old method) mtDNA
9STR Fingerprinting Method
STR, or short, tandem repeat (sequences) exist
in the non-coding regions of our DNA (i.e. not in
genes), and vary between individuals. These
regions can be amplified, and the length of
each of the amplified STR sequences can be
determined. In criminal investigations, there
are 13 regions (aka loci) in the human genome
that are amplified and analyzed.
SIMPLE EXAMPLE (using only 4 STR amplified
regions)
- Suspect 1.
- jump-jump-jump-jump-jump (20)
- run-run-run-run-run (15)
- skip-skip-skip-skip-skip-skip (24)
- hop-hop-hop-hop (12)
-
- 13) Total in STR
- Suspect 2.
- jump-jump-jump-jump (16)
- run-run-run-run-run-run (18)
- skip-skip-skip-skip-skip-skip-skip (28)
- hop-hop-hop-hop (12)
-
- 13) Total in STR
10Determining the size of amplified DNA
and
11Determining the size of amplified DNA
DNA in a salt solution
Power source
-
DNA has a - charge, and is attracted to the
electrode.
12Determining the size of amplified DNA
(Continued)
-
Amplified DNA sample is placed in gel
gel
shorter DNA
longer DNA
The gel limits the diffusion rate of the DNA, and
therefore the shorter pieces of amplified DNA
move quicker through the gel.
-
13Determining the size of amplified DNA
(Continued)
EXAMPLES of some DNA gels
14STR Fingerprinting Method
back to our SIMPLE EXAMPLE (using only 4 STR
amplified regions)
- Suspect 1
- jump-jump-jump-jump-jump (20)
- run-run-run-run-run (15)
- skip-skip-skip-skip-skip-skip (24)
- hop-hop-hop-hop (12)
- Suspect 2
- jump-jump-jump-jump (16)
- run-run-run-run-run-run (18)
- skip-skip-skip-skip-skip-skip-skip (28)
- hop-hop-hop-hop (12)
Unknown Sample
Suspect 2
Suspect 1
Note The word strings above (e.g. jump-jump) are
intended to present the example clearly. In
reality, these would all be DNA (e.g.
CATG-CATG-CATG)
15STR Fingerprinting Method
Statistical Basis and Information Management
By testing nine of these STR sites on different
chromosomes in humans you get a one in a billion
unique signature. Nine sites as standards are
used by the military for paternity
matters. Thirteen sites are commonly used for
forensic tests and for the CODIS database,
although this method is not sufficient for
identifying differences in identical twins. The
Combined DNA Index System (CODIS) is a DNA
database funded by the United States Federal
Bureau of Investigation (FBI). It is a computer
system that stores DNA profiles created by
federal, state, and local crime laboratories in
the United States, with the ability to search the
database to assist in the identification of
suspects in crimes. Although the DNA
Identification Act was passed in 1994, CODIS did
not become fully operational until 1998.
16DNA Amplification and Specificity
PCR amplification involves enriching a specific
section of DNA for analysis. It is considered
specific since two different pieces of
synthetic DNA (i.e. primers) are used to
facilitate the synthesis of DNA in a test tube.
These primers are specific to a known section
of DNA, and if the reaction is done correctly the
only DNA amplified is the intended section of DNA
(i.e. amplifying an STR section from the rest of
the human DNA, as well as any bacterial, viral,
or plant DNA). In human samples, the presence of
more than one contributing source of DNA (i.e. a
sample that has been contaminated by other people
at a crime scene or working in a molecular
forensics lab) will be detected through the
presence of 4 (or more) results, rather than 2
(remember we each have 2 copies of our genome,
one from mom and one from dad). So in STR
analysis, each loci or allele actually has
two results, or one if mom and dad each provided
your genome with the same size of a given STR.
17DNA Amplification and Specificity
The 13 STR loci used by the FBI (and other law
enforcement) are CSF1PO FGA TH01 TPOX vWA D3S135
8 D5S818 D7S820 D8S1179 D13S317 D16S539 D18S51 D21
S11
Example Results
TAKEN FROM Bruce Budowle, Genotype Profiles for
Five Population Groups at the Short Tandem Repeat
Loci D2S1338 and D19S433, Forensic Science
Communications July 2001.
18STR Fingerprinting Method
Confounding Issues
- DNA from crime scene evidence can be very small
quantity, poorly preserved, or highly degraded,
so only a partial DNA profile can be obtained.
When fewer than 13 STR loci are examined, the
overall genotype frequency is higher, therefore
making the probability of a random match higher
as well. - As an example, if a suspect-sample match was
made using only 4 of the STR loci (as was the
example earlier), the probability for a match
(true or false positive) is about 1 in 330. - If an individual happens to have STR alleles that
are very common in his or her ethnic group, the
genotype frequency can also be quite high, even
when all of the core 13 STR loci are examined. - Crime scene samples sometimes contain DNA from
several different sources, which can make
identifying the source(s) of the DNA extremely
difficult.
19PCR Concept Amplification of a piece of DNA
for analysis. Driving phenomena of PCR Heating
and Cooling Heating Double-stranded DNA comes
apart when heated to near boiling. This is also
called denaturing or melting. Cooling
Complementary DNA comes together when cooled.
This is also called renaturing, annealing or
hybridizing.
Double-Stranded DNA
COOLING
HEATING
Single-Stranded DNA
20Double-Stranded DNA
HEATING
3 5
5 3
PCR Primers
Single-Stranded DNA
21Most PCR applications use 30 cycles (230 1.07
billion), representing an amplification of about
1 billion fold.
22OTHER DNA Methods
In Restriction Fragment Length Polymorphism
(RFLP), a large area of highly variable DNA is
amplified (PCR), then cut with a specific
restriction enzyme. A restriction enzyme cuts DNA
at a specific site (e.g. Nla3 cuts at CATG). Once
the amplified DNA is cut (or digested), the
resulting DNA fragments are separated on a gel
(similar to what we discussed earlier). Since
each person would have a unique subset of DNA
fragments in this method, their gel pattern would
be unique. STR has largely replaced RFLP since
the results of STR can be much more easily
described categorically therefore stored/searched
in a database. mtDNA is mitochondrial DNA, which
is maternally inherited (i.e. you only get this
from your biological mother), and has two highly
variable sections of DNA. DNA amplification and
sequencing of these regions can be used to gain a
positive match, but will not exclude people of a
similar familial line (i.e. you, your siblings,
your mother and your grandmother all have the
same sequence). The advantage of mtDNA use is
when STR (or other methods based on nuclear DNA)
are limited, such as samples of hair, bone,
teeth. Similarly, if the sample is highly
degraded, mtDNA may be preferred since there
hundreds of mitochondria per cell, yet only on
nucleus).