Perlegen At-A-Glance - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Perlegen At-A-Glance

Description:

Reading Human Genomic Sequence By Using Affymetrix DNA Chips. On a glass chip are synthesized 62,000 consecutive bases of known human genomic ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 54
Provided by: soeU
Category:
Tags: chip | glance | perlegen

less

Transcript and Presenter's Notes

Title: Perlegen At-A-Glance


1
Perlegen At-A-Glance
  • San Francisco Bay Area
  • Spun-off from Affymetrix, Inc. in March
    2001
  • 95 employees
  • Approximately half in genetics/biology and half
    in bioinformatics
  • Privately-held

2
Using the Human Genome
3
95 of one human genome is now publicly available
One copy of the human genome consists of 3
billion bases AGTCCTAGCCTGTGATATAGGGCCCTAGATCA.
4
One copy of the human genome cost 100 million to
obtain
Why were we willing to spend so much money?
5
Variations in DNA sequence affect many aspects of
our lives
Inherited traits or phenotypes
6
Any two humans share 99.9 the same DNA sequence
7
Traits are influenced to different degrees by
genetics and environment
Environmental contribution
Genetic contribution
8
Most common traits are believed to be about 50-50
Diabetes
Skin color
Rheumatoid arthritis
Obesity
Genetics
Environment
Height
Schizophrenia
Osteoporosis
Heart failure
9
With knowledge of the genetic component of a
trait
  • Diagnostic
  • Determine how a patient will respond to a
    particular drug treatment
  • Targets for drug development
  • More effective consumer products
  • Evaluate the role of lifestyle and enviroment on
    the trait

10
DNA is double-stranded and connected by very
specific pairing of the four bases A, G, T, C
DNA can be unwound to single-stranded form and
then can be wound again to double-stranded form
based on the specificity of base-pairing called
hybridization
11
Human DNA variation results from errors in DNA
replication
12
DNA variations come in different forms
  • Single nucleotide polymorphism (SNP)
  • AGCCTGTCACT AGCCTATCACT

Deletion AGCCTGTCACT AGCCTTCACT
Insertion AGCCTGTCACT AGCCTGGTCACT
Variable number tandem repeat (VNTR) CAGCAGCAG
CAGCAGCAGCAGCAG
13
The genetic contribution to a trait may be due to
variation in one genea Mendelian trait
14
Before the Human Genome Project, genes
responsible for Mendelian traits were the only
genes we could find
but it still took a decade or more to find one
of these genes
15
Once the results of the Human Genome Project
began to emerge, the number of Mendelian trait
genes discovered increased exponentially and the
time to discovery decreased
16
There are currently 8309 genes whose variants are
associated with a disorder in OMIM
  • Cystic Fibrosis
  • Huntingtons Disease
  • Familial Breast Cancer
  • Severe Combined Immunodeficiency Disorder

knowledge which is used for diagnostics and
preventative therapies, drug development, and
gene therapy
17
But Mendelian traits are the minority and the
genetic variants responsible for Mendelian
disorders are rare in the general population
How many genetic variants have we found that are
resonsible for traits and disorders that affect
millions of people?
Very few
18
The vast majority of traits are not caused by
variation in a single gene and are called complex
traits
  • Probably the result of 10-30 genetic changes
    spread across the genome
  • Any single genetic variant may be responsible for
    only a small contribution to the trait

19
You may not need to have all of the possible
genetic changes to get the disease
An example where 10 genes are involved in a
disease
Variants (green) in any 4 of the 10 genes causes
disease.
20
What does all that mean?
  • The genetic variants responsible for common
    disease are themselves common in the general
    population
  • These genetic variants are found in both sick and
    healthy people

This makes associating these variants with the
disease extremely difficult and expensive
21
Genetic Association Study
If a DNA variant is associated with a trait of
interest, affecteds will have a different
frequency of that variant than unaffecteds
22
In order to know, with statistical certainty,
that a genetic variant with a small effect is
associated with a trait requires looking at the
DNA of large numbers of people

1,000 people
23
At 100 million per genome, we certainly cannot
sequence the genomes for a thousand people for
each trait we are interested in finding the genes
for
we just need to look at the variants in the
genomes of the 1000 people
24
Single Nucleotide Polymorphisms (SNPs)
  • SNPs are a frequent form of DNA variation and are
    scattered randomly across the genome
  • Each SNP is characterized by only two bases

25
Genotyping calls the two variants that each
person carries at one base position in the genome
But to genotype, you need to know the two base
variants and genome position of the SNPs
26
How many SNPs do we need to find across the
genome and genotype in the 1000 people to find
the genes involved in complex traits?
The average cost of a single SNP genotype for one
person is .50 or 500 for 1000 people
27
There are 3 million SNPs between two people
1.5 billion!
28
Look only at SNPs in the known functional
sequences of the human genome because only
functional regions are likely to be associated
with a trait
Minimize the cost of finding the SNPs and
genotyping the SNPs in a Genetic Association Study
29
Look at dense set of common SNPs across the whole
genome
  • Not all functional sequences have been
    discovered
  • The important changes in DNA may not lie in
    known functional sequences (which comprise less
    than 3 of the genome)
  • Even if all important changes are in known
    functional sequences, which do you select for
    research? (You need to have the correct
    hypothesis up-front)

30
Discover all the common SNPs by looking at the
sequence of 25 copies of the genome from around
the world
but it takes 1 year to sequence one mammalian
genome, so that would take 25 years, not to
mention the cost!
31
Perlegen came up with a faster and cheaper way to
find the common SNPs compared to sequencing,
possible only because we had one copy of the
human genome already known and technology
improvements
32
Reading Human Genomic Sequence By Using
Affymetrix DNA Chips
33
Take another copy of the human genome, label it
with a fluorophore, and hybridize it to the chip
34
Detection of DNA Variation By Using DNA Chips
35
How many chips do we have to process to discover
SNPs from the 25 genomes?
600,000 chips. At 200 chips processed per day, it
would take 8.4 years!
36
What Perlegen was able to do successfully, that
had never been done before
Cover 15 million bases of genomic DNA on one
wafer!
5000 wafers to find the SNPs in 25 genomes
37
Perlegens Technological Advantage

38
Human Whole-Genome High-Density Oligonucleotide
Arrays
39
Perlegen finished SNP discovery across the entire
human genome for 25 copies of the genome in under
2 years in August 2002
  • 1,717,015 common SNPs discovered and confirmed
  • Had all the assays developed and working to
    genotype all the SNPs

Still, that would require 850 million for
genotyping 1000 people.
But we discovered something else
40
SNPs
ATTGCAATCCGTGG...ATCGAGCCATACGATTGCACGCCG AT
TGCAAGCCGTGG...ATCTAGCCATACGATTGCAAGCCG ATTG
CAAGCCGTGG...ATCTAGCCATACGATTGCAAGCCG ATTGCA
ATCCGTGG...ATCGAGCCATACGATTGCACGCCG ATTGCAAG
CCGTGG...ATCTAGCCATACGATTGCAAGCCG
41
SNP Space
ATTGCAATCCGTGG...ATCGAGCCATACGATTGCACGCCG AT
TGCAAGCCGTGG...ATCTAGCCATACGATTGCAAGCCG ATTG
CAAGCCGTGG...ATCTAGCCATACGATTGCAAGCCG ATTGCA
ATCCGTGG...ATCGAGCCATACGATTGCACGCCG ATTGCAAG
CCGTGG...ATCTAGCCATACGATTGCAAGCCG
42
Theres something amazing about SNPs...
SNPs occur in blocks !
43
Haplotype Pattern
ATTGCAATCCGTGG...ATCGAGCCATACGATTGCACGCCG AT
TGCAAGCCGTGG...ATCTAGCCATACGATTGCAAGCCG ATTG
CAAGCCGTGG...ATCTAGCCATACGATTGCAAGCCG ATTGCA
ATCCGTGG...ATCGAGCCATACGATTGCACGCCG ATTGCAAG
CCGTGG...ATCTAGCCATACGATTGCAAGCCG
44
The number of haplotype patterns is limited
Possible patterns 26 SNPs X 2 bases 226
Observed patterns 7
1 2 3 4
The majority of the patterns fall into only 4
classes, which can be distinguished from each
other by only 2 SNPS
45
A SNP-Haplotype Map of the Human Genome

2.3 billion bases of genomic DNA sequence
is covered in 175,309 haplotype blocks
13,000 bases is the average haplotype block size
6.5 SNPs is the average number of SNPs per
haplotype block
210,937 SNPs uniquely define haplotypes representi
ng the pattern of DNA variation spanning the
human genome
46
The haplotype structure of Chr.21 is available to
the public
http//genome-hg8.cse.ucsc.edu/cgi-bin/hgGateway?d
bhg8
47
Genotyping only haplotype-defining SNPs reduces
the number of bases to be looked at in each
individual
1.7 million genotypes/individual
210,000 genotypes/individual
48
Whole Genome Scanning Approach
Looking across the entire genome in hundreds of
people
  • Does not require a hypothesis up front
  • Does not require placing bets on a few locations
  • Will reveal many places in the genome that play a
    role in the disease or trait

49
Whole Genome Association Methodology
105 million
50
Genetic Association Study
If a DNA variant is associated with a trait of
interest, affecteds will have a different
frequency of that variant than unaffecteds
51
Genetic Association Analysis Using Pooled DNA
Samples
SNP 1
52
Whole Genome Association Methodology
All SNP assays per association study using one
DNA pool of affecteds and one DNA pool of
unaffecteds 210,000 SNP assays per
sample 420,000 SNP assays per association study
210,000
53
Association Studies currently underway at Perlegen
  • Genetics of drug response to a highly effective
    drug with GlaxoSmithCline
  • Small percent of patients have adverse reaction
  • Genetics of Diabetes Type 2 with a large
    international consortium of researchers
  • Affects 15 million in the U.S. alone
  • Genetics of common traits with Unilever
  • Improve effectiveness of beauty products
Write a Comment
User Comments (0)
About PowerShow.com