Title: The Identification of Human
1 The Identification of Human Quantitative Trait
Loci
Dr John Blangero Southwest Foundation for
Biomedical Research ChemGenex Pharmaceuticals
2The Goals Genetic Analysis of Complex Phenotypes
QTL Localization Where in the genome is the QTL
located? QTL Identification What is (are) the
gene(s) involved? QTL Allelic
Architecture What are the specific QTNs? How many
QTNs? What are their frequencies and effect
sizes?
3Quantitative Traits
- Usually closer to gene action than disease
itself. - Have superior statistical power.
4Quantitative Endophenotypes
- Heritable
- Genetically correlated with disease or other
focal phenotype - Closer to the action of the genes
5Liability The Threshold Model
Affecteds
Normals
Disease-Related Trait
6The process of finding and identifying
disease-related genes involves Objective
Prioritization.
7Different Diseases
Different Designs
Different Methods
8Family StudiesvsStudies of Unrelateds
9Major Study Designs in Human Genetics Possible
Inferences
Inference Design
Heritability Linkage Association Unrelated
individuals No No
Yes Triads No
Yes Yes Sibling pairs
Yes Yes Yes Nuclear
families Yes Yes
Yes Extended pedigrees Yes
Yes Yes
10You can exploit Linkage and Association
Information Jointly in Family Studies
11Relative Per-Subject Power to Localize QTLs
Population Relative Ped.
Pedigree Study Efficiency
Size Type Jirel (Nepal) 1.00
2300 Extended (isolate) Vermont
0.91 331 Extended SAFHS
0.59 31 Extended
GAIT 0.35 19
Extended Framingham 0.24 5
Extended, nuclear Nuclear (4 sibs) 0.17
6 Nuclear Nuclear (3 sibs)
0.11 5 Nuclear Sib-pair
0.04 2 Relative pair
12Linkage DesignsvsAssociation Designs
13Power Linkage vs Association
14Example 1 Positional Candidate Genes
- QTL for serum leptin levels in the San Antonio
Family Heart Study - Highly replicated QTL
15Chromosome 2 Obesity QTL
16Bioinformatic Prioritization GeneSniffer Results
2p22
POMC
GCKR
UCN
17What Do You Do With A Good Positional Candidate
Gene?
The ALL or NOTHING principle Find all of the
variation in the gene. Preference Resequence
everyone (no bias against rare variants) Alternati
ve Resequence a subset of individuals
18POMC Pattern of LD
19POMC QTN Analysis Marginal Associations
20How To Find the Most Likely Functional SNPs
- Bayesian Quantitative Trait Nucleotide Analysis
has the potential to aid the discovery of the DNA
variants that influence risk of common disease.
Objectively prioritizes SNPs for further
functional work.
21BQTN Analysis Bayesian Model Selection/Model
Averaging
Evaluate possible models of gene action. This may
be very large, 2n models of additive gene
action. Use Bayesian model selection to choose
best models and average parameters over models.
Eliminates problem of multiple testing. Yields
unbiased estimates of effect size. Allows
prioritization of polymorphisms for further lab
evaluation. Calculation of Posterior Probability
of Effect.
22The Parallel Ranch 1,500 Processors
23Sequential Oligogenic Linkage Analysis Routines
- All analyses were performed using a parallel
version of SOLAR on up to 1,500 processors.
For more information on SOLAR, follow the
software links at http//www.sfbr.org
24BQTN analysis of POMC polymorphisms
- Three variants account for 11 of variation in
leptin levels. - The frequencies of these variants are 0.005,
0.004 and 0.06. - LD with any other SNPs is very low 0.075, 0.248
and 0.189. - It would be VERY HARD to find these by LD.
25Linkage Conditional on POMC SNPs
Marginal LOD5.86 Conditional LOD3.05
26What Do You Do With A Good Positional Candidate
Region?
The ALL or NOTHING principle Find all of the
variation in the region, say 5 10 Mb.
Preference Resequence everyone (no bias against
rare variants). This can be done NOW! It is the
wave of the future. Dont waste time with LD. It
is your ENEMY.
27Example 2 Identifying Human QTLs Quickly
- Expression phenotypes that are cis-regulated
should be much easier to quickly identify
functional variants and correlate them with
disease risk.
28Gene Expression Levels as Endophenotypes
- Quantitative variation in gene expression levels
explains some proportion of the variation in many
phenotypes. - The amount of mRNA of a specific transcript in a
tissue sample is about as close to gene action
as possible hence, such phenotypes ought to be
dissectible by statistical genetic approaches. - Array-based technologies make it feasible to
quantify the expression levels of many
transcripts simultaneously.
29Project Description
- San Antonio Family Heart Study (SAFHS) designed
in 1991 to investigate the genetics of CVD in
Mexican Americans - Includes 1,431 individuals from 42 families
- 2 recalls since 1991
- Extensive phenotypic data
- anthropometry, blood pressure, lipids, obesity,
diabetes, inflammation, oxidative stress,
hormones, osteoporosis, brain structure/function - Genome scanned
30Methodology
- Blood samples collected from first SAFHS
examination approx 15 years ago - Lymphocytes isolated from blood and stored in
RPMI-C media in liquid nitrogen - RNA extracted and expression profiles generated
on stored lymphocytes - 47,289 transcripts interrogated using the
Illumina platform
31Detection Statistics
- 1,280 samples analyzed, good data from 1,240
(97) - Of the 47,289 transcripts per array, we
significantly detected 20,413 transcripts.
32Heritabilities of Autosomal RefSeq Transcripts
33Cis-Regulated Expression QTLs
34Identifying Novel Candidate Genes for Disease Risk
- After determining cis-regulated QTLs, look for
correlations with phenotypes related to disease
risk - Transcriptomic Epidemiologyusing high
dimensional endophenotypic search - For example, 383 cis-regulated transcripts are
significantly correlated with BMI (an index of
obesity). - Many of these are novel genes of unknown function.
35Expression QTLs LOD gt 3
Approximately, 34 of QTLs are Cis. Effect size
(QTL-specific heritability) is 64 larger for Cis
QTLs.
36Cis Regulation UTS2 (urotensin 2 preprotein)
37Cis and Trans Regulation HBG2 (G-gamma globin)
38Trans Regulation LOC389472
39Mitochondrial QTLs Influencing Expression
40Identification of Human QTLs Example 3
QTL influencing inflammatory response A novel
positional candidate gene (SEPS1/SELS) found by
expression studies in an animal model
41SEPS1 Gene Discovery
- SEPS1 (formerly known as Tanis) was first
identified by differential gene expression in
liver of diabetic P. obesus - Putative functions related to ER stress response
through processing and removal of misfolded
proteins - (Ye et al (2004). Nature 429, 841-847)
42SEPS1 Gene Discovery
- Human SEPS1 gene is located on 15q26.3
- Mammalian plasma membrane selenoprotein also a
member of the GRP family - Consists of 6 exons, encodes a 204aa protein
- 15q26 region shown to contain QTLs influencing
inflammatory disorders - Zamani et al (1996). Hum Genet 98, 491-6.
- Field et al (1994). Nat Genet 8, 189-94.
- Blacker et al (2003).Hum Mol Genet 12, 23-32.
- Susi et al (2001). Scand J Gastroenterol 36,
372-4. - Mahaney et al (2005) Unpublished.
43SEPS1 Variant Identification
- Sequenced 9.3kb including putative promoter,
exons, introns and conserved regions in 50
individuals from three different ethnic
populations - 16 variants genotyped in cohort of 522 Caucasian
individuals from 92 families - Plasma levels of IL-1?, IL-6 and TNF-? measured
- Results analyzed for association using SOLAR
44Association Analysis
IL-1? IL-6 TNF-?
45BQTN Analysis
- BQTN analysis strongly supported a model in which
the G-105A SNP was responsible for the observed
associations with estimated posterior
probabilities of gt0.999, 0.95, and 0.79 (for
TNF-?, IL-1?, and IL-6 respectively) - Analysis indicates the G-105A SNP is of direct
functional consequence (or is highly correlated
with a functional variant) - Analysis performed to test the functionality of
this G-105A variant
46Effect of A or G variant on SEPS1 promoter
activity under Tunicamycin stress conditions
P 0.00006
2.5
2
1.5
Promoter activity (fold change in luc activity
over basal)
1
0.5
0
A variant
G variant
47Physiological Role of SEPS1
48Exploring the Effects of the SEPS1 G-105A QTN
- Looked at the in vivo effects of SEPS1 G-105A QTN
on expression levels of SEPS1 and genes in the
following Gene Ontology categories - Endoplasmic Reticulum
- Unfolded Protein Response
- Golgi Stack and Protein Transportation
- Oxidative Stress
49SEPS1 Expression is Correlated With Disease In
Vivo
50SEPS1 G-105A QTN Influences Expression In Vivo
- SEPS1 transcript is cis-regulated (as defined by
quantitative trait linkage analysis). - The rare A variant is associated with decreased
expression in lymphocytes (p 0.032).
51SEPS1 G-105A Associated Genes
52Acknowledgements
ChemGenex Pharmaceuticals Jeremy Jowett Greg
Collier
- Southwest Foundation for Biomedical Research
- Joanne Curran Eric Moses
- Matt Johnson Catherine Jett
- Tom Dyer Shelley Cole
- Harald Göring Jean MacCluer
- Charles Peterson
- Tony Comuzzie
- Laura Almasy
Special thanks to the Azar family of San Antonio
for their financial support of our research