Title: LargeScale Population Studies using Genetic Polymorphisms
1Large-Scale Population Studies using Genetic
Polymorphisms
Jonathan C. CohenUT Southwestern Medical Center
Financial Disclosure NoneUnlabeled/Unapproved
Uses None
For slide narration see notes for text.
2Genetic Architecture of Heart Disease
Common Variant
Rare Variants
Age
Ancient
Recent
Ancestry
Common
Different
Frequency
High
Low
Effect
Small
Large
alleles
Few
Many
3Association Studies
Dallas Heart Study (DHS) Extensive Phenotyping
Southwestern PGA High Throughput Genotyping
Association Studies
4Dallas Heart Study
Dallas Heart Study
AA (50)
Caucasian (35)
Hispanic (15)
Other (lt1)
5Dallas Heart Study
Dallas Heart Study
DXA
MRI
n 3000 ages 30-60 50 Black 15 Hispanic
EBCT
6Southwestern PGA Conceptual Framework
Candidate Gene List Expression studies, Computer
prediction, Expert opinion
Identify SNPs in Candidate Genes DNA sequencing
in affected individuals
Genotype SNPs in DHS Mass Spectrometry (Sequenom)
7Heart Disease SNPs
Heart Disease SNPs
- Identify SNPs in 356 candidate genes (n3266)
- Hyperlipidemia
- ? LDL-cholesterol
- Triglycerides
- HDL-cholesterol
- Insulin resistance
- Hepatic fat (MR-spectroscopy)
- coronary calcium (EBCT)
451 nonsynonymous SNPs
8Heart Disease SNPs
Heart Disease SNPs
2. Determine association with phenotypes
- Genotype DHS (Mass Spectrometry)
- Test SNPs against all phenotypes (ANOVA)
- Test 4 groups independently
- Black men, Black women, White men, White women
9Genetic Association Studies InitialScreen
Phenotypes
SNPs
WM 0.04
BM 0.03
BW 0.02
WW 0.04
WM 0.4
BM 0.03
BW 0.6
WW 0.5
10Sources of Error in Genetic Associations
- Genotyping Errors
- False Positive Associations
-
11Genotyping Errors in DHS
- - Validated Sequenom assay 7 SNPs, 50 samples
-
- Determine if SNPs in Hardy Weinberg
equilibrium (HWE) 16/151 SNPs analyzed in DHS
not in HWE
Error rate using mass-spectrometry real-time
PCR
12- False Positive Associations
Association study 129 SNPs and
LDL-cholesterol 16 SNPs Plt0.05 in 1 of 4
groups 2 SNPs Plt0.05 in 2 of 4 groups 1 SNP
Plt0.05 in 4 of 4 groups APOE
13Strategies to Avoid False Associations
Adjusting for Multiple Testing
- Correct for multiple testing using Bonferroni
correction Pka/n
Example 129 SNPs analyzed in 4 groups To
obtain P 0.05 would require a Pk 0.0001 For
APOE on LDL-C, P 0.02 in Black Men (n772) For
APOE on LDL-C, P 0.002 in White Men
(n499) Neither would be considered at the
adjusted P-value
Problem loss of power to detect true
associations
14Strategies to Avoid False Associations
- 2. Use nominal P-value (0.05) and replicate,
replicate
Example APOE and LDL-cholesterol Nominal P
values in consecutive groups Black men 0.02
White men 0.002 Black women 0.0001 White
women 0.0001 Cumulative Probability 0.02 X
0.002 X 0.0001 X 0.0001 0
Replication eliminates false positives without
increased risk of false negatives
15APOAV S19W Allele and Plasma Triglyceride
Concentrations in DHS
300
wt
250
S19W
200
Plasma Triglyceride (mg/dl)
150
100
Plt 0.05
50
0
Black Women
White Women
White Men
Black Men
16APOAV Alleles and Plasma Triglyceride
Concentrations
TG lt 10th percentile
TG gt 90th percentile
25
20
15
with S19W
10
Plt 0.005
5
0
Men (n164)
Women (n100)
17Conclusions
- Large-scale studies generate many false-positive
associations - Very low P-values will eliminate true
associations - Replication is critical to validation of genetic
associations - Optimal strategy Use nominal P-value for initial
screen and validate by replication - Reliability of high-throughput genotyping assays
should be carefully examined
18Genetic Architecture of Heart Disease
Rare Variants
Common Variant
Age
Ancient
Recent
Ancestry
Common
Different
Frequency
High
Low
Effect
Small
Large
alleles
Few
Many
19Genetics of Congenital Heart Disease (CHD)
- Most CHD sporadic, familial cases uncommon
- Multifactorial etiology
- Genes plus environment
- 5-fold increase in recurrence risk
- Few known genetic causes of CHD
- Multiple AD syndromes with CHD in which gene is
known - Only one non-syndromic cause of CHD NKX2-5
20Identification of Mutations Causing CHD
- Sequence all genes known to be required for
cardiac development in patients with CHD - Identify nonconservative mutations
- Test for i) Segregation in family
- ii) Effect on function in vitro
- iii) Effect on function in vivo (animal
models)
21Prevalence of Sequence Variations
- Selection of candidate genes (n100)
- Disruption of gene alters cardiogenesis in model
organism (flies, fish, mice) - Sequenced exons, flanking intronic sequences
- Subjects CHD (42 sporadic, 17 familial)
- - Septal defects, Tetralogy of Fallot, pulmonic
stenosis, patent ductus arteriosus, etc - Mutations detected 149 nonsynonymous variants
- 44 found in only a single subject
-
22Nature of Rare Sequence Variations
23Candidate Gene GATA4
- Transcription factor
- - 2 zinc finger domains
- Expressed in heart during embryogenesis
- Essential for cardiogenesis in flies, fish, and
mice
24Mutation of a Highly Conserved Residue (G296S) in
GATA4
Transcription activating domains
25Atrial and Ventricular Septal Defect
26GATA4 G296S Segregates With Cardiac Septal Defects
No mutation found in 3500 controls
27Genome-wide Linkage Analysis Limits Linkage to
8p22-23
- Genome wide linkage analysis 358 microsatellite
markers - Single linked region (12.7 Mb) on chromosome 8p
28GATA4 G295S has reduced DNA binding
29Frameshift Mutation in GATA4 in Unrelated
Patient with ASD
c.1075delG E359X
30Frameshift Mutation in GATA4 Segregates with CHD
31Screening other Patients with ASD and VSD for
GATA4 mutations
- Generating mutant constructs and testing function
- Other non-synonymous changes S377G, V380M, and
P394T
32Conclusions
- Mutations in GATA4 mutations cause CHD
- Large-scale sequencing in targeted populations
may reveal mutations with large phenotypic
effects that are common in aggregate
33Acknowledgements
Bioinformatics Alex Pertsemlidis Jeff
Schageman Bob Barnes
Dallas Heart Study Helen H. Hobbs Rudy Guerra
Congenital Heart Disease Deepak Srivastava Vidu
Garg