Title: mStruct: Structure under mutations
1mStruct Structure under mutations
mStruct Inference of population structure in the
presence of genetic admixing and allele mutations
- Suyash Shringarpure and Eric Xing
- Carnegie Mellon University
2Significance
3Genetic Population Structure
- Structure (Pritchard et al, 2000)?
Ancestral proportion
Africa
Europe
Mid-East
Cent./S. Asia
East Asia
Oceania
Genetic structure of Human Populations (Rosenberg
et al. 2002)?
4Generative model- Structure
All the alleles observed at this locus
a (for the dataset)
0.8
0.2
0.8
0.2
0.3
0.7
5Modeling allele similarity
- Microsatellite
- Repeats of a small DNA unit, say
Allele - 2
Allele - 9
Allele - 10
- Allele 9 is much more similar to allele 10 than
allele 2. - Allele 10 might be a mutation of allele 9.
- Mathematically encode the idea in the model
- mStruct Structure under mutations
6Hypothesis
- Individual genomes in modern populations are a
result of - Admixture of ancestral populations.
- Mutations from ancestral alleles.
- Ancestral populations have fewer alleles
- (Mostly) True for microsatellites
7Generative model- mStruct
All the alleles observed at this locus
a (for the dataset)
d1
0.8
0.2
0.8
0.2
d2
0.3
0.7
8Mutation models
- How to derive descendant alleles from ancestral
alleles? - Distribution based on the single step model
- P(ba) a dabs(b-a) , d lt 1
- Computationally easy
- NOT conventional mutation rate.
9Finding ancestral alleles
- Fit mixtures of mutation distributions
- Try using 1,2,3.. ancestral alleles
- Use information theory to decide how many
ancestral alleles are appropriate
Histogram of observed alleles
10Comparing population structure maps
11Phylogenetic Trees from the Structural Maps
12Phylogenetic Trees from the Structural Maps
mStruct
Structure
13HGDP SNP results
14Implications of Inconsistency
- Simplistic mutation model
- SNP mutations harder to discover from data
- The model reduces to Structure
- Fundamental difference
- Different markers treated differently
- Structures treatment of alleles is almost
categorical
15Contour of Empirical Mutation
16Conclusion
- Generative model for population structure
- Modeling mutations from ancestral alleles
- Gives mutational information apart from
population structure. - (in press) Genetics
- Online version up now.
17Graphical model representations
Structure
mStruct