Title: Thore Egeland
1Descent graphs in pedigree analysis applications
to haplotyping, location score, and
marker-sharing statistics.
Based partly on notes by Maayan Fishelson
andTerry Speed
2The general parts of the title
- - haplotyping
- - location score
- - marker-sharing statistics
3References
- The algorithm presented herein was introduced by
- Sobel and Lange 2, and Kruglyak et al. 1.
- E. Sobel and K. Lange. Descent graphs in pedigree
analysis applications to haplotyping, location
score, and marker-sharing statistics. Am. J. Hum.
Genet., 581323--1337. 1996. - L. Kruglyak, M.J. Daly, M.P. Reeve-Daly, and E.S.
Lander. Parametric and nonparametric linkage
analysis a unified multipoint approach. Am. J.
Hum. Genet., 581347--1363, 1996.
4Inheritance vector I
1
2
(x1,x2)
(x3,x4)
paternal
f2 founders, n2 non-founders
12
13
(x1,x3)
(x2,x3)
v(x)(p1,m1,.,pn,mn)(1,1,0,1)
inheritance vector (Notation differs, I stick to
Fishelson, Speed)
5Inheritance vector II
- - All 22n inheritance vectors are equally
likely a priori - - Number of possible inheritance vectors reduced
as people are genotyped. - - If all are genotyped and phase is known, there
is only one possible inheritance vector
6Descent Graph
- Corresponds to a specific inheritance vector.
- Sobel Lange construct Markov Chain on descent
graphs - Simulated annealing used to search for single
most likely descent graph
7Main Idea
- Let a (a1,,a2f) be a vector of alleles
assigned to founders of the pedigree (f is the
number of founders). - We want to represent by a graph the restrictions
imposed by the observed marker genotypes on the
vectors a that can be assigned to the founder
alleles. - The algorithm extracts from the graph only
vectors a compatible with the marker data.
8Example marker data on a pedigree
9Descent Graph
- Corresponds to a specific inheritance vector.
- Vertices the individuals alleles (2 alleles for
each individual in the pedigree). - Edges represent the allele flow specified by the
inheritance vector. A childs allele is
connected by an edge to the parents gene from
which it flowed.
10Example Descent Graph (vertices)
Assume that the descent graph vertices below
represent the pedigree on the left.
Descent Graph
3
4
5
6
1
2
7
8
(a,b)
(a,b)
(a,c)
(b,d)
(a,b)
(a,b)
11Example Descent Graph (cont.)
Descent Graph
3
4
5
6
1
2
7
8
(a,b)
(a,b)
(a,c)
(b,d)
(a,b)
(a,b)
- Assume that paternally inherited alleles are on
the left. - Assume that non-founders are placed in
increasing order. - A 1 (0) is used to denote a paternally
(maternally) originated gene. - ? The gene flow above corresponds to the
inheritance vector v ( 1,1 0,0 1,1 1,1
1,1 0,0 )
12Founder Graph
- Vertices the founder alleles.
- Edges connect the alleles appearing together in
a genotyped individual for the gene flow
specified by the inheritance vector v. - Note the edges are labeled with the genotype of
the corresponding individuals.
13Example Founder Graph
Descent Graph
3
4
5
6
1
2
7
8
(a,b)
(a,b)
(a,c)
(b,d)
(a,b)
(a,b)
Founder Graph
5
3
6
4
2
1
8
7
14Founder Graph
- Includes m connected components, C1,Cm.
- Here C12, C21,3,5, C34,6,7,8
- The founder alleles assigned to different
components appear in different genotyped
individuals, by construction. - Under random mating and Hardy-Weinberg
equilibrium, the vectors of alleles assigned to
different components are independent - Each component can be processed individually.
15Singleton Components
- The vertices corresponding to alleles that never
passed through genotyped individuals form
singleton components. - Any allele type can be assigned to singleton
components.
Singleton component
16Singleton Components (cont.)
3
4
5
6
1
2
7
8
(a,b)
(a,b)
(a,c)
(b,d)
(a,b)
(a,b)
17Find compatible allelic assignments for
non-singleton components
- Identify the set of compatible alleles for each
vertex. This is the intersection of the
genotypes. attached to the edges incident to the
vertex.
a,b n a,b a,b
a,b n b,d b
18Possible Allelic Assignments (example)
b
a
a,b
a,b
a,b
a,c
b,d
a,b,c,d
19Likelihood of descent graph. Sect 4
20Compatible Allelic Assignments
- Denote by A1,,Am the set of compatible allelic
assignments obtained for each connected component
at the end of the algorithm. - Except for singleton components, each Ai contains
0,1, or 2 assignments. - If for some i, Ai is empty ? Prmv 0.
- The compatible assignments are those in the
Cartesian product A1xxAm.
21Computing Prmv
- The probability of singleton components is 1 ? we
can ignore them. - Let ahi be an element of Ai (a vector of alleles
assigned to the vertices of component Ci).
22Computing Prmv Complexity