Reconstruction on trees and Phylogeny 2 - PowerPoint PPT Presentation

About This Presentation
Title:

Reconstruction on trees and Phylogeny 2

Description:

We study the reconstruction problem for the Ising-CFN model on regular trees. 9/24/09 ... Finite set A of information values. Tree T=(V,E) rooted at r. ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 18
Provided by: chris802
Category:

less

Transcript and Presenter's Notes

Title: Reconstruction on trees and Phylogeny 2


1
Reconstruction on trees and Phylogeny 2
Elchanan Mossel, U.C. Berkeley mossel_at_stat.berke
ley.edu, http//www.cs.berkeley.edu/mossel/ Sup
ported by Microsoft Research and the Miller
Institute
2
Reconstruction on Ising-CFN model
  • We study the reconstruction problem for the
    Ising-CFN model on regular trees.




-



-



-
-

3
Markov models on trees
  • Finite set A of information values.
  • Tree T(V,E) rooted at r.
  • Vertex v 2 V, has information sv 2 A.
  • Edge e(v, u), where v is the parent of u, has a
    mutation matrix Me of size A A
  • Mi,j (v,u) P?u j ?v i
  • For each character ?, we are given ??T (?v)v 2
    ?T,
  • where ?T is the boundary of the tree.
  • We will focus on the Ising-CFN model

4
Statistical physics
  • Statistical physics is a sub-field of
    mathematical physics where we study complex
    systems with simple microscopic interactions.
  • The Ising model on a graph is a probability
    measure (Gibbs distribution) on the space of
    configurations s from vertices to -1,1 such
    that
  • Ps exp(S(v, w) e E s(v)s(w)/T).
  • Traditionally studied on cubes in Zd.

The Ising model on 200 x 200 grid
5
Statistical physics on trees
  • The Ising model on the binary tree can be
    defined
  • Set sr, the root spin, to be /- with probability
    ½.
  • For all pairs of (parent, child) (v, w), set sw
    sv, with probability ?, otherwise sw /-
    with probability ½.
  • This is exactly the CFN model.
  • Studied in statistical physics Spitzer 75,
    Higuchi 77, Bleher-Ruiz-Zagrebnov 95,
    Evans-Kenyon-Peres-Schulman 2000, Ioffe 99, M 98,
    Haggstrom-M 2000, Kenyon-M-Peres 2001,
    Martinelli-Sinclair Weitz 2003, Martine 2003




-



-



-
-

6
Reconstruction solvability
  • Let T be an infinite rooted tree and Tn denote
    the first n levels of T.
  • We say that the reconstruction problem is
    solvable if one of the following equivalent
    conditions hold
  • 9 ? s.t. (8 non-degenerate ?) limn ! 1 I(X0,Xn)
    gt 0, where I(X0,Xn) H(X0) H(Xn) H(X0,Xn) H
    is the entropy operator, H(X) -?x PX x log2
    PX x.
  • 9 i,j s.t. limn ! 1 Pni - Pnj gt 0, where Pnj
    denotes the distribution of Xn conditional on X0
    j.
  • If X0 has the uniform distribution then,
    liminfn ! 1 ?n gt 1/m, where ?n is the
    probability of correct reconstruction of X0 given
    Xn.
  • 9 ? (8 non-degenerate ?) liminfn ! 1
    VarEX0Xn gt 0.

7
The Ising model on the 3-regular tree
mutual information H(s?) H(sr)) - H(sr,s?)
8
Reconstruction for the CFN model
  • Thm The reconstruction problem for the Ising
    model on the (b1)-regular tree is solvable if
    and only if b ?2 gt 1.
  • Easy direction Higuchi 77 prove that a
    certain reconstruction algorithm works when b ?2
    gt 1.
  • Higuchi argument extends to general chains and
    general trees.
  • Will also show an argument from M98 useful for
    phylogeny.
  • Hard direction 95 Non-reconstruction?
  • 6 different proofs!
  • All involve a magic.
  • None extends to other markov models.
  • Will follow a coupling proof Martinelli-Siclair-W
    eitz

9
Non-reconstruction - Coupling down
  • Copying rule. For i ,-
  • Pi ! i ?.
  • Pi ! Uniform 1 ?.
  • Continuing down the tree, non-coupled elements
    form a branching process with parameter ?.

/ -
/ -

/ -










  • If b ? 1, branching process dies ) coupling.
  • More generally, at level n, the expected number
    of uncoupled sites is bn?n.
  • (Doesnt work all the way to b ?2 1).

10
Non-reconstruction - Coupling up
  • We try to couple two configurations which differ
    at level n so that they agree at the root.
  • First consider the case where they differ at
    exactly one site.



/ -
u
v




/ -


  • Lemma Mossel-Kenyon-Peres Among all boundary
    conditions ?, E? ?u 1 ?v 1 E??u -1
    ?v 1 is maximized for the free boundary.
  • ) Pnot coupling at u ?.
  • ) Pnot coupling at the root ?n.

11
Coupling up path coupling
  • We got that if ? and ? are two boundary
    conditions which differ in one position at level
    n, then
  • E??(?) E??(?) 2 ?n, where ? is the root.
  • ) if ? and ? are two boundary conditions which
    differ at k sites, then
  • E??(?) E??(?) 2 k ?n.
  • Pf If ? and ? differ at k sites, then we can
    find a sequence ? ?(0),?(1),,?(k) ?, such
    that ?i and ?i1 differ in exactly one site.
  • E??(?) E??(?)
  • ?i1k E?(i)?(?) E?(i-1)?(?) 2 k ?n.

12
Non reconstruction for b ?2 lt 1
  • Fix ? such that b ?2 lt 1.
  • We will show that EE?(?) ? E-E?(?)
    ?- ! 0,
  • where ? boundary conditions conditioned on
    ?(?) .
  • Let (?,?-) be given by the down coupling.
  • Let K(?,?-) number of disagreements between
    ?,?-.
  • EE?(?) ? E- E?(?) ?-
  • E_,-E?(?) ? - E?(?) ?-
  • E,-2 K(?,?-) ?n 2 ?n E,-K(?,?-) (up
    coupling).
  • 2 ?n bn ?n (down coupling)
  • 2 (b ?2)n ! 0 exp. fast in n.

13
Where we stopped
  • Thm The reconstruction problem for the Ising
    model on the (b1)-regular tree is solvable if
    and only if b ?2 gt 1.
  • We showed that if b ?2 lt 1, it is impossible to
    reconstruct (hard direction).
  • We now show that if b ?2 gt 1, we can reconstruct.

14
Reconstruction via majority
  • Fix ? such that b ?2 gt 1.
  • Let X Xn () - (-) at level n.
  • We claim that Xn is a good estimator of ?(?).
  • EXn bn ?n E-Xn -bn ?n.
  • We show that E/-Xn2 c(E/-Xn)2 c b2n
    ?2n.
  • Let f fn (g gn) be the density of the (-)
    measure with respect to some reference measure ?.
  • 2 bn ?n EX E-X s X (f g) d ?
  • s X (f1/2 g1/2) (f1/2 g1/2) d ?
  • (s X2 (f1/2 g1/2)2 d?)1/2 (s (f1/2
    g1/2)2 d? )1/2
  • (4 s X2 f d? 4 s X2 g d?)1/2 (s f g
    d?)1/2
  • (8 c b2n ?2n)1/2 (DTV(,-))1/2.

15
Bounds on the second moment
  • Write Xn ?v ?(v), where the sum is over all v
    in level n.
  • EXn2 ?v,w E?(v) ?(w).
  • For each edge with prob. ? the two end points are
    the same and with prob. 1-? the two points are
    independent.
  • If there is a red edge on the path between v and
    w, then E?(v) ?(w) 0.

v
w
v
w
  • Otherwise, ?(v) ?(w).
  • E?(v) ?(w) ?d(v,w).
  • EXn2 bn(1 ?i1n (bi bi-1)?2i)
  • bn(1 (b-1) ?2 ?i0n-1 bi ?2i).
  • O(b2n ?2n) iff b ?2 gt 1.

v
1
2
4
16
Remarks on the second moment
  • Kamea/ Higuchi argument is very robust.
  • Works for general trees when br(T) ?2 gt 1.
  • Works for general markov chains, where ? 2nd
    eigenvalue of M (M-Peres 2002).
  • Kesten-Stigum (1966!) proved that for all markov
    chains
  • if b ?2 gt 1, then the limiting law of the count
    depends on the root.
  • If b ?2 lt 1, then the limiting law is normal for
    all root values.
  • M-Peres (2002) count reconstruction is impossible
    if b ?2 lt 1.

17
Recursive reconstruction for Ising models
?
  • An alternative proof for reconstruction for b ?2
    gt 1 M98
  • Advantage Works also when we have lower bound on
    ?. Majority doesnt.
  • Blue edges have ?1 , black ?2, ?1 lt ?2 1.
  • Maj(s?) Maj of black tree.
  • Maj of black tree sv .
  • sv and s? have exp. small correlation.
  • Phylogeny reconstruction given bounds.

v
  • Instead we will use recursive-majority.
Write a Comment
User Comments (0)
About PowerShow.com