Title: Model search in structural equation models with latent variables
1Model search in structural equation models with
latent variables
- Ricardo Silva
- Gatsby Computational Neuroscience Unit
- rbas_at_gatsby.ucl.ac.uk
2Outline
- The inference problem
- Discovering structure in linear models
- Experiments
- Extensions
- Conclusion
3Outline
- The inference problem
- Discovering structure in linear models
- Experiments
- Extensions
- Conclusion
4An example
Country XYZ 1. GNP per capita _____ 2. Energy
consumption per capita _____ 3. Labor force in
industry _____ 4. Ratings on freedom of press
_____ 5. Freedom of political opposition
_____ 6. Fairness of elections _____ 7.
Effectiveness of legislature _____
Task learn causal model
5A latent variable model
Economicalstability
Politicalstability
1
2
3
4
5
6
7
- In many domains, the observed variables are
measurements of a set of common factors - Usually hidden and unknown
6Tasks
- Learn which latent variables exist
- Learn which observed variables measure them
- Learn how latents are causally connected
- For continuous and ordinal data
- Theoretically consistent
7Novelty
- When we come to models for relationships between
latent variables we have reached a point where
so much has to be assumed that one might justly
conclude that the limits of scientific usefulness
have been reached if not exceeded. - - Bartholomew and Knott, 1999
- We show this pessimistic claim is unwarranted
8Outline
- The inference problem
- Discovering structure in linear models
- Experiments
- Extensions
- Conclusion
9Using parametric constraints
- Fact given a graph with this structure
- it follows that
L
W ?1L ?1 X ?2L ?2 Y ?3L ?3 Z ?4L
?4
X
Y
Z
W
10The inference problem
- Unobservable true model
- Observable tetrad constraints
- Needed inference observable ? unobservable
Sets of tetrad constraints assumptions? sets
of causal structures
11Example of sound inference
- Given the observable condition
- then X1, X2, X3, X4 are independent conditioned
on some (possibly hidden) node
?12?34 ?13?24 ?14?23
12Clustering
- Learning when two observed variables do not share
any hidden common parent. - Using tetrad constraints, it is possible to learn
that X1 and Y1 do not have a common latent parent
in the cases above
13Putting things together
- Learning pure measurement models
- Discovery method one-latent identification
clustering
Impure
Pure
X1
X2
X3
X4
X5
X6
X7
X8
X1
X2
X3
X5
X7
X8
14The BuildPureClusters algorithm
- Goal learn a pure measurement model of a subset
of the latents - Assumptions
- Linearity
- True model is acyclic
- Causal independence ??probabilistic independence
- Theoretically sound
- In the limit, it returns an equivalence class
that includes the correct model
15The BuildPureClusters output
- Input covariance matrix
- Generates a partition P1, P2, , Pk of a subset
of the observed variables (clustering of
variables) - For each Pi
- Elements measure a same latent in the true model
- Conditionally independent given this latent
- Note one node in P1 ?? P2 ? Pk might not be a
descendant of the respective latent - Output a pure measurement model based on this
partition
16Example of input/output
m10
17Full example
18Discrete version
- Continuous linear models discretized as
binary/ordinal variables (Bartholomew and Knott,
1999)
Job satisfaction
Professional status
Latent variables
Stress compared to previous job
Latent measures
Income
Education level
Time in job
Discretized observations
Education level (high school, college, grad)
Income (low, medium, high)
Stress (less, same, more)
Years in job(less than five, more than five)
19Learning the structural model
- Pure measurement models provide a way to learn
structural models (Spirtes et al., 2000) - Once a proper measurement model is found, one can
apply standard methods to learn the latent
structure - Ex. PC search, GES, etc.
20How a pure measurement model is useful
21Outline
- The inference problem
- Discovering structure in linear models
- Experiments
- Extensions
- Conclusion
22Simulation studies comparisons
- Factor analysis (FA)
- Standard tool
- Basis for several other models
- Purified factor analysis (P-FA)
- Take FA output, eliminate nodes as in
BuildPureClusters - Motivation to make FA directly comparable to
BuildPureClusters
23MM1
SM1
MM2
SM2
X
MM3
SM3
24Criteria
True model
Estimated model
L1
L1
1 mistake of omission
Latent error
L2
L2
L1
L1
1 mistake of omission
L2
L2
Edge error
L1
L1
1 mistake of commission
L2
L2
25Real data test anxiety
- Sample 315 students from British Columbia
- Data available at http//multilevel.ioe.ac.uk/team
/aimdss.html - Goal identify the psychological factors of test
anxiety in students - Examples of indicators feel lack of
confidence, jittery, heart beating, forget
facts, etc. (total of 20) - Details Bartholomew and Knott (1999)
26Simulation studies some results
27Simulation studies some results
28Simulation studies some results
29Simulation studies some results
30Theoretical model
- Two factors, originally not pure
- When simplified as pure p-value is zero
31Our output
32Outline
- The inference problem
- Discovering structure in linear models
- Experiments
- Extensions
- Conclusion
33Unveiling more information
- Limitations of pure models
- No pure model with all three latents and three
indicators per latent
L1
L2
L3
X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
34From pure to impure models principles
- The pairwise discovery principle
- No global three-indicator pure model might exist,
but pairwise pure models might
L1
L3
L1
L2
L2
L3
X1
X2
X3
X7
X9
X10
X2
X3
X4
X5
X6
X1
X4
X5
X6
X8
X9
X10
35From pairwise separability to impure models
- If true model is pairwise-separable
- Use tetrad constraints to find sextets (X1, X2,
X3, Y1, Y2, Y3) that are (X1, X2, X3) x (Y1, Y2,
Y3) separable - Each triplet will correspond to a latent
- Triplets might be overlapping
- How to decide which impurities?
- When true model is not pairwise-separable, target
largest submodel
36Introducing impurities
- We know that (X4, X5, X6) x (X8, X9, X10) are
separable - A single latent separates (X4, X5, X6)
- A single latent separates (X8, X9, X10)
- They are not the same
T2
T3
X4
X5
X6
X8
X9
X10
37Introducing impurities
- But we also know that (X7, X8, X9, X10) are
separated by a single latent - Which still has to be T3
T2
T3
?
X4
X5
X6
X7
X8
X9
X10
?
38Introducing impurities
- We also know that (X4, X7, X8, X9, X10) are
separated by a single latent - Which has to be T3
T2
T3
X4
X5
X6
X7
X8
X9
X10
?
39Introducing impurities
- We know that X6 cannot be a parent of X7
- Because marginalizing X6 would imply
T2
T3
X4
X5
X6
X7
X8
X9
X10
T2
T3
X9
X10
X4
X5
X7
X8
40Introducing impurities
- For the same reason, X7 cannot be a parent of X6
- Only possibility
- That is, we preserve latent T2 no need for
three-indicator pure model
T2
T3
X4
X5
X6
X7
X8
X9
X10
41Testing
- As before, we could use individual tetrad
constraints to identify such a relation - However, there is an obvious loss of power
- Adjusting for multiple hypothesis tests
- Alternative fit whole model
- How to fit models with bi-directed edges?
42Maximum likelihood and Bayesian model selection
- Drton and Richardson (2004) describe maximum
likelihood estimators for Gaussian models with
directed and bi-directed edges - Combined with BIC, gives a model selection
procedure - Silva and Ghahramani (2006) describe Monte Carlo
methods for computing marginal likelihoods - Such model selection procedures can also be used
to remove or add bi-directed edges in partially
specified models
43Weaker pairwise separability
- What if this is the true model?
- No three-indicator pairwise separability, but we
can tell that - Some single latent separates X1, X2, X3 and
some separates X4, X5, X6 - X1, X2 and X5, X6 do not share a common
parent - Consequence model is again identifiable
L1
L2
X1
X2
X3
X4
X5
X6
44Other types of impurities
- No pairwise separability at all
- Can it be distinguished from this?
L2
L3
X4
X5
X6
X7
X8
X9
L
X6
X9
X4
X5
X7
X8
45Other types of impurities directed edges
- Suppose this is the true model
- By conditioning on X6
L2
L3
X4
X5
X6
X7
X8
X9
X10
X3
L2
L3
X4
X5
X7
X8
X9
X10
X3
46Other types of impurities directed edges
- However, the general case is not solved yet
- Structural equation models are not closed under
conditioning - Needed a general graphical characterization of
conditional tetrad constraints - Open problem
47Outline
- The inference problem
- Discovering structure in linear models
- Experiments
- Extensions
- Conclusion
48Conclusion
- It is possible to learn latent variable models
from data when models are identifiable - Algorithms and implementation in Tetrad,
http//www.phil.cmu.edu/projects/tetrad/ - Future work
- Allowing for prior knowledge of structures
- Better treatment of ordinal data
- Implementation of more generic equivalence
classes that allow for impurities
49Acknowledgements
- Thanks to Richard Scheines, Clark Glymour, Peter
Spirters and Joseph Ramsey
50References
- Bartholomew, D. and Knott (1999). Latent variable
models and factor analysis. - Silva, R. Scheines, R. Glymour, C. and Spirtes,
P. (2005). Learning the structure of linear
latent variable models. Journal of Machine
Learning Research - Silva, R. and Ghahramani, Z. (2006). Bayesian
inference for Gaussian mixed graph models.
Uncertainty on Artificial Intelligence