Title: Probabilistic EntityRelationship Models, PRMs, and Plate Models
1Probabilistic Entity-Relationship Models, PRMs,
and Plate Models
- David Heckerman, Chris Meek, and Daphne Koller
- Slides from SRL 2004 talk
2History/Motivation
- Began with Plates (stats) PRMs (ML)
- Found it to be important to distinguish between
entities and relationships - Discovered the ER model (e.g., Ullman and Widom,
Ch 2) - Created probabilistic version of ER model PER
model - PER Model is more expressive than Plate Model or
PRM and helps to show their connections - PER Model provides a strong link to the db
community by virtue of being built on top of ER
Model
3Outline
- Entity-Relationship (ER) Model
- Probabilistic Entity-Relationship (PER) Model
- Connections to plate model, PRM
- Modeling issues
4ER Model
- An abstract representation of data
- The creation of an ER model is often the first
step in the process of constructing a relational
database. - Often constructed before any data has arrived
(much like we construct models before collecting
data).
5ER Model -- Example
- A university database maintains records on
students and their IQs, courses and their
difficulty, and the courses taken by students and
the grades they receive.
Entity classes
Attribute classes
Course entities CS107, Stats10, Student
entities John, Mary, Takes relations (John,CS
107), Attributes John.IQ, CS107.Diff
Relationship class
6ER Model generates attributes
ER Model
Skeleton
gt
Attributes
7PER Model -- Example
- Continuing the university database example, a
student's grade in a course depends both on the
student's IQ and on the difficulty of the course.
Arc classes
Not shown Local distribution class for grade
8PER Model generates Bayes net
PER Model
Skeleton
gt
Attributes
9Constraints on arc classes
ER Model
Skeleton
gt
Attributes
10More on constraints
A database contains diseases and symptoms for a
given patient. Both diseases and symptoms have
labels from a common set of categories (e.g.,
cardiovascular, neuro, urinary). The possible
causes of a symptom are diseases that have at
least one category in common with that symptom.
11More on constraints
A constraint on the arc class from X.A to Y.B in
a PER model is any first-order expression
involving entities and relationship classes in
the PER model such that the expression is bound
when the tail and head entities are taken to be
constants. To determine whether to draw an arc
from x.A to y.B, we evaluate the first-order
expression using the tail and head entities of
the putative arc. (It must evaluate to true or
false.) We draw the arc from x.A to y.B only if
the expression is true.
12Local distribution classes
E.g., Noisy OR
13Caveat
- Typically, a PER model is not based on the ER
model of a database
14PER model, plate model, PRM
PER model
Plate model
PRM
15Modeling issues
- Restricted relationships
- Self relationships
- Probabilistic relationships
16Restricted relationship Example
Hierarchical model A binary outcome O is
measured on patients in multiple hospitals. Each
patient is treated in exactly one hospital. It
is believed that outcomes in any given hospital h
are i.i.d. given binomial parameter h.q and that
these binomial parameters are themselves i.i.d.
across hospitals given hyperparameters a.
a
a
Hospital
q
Ç
h1.q
hm.q
In
p11.O
pm1.O
Patient
O
17Restricted, Self, and Uncertain
RelationshipExample
Full
F(p,pf)
- A student's grade in a course depends on whether
an advisor of the student is a friend of a
teacher of the course.
Friend
Professor
Teaches
Course
Diff
Takes
Grade
Advises
Student
IQ
18In the paper(Google -gt Heckerman -gt Papers)
- Formal definitions and theorems
- Precise differences between PER models, plate
models, and PRMs - Undirected PER models
- PER models for asymmetric independence
- Many more examples