Dealing With Uncertainty P(X|E) - PowerPoint PPT Presentation

About This Presentation
Title:

Dealing With Uncertainty P(X|E)

Description:

Bayesians use all possible models, with priors. Concerns ... Past: what is the likelihood that Marilyn Monroe committed suicide? ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 19
Provided by: ics9
Learn more at: https://ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Dealing With Uncertainty P(X|E)


1
Dealing With UncertaintyP(XE)
  • Probability theory
  • The foundation of Statistics
  • Chapter 13

2
History
  • Games of chance 300 BC
  • 1565 first formalizations
  • 1654 Fermat Pascal, conditional probability
  • Reverend Bayes 1750s
  • 1950 Kolmogorov axiomatic approach
  • Objectivists vs subjectivists
  • (frequentists vs Bayesians)
  • Frequentist build one model
  • Bayesians use all possible models, with priors

3
Concerns
  • Future what is the likelihood that a student
    will earn a phd?
  • Current what is the likelihood that a person has
    cancer?
  • What is the most likely diagnosis?
  • Past what is the likelihood that Marilyn Monroe
    committed suicide?
  • Combining evidence and non-evidence.
  • Always Representation Inference

4
Basic Idea
  • Attach degrees of belief to proposition.
  • Theorem (de Finetti) Probability theory is the
    only way to do this.
  • if someone does it differently you can play a
    game with him and win his money.
  • Unlike logic, probability theory is
    non-monotonic.
  • Additional evidence can lower or raise belief in
    a proposition.

5
Random Variable
  • Informal A variable whose values belongs to a
    known set of values, the domain.
  • Math non-negative function on a domain (called
    the sample space) whose sum is 1.
  • Boolean RV John has a cavity.
  • cavity domain true,false
  • Discrete RV Weather Condition
  • wc domain snowy, rainy, cloudy, sunny.
  • Continuous RV Johns height
  • johns height domain positive real number

6
Cross-Product RV
  • If X is RV with values x1,..xn and
  • Y is RV with values y1,..ym, then
  • Z X x Y is a RV with nm values ltx1,y1gtltxn,ymgt
  • This will be very useful!
  • This does not mean P(X,Y) P(X)P(Y).

7
Discrete Probability
  • If a discrete RV X has values v1,vn, then a prob
    distribution for X is non-negative real valued
    function p such that sum p(vi) 1.
  • Prob(fair coin comes up heads 0,1,..10 in 10
    tosses)
  • In math, pretend p is known. Via statistics we
    try to estimate it.
  • Assigning RV is a modelling/representation
    problem.
  • Standard probability models are uniform and
    binomial.
  • Allows data completion and analytic results.
  • Otherwise, resort to empirical.

8
Continuous Probability
  • RV X has values in R, then a prob distribution
    for X is a non-negative real-valued function p
    such that the integral of p over R is 1. (called
    prob density function)
  • Standard distributions are uniform, normal or
    gaussian, poisson, beta.
  • May resort to empirical if cant compute
    analytically.

9
Joint Probability full knowledge
  • If X and Y are discrete RVs, then the prob
    distribution for X x Y is called the joint prob
    distribution.
  • Let x be in domain of X, y in domain of Y.
  • If P(Xx,Yy) P(Xx)P(Yy) for every x and y,
    then X and Y are independent.
  • Standard Shorthand P(X,Y)P(X)P(Y), which means
    exactly the statement above.

10
Marginalization
  • Given the joint probability for X and Y, you can
    compute everything.
  • Joint probability to individual probabilities.
  • P(X x) is sum P(Xx and Yy) over all y
  • written as sum P(Xx,Yy).
  • Conditioning is similar
  • P(Xx) sum P(XxYy)P(Yy)

11
Conditional Probability
  • P(Xx Yy) P(Xx, Yy)/P(Yy).
  • Joint yields conditional.
  • Shorthand P(XY) P(X,Y)/P(Y).
  • Product Rule P(X,Y) P(X Y) P(Y)
  • Bayes Rules
  • P(XY) P(YX) P(X)/P(Y).
  • Remember the abbreviations.

12
Consequences
  • P(XY,Z) P(Y,Z X)P(X)/P(Y,Z).
  • proof Treat YZ as new product RV U
  • P(XU) P(UX)P(X)/P(U) by bayes
  • P(X1,X2,X3) P(X3X1,X2)P(X1,X2)
  • P(X3X1,X2)P(X2X1)P(X1) or
  • P(X1,X2,X3) P(X1)P(X2X1)P(X3X1,X2).
  • Note These equations make no assumptions!
  • Last equation is called the Chain or Product Rule
  • Can pick the any ordering of variables.

13
Bayes Rule Example
  • Meningitis causes stiff neck (.5).
  • P(sm) 0.5
  • Prior prob of meningitis 1/50,000.
  • p(m) 1/50,000.
  • Prior prob of stick neck ( 1/20).
  • p(s) 1/20.
  • Does patient have meningitis?
  • p(ms) p(sm)p(m)/p(s) 0.0002.

14
Bayes Rule multiple symptoms
  • Given symptoms s1,s2,..sn, what estimate
    probability of Disease D.
  • P(Ds1,s2sn) P(D,s1,..sn)/P(s1,s2..sn).
  • If each symptom is boolean, need tables of size
    2n. ex. breast cancer data has 73 features per
    patient. 273 is too big.
  • Approximate!

15
Idiot or Naïve Bayes
  • Goal max arg P(D, s1..sn) over all Diseases
  • max arg P(s1,..snD)P(D)/ P(s1,..sn)
  • max arg P(s1,..snD)P(D) (why?)
  • max arg P(s1D)P(s2D)P(snD)P(D).
  • Assumes conditional independence.
  • enough data to estimate
  • Not necessary to get prob right only order.

16
Bayes Rules and Markov Models
  • Recall P(X1, X2, Xn) P(X1)P(X2X1)P(Xn
    X1,X2,..Xn-1).
  • If X1, X2, etc are values at time points 1, 2..
  • and if Xn only depends on k previous times,
    then this is a markov model of order k.
  • MMO Independent of time
  • P(X1,Xn) P(X1)P(X2)..P(Xn)

17
Markov Models
  • MM1 depends only on previous time
  • P(X1,Xn) P(X1)P(X2X1)P(XnXn-1).
  • May also be used for approximating probabilities.
    Much simpler to estimate.
  • MM2 depends on previous 2 times
  • P(X1,X2,..Xn) P(X1,X2)P(X3X1,X2) etc

18
Common DNA application
  • Goal P(gataag) ?
  • MM0 P(g)P(a)P(t)P(a)P(a)P(g).
  • MM1 P(g)P(ag)P(ta)P(aa)P(ga).
  • MM2 P(ga)P(tga)P(ata)P(gaa).
  • Note each approximation requires less data and
    less computation time.
Write a Comment
User Comments (0)
About PowerShow.com