Probability: Overview, Definitions, Jargon Blood Feuds - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Probability: Overview, Definitions, Jargon Blood Feuds

Description:

Statistics is a hard sell (on a good day) Most palatable ... Physical limits on information storage ... probability that a nonsmoker will get lung ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 57
Provided by: biow
Category:

less

Transcript and Presenter's Notes

Title: Probability: Overview, Definitions, Jargon Blood Feuds


1
ProbabilityOverview, Definitions, Jargon Blood
Feuds
  • BioE131/231

2
Probability vs Statistics
  • Statistics is a hard sell (on a good day)
  • Most palatable approach that I know
  • Concentrate on modeling rather than tests
  • Bayesian vs Frequentist schools
  • Emphasize connection to information theory
  • Physical limits on information storage
    transmission
  • Where probability meets signals systems
    engineering

3
Bayesians and Frequentists
  • Believe it or not, statisticians fight
  • Frequentists (old school)
  • Emphasis on tests t, ?2, ANOVA
  • Write down competing hypotheses, but only analyze
    null hypothesis (!)
  • Report significance (actually improbability)
  • Bayesians (new school)
  • Emphasis on modeling
  • Build a model for all competing hypotheses
  • Use probabilities to represent levels of belief

4
A word on notation
5
Distributions densities
6
Discrete vs continuous
Binomial
Gaussian
7
Cumulative distributions
  • Density function
  • Cumulative distribution function

8
More definitions
9
Normalization
Similarly for probability density functions
etc. (replace sums by integrals)
10
Independence
11
I.I.D.
12
Uniform
13
Lets get Bayesian
14
Example Fall 05 admissions
P(C1)
P(A1 C1)
P(A1, C1)
P(C1 A1)
P(A1)
15
Example Fall 05 admissions
P(C1)
P(A1 C1)
0.23 / 0.83
P(A1, C1)
P(C1 A1)
0.23 / 0.26
P(A1)
16
Bayesian inference
  • Probabilities frequencies are essentially
    alternative ways of looking at the same thing
  • However... frequencies are sometimes more
    intuitive
  • We will return to more examples of Bayes Theorem
    and inference

17
Experimental error
18
Experimental error (cont.)
19
Approximate errors
20
Shannon information
Can also be interpreted as number of bits that an
ideal data compression algorithm needs to
encode message x
21
Entropy of a bent coin
  • X1 means heads
  • X0 means tails
  • P(X1) q
  • P(X0) 1-q

22
LHopitals rule
23
Entropy of a bent coin
  • X1 means heads
  • X0 means tails
  • P(X1) q
  • P(X0) 1-q

24
Information binary codes
  • Illustration of Shannon information in the
    special case of a uniform distribution

25
Information uniform distributions
  • Consider an alphabet of N symbols (each of which
    appears equally frequently)
  • A simple binary code needs lg(N) bits per symbol
    (rounded up)
  • Each symbol occurs equally frequently means
    that the probability of any symbol is p1/N and
    the Shannon information is h-lg(p)lg(N)
  • This is a special case of a more general theorem
    that h(x) represents the number of bits needed to
    encode x

26
Shannon entropy
S lth(x)gt
Can also be interpreted as a performance measure
for an ideal data compression algorithm
27
Relative entropy
A measure of difference between probabilistic
models (not a true distance, despite the name)
Can also be interpreted as a measure of relative
efficiency between two ideal data compression
algorithms
28
DNA example
  • Consider a sequence of DNA that is all As and
    Ts. Distribution isp(A)p(T)1/2p(C)p(G) 0
  • Consider another sequence that is uniformly
    distributedq(A)q(C)q(G)q(T)1/4
  • If I tell you a nucleotide is from the second
    sequence, you need two bits to encode it
  • If I tell you the nucleotide is from the first
    sequence, you only need one bit to encode it
  • If I say its from the second sequence, but its
    actually from the first, youve wasted a bit
  • This is whats meant by D(pq)2
  • Its not the same as D(qp) if I tell you the
    nucleotide is from the first sequence but its
    really from the second, you might not be able to
    encode it at all (technically the relative
    entropy is infinite)

29
Mutual information
Measure of increased data compression efficiency
obtainable by compressing two related texts
together
30
Example DNA double helix
  • Consider strand-symmetric DNA
  • Assume uniform distribution over nucleotides
  • Pick a random nucleotide (x) and its
    opposite-strand partner (y)
  • P(x) 1/4, P(y) 1/4 so P(x)P(y)1/16
  • P(x,y) 1/4 if x is complement of y, 0
    otherwise
  • Mutual information is 2 bits

31
Message length
32
Binary codes
33
Prefix codes
34
Unique decodability
35
Kraft-McMillan Inequality
36
Ideal codes
An ideal code, C, for a probability distribution,
P, is one in which the codeword lengths in C
match the Shannon information contents in P.
37
Why ideal?
38
Relative entropy codes
39
Mutual information codes
I(xy) S(x) S(y) - S(x,y)
40
Conditional entropy
Measures the information content of a variable,
x, when another variable, y, is known
S(xy) S(x,y) - S(x)
I(xy) S(x) S(y) - S(x,y) S(x) - S(xy)
S(y) - S(yx)
41
Mutual information example 2
42
Combinatorics
43
Multinomials
44
Rates of events
45
Gaussian distribution
Density function
Cumulative distribution function
46
Extreme Value Distribution
47
Bayesian Inference Examples
Several of these examples are taken from
http//yudkowsky.net/bayes/bayes.html Others are
from David MacKays book
48
Bayes Theorem (reminder)
49
Xdisease, Ysymptom (or test result)
  • 1 of women at age forty who participate in
    routine screening have breast cancer.
  • 80 of women with breast cancer will get positive
    mammographies.
  • 9.6 of women without breast cancer will also get
    positive mammographies.
  • A woman in this age group had a positive
    mammography in a routine screening.
  • What is the probability that she actually has
    breast cancer?

Scary fact 85 of doctors get this
wrong (Casscells, Schoenberger, and Graboys 1978
Eddy 1982 Gigerenzer and Hoffrage 1995)
50
Alternative presentation
  • 100 out of 10,000 women at age forty who
    participate in routine screening have breast
    cancer.
  • 80 of every 100 women with breast cancer will
    get positive mammographies.
  • 950 out of 9,900 women without breast cancer
    will also get positive mammographies.
  • If 10,000 women in this age group have a positive
    mammography in a routine screening
  • About what fraction of them actually have breast
    cancer?

Equally scary fact 54 of doctors still get it
wrong
51
All relevant probabilities
52
Similar example
  • A drug test is 99 accurate.
  • Suppose 0.5 of people actually use the drug.
  • What is the probability that a person who tests
    positive is actually a user?

53
Xrisk factor, Ydisease
  • Medical researchers know that the probability of
    getting lung cancer if a person smokes is .34.
  • The probability that a nonsmoker will get lung
    cancer is .03.
  • It is also known that 11 of the population
    smokes.
  • What is the probability that a person with lung
    cancer was a smoker?

54
Sometimes the problem is stated less clearly
  • Suppose you have a large barrel containing a
    number of plastic eggs.
  • Some eggs contain pearls, the rest contain
    nothing.
  • Some eggs are painted blue, the rest are painted
    red.
  • Suppose that
  • 40 of the eggs are painted blue
  • 5/13 of the eggs containing pearls are painted
    blue
  • 20 of the eggs are both empty and painted red.
  • What is the probability that an egg painted blue
    contains a pearl?

55
Pearls and eggs
  • X is egg color (0 for blue, 1 for red)
  • Y is the pearl (0 if absent, 1 if present)

56
The Monty Hall problem
  • We are presented with three doors - red, green,
    and blue - one of which has a prize.
  • We choose the red door, but this door is not
    opened (yet), according to the rules.
  • The rules are that the presenter knows what door
    the prize is behind, and who must open a door,
    but is not permitted to open the door we have
    picked or the door with the prize.
  • The presenter opens the green door, revealing
    that there is no prize behind it, and
    subsequently asks if we wish to change our mind
    about our initial selection of red.
  • What are the probabilities that the prize is
    behind (respectively) the blue and red doors?
  • Xprize door, Ypresenter door
Write a Comment
User Comments (0)
About PowerShow.com