Title: The Binomial Distribution
1Chapter 19
- The Binomial Distribution
2What is the binomial distribution?
- The binomial distribution is the probability
distribution of the number of times a dichotomous
event event E occurs in N attempts. - The event is considered to either occur or not.
- Binomial distributions are not normal
distributions. - Binomial means 2 numbers.
3Examples of situations generating binomial
distributions
- Coin toss event heads
- Birth event female gender
- Two alternative forced choice for mood event
happy - Notice that there is an anti-event for each
event. - For example tails, male, not happy.
4Examples of situations generating binomial
distributions
- Either the event or the anti-event happens.
- There are no other possibilities.
- The first 2 examples are inherently dichotomous,
whereas the third example is made binary by
experimental design. - If the probability of the event is P, then the
probability of the anti-event is _______?
5Experiment
- Flip a coin 10 times.
- How many heads do you get?
- Now flip a coin 20 times.
- How many heads do you get.
6What happens as the number of trials increases?
7Explanation
- More observations means more opportunities for an
event to occur. - More observations means there are more possible
outcomes in terms of the number of events.
8What happens if P ? .5?
- For example, if an event is rolling 1 on a die,
then P 1/6.
9Explanation
- Small P means the event is less likely, therefore
the expected number of times the event will occur
is less. - Symmetry is lost when P ? (1-P).
10ExampleCola taste test
- The binomial distribution can be used to test
hypotheses. - In a pilot study Pepsi tested 12 tasters to see
if they preferred Pepsi to Coke. - 9 of the 12 preferred Pepsi over Coke.
- Pepsi executives were convinced that their
product was better. - Word leaked out to Coca-Cola.
- The executives there were convinced that the
results were due to sampling error. - Who is correct?
11ExampleCola taste test
- Let us look at things from the Coke executives
point of view. - They are confident that Coke is at least as good
as Pepsi. - Thus, they make a null hypothesis that the
probability P, that a person would pick Pepsi, is
.5. - Coke executives always keep a copy of Cohens
book close by. - So, they look in table A13 to find the
probabilities for 9 or more tasters picking Pepsi
when P.5. - The sum of such probabilities is the probability
that Pepsi has beaten Coke by chance.
12ExampleCola taste test
- Let X be the number of tasters who prefer Pepsi.
- When n 12 and P.5
- p( X9 ) .0537
- p( X10 ) .0161
- p( X11 ) .0029
- p( X12 ) .0002
- So the total probability on that tail of the
distribution is -
- p .0537 .0161 .0029 .0002 .0729
- So, with an ? .05, the probability is not
significant.
13When the null distribution is not based on P.5
- The Coke executives were lucky that their null
hypothesis was based on P .5. - What if it is not?
- Fortunately, we can produce our own distribution
using the general equation for the probability
14When the null distribution is not based on P.5
- When would P ? .5?
- Suppose a gambler is at a casino and notices that
3 is rolled less than 1/6th of the time. - In particular, 3 appears only once in 24 rolls.
- Is this a fair die?
- Left as an exercise.
15Approximating the binomial distribution using the
normal distribution
- Throughout history, statisticians have had an
obsession with the normal distribution. - So, naturally they would try to approximate the
binomial distribution with the normal. - If N, the number of trials, is large and P is
close to .5, then the approximation is good. - ? is approximately NP (why?).
- The standard deviation is
- And, z is then given by
- Or,
16Approximating the binomial distribution using the
normal distribution
- How good is the approximation?
- When P is near .5 and Ngt25, the error is small.
- If P is not near .5, NPQ should be at least 9.
- NPQ can be manipulated experimentally. How?
17Z test for proportions
- Suppose we dont have X but have a proportion
instead. - For example say 58 of voters sampled prefer
candidate A and 42 favor candidate B. - Our formula for z then becomes
18Z test for proportionsExample
- For example say 58 of voters sampled prefer
candidate A and 42 favor candidate B. - 200 voters were surveyed.
- Our formula for z then becomes
19Z test for proportionsExample
- Plugging in the numbers from the survey
- Which is larger than either the 1 or 2 tailed
zcrit.
20Adjusting the z approximation for bias
- The z distribution ( the approximating
distribution ) is continuous and the binomial
distribution is discrete. - This causes a bias which is corrected by adding a
correction factor of .5 in the numerator. - The bias is greatest when N is small, so apply
the corrected formula in this case.
21General rules for testing hypotheses with the
binomial distribution
- There are 3 approaches.
- You can always use the formula for probabilities
in the binomial distribution. - This formula can be used to custom make any
distribution, distribution table (like A13) or
part thereof. - If P .5, you may be able to use table A13.
- If N is large enough and P is close enough to .5,
you may use a normal approximation.
22One tailed vs. two tailed tests
- http//onlinestatbook.com/chapter9/tails.html
- Note the binomial calculator.
23Exercises
- Page 619
- 1, 2,
- Answer the question about the casino,
- 6, 8
24SPSS
- Suppose a teacher knows from past experience that
children are equally likely to leave the class
room from any of 4 doors. - Suppose she hypothesizes that oppositional
children will more likely leave by the door in
the left rear of the room. - P for children leaving by the left rear door is
.25 under the null hypothesis. - Suppose she collects the following data with a
class of 80 oppositional students.
25SPSS
- Data-gtWeight cases
- Select Weight cases by.
- Put Number of oppositional children in box.
- OK.
- Analyze-gtNonparametric Tests-gtBinomial Test.
- Set Test Proportion to .75, which is the null
hypothesized portion for the first door listed. - If the order of the rows was reversed we would
enter .25. - Put door in the Test Variable List.
- OK.
26SPSS
- Significance of your test is given in the Asymp.
Sig. column. - Notice that it is a 1 tailed test.
- You can convert this to a 2 tailed test by
multiplying by 2. - However, due to skewing, this only works if the
distribution is symmetric. - The distribution is symmetric if
- P .5
27SPSS
- What if the data is not condensed?
- Open George grades.sav data.
- Suppose we want to know if the the number of
female students is statistically greater than the
number of male students. - Each students gender is tabulated individually.
- Go directly to the analysis.
- Move sex to the Test Variable List.
- Set the test proportion to .5.
- OK
- You should find that p.031
28SPSS Exercises