Title: Psychology 9
1Psychology 9
- Quantitative Methods in Psychology
-
- Jack Wright
- Brown University
- Section 15
-
-
Note. These lecture materials are intended
solely for the private use of Brown University
students enrolled in Psychology 9, Spring
Semester, 2002-03. All other uses, including
duplication and redistribution, are unauthorized.
2Agenda
- From binomials to the Normal Distribution
- Announcements
- Assignment Chapter 9
3Illustrative problem 2
- The probability of a certain trait occurring in a
population is .40. An investigator intends to
draw a random sample of 20 people and count the
number with the trait. - What is the binomial distribution?
4Binomial Distribution
- For n 20, p .4, and r people with trait
Middle?
Probability
Spread?
Shape?
R ( with trait)
5Defining mbinomial and sbinomial
- You flip a fair coin 2 times and count heads.
What is the mean heads (r)? - Our possible outcomes are TT TH HT HH
- Rs ( heads) for each are 0 1 1
2 - What is the mean of the rs?
- S(r)/N (0 1 1 2)/4 4/4 1.0
- More simply mbinomial N p 2 .5 1
6Defining mbinomial and sbinomial
- What is the standard deviation of the heads
(r)? - Our possible outcomes are TT TH HT HH
- Rs ( heads) for each are 0 1 1
2 - deviations from mean -1 0 0
1 - sum of squares 2
- s sqrt(SS/N) sqrt(2/4) sqrt(.5) .71
- Or, more simply
- sbinomial ?(Npq) ?(2.5.5) .71
7Implications of mbinomial and sbinomial
- We now have measures of center and spread of
binomial - What about shape?
- Why do we care?
- suppose many binomials take on just one shape
- then, solve properties of this one shape
- then, use this knowledge to solve all binomial
situations with this shape, no matter what
mbinomial and sbinomial are
8p.5, n5, 10
height gets lower
Shape appears
area per bar gets smaller
Bars get thinner
9p.5, n20, 50
Shape better defined
10p.5, n100, 500
Most of area in here
Lets focus on this part...
Approaches, but does not reach 0
11p.5, n100, 500 (truncated)
Shape becomes very well defined
12Effects of increasing n on binomial distributions
- 1. The measure r ( successes) evolves from a
discrete to a continuous variable - 2. Histogram evolves into continuous function
- each bar narrower and height lower
- probability in each bar gets smaller
- probability polygon becomes smooth curve
- 3. Tails extend infinitely in both directions
approach but do not reach 0 - 4. Distribution becomes symmetric and bell
shaped
13The normal approximation of the binomial
- Our binomial distributions appear to become
normal - But are they in fact normal?
- Method
- repeat, but superimpose normal (Gaussian)
function - two parameters (as always)
- mbinomial and sbinomial
- Why is this important?
14p.5, n5, 10 w/ Gaussian
Poor fit
Better fit
15p.5, n20, 50 w/ Gaussian
Good fit
Still better fit
16p.5, n100, 500 w/ Gaussian
Nearly perfect fit
17Boundary conditions
- So far, considered only p .5
- We know binomial distribution is skewed when p !
.5, at least for small n - What happens when n increases?
- Why important?
- So, consider extreme case
18p.1, n5, 10
Poor fit
19p.1, n20, 50
Better fit
20p.1, n100, 500
Good fit, even when p .1
21Summary
- With n large, the binomial distribution becomes a
continuous normal (Gaussian) function - occurs even when p ! .5
- Therefore, can solve large binomial problems with
only 3 pieces of information - the middle or mbinomial
- the width of the bell around the middle or
sbinomial - the Gaussian shape
22Summary
- How large is large enough?
- rule of thumb when Np and Nq gt 10
- eg p .5, q .5, N must be about 20
- eg p .1, q .9, N must be about 100
23From binomials to sample sums
- So far, dealt only with binomials ( successes)
- Now examine how our normal approximation applies
to other things - Suppose you roll a die. What is the probability
of getting each number?
24Probability Distribution
- We have 6 outcomes, each with p 1/6 .17
Probability
Outcome
25From binomials to sample sums
- Now suppose you roll the die twice and take the
SUM or the mean of the outcomes? What will the
distribution be? - sum mean ways to get this result p
- 2 1 11 1/36 .027
- 3 1.5 12 21 .055
- 4 2 13 31 22 .083
- 5 2.5 23 32 14 41 .111
- 6 3 15 51 24 42 33 .138
- 7 3.5 16 61 25 52 34 43 .167
- 8 4 26 62 35 53 44 .138
- 9 4.5 36 63 45 54 .111
- 10 5 46 64 55 .083
- 11 5.5 56 65 .055
- 12 6 66 .027
26Probability distribution of sums
- We have 11 possible outcomes...
Probability
Sum
Mean
1 2 3 4 5
6
Outcome
27Extension to random samples
- For previous problem, could work out all possible
combinations - For bigger problems, this would be laborious
- Instead of all possible samples, imagine we take
random samples. - Example
- we roll the die 10 times, sum up the dots
- take the average number of dots
- repeat this many times
- draw a probability histogram
- try to fit a Gaussian function
28n2 rolls
29n3 rolls
30n5 rolls
31n10 rolls
32n50 rolls
33Final extension
- In previous case, distribution was flat
- Now consider cases in which the population is
an asymmetric or irregular distribution
Frequency
Value
34Final extension
- Using this population...
- Suppose we draw random samples of size n
- for each sample, compute the mean and record it
- repeat this 1000 times
- draw a probability histogram of the distribution
of sample means - try to fit a Gaussian function
35n1
36n2
37n3
38n5
39n10
40n50
41Significance
- 1. Binomial situation provides account of how
chance events combine - Small n exact solutions through binomial formula
and tables - Large n Gaussian distribution provides good
approximation, even when p ! .5 - For infinite n, the binomial distribution
converges with the Gaussian function
42Significance
- 2. Many phenomena can be understood as the result
of adding together several independent events - eg multiple genes and other determinants of
physical characteristics - eg multiple genes and other determinants of
cognitive capacities and behavioral traits
43Significance
- 3. Phenomena that are the result of adding
independent random elements will often have an
approximately normal distribution. - This is known as the Central Limit Theorem.
- Central refers to fundamental
- Limit indicates that the normal distribution
emerges as we reach the limit of the binomial
(infinite n).
44Significance
- 4. For random samples from a population, the
sample sum and mean, by definition, result from
adding independent elements. - Therefore, sample means necessarily satisfy the
central limit theorem and will often be normally
distributed - Even when the parent population is not normally
distributed.
45Significance
- 5. For these reasons, the normal or Gaussian
distribution becomes an indispensable tool in
studying a wide range of cases in which the
Central Limit Theorem is satisfied.
46Significance
- The Gaussian or normal distribution is the most
widely used in Statistics There are two major
reasons for its dominance. The first is that
the mathematics tend to be relatively simple, and
the second is that often the random mechanism can
be specified as approximately Gaussian due to the
Central Limit Theorem. - Chambers Hastie (1992)
47Significance
- Thus, you see, it is no accident that the normal
distribution is the workhorse of inferential
statistics. The assumption of normal population
distributions or the use of the normal
distribution as an approximation device is not as
arbitrary as it sometimes appears this
distribution is part of the very fabric of
inferential statistics. - William Hays (1963)