Sampling - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Sampling

Description:

Sampling W&W, Chapter 6 Rules for Expectation Examples Mean: E(X) = xp(x) Variance: E(X- )2 = (x- )2p(x) Covariance: E(X- x)(Y- y) = (x- x)(y ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 33
Provided by: SaraM174
Learn more at: http://saramitchell.org
Category:
Tags: moments | sampling

less

Transcript and Presenter's Notes

Title: Sampling


1
Sampling
  • WW, Chapter 6

2
Rules for Expectation
  • Examples
  • Mean E(X) ?xp(x)
  • Variance E(X-?)2 ?(x- ?)2p(x)
  • Covariance E(X-?x)(Y-?y)
  • ?(x- ?x)(y- ?y)p(x,y)

3
Rules for Expectation
  • E(X Y) E(X) E(Y)
  • E(aX bY) aE(X) bE(Y)
  • E(R) where R10XY
  • E(10 X Y) 10 E(X) E(Y)
  • Eg(X,Y) ?x ?yg(x,y)p(x,y)

4
Sampling
  • What can we expect of a random sample drawn from
    a known population?
  • Can we generalize findings from our random sample
    to the population?
  • This is the heart of inferential statistics.

5
Definitions
  • Population The total collection of objects to be
    studied.
  • Each individual observation in a random sample
    has the population probability distribution p(x).
    See Table 6-1, p.190
  • Random Sample A sample in which each individual
    has an equal chance of being selected.

6
Definitions (continued)
  • The sample mean is not as extreme (doesnt vary
    so widely) as the individual values in the sample
    because it represents an average.
  • In other words, extreme observations are diluted
    by more typical observations. See Figure 6-2.
  • A sample is representative if it has the same
    characteristics as the population random samples
    are much more likely to be representative.

7
Sampling with or without replacement
  • In large samples, these are practically
    equivalent.
  • A very simple random sample (VSRS) is a sample
    whose n observations X1, X2, Xn are independent.
    The distribution of each X is the population
    distribution p(x), that is
  • p(x1)p(x2)p(xn)

8
Small Samples
  • The exception to this rule occurs in small
    samples, where sampling without replacement
    significantly changes the probability of other X
    values (see page 216).
  • Example calculating the probability of various
    poker hands

9
How Reliable is the Sample?
  • Suppose we calculate the sample mean (M), and we
    want to know how close it comes to ?, the
    population mean.
  • Imagine collecting many different samples,
    getting a sample mean for each sample. We could
    build the sampling distribution of M, denoted
    p(M).
  • Example everyone flip a coin 10 times and tell
    me how many heads you flipped.

10
How Reliable is the Sample?
  • Rather than actually sampling, we can simulate
    this sampling on a computer, which is called
    Monte Carlo sampling (or simulation).
  • We can also derive mathematical formulas for the
    sampling distribution of M.

11
Moments of the Sample Mean
  • Recall our objective is to estimate a population
    mean, ?. If we take a random sample of
    observations from the population and calculate
    the sample mean, how good will M be as an
    estimator of its target, ??
  • We start with the definition of the sample mean
    M 1/n(X1 X2 Xn)

12
Moments of the Sample Mean
  • We start by calculating the expectation of the
    sample mean
  • E(M) 1/nE(X1) E(X2) E(Xn)
  • Remember that each observation X has the
    population distribution p(x) with mean ?. Thus
    E(X1) E(X2) ?
  • E(M) 1/n? ? ?
  • 1/nn ? ?

13
Moments of the Sample Mean
  • We can see that E(M) ?
  • On average, the sample mean will be on target,
    that is, equal to ?.
  • Of course, an individual sample mean is likely to
    be a little above or below its target (think of
    the coin flips we did).
  • The key question is how much above or below? We
    must find the variance of M.

14
Moments of the Sample Mean
  • Var (M) 1/n2var(X1) var(X2)
  • var(Xn)
  • Each observation X has the population
    distribution p(x) with variance ?2, so
  • Var (M) 1/n2?2 ?2 ?2
  • 1/n2n ?2 ?2/n
  • Standard deviation (M) ?/?n

15
Standard error
  • This typical deviation of M from its target, ?,
    represents the estimation error, and is commonly
    called the standard error.
  • What happens as n increases?
  • The standard error decreases, thus the larger the
    sample, the more accurately M estimates ?!

16
The shape of the sampling distribution
  • Figure 6-3 shows 3 different parent population
    distributions. We see that as n increases, the
    sampling distribution has an approximately normal
    shape.
  • Central Limit Theorem In random samples of size
    n, M fluctuates around ? with a standard error of
    ?/?n. Thus as n increases, the sampling
    distribution of M concentrates more and more
    around ? and becomes normal (bell-shaped).

17
Normal approximation rule
  • If we know the normal approximation rule, or
    Central Limit Theorem, we can look at the
    probability of particular values (or ranges) of M
    using the standard normal table.
  • Example Suppose a population of men on a large
    southern campus has a mean height of ?69 inches
    with a standard deviation ? 3.22 inches.

18
Normal approximation rule
  • If a random sample of n 10 men is drawn, what
    is the chance that the sample mean M will be
    within 2 inches of the population mean?
  • E(M) ? 69
  • SE ?/?n 3.22/ ?10 1.02
  • We want to find the probability that M is within
    2 inches, or between 67 and 71.

19
Normal approximation rule
  • Z M - ? M - ?
  • SE ?/?n
  • Z 71 69 1.96
  • 1.02
  • Thus a sample mean of 71 is nearly 2 standard
    errors about its expected value of 69.
  • P(Z gt 1.96) .025, likewise P(Z lt -1.96) .025

20
Normal approximation rule
  • Probability (67 lt M lt 71)
  • 1 .025 - .025 .95
  • We can conclude that there is a 95 chance that
    the sample mean will be within 2 inches of the
    population mean.
  • Note that there are 2 formulas for Z-scores, one
    for individual values of X, and one for sample
    means, M.

21
Another Example
  • Suppose a large statistics class has marks
    normally distributed with ?72, and ?9.
  • What is the probability that an individual
    student drawn at random will have a mark over 80?
  • Here we are comparing a single students score to
    the distribution of scores.

22
Another Example
  • Z X ? 80 72 .89
  • ? 9
  • Pr(Z gt .89) .187
  • What is the probability that a random sample of
    10 students will have a sample mean over 80?
  • In this case, we are comparing the sample mean to
    all possible sample means, the sampling
    distribution.

23
Another Example
  • Z M - ? 80 72 2.81
  • ?/?n 9/?10
  • Pr(Z gt 2.81) .002
  • This sample mean is very unlikely. This shows
    that taking averages tends to reduce the
    extremes.

24
Proportions
  • We often express our data as proportions, such as
    the proportion of heads in a sample of 10 coin
    flips.
  • Normal Approximation Rule for Proportions In
    random samples of size n, the sample proportion P
    fluctuates around the population proportion ?
    with a standard error of ??(1- ?)/n

25
Proportions
  • We can see again that as n increases, our sample
    proportion gets closer to the population
    proportion.
  • Example A population of voters has 60
    Republicans and 40 Democrats. What is the
    chance that a sample of 100 will produce a
    minority of Republicans (less than 50)?

26
Proportion Example
  • Z P - ? P - ?
  • SE ??(1- ?)/n
  • Z .5 - .6 -2.00
  • ?.6(1- .6)/100
  • Pr(Z lt -2.00) .023 or 2

27
Normal Approximation to the Binomial
  • Of your first 10 grandchildren, what is the
    chance there will be more than 7 boys?
  • This is the same as the proportion of boys is
    more than 7/10.
  • We could use the binomial to solve this problem.
  • Assume p(boy) .5

28
Normal Approximation to the Binomial
  • P(S gt 7) P(S8) P(S9) P(S10)
  • You could calculate this or just use the
    cumulative binomial table on pages 670-671.
  • P(S gt 7) .044 .010 .001 .055
  • We can also use what we know about proportions
    and that they will approximate the normal
    distribution to solve this problem.

29
Normal Approximation to the Binomial
  • We want to know the probability of getting more
    than 7 boys. We must calculate this as p7/10
    because we are dealing with a continuous
    distribution (normal), so everything between 7
    and 8 must be included.
  • Z P - ? .7 - .5 1.26
  • ??(1- ?)/n ?.5(1-.5)/10
  • Pr(Z gt 1.26) .104

30
Normal Approximation to the Binomial
  • Obviously, this involves some error. We can
    correct with the continuity correction, where we
    take the half way point between 7 and 8.
  • Z P - ? .75 - .5 1.58
  • ??(1- ?)/n ?.5(1-.5)/10
  • Pr(Z gt 1.58) .057

31
Normal Approximation to the Binomial
  • Note that this is very close to our estimate
    calculated from the binomial distribution, .055!

32
Monte Carlo Simulations
  • A computer program that repeats sampling and
    constructs a sampling distribution.
  • This approach is particularly useful for
    providing sampling distributions that cannot be
    derived easily theoretically.
Write a Comment
User Comments (0)
About PowerShow.com