SM219 Ch8 - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

SM219 Ch8

Description:

The mean of the avg is the mean of the data ... Consider the avg of N=6 values from this population ... find how many SD the avg and mean might be apart ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 26
Provided by: johnt1
Category:
Tags: avg | ch8 | sm219

less

Transcript and Presenter's Notes

Title: SM219 Ch8


1
Ch 8, Confidence
  • In Ch7, we saw that the average of a sample has a
    normal distn
  • The mean of the avg is the mean of the data
  • We can find how far the avg might be from the
    mean with a certain probability

2
Ch 8, Confidence
  • This leads us to say that we can be, say, 95
    confident that the mean is within a certain
    distance of the avg
  • Note that the mean might be unknown, but it is
    not random
  • It does not change if we take a different sample
  • So we cannot assign probabilities to the mean

3
Ch 8, Confidence
  • If we give a 95 confidence interval for the
    mean, we are 95 confident that the true mean
    lies within this interval
  • The interval will be based on an interval that
    has a 95 probability of containing the avg
  • (Since the avg IS random, we can assign
    probabilities to it.)

4
Ch 8, Confidence
  • Suppose we have a normal distn with SD11.5
  • Consider the avg of N6 values from this
    population
  • In what interval (around the mean) will the avg
    fall with probability 95?

5
Ch 8, Confidence
6
Ch 8, Confidence
  • So 95 of the time, the avg will be within /-9.2
    of the mean
  • We now reverse this and say that we are 95
    confident that the mean is within /-9.2 of the
    avg
  • If the avg turns out to be 63.7, then the 95 CI
    for the mean is 63.7 /-9.2

7
Ch 8, Confidence
  • Consider SP500.xls
  • Divide into 5 groups

8
Ch 8, Confidence
9
Ch 8, Confidence
10
Ch 8, Confidence
11
Ch 8, Confidence
  • Procedure
  • For given confidence, find how many SD the avg
    and mean might be apart
  • We are confident that the mean is within this far
    of the avg

12
Ch 8, Confidence
  • Example
  • We have a normal population with SD8.4
  • The avg of a sample of size 7 is 33.5
  • Find 90 CI for the mean of the population

13
Ch 8, Confidence
14
Ch 8, Confidence
  • So the avg might be 1.6458.4/sqrt(7) away from
    the mean
  • We are 90 confident that the mean is no higher
    than 33.5 1.6458.4/sqrt(7) and no lower than
    35- 1.6458.4/sqrt(7)

15
Ch 8, Confidence
  • In the previous problems, we had to assume we
    knew the value for the SD
  • This is generally not true
  • Generally we have to estimate SD from the data
  • Replace the normal distn with Students t distn

16
Ch 8, Confidence
  • Students t is similar to normal, but has an
    extra parameter
  • Degrees of freedom (df)
  • Measures how good our est of SD is
  • FOR THESE PROBLEMS, dfN-1
  • For large df, t is essentially normal
  • For smaller df, the quantiles are a little larger
    for t than for normal

17
Ch 8, Confidence
  • (See new PROBCALC.XLS)
  • For normal, 90 CI is /-1.645 SD
  • For t with 5 df, 90 is 2.015 SD
  • For df15, 90 is 1.753 SD
  • For df75, 90 is 1.665 SD

18
Ch 8, Assumptions
  • We use the t distn when we estimate the SD from
    the data
  • We are still assuming that the data comes from a
    normal population
  • There are two issues

19
Ch 8, Assumptions
  • There may be outliers in the sample
  • This means that some of the observations should
    not be considered to be representative of the
    underlying population
  • For our purposes, we will only consider an
    observation to be an outlier if it is WAY outside
    the others

20
Ch 8, Assumptions
  • Boxplots may be helpful in suggesting outliers
  • These are marked with asterisks
  • But simply because a value is far from the other
    values does not make it an outlier
  • There should be some reason to think that this
    point should not have been in the sample

21
Ch 8, Assumptions
  • Suppose we take a sample of cities
  • New York might be an outlier because it is so
    different from other cities
  • If we consider the stock market, we might want to
    eliminate 9/11
  • If we consider Olympic records, we might ignore
    Bob Beamons 1968 record

22
Ch 8, Assumptions
  • The other assumption is that the underlying distn
    is normal
  • A good way to assess whether the distn is normal
    is to use a normal probability plot
  • This plots the sorted data vs percentiles of the
    (standard) normal
  • If the plot is approx a straight line, then the
    data appears to come from a normal distn

23
Ch 8, Assumptions
  • How close to a straight line?
  • We will only be concerned if the plot is
    OBVIOUSLY not a straight line

24
Ch 8, Assumptions
  • What to do if the distn is NOT normal?
  • This changes the question
  • The mean is the obvious way to describe the
    normal distn
  • For a different distn, it is less clear what to
    use to describe

25
Ch 8, Assumptions
  • We might consider using the median to describe
    the distn
  • If we consider a possible value for the median
    and count the number in our sample that are above
    (or below) this value, we get the binomial distn
  • Not so simple to form confidence intervals
Write a Comment
User Comments (0)
About PowerShow.com