II' The World: Probability Theory - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

II' The World: Probability Theory

Description:

In real life, we often do not know various details about the world, and we want ... we don't start out with a preconceived idea about what value should be, and then ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 33
Provided by: SocialSc2
Category:

less

Transcript and Presenter's Notes

Title: II' The World: Probability Theory


1
Chapter 9
2
II. The World Probability Theory (Ch. 6 8)
I. Data Descriptive Statistics (Ch. 2 5)
III. Drawing Conclusions Statistical
Inference (Ch. 9 11)
3
  • In real life, we often do not know various
    details about the world, and we want to use our
    data sets to help us determine these details.
  • An important part of doing good empirical science
    is that we must do this carefully
  • We must understand what kind of evidence our
    (particular uses of) our data sets are providing
    .
  • We must understand how strong this evidence is.
  • We must understand on what grounds we are drawing
    any conclusions that we do.
  • In short, we want to quantify our degree of
    uncertainty about any conclusions we may choose
    to draw.

4
  • EXAMPLE What kind of weekly salaries do American
    adults make?
  • What is the average salary?
  • How spread out is this distribution?
  • Are there issues of skewness, kurtosis, etc.?
  • We have a sample of current American salaries.
  • But how likely is it that our estimates from the
    sample (e.g, ) will be close to the actual
    population parameters (e.g., ?).
  • And what do we mean by close to, anyways?

5
  • The first thing to note about sampling (i.e.,
    generating a data set) is that we always try to
    get data sets with more than one observation.
  • Why??
  • Suppose I want to determine the average weekly
    salary of US adults.
  • Option 1 Ask just one adult, assume this salary
    is representative of the whole.
  • But if there is this kind of variation possible
    in checking just one person, why cant/wont
    there be the same kind of variation (or even
    more?) if I use a bunch of people?

6
  • At this point we must start distinguishing
    sample statistics from random variables (and
    their distributions).
  • We use statistics derived from our actual data
    sets
  • e.g., 3.167, from 1, 2, 7, 8, -3, 4
  • to try to determine features about the
    distribution of the (theoretical) population
  • e.g.,

7
  • Similarly, we need to distinguish
  • is a particular number, and thus has no mean
    or variance.
  • (although it is a sample mean)
  • From
  • is a random variable (built out of other
    random variables, and it does have a mean and a
    variance.

8
  • Here is a useful way to think about the
    differences between notions like and
  • is a particular number that exists only after
    you have collected your data

9
  • In contrast is not any particular number.
    Instead, is built out of all the ways your
    data could turn out. Intuitively, is what you
    have before you collect your data.

Typically, not all data sets are equally likely
instead, they have a distribution. This
distribution is what we are trying to discover.
10
  • In contrast is not any particular number.
    Instead, is built out of all the ways your
    data could turn out. Intuitively, is what you
    have before you collect your data.

is determined by all the possible ways that
the Score column could be filled out, and the
probabilities that it will actually be filled out
in those ways.
11
  • Lets use our CPS data regarding weekly incomes
    to think about the underlying logic of random
    sampling.
  • To simplify things, lets pretend that the
    population of interest is only the persons in the
    sample.
  • Well pretend we dont know what our populations
    mean weekly income is
  • I.e., we dont know what ? is.
  • Instead, were going to try to estimate this
    population by collecting a random sample from it.

12
  • Random sampling does not make assumptions about
    people and their incomes
  • The sampling procedure does not assume that
    after we have selected Jon, Lisa, Ming, Pete,
    Casey, to be used in our sample, we assume that
    each one of them
  • could have had a weekly salary of, 0, with a
    probability around 28/14380, AND
  • could have had a weekly salary of 1, with a
    probability around 11/14380, AND
  • .., AND
  • could have had a weekly salary of 2884.610, with
    a probability around 215/14380.

13
  • Random sampling does not make assumptions about
    people and their incomes
  • The sampling procedure does not assume that
    after we have selected Jon, Lisa, Ming, Pete,
    Casey, to be used in our sample, we assume that
    each one of them
  • could have had a weekly salary of, 0, with a
    probability around 28/14380, AND
  • could have had a weekly salary of 1, with a
    probability around 11/14380, AND
  • .., AND
  • could have had a weekly salary of 2884.610, with
    a probability around 215/14380.

14
  • We decide (in advance) to collect a sample of
    size n
  • E.g., maybe n 50, or 100, or 1000
  • So we have slots X1, X2, X3, Xn, to fill in
  • We fill in these slots by selecting persons at
    random to be in these slots.
  • So there is about a w0/P chance of filling in any
    given slot with someone whos weekly income is
    0, AND
  • there is about a w1/P chance of filling in any
    given slot with someone whos weekly income is
    1, AND
  • . AND
  • there is about a w28864.61/P chance of filling in
    any given slot with someone whos weekly income
    is 2884.61

15
  • This underlying logic has three parts.
  • We get values for X1, X2, X3, Xn by
  • Independently drawing from
  • one population, which is
  • the correct population of interest.
  • Its important to observe all three of these
    parts.
  • Violating any one of these requirements can
    easily lead to numerical data which can be the
    basis of impressive-looking empirical results.
  • But which have little or no relevance to the
    empirical issue you are studying!

the i.i.d, requirement
16
  • Random sampling involves a set X1,,Xn of
    i.i.d. random variables.
  • Remember i.i.d. Independent and Identically
    Distributed
  • One observation doesnt affect the distribution
    of any other observation
  • E.g., we dont wait to collect a low score just
    because the last 10 scores we collected were
    high.
  • E.g., we dont insist that each subsequent score
    be higher than all the previous ones.
  • E.g., we dont start out with a preconceived idea
    about what value should be, and then fill in
    our slots X1,,Xn with values that make our
    prediction come out right.

17
  • Random sampling involves a set X1,,Xn of
    i.i.d. random variables.
  • Remember i.i.d. Independent and Identically
    Distributed
  • One observation doesnt affect the distribution
    of any other observation
  • E.g., we dont wait to collect a low score just
    because the last 10 scores we collected were
    high.
  • E.g., we dont insist that each subsequent score
    be higher than all the previous ones.
  • E.g., we dont start out with a preconceived idea
    about what value should be, and then fill in
    our slots X1,,Xn with values that make our
    prediction come out right.

18
  • Each Xi comes from just one population out in
    the world.
  • E.g., to learn about American incomes, we dont
    sample 30 Americans and 20 Iraqis.
  • E.g., we also dont ask 30 Americans about both
    their weekly incomes and their monthly
    rent/mortgage, and treat these as 60 data points.

19
  • Each Xi comes from just one population out in
    the world.
  • E.g., to learn about American incomes, we dont
    sample 30 Americans and 20 Iraqis.
  • E.g., we also dont ask 30 Americans about both
    their weekly incomes and their monthly
    rent/mortgage, and treat these as 60 data points.

20
  • That one population is the right one for our
    purposes.
  • E.g., to learn about American incomes, we dont
    just sample South Dakotans.
  • This would be biased sampling.
  • We would be focusing on a subpopulation, which
    would be problematic for our purposes.
  • Our sample would probably be biased low.
  • E.g., we also dont ask Iraqis about their
    incomes.
  • That is the wrong population for our study.
  • Similarly, we dont just ask women only, or
    Hispanics only, or Senators only.
  • These too are interesting populations, but they
    are not right for the question we are trying to
    answer.

21
  • That one population is the right one for our
    purposes.
  • E.g., to learn about American incomes, we dont
    just sample South Dakotans.
  • This would be biased sampling.
  • We would be focusing on a subpopulation, which
    would be problematic for our purposes.
  • Our sample would probably be biased low.
  • E.g., we also dont ask Iraqis about their
    incomes.
  • That is the wrong population for our study.
  • Similarly, we dont just ask women only, or
    Hispanics only, or Senators only.
  • These too are interesting populations, but they
    are not right for the question we are trying to
    answer.

22
  • Lets begin by taking a random sample of weekly
    wages.
  • Afterwards, we will consider the underlying
    statistical properties of the variableand
    explore them in relation to the underlying
    variable of interest, X
  • is the variable that will yield our sample
    average of American weekly wages
  • X is the variable that yields these wages
  • We are interested in the distribution of X,
    because that is what determines a large part of
    our economy!
  • We typically care about only insofar as it
    helps us understand X.

23
  • To get we need a random sample X1,,X100,
    where Xi X.
  • Ultimately, we are interested in the distribution
    of X.
  • Lets begin by looking at the relationship
    between the mean of X and the mean of

24
In short, the mean of is the same as the
mean of X.
25
  • Now lets explore the variance of
  • Since the variables X1,,Xn are i.i.d., we can
    make use of what we saw in Chapter 7, namely

26
(No Transcript)
27
  • In short, the variance of is 1/nth the
    variance of X.
  • So, the standard deviation of is the size
    of the standard deviation of X.
  • To sum up

28
  • In other words, the standard deviation of the
    average of our sample (our data set) is only
    the size of the standard deviation of a sample
    of just one.
  • So if you want your estimate to have 10 times
    less variability (measured by ?) in it than is
    present in American adults in general, you should
    take the average of 100 adults, instead of just
    measuring one adult.
  • Thus, the reliability of our estimate increases
    dramatically as n increases.

29
  • Notice that in our calculation of
  • it was crucial that the Xis were all i.i.d.
    (independent and identically distributed)
  • If the Xis were not independent, then we couldnt
    have made the step
  • If the Xis were not identically distributed, then
    we couldnt have made the step

30
  • Notice also that by the Central Limit Theorem,
    as n gets large, will approximate a Gaussian
    distribution.
  • Knowing that has an (approximately) normal
    distribution gives us much useful information
    about how close to the mean of (and
    hence to the mean of X) we are likely to get with
    a given random sample.

31
  • It is important to distinguish
  • The sample mean
  • The random variable
  • The population mean

32
  • Rules of thumb regarding random sampling.
  • Moral Protect the i.i.d-ness of your sample!!
  • pp. 322 323
Write a Comment
User Comments (0)
About PowerShow.com