Title: Today
1Todays Agenda
- Review Homework 1 not posted
- Probability ? Application to Normal Curve
- Inferential Statistics
- Sampling
2Probability Basics
- What is the probability of picking a red marble
out of a bowl with 2 red and 8 green?
p(red) 2 divided by 10 p(red) .20
3Frequencies and Probability
- The probability of picking a color relates to the
frequency of each color in the bowl - 8 green marbles, 2 red marbles, 10 total
- p(Green) .8 p(Red) .2
4Frequencies Probability
- What is the probability of randomly selecting an
individual who is extremely liberal from this
sample? - p(extremely liberal) 32 .024 (or
2.4) - 1,319
5PROBABILITY THE NORMAL DISTRIBUTION
- We can use the normal curve to estimate the
probability of randomly selecting a case between
2 scores - Probability distribution
- Theoretical distribution of all events in a
population of events, with the relative frequency
of each event
6PROBABILITY THE NORMAL DISTRIBUTION
- The probability of a particular outcome is the
proportion of times that outcome would occur in a
long run of repeated observations. - 68 of cases fall within /- 1 standard deviation
of the mean in the normal curve - The odds (probability) over the long run of
obtaining an outcome within a standard deviation
of the mean is 68
7Probability the Normal Distribution
- Suppose the mean score on a test is 80, with a
standard deviation of 7. If we randomly sample
one score from the population, what is the
probability that it will be as high or higher
than 89? - Z for 89 89-80/7 9/7 or 1.29
- Area in tail for z of 1.29 0.0985
- P(X gt 89) .0985 or 9.85
- ALL WE ARE DOING IS THINKING ABOUT AREA UNDER
CURVE A BIT DIFFERENTLY (SAME MATH)
8Probability the Normal Distribution
- Bottom line
- Normal distribution can also be thought of as
probability distribution - Probabilities always range from 0 1
- 0 never happens
- 1 always happens
- In between happens some percent of the time
- This is where our interest lies
9Inferential Statistics (intro)
- Inferential statistics are used to generalize
from a sample to a population - We seek knowledge about a whole class of similar
individuals, objects or events (called a
POPULATION) - We observe some of these (called a SAMPLE)
- We extend (generalize) our findings to the entire
class
10 WHY SAMPLE?
- Why sample?
- Its often not possible to collect info. on all
individuals you wish to study - Even if possible, it might not be feasible (e.g.,
because of time, , size of group)
11WHY USE PROBABILITY SAMPLING?
- Representative sample
- One that, in the aggregate, closely approximates
the population from which it is drawn
12PROBABILITY SAMPLING
- Samples selected in accord with probability
theory, typically involving some random selection
mechanism - If everyone in the population has an equal chance
of being selected, it is likely that those who
are selected will be representative of the whole
group - EPSEM Equal Probability of SElection Method
13PARAMETER STATISTIC
- Population
- the total membership of a defined class of
people, objects, or events - Parameter
- the summary description of a given variable in a
population - Statistic
- the summary description of a variable in a sample
(used to estimate a population parameter)
14INFERENTIAL STATISTICS
- Samples are only estimates of the population
- Sample statistics will be slightly off from the
true values of its populations parameters - Sampling error
- The difference between a sample statistic and a
population parameter
15EXAMPLE OF HOW SAMPLE STATISTICS VARY FROM A
POPULATION PARAMETER
x7 x0 x3 x1 x5 x8
x5 x3 x8 x7 x4 x6
X4.0
X5.5
µ 4.5 (N50)
x1 x7 x3 x4 x5 x6
CHILDRENS AGE IN YEARS
X4.3
x2 x8 x4 x5 x9 x4
x5 x9 x3 x0 x6 x5
X5.3
X4.7
16By Contrast Nonprobability Sampling
- Nonprobability sampling may be more appropriate
and practical than probability sampling - When it is not feasible to include many cases in
the sample (e.g., because of cost) - In the early stages of investigating a problem
(i.e., when conducting an exploratory study) - It is the only viable means of case selection
- If the population itself contains few cases
- If an adequate sampling frame doesnt exist
17Nonprobability Sampling 2 Types
- CONVENIENCE SAMPLING
- When the researcher simply selects a requisite
number of cases that are conveniently available - SNOWBALL SAMPLING
- Researcher asks interviewed subjects to suggest
additional people for interviewing
18Probability vs. Nonprobability SamplingResearch
Situations
- For the following research situations, decide
whether a probability or nonprobability sample
would be more appropriate - You plan to conduct research delving into the
motivations of serial killers. - You want to estimate the level of support among
adult Duluthians for an increase in city taxes to
fund more snow plows. - You want to learn the prevalence of alcoholism
among the homeless in Duluth.
19(Back to Probability Sampling)The Catch-22 of
Inferential Stats
- When we collect a sample, we know nothing about
the populations distribution of scores - We can calculate the mean (X) standard
deviation (s) of our sample, but ? and ? are
unknown - The shape of the population distribution
(normal?) is also unknown - Exceptions IQ, height
20PROBABILITY SAMPLING
- 2 Advantages of probability sampling
- Probability samples are typically more
representative than other types of samples - Allow us to apply probability theory
- This permits us to estimate the accuracy or
representativeness of the sample
21SAMPLING DISTRIBUTION
- Sampling Distribution
- From repeated random sampling, a mathematical
description of all possible sampling event
outcomes (and the probability of each one) - Permits us to make the link between sample and
population - answer the question What is the probability
that sample statistic is due to chance? - Based on probability theory
22EXAMPLE OF HOW SAMPLE STATISTICS VARY FROM A
POPULATION PARAMETER
x7 x0 x3 x1 x5 x8
x5 x3 x8 x7 x4 x6
X4.0
X5.5
µ 4.5 (N50)
x1 x7 x3 x4 x5 x6
CHILDRENS AGE IN YEARS
X4.3
x2 x8 x4 x5 x9 x4
x5 x9 x3 x0 x6 x5
X5.3
X4.7
23What would happen(Probability Theory)
- If we kept repeating the samples from the
previous slide millions of times? - What would be our most common sample mean?
- The population mean
- What would the distribution shape be?
- Normal
- This is the idea of a sampling distribution
- Sampling distribution of means
24Relationship between Sample, Sampling
Distribution Population
POPULATION
SAMPLING DISTRIBUTION (Distribution of sample outcomes)
SAMPLE
- Empirical (exists in reality)
- but unknown
- Nonempirical (theoretical or hypothetical)
- Laws of probability allow us
- to describe its characteristics
- (shape, central tendency,
- dispersion)
- Empirical known (e.g.,
- distribution shape, mean, standard deviation)
25THE TERMINOLOGY OF INFERENTIAL STATS
- Population
- the universe of students at the local college
- Sample
- 200 students (a subset of the student body)
- Parameter
- 25 of students (p.25) reported being Catholic
unknown, but inferred from sample statistic - Statistic
- Empirical known proportion of sample that is
Catholic is 50/200 p.25 - Random Sampling (a.k.a. Probability)
- Ensures EPSEM allows for use of sampling
distribution to estimate pop. parameter (infer
from sample to pop.) - Representative
- EPSEM gives best chance that the sample statistic
will accurately estimate the pop. parameter