Title: Topics: Inferential Statistics
1Topics Inferential Statistics
- Inference
- Terminology
- Central Limit Theorem
- Estimation
- Point Estimation
- Confidence Intervals
- Hypothesis Testing
2Inferential Statistics
- Research is about trying to make valid inferences
- Inferential statistics the part of statistics
that allows researchers to generalize their
findings beyond data collected. - Statistical inference a procedure for making
inferences or generalizations about a larger
population from a sample of that population
3How Statistical Inference Works
4Basic Terminology
- Population any collection of entities that have
at least one characteristic in common - Parameter the numbers that describe
characteristics of scores in the population
(mean, variance, s.d., correlation coefficient
etc.)
5Basic Terminology (contd)
- Sample a part of the population
- Statistic the numbers that describe
characteristics of scores in the sample (mean,
variance, s.d., correlation coefficient,
reliability coefficient, etc.)
6Basic Statistical Symbols
7Basic Terminology (cont)
- Estimate a number computed by using the data
collected from a sample used as a best guess
about a population parameter - Estimator formula used to compute an estimate
8The Process of Estimation
9 Types of Samples
- Probability
- Simple Random Samples
- Simple Stratified Samples
- Systematic Samples
- Cluster Samples
- Non Probability
- Purposive Samples
- Convenience Samples
- Quota Samples
- Snowball Samples
10Evaluating Samples
- If randomness of sample is impaired by refusal to
participate - Is participation rate reasonably high?
- Is there reason to believe participants and
non-participants are similar on the relevant
variables? - If sample was not random
- Is it drawn from target group for the
generalization? - Is it at least reasonably diverse?
- Does researcher explicitly discuss this
limitation? - Has author described relevant demographics of
sample? - Is sample size sufficiently large?
11Limits on Inferences and Warnings
- Response Rates
- Source of data
- Sample size and sample quality
- Random
12Estimation
- Point Estimation
- Interval estimation
- Sampling Error
- Sampling Distribution
- Confidence Intervals
13Interval Estimation
- Interval Estimation an inferential statistical
procedure used to estimate population parameters
from sample data through the building of
confidence intervals - Confidence Intervals a range of values computed
from sample data that has a known probability of
capturing some population parameter of interest
14Key Concepts
- Sampling Distribution
- Sampling Error
- Standard Error
- Confidence Interval
15Central Limit Theorem
- The sampling distribution of means, for samples
of 30 or more - Is normally distributed (regardless of the shape
of the population from which the samples were
drawn) - Has a mean equal to the population mean, mu
regardless of the shape population or of the size
of the sample - Has a standard deviation--the standard error of
the mean--equal to the population standard
deviation divided by the square root of the
sample size
16Sampling Distribution of Mean
17Central Limit Theorem Again
- The sampling distribution of means, for samples
of 50 or more - Is normally distributed (regardless of the shape
of the population from which the samples were
drawn) - Has a mean equal to the population mean, mu
regardless of the shape population or of the size
of the sample - Has a standard deviation--the standard error of
the mean--equal to the population standard
deviation divided by the square root of the
sample size
18Sampling Distribution and Sampling Error Under CTL
u
2sem
-2sem
1sem
-1sem
-3sem
3sem
mu
Population mean
19Sampling Distribution and Standard Error
- Sampling Distribution a theoretical distribution
that shows the frequency of occurrence of values
of some statistic computed for all possible
samples of size n drawn from some population. - Sampling Distribution of the Mean A theoretical
distribution of the frequency of occurrence of
values of the mean computed for all possible
samples of size n from a population - Standard Error the standard deviation of the
sampling distribution of the statistic
20Review Different Version of Standard Deviations
- Standard deviation
- Standard error of measurement
- Standard error of a statistic (mean, variance,
correlation coefficient, t-statistic etc)
21Building Confidence Interval (95)
95
u
2sem
-2sem
1sem
-1sem
-3sem
3sem
98.5 97 98.5
100
101.5 103 104.5
99
105
22Confidence Intervals (CI)
- A range of values having a known probability that
the interval computed from the sample data
includes the population parameter of interest - A defined interval of values that includes the
statistic of interest, by adding and subtracting
a specific amount (in this case standard error
points) from the computed statistic (in this case
the sample mean)
23Process for Constructing Confidence Intervals
- Compute the sample statistic (e.g. a mean)
- Compute the standard error of the statistic
(mean) - Make a decision about level of confidence that is
desired (usually 95 or 99) - Identify table value for 95 or 99 confidence
interval - Multiply standard error of the mean by the tabled
value - Form interval by adding and subtracting
calculated value to and from the mean
24Various Levels of Confidence
- When population standard deviation is known use Z
table values - For 95CI mean /- 1.96 s.e. of mean
- For 99 CI mean /- 2.58 s.e. of mean
25Factors Affecting Width of Confidence Intervals
- Level of confidence
- Data variability
- Sample size
26Effects of Confidence Interval
95 times out of 100 the interval
constructed around the sample mean will
capture the population mean. 5 times out of 100
the interval will not capture the population mean
99 times out of 100 the interval
constructed around the sample mean will
capture the population mean. 1 time out of 100
the interval will not capture the population mean
99
95
u
-2.58sem
-1.96sem
2.58sem
1.96sem
mu
27Effects of Variability
28Effects of Sample Size
29Practice Example What does the 95 CI really
mean?
- Sample of 1000 California seniors
- A sample mean on test of US History of 489
- Population (all California seniors) mean not
known - Known standard deviation of 100
- Pick the critical value for the Confidence
Interval - For 95 CI (with population s.d.known) /-
1.96 - Calculate Standard Error of Mean
- SEMSD/Sqrt(N)100/Sqrt(1000)100/31.63.16
- Create a 95 Confidence Interval
- Adding and subtracting 1.96(3.16) 6.27 to 489
creates a 95 confidence interval (CI) 482.73 -
495.27