Title: SAMPLING STRATEGIES
1THE CONCEPT OF STATISTICAL SIGNIFICANCE CHI-SQUAR
E AND THE NULL HYPOTHESIS
2READINGS
- Pollock, Essentials, ch. 5 and ch. 6, pp. 121-135
- Pollock, SPSS Companion, ch. 7
3OUTLINE
- Strategies for Sampling
- Establishing Confidence Intervals
- Chi-Square and the Null Hypothesis
- Critical Values of Chi-Square
4- Why Sample?
- Goal description of a population
- Advantages savings of time and money
- Basic paradox credibility of results from a
sample - depends on size and quality of the sample
itself, - and not on the size of the population
5Types of Samples Probability sampling Every
individual in the population has a known
probability of being included in the
sample Random sample (SRS) each individual has
an equal chance of being selected, and all
combinations are equally possible Systematic
sample every kth individualmore or
less equivalent to SRS if first selection is made
through random process Stratified sample
individuals separated into categories, and
independent (SRS) samples selected within the
categories Cluster sample population divided
into clusters, and random sample (SRS) then drawn
of the clusters
6- Parameters and Statistics
- A parameter is a number that describes the
population. It is - a fixed number, though we do not know its value.
- A statistic is a number that describes a sample.
We use - statistics to estimate unknown parameters.
- A goal of statistics To estimate the
probability that the - null hypothesis holds true for the population.
Forms - Parameter may not fall within a confidence band
that can be placed around a sample statistic, or - A relationship observed within a sample may not
have a satisfactory probability of existing
within the population.
7- Problems with Sampling (I)
- Bias
- A consistent, repeated deviation of the sample
statistic - from the population parameter
- Convenience sampling
- Voluntary response sampling
- Solution Use SRS
- Variation
- Signal large standard deviation within sample
- Range of sample statistics
- Solution Use larger N
-
8Problems in Sampling (II) Ho for
Sample Accepted Rejected Ho for Population
True Type I False Type II
Where Ho null hypothesis
9(No Transcript)
10What is Chi-square? A measure of
significance for cross-tabular
relationships Where fo observed frequency
(or cell count) And fe expected frequency
(or cell count) X2 S (fo fe)2/fe
11(No Transcript)
12Calculating Expected Frequencies fe col S
(row S/total N) for upper left-hand cell
802 (200/1,679) 95.5 fo 44 fo fe
44 95.5 -51.5 (fo fe)2 2,652.25 (fo
fe)2/fe 27.77
13 Conceptualizing Chi-Square
- Expected frequencies represent the null
hypothesis (no relationship) - Observed frequencies present visible results
- Question 1 Are observed frequencies different
from expected frequencies? - Question 2 Are they sufficiently different to
allow us to reject the possibility that the true
relationship (within the universe of case) is
null?
14Figuring Degrees of Freedom df (r 1)(c
1) Illustration Given marginal values,
________X________ __Y__ L H S
L 30 50 H 50 S
60 40 100 and df 1
15(No Transcript)
16Characteristics of Chi-Square
- Distribution for null hypothesis has a known
distributionskewed to the right - Specific distributions have corresponding degrees
of freedom, defined as (r-1)(c-1) - For a 2x2 table, chi-square of 3.841 or greater
would occur no more than 5 of the time in event
of null hypothesis (thus, .05 level or better)
17(No Transcript)
18(No Transcript)
19POSTSCRIPT
- X2 f (strength of relationship, sample size)
- The stronger the observed relationship within the
sample, the higher the X2 - The larger the sample (SRS), the higher the X2
- The higher the X2 (given degrees of freedom), the
greater the probability that null hypothesis does
not hold in the population (p lt .05)
20Limitations of Chi-Square
- No more than 20 of expected frequencies less
than 5 and all individual expected frequencies
are 1 or greater - Directly proportional to N observations
- Rejection of null hypothesis does not directly
confirm strength or direction of relationship
21Review Summary Measures for Cross-Tabulations
- Lambda-b PRE, ranges from zero to unity
measures strength only - Gamma Form and strength (-1 to 1), based on
pairs of observations - Chi-square Significance, based on deviation
from null hypothesis