Title: Non-Parametric Statistics Part I: Chi-Square
1Non-Parametric Statistics Part IChi-Square
c2
2x2 Operates on FREQUENCY Data
Suppose we have a plot of land on which we hope
to harvest wood. Maple is more valuable than Oak
and Oak more valuable than pine. We take a sample
of the trees (the whole plot is too big) and we
ask whether there are significantly unequal
amounts of each type (a.05).
Pine Maple Oak
of trees 145 301 289
We cannot get a mean from these data but there
are clear differences between the amounts in each
category. This is categorical or nominal data
experessed as frequencies. So we use the x2
Â
3x2 Homogeneity
What are the null and alternative hypotheses?
H0 The groups have equal frequencies.
H1 The groups do not have equal frequencies.
Find the critical value
x2 table (k-1 df 3-1 2) 5.99
Calculate the obtained statistic
Pine Maple Oak
of trees observed 145 301 289
of trees expected
(145 301 289)/3 245
245 245 245
Â
Â
Â
Â
61.52
Make a decision
Our obtained value is larger than our critical
value. Reject the null the groups do not have
equal frequencies.
4x2 Homogeneity Example
Is political affiliation distributed equally in
our class? (use alpha.01)
What are the null and alternative hypotheses?
Find the critical value
x2 table (k-1 df 3-1 2) 9.21
Calculate the obtained statistic
Democrat Republican Other
of people observed 10 15 5
(10 15 5)/3 10
Â
Â
Â
Â
5
Make a decision
Our obtained value is smaller than our critical
value. Retain the null the groups have equal
frequencies.
5x2 Goodness of Fit
- Five years ago the tree-lot was also sampled. Has
the composition of the lot changed since then
(use alpha.05)?
We need a different expected value based on the
previous sample.
Pine Maple Oak
trees 2014 145 301 289
Pine Maple Oak
trees 2009 255 115 103
Notice were trying to compare the frequencies
from two time points, but the total of trees
categorized in 2014 is different from the 2009
total!
Pine Maple Oak
trees expected 396.9 176.4 161.7
6x2 Goodness of Fit Example
What are the null and alternative hypotheses?
Find the critical value
x2 table (k-1 df 3-1 2) 5.99
Calculate the obtained statistic
Pine Maple Oak
trees 2014 145 301 289
Pine Maple Oak
trees expected 396.9 176.4 161.7
Â
Â
Â
Â
348.10
Make a decision
Our obtained value is larger than our critical
value. Reject the null the composition of the
lot has changed.
7x2 Independence
8x2 Independence Example (assume alpha.05)
What are the null and alternative hypotheses?
Find the critical value
df for this test is (r-1)(c-1)
We have 2 rows and 3 columns, so (2-1)(3-1) 2
x2 table (df 2) 5.99
Calculate the obtained statistic
9x2 Independence How to calculate expected
values
R
C
Grand Total 1500
Expected value (R x C)/ grand total
Expected Mirkwood-Pine (702 x 356)/1500 166.61
Expected Old Forest-Pine (798 x 356)/1500
189.39
10x2 Independence
Observed Values
Expected Values
28.18
11x2 Independence Example (assume alpha.05)
What are the null and alternative hypotheses?
H0 Tree type and forest are independent.
H1 Tree type and forest and not independent.
Find the critical value
df for this test is (r-1)(c-1)
We have 2 rows and 3 columns, so (2-1)(3-1) 2
x2 table (df 2) 5.99
28.18
Calculate the obtained statistic
Make a decision
Our obtained value is larger than our critical
value. Reject the null tree type and forest are
not independent.