Title: Hypothesis Testing
1Lecture 10
2from a previous lecture Functions of a Random
Variable
any function of a random variable is itself a
random variable
3If x has distribution p(x)the y(x) has
distributionp(y) px(y) dx/dy
4example
- Let x have a uniform (white) distribution of 0,1
1
p(x)
0
x
1
Uniform probability that x is anywhere between 0
and 1
5- Let y x2
- then xy½
-
- px(y)1 and dx/dy½y-½
- So p(y)½y-½ on the interval 0,1
p(y)
y
6Another example
- Let x have a normal distribution with zero
expectation and unit variance. To avoid
complication, assume xgt0, so that the
distribution is twice the usual amplitude
p(x) 2 (2p)-1/2 exp(-½ x2)
The distribution of yx2 is
p(y) p(x(y)) dx/dy (2py)-1/2 exp(-½y)
note that we used, as before, dx/dy½y-½ You can
check that ?0?p(y)dy1 by looking up
?0?y-1/2exp(-ay)dy?(p/a) in a math book.
7singularity at origin
p(x)
p(y)
x
y
Results not so different from uniform distribution
8from a previous lecture Functions of Two
Random Variables
any function of a several random variables is
itself a random variable
9If (x,y) has joint distribution p(x,y)then
given u(x,y) and v(x,y)thenp(u,v)
px(u,v),y(u,v) ?(x,y)/?(u,v)note then
that p(u)?p(u,v)dv and p(v)?p(u,v)du
Jacobian determinant
10example
- p(x,y) 1/(2ps2) exp(-x2/2s2) exp(-y2/2s2)
- 1/(2ps2) exp(-(x2y2)/2s2)
- uncorrelated normal distribution of two variables
with zero expectation and equal variance, s
s1
11Whats the distribution of ux2y2 ?
- We need to choose a function v(x,y). A
reasonable choice is motivated by polar
coordinates, vtan-1(x/y) - Then xu1/2 sin(v) and yu1/2 cos(v)
- And the Jacobian determinant is
We usually call these r2 and q in polar
coordinates
½ u-1/2 sin(v) ½u-1/2cos(v) u1/2 cos(v)
-u1/2 sin(v)
?x/?u ?y/?u ?x/?v ?y/?v
s1
½ sin2(v)½cos2(v) ½
12- So p(x,y) 1/(2ps2) exp(-(x2y2)/2s2)
- transforms to
- p(u,v) 1/(4ps2) exp(-u/2s2)
- p(u) ?02p 1/(4ps2) exp(-u/2s2) dv 1/(2s2)
exp(-u/2s2)
Note ?0?p(u)du exp(-u/2s2)0? 1 as expected
p(u)
u
13The point of my showing you this is to give you
the sense that computingthe probability
distributionsassociated with functions of random
variables isnot particularly mysteriousbut
instead is rather routine(though possibly
algebraically tedious)
14Four (and only four) Important Distributions
- Start with a bunch of random variables, xi
- that are uncorrelated, normally distributed,
with zero expectation and unit variance - The four important distributions are
- The distribution of xi, itself and the
- distributions of three possible choices of
u(x0,x1) - u Si1Nxi2
- u x0 / ? N-1 Si1Nxi2
- u N-1Si1N xi2 / M-1Si1M xNi2
15Important Distribution 1
- Distribution of xi itself
- (normal distribution with zero mean and unit
variance) -
- p(xi)(2p)-½ exp-½xi2
- Suppose that a random variable y has expectation
y and variance sy2. - Then note the variable
- Z (y-y)/sy
- Is normally distributed with zero mean and unit
variance. - We show this by noting p(Z)p(y(Z)) dy/dZ with
dy/dZsy, so that p(y)(2p)-½ s-1exp-½xi2
transforms to p(Z)(2p)-½ exp-½Z2
16p(x)
x
17properties of the normal distribution(with zero
expectation and unit variance)
Mean 0 Mode 0 Variance 1
18Important Distribution 2
- Distribution of u Si1Nxi2
-
- the sum of squares of N normally-distributed
random variables with zero expectation and unit
variance - This is called the chi-squared distribution with
N degrees of freedom and u is given the special
symbol ucN2 - We have already computed the N1 and N2 cases!
19N
p(cN2)
Heres the cases we worked out
cN2
20properties of the chi-squared distribution
1
- p(cN2) cN2½N-1 exp -½
cN2
2½N (½N-1)!
Mean N Mode 0 if Nlt2 N-2
otherwise Variance 2N
21Important Distribution 3
- Distribution of u x0 / ? N-1 Si1Nxi2
-
- the ratio of
- a normally-distributed random variable with zero
expectation and unit variance - and
- the square-root of the sum of squares of N
normally-distributed random variables with zero
expectation and unit variance, divided by N - This is called students t distribution with N
degrees of freedom and u is given the special
symbol utN
22N
?
p(tN)
Note N1 case is very long-tailed
Looks pretty much like a Gaussian in fact, is a
Gaussian in the limiting case N??
tN
23properties of students tN distribution
(½N-½)!
?(Np) (½N-1)!
Mean 0 Mode 0 Variance ?
if Nlt3 N/(N-2) otherwise
24Important Distribution 4
- Distribution of u N-1Si1N xi2 / M-1Si1M
xNi2 - the ratio of
- the sum of squares of N normally-distributed
random variables with zero expectation and unit
variance, divided by N - and
- the sum of squares of M normally-distributed
random variables with zero expectation and unit
variance, divided by M - This is called F distribution with N and M
degrees of freedom and u is given the special
symbol uFN,M
25N
M
p(FN,M)
FN,M
26properties of the FN,M distribution
- p(FN,M) too complicated for me to type in
Mean M/(M-2) if Mgt2 Mode (N-2)/N /
M/(M2) if Ngt2 Variance 2M2(NM-2) if
Mgt4 N(M-2)2(M-4)
27 28- The Null Hypothesis
- always a variant of this theme
- the results of an experiment differs from the
expected value only because of random variation
29- Test of Significance of Results
- say to 95 significance
- The Null Hypothesis would generate the observed
result less than 5 of the time
30- Example You buy an automated pipe-cutting
machine that cuts a long pipes into many segments
of equal length - Specifications
- calibration (mean, mm) exact
- repeatability (variance, sm2) 100 mm2
- Now you test the machine by having it cut 25
10000-mm length pipe segments. You then measure
and tabulate the length of each pipe segment, Li.
31- Question 1 Is the machines calibration
correct? - Null Hypothesis any difference between the mean
length of the test pipe segments from the
specified 10000 mm can be ascribed to random
variation - you estimate the mean of the 25 samples
mobs9990 mm - The mean length deviates (mm-mobs)10 mm from the
setting of 10000. Is this significant? - Note from a prior lecture, the variance of the
mean is - smean2 sdata2/N.
-
32- So the quantity
- Z (mm-mobs) / (sm/?N) where mobsN-1SiLi
- is a normally-distributed with zero expectation
and unit variance. - In our case Z 10 / (10/5) 5
- Z5 means that mm is 5 standard deviations from
the expected value of zero. -
Scaling a quantity so it has zero mean and unit
variance is an important trick
33- The amount of area under the normal distribution
that is 5 standard deviations away from the mean
is very small. We can calculated it using the
Excel function - NORMDIST(x,mean,standard_dev,cumulative)
- x is the value for which you want the
distribution. - mean is the arithmetic mean of the distribution.
- standard_dev is the standard deviation of the
distribution. - Cumulative is a logical value that determines the
form of the function. If cumulative is TRUE,
NORMDIST returns the cumulative distribution
function if FALSE, it returns the probability
mass function. - 2NORMDIST(-5,0,1,TRUE) 5.7421E-07 0.00006
-
Factor of two to account for both tails
34Thus the Null Hypothesisthat the machine is
well-calibratedcan be excludedto very high
probability
35- Question 2 Is the machines repeatability
within specs? - Null Hypothesis any difference between the
repeatability (variance) of the test pipe
segments from the specified sm2100 mm2 can be
ascribed to random variation - The quantity xi (Li-mm) / sm is
normally-distributed with mean0 and variance1,
so - The quantity cN2 Si (Li-mm)2 / sm2 is
chi-squared distributed with 25 degrees of
freedom.
36- Suppose that the root mean squared variation of
pipe lengths was N-1 Si (Li-mm)2½ 12 mm. - Then c252 Si (Li-mm)2 / sm2 25 ? 144 / 100
36 - CHIDIST(x,degrees_freedom)
- x is the value at which you want to evaluate the
distribution. - degrees_freedom  is the number of degrees of
freedom. - CHIDIST P(Xgtx), where X is a y2 random
variable. - The probability that c252 ? 36 is
CHIDIST(36,25)0.07 or 7
37Thus the Null Hypothesisthat the difference from
the expected result of 10 is random
variationcannot be excluded(not to greater
than 95 probability)
38Question 3But suppose the manufacturer had not
stated a repeatability specsjust a calibration
specyou cant test the calibrationusing the
quantity Z (mm-mobs) / (sm/?N)
Not known
39Since the manufacturer has not supplied a
variance, we must estimate it from the data
sobs2 N-1 Si (Li-mm)2 144 mm2. and use it
in the formula (mm-mobs) / (sobs/?N)
40But the quantity (mm-mobs) / (sobs/?N) is not
cN2 distributedbecause sobs is itself a random
variableits t-distributedremember tN x0 /
? N-1 Si1Nxi2
41- In our case tN 10 / (12/5) 4.16
- TDIST(x,degrees_freedom,tails)
- x  is the numeric value at which to evaluate the
distribution. - Degrees_freedom  is an integer indicating the
number of degrees of freedom. - Tails specifies the number of distribution tails
to return. If tails 1, TDIST returns the
one-tailed distribution. If tails 2, TDIST
returns the two-tailed distribution. - TDIST is calculated as TDIST P(xltX), where X
is a random variable that follows the
t-distribution. - tN TDIST(4.16,25,1) 0.00016 0.016
42Thus the Null Hypothesisthat the difference from
the expected result of 10000 is due to random
variationcan be excludedto high
probability,but not nearly has high as when the
manufacturer told us the repeatability
43Question 4Suppose you performed the test
twice, a year apart, and wanted to knowhas the
repeatability changed? This Year N-1 Si
(Lyr1i-mm)2½ 12 mm Last Year M-1 Si
(Lyr2i-mm)2½ 14 mm(lets say NM25 in both
cases)
44- Null Hypothesis any difference between the
repeatability (variance) of the test pipe
segments between years can be ascribed to random
variation - The ratio of mean-squared error is F-distributed
- FN,M N-1Si1N xi2 / M-1Si1M xNi2
- 12/14 0.857
45Note that since F is of the form Fa/b with both
a and b fluctuating around a mean value, that we
really want the cumulative probability that
Flt12/14 and Fgt14/12
p(FN,M)
FN,M
12/14
14/12
1
46- FDIST(x,degrees_freedom1,degrees_freedom2)
- x is the value at which to evaluate the
function. - Degrees_freedom1Â Â is the numerator degrees of
freedom. - Degrees_freedom2Â Â is the denominator degrees of
freedom. - FDIST is calculated as FDISTP( Fltx ), where F
is a random variable that has an F distribution. - Left hand tail
- 1-FDIST(0.857,25,25) 1-0.6480.35235.2
- Right hand tail
- 1-FDIST(1/0.857,25,25) 1-0.6480.35235.2
- Both tails 70.4
Since P(Fgtx) 1-P(Fltx)
47Thus the Null Hypothesisthat the year-to-year
difference in variance is due to random
variationcannot be excludedthere is no strong
reason to believe that the repeatability of the
machine has changed between the years
48Question 5 Suppose you performed the test
twice, a year apart, and wanted to know if the
calibration changed.This Year myr1obs N-1
Si Lyr1i 9990 syr1obs N-1 Si (Lyr1i)2½
12 mm Last Year myr2obs N-1 Si Lyr2i
9993 syr2obs M-1 Si (Lyr2i-mm)2½ 14
mm(lets say NM25 in both cases)
49- The problem that we face is that while
- tyr1N (myr1obs mm) / (syr1obs/?N)
- and
- tyr2N (myr2obs mm) / (syr2obs/?N)
- Are individually t-distributed, their difference,
- tyr1N tyr2N
- Is not t-distributed. Statisticians have
circumvented this problem by cooking up a
function of (myr1obs , myr2obs , syr1obs,
syr1obs) that is approximately t-distributed.
But its messy.
50In our case
Note Excels function TTEST() allows you to
perform the test on columns of data, without
typing the the formulas very handy!
19
51Thus the Null Hypothesisthat the difference in
means is due to random variationcannot be
excluded
525 tests
- mobs mprior when mprior and sprior are known
- normal distribution
- sobs sprior when mprior and sprior are known
- chi-squared distribution
- mobs mprior when mprior is known but sprior is
unknown - t distribution
- s1obs s2obs when m1prior and m2prior are known
- F distribution
- m1obs m2obs when s1prior and s2prior are
unknown - modified t distribution
53Example 1 LaGuardia Airport Mean Daily
Temperature Was the 5-year period 1950-1954
significantly warmer or cooler than the 5-year
period 2000-2004?
1950-1954
2000-2004
Null Hypothesis any differences between the mean
temperatures of these two time periods can be
ascribed to random variation Type of Test t-test
modified to test two means
54Results 1950-1954 Mean Temperature
55.86580.77 1950-1954 Mean Temperature
55.87920.80 T-test Significance Probability
49 The Null Hypothesis, that the difference in
means is due to random variation, cannot be
rejected
55Issue about noise
- Note that we are estimating s by treating the
short-term (days-to-months) temperature
fluctuations as noise - Is this correct?
- Certainly such fluctuations are not measurement
noise in the normal sense. - They might be considered model noise in the
sense that they are caused by weather systems
that are unmodeled (by us) - However, such noise probably does not meet all
the requirements for use in the statistical test.
In particular, it probably has some day-to-day
correlation (hot today, hot tomorrow, too) that
violated our implicit assumption of uncorrelated
noise.
56Example 2 Does a parabola fit better than a
straight line?
First 7 days of data on Neuse River Hydrograph
shown in an early lecture
N7
Discharge, cfs
day
57A parabola willalwaysfit better than a straight
linebecause it has an extra parameterBut does
it fit significantly better?Null Hypothesis
Any difference in fit is due to random variation
58Linear Fit
Quadratic Fit
Approximation ratio of prediction errors
follows an F-distribution with the number of
degrees of freedom given by the number of data
minus the number of parameters in the fit
(N-3)-1Si1N (diobs-dipre)2 6985
(N-2)-1Si1N (diobs-dipre)2 153431
F 153431/ 6985 21.96 P(Flt21.96)
1-FDIST(21.96,5,4) 0.995 99.5
59- The Null Hypothesis can be rejected with
- 99.5 confidence
60Another Issue about noise
- Note that we are again basing estimates upon
- model noise
- in the sense that the prediction error is being
controlled at least partly - by the misfit of
the curve, as well as by measurement error - As before, such noise probably does not meet all
the requirements for use in the statistical test.
So the test needs to be used with some caution.