Hypothesis Testing - PowerPoint PPT Presentation

About This Presentation
Title:

Hypothesis Testing

Description:

Jacobian determinant. example. p(x,y) = 1/(2ps2) exp(-x2/2s2) exp(-y2/2s2) ... And the Jacobian determinant is. s=1. u-1/2 sin(v) u-1/2cos(v) u1/2 cos(v) -u1 ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 61
Provided by: billm7
Category:

less

Transcript and Presenter's Notes

Title: Hypothesis Testing


1
Lecture 10
  • Hypothesis Testing

2
from a previous lecture Functions of a Random
Variable
any function of a random variable is itself a
random variable
3
If x has distribution p(x)the y(x) has
distributionp(y) px(y) dx/dy
4
example
  • Let x have a uniform (white) distribution of 0,1

1
p(x)
0
x
1
Uniform probability that x is anywhere between 0
and 1
5
  • Let y x2
  • then xy½
  • px(y)1 and dx/dy½y-½
  • So p(y)½y-½ on the interval 0,1

p(y)
y
6
Another example
  • Let x have a normal distribution with zero
    expectation and unit variance. To avoid
    complication, assume xgt0, so that the
    distribution is twice the usual amplitude

p(x) 2 (2p)-1/2 exp(-½ x2)
The distribution of yx2 is
p(y) p(x(y)) dx/dy (2py)-1/2 exp(-½y)
note that we used, as before, dx/dy½y-½ You can
check that ?0?p(y)dy1 by looking up
?0?y-1/2exp(-ay)dy?(p/a) in a math book.
7
singularity at origin
p(x)
p(y)
x
y
Results not so different from uniform distribution
8
from a previous lecture Functions of Two
Random Variables
any function of a several random variables is
itself a random variable
9
If (x,y) has joint distribution p(x,y)then
given u(x,y) and v(x,y)thenp(u,v)
px(u,v),y(u,v) ?(x,y)/?(u,v)note then
that p(u)?p(u,v)dv and p(v)?p(u,v)du
Jacobian determinant
10
example
  • p(x,y) 1/(2ps2) exp(-x2/2s2) exp(-y2/2s2)
  • 1/(2ps2) exp(-(x2y2)/2s2)
  • uncorrelated normal distribution of two variables
    with zero expectation and equal variance, s

s1
11
Whats the distribution of ux2y2 ?
  • We need to choose a function v(x,y). A
    reasonable choice is motivated by polar
    coordinates, vtan-1(x/y)
  • Then xu1/2 sin(v) and yu1/2 cos(v)
  • And the Jacobian determinant is

We usually call these r2 and q in polar
coordinates
½ u-1/2 sin(v) ½u-1/2cos(v) u1/2 cos(v)
-u1/2 sin(v)
?x/?u ?y/?u ?x/?v ?y/?v


s1
½ sin2(v)½cos2(v) ½
12
  • So p(x,y) 1/(2ps2) exp(-(x2y2)/2s2)
  • transforms to
  • p(u,v) 1/(4ps2) exp(-u/2s2)
  • p(u) ?02p 1/(4ps2) exp(-u/2s2) dv 1/(2s2)
    exp(-u/2s2)

Note ?0?p(u)du exp(-u/2s2)0? 1 as expected
p(u)
u
13
The point of my showing you this is to give you
the sense that computingthe probability
distributionsassociated with functions of random
variables isnot particularly mysteriousbut
instead is rather routine(though possibly
algebraically tedious)
14
Four (and only four) Important Distributions
  • Start with a bunch of random variables, xi
  • that are uncorrelated, normally distributed,
    with zero expectation and unit variance
  • The four important distributions are
  • The distribution of xi, itself and the
  • distributions of three possible choices of
    u(x0,x1)
  • u Si1Nxi2
  • u x0 / ? N-1 Si1Nxi2
  • u N-1Si1N xi2 / M-1Si1M xNi2

15
Important Distribution 1
  • Distribution of xi itself
  • (normal distribution with zero mean and unit
    variance)
  • p(xi)(2p)-½ exp-½xi2
  • Suppose that a random variable y has expectation
    y and variance sy2.
  • Then note the variable
  • Z (y-y)/sy
  • Is normally distributed with zero mean and unit
    variance.
  • We show this by noting p(Z)p(y(Z)) dy/dZ with
    dy/dZsy, so that p(y)(2p)-½ s-1exp-½xi2
    transforms to p(Z)(2p)-½ exp-½Z2

16
p(x)
x
17
properties of the normal distribution(with zero
expectation and unit variance)
  • p(xi) (2p)-½ exp-½xi2

Mean 0 Mode 0 Variance 1
18
Important Distribution 2
  • Distribution of u Si1Nxi2
  • the sum of squares of N normally-distributed
    random variables with zero expectation and unit
    variance
  • This is called the chi-squared distribution with
    N degrees of freedom and u is given the special
    symbol ucN2
  • We have already computed the N1 and N2 cases!

19
N
p(cN2)
Heres the cases we worked out
cN2
20
properties of the chi-squared distribution
1
  • p(cN2) cN2½N-1 exp -½
    cN2

2½N (½N-1)!
Mean N Mode 0 if Nlt2 N-2
otherwise Variance 2N
21
Important Distribution 3
  • Distribution of u x0 / ? N-1 Si1Nxi2
  • the ratio of
  • a normally-distributed random variable with zero
    expectation and unit variance
  • and
  • the square-root of the sum of squares of N
    normally-distributed random variables with zero
    expectation and unit variance, divided by N
  • This is called students t distribution with N
    degrees of freedom and u is given the special
    symbol utN

22
N
?
p(tN)
Note N1 case is very long-tailed
Looks pretty much like a Gaussian in fact, is a
Gaussian in the limiting case N??
tN
23
properties of students tN distribution
(½N-½)!
  • p(tN) 1
    N-1tN2-½(N1)

?(Np) (½N-1)!
Mean 0 Mode 0 Variance ?
if Nlt3 N/(N-2) otherwise
24
Important Distribution 4
  • Distribution of u N-1Si1N xi2 / M-1Si1M
    xNi2
  • the ratio of
  • the sum of squares of N normally-distributed
    random variables with zero expectation and unit
    variance, divided by N
  • and
  • the sum of squares of M normally-distributed
    random variables with zero expectation and unit
    variance, divided by M
  • This is called F distribution with N and M
    degrees of freedom and u is given the special
    symbol uFN,M

25
N
M
p(FN,M)
FN,M
26
properties of the FN,M distribution
  • p(FN,M) too complicated for me to type in

Mean M/(M-2) if Mgt2 Mode (N-2)/N /
M/(M2) if Ngt2 Variance 2M2(NM-2) if
Mgt4 N(M-2)2(M-4)
27
  • Hypothesis Testing

28
  • The Null Hypothesis
  • always a variant of this theme
  • the results of an experiment differs from the
    expected value only because of random variation

29
  • Test of Significance of Results
  • say to 95 significance
  • The Null Hypothesis would generate the observed
    result less than 5 of the time

30
  • Example You buy an automated pipe-cutting
    machine that cuts a long pipes into many segments
    of equal length
  • Specifications
  • calibration (mean, mm) exact
  • repeatability (variance, sm2) 100 mm2
  • Now you test the machine by having it cut 25
    10000-mm length pipe segments. You then measure
    and tabulate the length of each pipe segment, Li.

31
  • Question 1 Is the machines calibration
    correct?
  • Null Hypothesis any difference between the mean
    length of the test pipe segments from the
    specified 10000 mm can be ascribed to random
    variation
  • you estimate the mean of the 25 samples
    mobs9990 mm
  • The mean length deviates (mm-mobs)10 mm from the
    setting of 10000. Is this significant?
  • Note from a prior lecture, the variance of the
    mean is
  • smean2 sdata2/N.

32
  • So the quantity
  • Z (mm-mobs) / (sm/?N) where mobsN-1SiLi
  • is a normally-distributed with zero expectation
    and unit variance.
  • In our case Z 10 / (10/5) 5
  • Z5 means that mm is 5 standard deviations from
    the expected value of zero.

Scaling a quantity so it has zero mean and unit
variance is an important trick
33
  • The amount of area under the normal distribution
    that is 5 standard deviations away from the mean
    is very small. We can calculated it using the
    Excel function
  • NORMDIST(x,mean,standard_dev,cumulative)
  • x is the value for which you want the
    distribution.
  • mean is the arithmetic mean of the distribution.
  • standard_dev is the standard deviation of the
    distribution.
  • Cumulative is a logical value that determines the
    form of the function. If cumulative is TRUE,
    NORMDIST returns the cumulative distribution
    function if FALSE, it returns the probability
    mass function.
  • 2NORMDIST(-5,0,1,TRUE) 5.7421E-07 0.00006

Factor of two to account for both tails
34
Thus the Null Hypothesisthat the machine is
well-calibratedcan be excludedto very high
probability
35
  • Question 2 Is the machines repeatability
    within specs?
  • Null Hypothesis any difference between the
    repeatability (variance) of the test pipe
    segments from the specified sm2100 mm2 can be
    ascribed to random variation
  • The quantity xi (Li-mm) / sm is
    normally-distributed with mean0 and variance1,
    so
  • The quantity cN2 Si (Li-mm)2 / sm2 is
    chi-squared distributed with 25 degrees of
    freedom.

36
  • Suppose that the root mean squared variation of
    pipe lengths was N-1 Si (Li-mm)2½ 12 mm.
  • Then c252 Si (Li-mm)2 / sm2 25 ? 144 / 100
    36
  • CHIDIST(x,degrees_freedom)
  • x is the value at which you want to evaluate the
    distribution.
  • degrees_freedom  is the number of degrees of
    freedom.
  • CHIDIST P(Xgtx), where X is a y2 random
    variable.
  • The probability that c252 ? 36 is
    CHIDIST(36,25)0.07 or 7

37
Thus the Null Hypothesisthat the difference from
the expected result of 10 is random
variationcannot be excluded(not to greater
than 95 probability)
38
Question 3But suppose the manufacturer had not
stated a repeatability specsjust a calibration
specyou cant test the calibrationusing the
quantity Z (mm-mobs) / (sm/?N)
Not known
39
Since the manufacturer has not supplied a
variance, we must estimate it from the data
sobs2 N-1 Si (Li-mm)2 144 mm2. and use it
in the formula (mm-mobs) / (sobs/?N)
40
But the quantity (mm-mobs) / (sobs/?N) is not
cN2 distributedbecause sobs is itself a random
variableits t-distributedremember tN x0 /
? N-1 Si1Nxi2
41
  • In our case tN 10 / (12/5) 4.16
  • TDIST(x,degrees_freedom,tails)
  • x  is the numeric value at which to evaluate the
    distribution.
  • Degrees_freedom  is an integer indicating the
    number of degrees of freedom.
  • Tails specifies the number of distribution tails
    to return. If tails 1, TDIST returns the
    one-tailed distribution. If tails 2, TDIST
    returns the two-tailed distribution.
  • TDIST is calculated as TDIST P(xltX), where X
    is a random variable that follows the
    t-distribution.
  • tN TDIST(4.16,25,1) 0.00016 0.016

42
Thus the Null Hypothesisthat the difference from
the expected result of 10000 is due to random
variationcan be excludedto high
probability,but not nearly has high as when the
manufacturer told us the repeatability
43
Question 4Suppose you performed the test
twice, a year apart, and wanted to knowhas the
repeatability changed? This Year N-1 Si
(Lyr1i-mm)2½ 12 mm Last Year M-1 Si
(Lyr2i-mm)2½ 14 mm(lets say NM25 in both
cases)
44
  • Null Hypothesis any difference between the
    repeatability (variance) of the test pipe
    segments between years can be ascribed to random
    variation
  • The ratio of mean-squared error is F-distributed
  • FN,M N-1Si1N xi2 / M-1Si1M xNi2
  • 12/14 0.857

45
Note that since F is of the form Fa/b with both
a and b fluctuating around a mean value, that we
really want the cumulative probability that
Flt12/14 and Fgt14/12
p(FN,M)
FN,M
12/14
14/12
1
46
  • FDIST(x,degrees_freedom1,degrees_freedom2)
  • x is the value at which to evaluate the
    function.
  • Degrees_freedom1  is the numerator degrees of
    freedom.
  • Degrees_freedom2  is the denominator degrees of
    freedom.
  • FDIST is calculated as FDISTP( Fltx ), where F
    is a random variable that has an F distribution.
  • Left hand tail
  • 1-FDIST(0.857,25,25) 1-0.6480.35235.2
  • Right hand tail
  • 1-FDIST(1/0.857,25,25) 1-0.6480.35235.2
  • Both tails 70.4

Since P(Fgtx) 1-P(Fltx)
47
Thus the Null Hypothesisthat the year-to-year
difference in variance is due to random
variationcannot be excludedthere is no strong
reason to believe that the repeatability of the
machine has changed between the years
48
Question 5 Suppose you performed the test
twice, a year apart, and wanted to know if the
calibration changed.This Year myr1obs N-1
Si Lyr1i 9990 syr1obs N-1 Si (Lyr1i)2½
12 mm Last Year myr2obs N-1 Si Lyr2i
9993 syr2obs M-1 Si (Lyr2i-mm)2½ 14
mm(lets say NM25 in both cases)
49
  • The problem that we face is that while
  • tyr1N (myr1obs mm) / (syr1obs/?N)
  • and
  • tyr2N (myr2obs mm) / (syr2obs/?N)
  • Are individually t-distributed, their difference,
  • tyr1N tyr2N
  • Is not t-distributed. Statisticians have
    circumvented this problem by cooking up a
    function of (myr1obs , myr2obs , syr1obs,
    syr1obs) that is approximately t-distributed.
    But its messy.

50
In our case
Note Excels function TTEST() allows you to
perform the test on columns of data, without
typing the the formulas very handy!
19
51
Thus the Null Hypothesisthat the difference in
means is due to random variationcannot be
excluded
52
5 tests
  • mobs mprior when mprior and sprior are known
  • normal distribution
  • sobs sprior when mprior and sprior are known
  • chi-squared distribution
  • mobs mprior when mprior is known but sprior is
    unknown
  • t distribution
  • s1obs s2obs when m1prior and m2prior are known
  • F distribution
  • m1obs m2obs when s1prior and s2prior are
    unknown
  • modified t distribution

53
Example 1 LaGuardia Airport Mean Daily
Temperature Was the 5-year period 1950-1954
significantly warmer or cooler than the 5-year
period 2000-2004?
1950-1954
2000-2004
Null Hypothesis any differences between the mean
temperatures of these two time periods can be
ascribed to random variation Type of Test t-test
modified to test two means
54
Results 1950-1954 Mean Temperature
55.86580.77 1950-1954 Mean Temperature
55.87920.80 T-test Significance Probability
49 The Null Hypothesis, that the difference in
means is due to random variation, cannot be
rejected
55
Issue about noise
  • Note that we are estimating s by treating the
    short-term (days-to-months) temperature
    fluctuations as noise
  • Is this correct?
  • Certainly such fluctuations are not measurement
    noise in the normal sense.
  • They might be considered model noise in the
    sense that they are caused by weather systems
    that are unmodeled (by us)
  • However, such noise probably does not meet all
    the requirements for use in the statistical test.
    In particular, it probably has some day-to-day
    correlation (hot today, hot tomorrow, too) that
    violated our implicit assumption of uncorrelated
    noise.

56
Example 2 Does a parabola fit better than a
straight line?
First 7 days of data on Neuse River Hydrograph
shown in an early lecture
N7
Discharge, cfs
day
57
A parabola willalwaysfit better than a straight
linebecause it has an extra parameterBut does
it fit significantly better?Null Hypothesis
Any difference in fit is due to random variation
58
Linear Fit
Quadratic Fit
Approximation ratio of prediction errors
follows an F-distribution with the number of
degrees of freedom given by the number of data
minus the number of parameters in the fit
(N-3)-1Si1N (diobs-dipre)2 6985
(N-2)-1Si1N (diobs-dipre)2 153431
F 153431/ 6985 21.96 P(Flt21.96)
1-FDIST(21.96,5,4) 0.995 99.5
59
  • The Null Hypothesis can be rejected with
  • 99.5 confidence

60
Another Issue about noise
  • Note that we are again basing estimates upon
  • model noise
  • in the sense that the prediction error is being
    controlled at least partly - by the misfit of
    the curve, as well as by measurement error
  • As before, such noise probably does not meet all
    the requirements for use in the statistical test.
    So the test needs to be used with some caution.
Write a Comment
User Comments (0)
About PowerShow.com