Title: MARE 250
1Hypothesis Testing III
MARE 250 Dr. Jason Turner
2To ASSUME is to make an
Four assumptions for means test hypothesis
testing 1. Random Samples 2. Independent
Samples 3. Normal Populations (or large
samples) 4. Variances (std. dev.) are equal
3Significance Level
The probability of making a TYPE I Error
(rejection of a true null hypothesis) is called
the significance level (a) of a hypothesis
test TYPE II Error Probability (ß)
nonrejection of a false null hypothesis For a
fixed sample size, the smaller we specify the
significance level (a) , the larger will be the
probability (ß) , of not rejecting a false
hypothesis
4Significance Level
If H0 is true If H0 is false
If H0 is rejected TYPE I ERROR No Error
If H0 is not rejected No Error TYPE II ERROR
5I have the POWER!!!
The power of a hypothesis test is the probability
of not making a TYPE II error (rejecting a false
null hypothesis) t evidence to support the
alternative hypothesis POWER 1 ß
Produces a power curve
6We need more POWER!!!
For a fixed significance level, increasing the
sample size increases the power Therefore, you
can run a test to determine if your sample size
HAS THE POWER!!!
By using a sufficiently large sample size, we can
obtain a hypothesis test with as much power as we
want
7- Increasing the power of the test
- There are four factors that can increase the
power of a means test - Larger effect size (difference) - The greater the
real difference between data for the two
populations, the more likely it is that the
sample means will also be different. - Higher a-level (the level of significance) - If
you choose a higher value for a, you increase the
probability of rejecting the null hypothesis, and
thus the power of the test. (However, you also
increase your chance of type I error.)
8Increasing the power of the test There are four
factors that can increase the power of a means
test 3. Less variability - When the standard
deviation is smaller, smaller differences can be
detected. 4. Larger sample sizes - The more
observations there are in your samples, the more
confident you can be that the sample means
represent m for the two populations. Thus, the
test will be more sensitive to smaller
differences.
9Calculating Power
Power - the probability of being able to detect
an effect of a given size Sample size - the
number of observations in each sample Difference
(effect) - the difference between µ for one
population and µ for the other
10Calculating Power
For a t-test provide difference (means) and
standard deviation (largest of two) If enter
Sample size get power If enter Power get
sample size
11Calculating Power
Similar for ANOVA enter levels as well If
enter Sample size get power If enter Power
get sample size
Calculates sample size per level
12Calculating Power
For an ANOVA provide levels, difference
(means) and standard deviation (largest) If
enter Sample size get power If enter Power
get sample size
Calculates sample size per level
13Increasing the power of the test The most
practical way to increase power is often to
increase the sample size However, you can also
try to decrease the standard deviation by making
improvements in your process or measurement
14- Sample size
- Increasing the size of your samples increases the
power of your test - However, in the real world this is also a
function of - Time
- Money
- Logistics
- Reality
15Data Transformations
One advantage of using parametric statistics is
that it makes it much easier to describe your
data If you have established that it follows a
normal distribution you can be sure that a
particular set of measurements can be properly
described by its mean and standard deviation If
your data are not normally distributed you cannot
use any of the tests that assume that it is (e.g.
ANOVA, t test, regression analysis)
16Data Transformations
If your data are not normally distributed it is
often possible to normalize it by transforming
it Transforming data to allow you to use
parametric statistics is completely legitimate
17Data Transformations
People often feel uncomfortable when they
transform data because it seems like it
artificially improves their results but this is
only because they feel comfortable with linear or
arithmetic scales However, there is no reason
for not using other scales (e.g. logarithms,
square roots, reciprocals or angles) where
appropriate (See Chapter 13)
18Data Transformations
Different transformations work for different data
types Logarithms Growth rates are often
exponential and log transforms will often
normalize them. Log transforms are particularly
appropriate if the variance increases with the
mean. Reciprocal If a log transform does not
normalize your data you could try a reciprocal
(1/x) transformation. This is often used for
enzyme reaction rate data.
19Data Transformations
Square root This transform is often of value
when the data are counts, e.g. urchins, Honu.
Carrying out a square root transform will convert
data with a Poisson distribution to a normal
distribution. Arcsine This transformation is
also known as the angular transformation and is
especially useful for percentages and proportions
20Which Transformation?
Johnson Transformation is useful when the
collected data are non-normal, but you want to
apply a methodology that requires a normal
distribution It is a MINITAB program not a
TEST!
21Which Transformation?
- Johnson Transformation should be used as a first
step before you transform data by hand - Why?
- Its quick and easy (point and click)
- It runs a variety of very complex data
transformation functions - However, only runs LOG, ARCSINE based equations
22How To?
STAT Quality Tools Johnson Transformation Ent
er what variable to be transformed, and what the
new transformed variable will be called Places
transformed data into a new column in your
MINITAB datasheet can copy this into Excel and
save FOREVER
23Johnson Transformation
- How do I know if it worked?
- If Johnson transformation program is successful
it will - Transform data and provide info on what
transformation it ran (formula) - Run normality test to verify
- Provide you with transformed data (if you ask for
it) - Output has 3 graphs
24Johnson Transformation
3 Graphs Transformation Equation
Did a LOG Transformation
25Johnson Transformation
- How do I know if it worked?
- If Johnson transformation program is NOT
successful it will - Tell you it failed to find a data transformation
that passed the normality test - Output has 2 graphs
26Johnson Transformation
2 Graphs
Did not transform data
27Then What?
- If the Johnson Transformation program was
unsuccessful at transformation your data to meet
parametric assumptions then run Data
transformations by hand - There are several, I am teaching you 4
- Log
- Reciprocal
- Square Root
- Arcsine
28Then What?
Calculate these in your working Excel file 1)
Make new column headers
29Then What?
Calculate these in your working Excel file 2)
Insert Function in cell below header
30Then What?
Calculate these in your working Excel file 3)
Enter cell number for first datapoint
31Then What?
Calculate these in your working Excel file 4)
Copy cell and paste/fill down
32Then What?
Calculate these in your working Excel file 5)
Wash, Rinse, Repeatfor other 3 variables