Title: Hypothesis Testing
1Hypothesis Testing
2What is Hypothesis Testing?
- Sample information can be used to obtain point
estimates or confidence intervals about
population parameters - Alternatively, sample information can be used to
test the validity of conjectures about these
parameters - Are private banks more profitable than
state-owned banks in the EU countries? - Are returns on a stock different before and after
a stock split? - Is there a larger variability in real estate
prices in Champaign than in Urbana?
3What is Hypothesis Testing?
- A hypothesis is a statement about a population
parameter from one or more populations - Statistically testable hypotheses are formulated
based on theories that are used to make
predictions - A hypothesis test is a procedure that
- States the hypothesis to be tested
- Uses sample information and formulates a decision
rule - Based on the outcome of the decision rule the
hypothesis is statistically validated or rejected
4Steps in Hypothesis Testing
- The following steps are followed in a hypothesis
test - State the hypothesis
- Identify the appropriate test statistics and its
probability distribution - Specify the significance level
- State the decision rule
- Collect the data and calculate the test statistic
- Make the statistical decision
- Evaluate whether the statistical decision implies
a corresponding financial decision
5Stating the Testable Hypotheses
- A hypothesis test always includes two hypotheses
- Null Hypothesis (H0) The null hypothesis is the
hypothesis to be tested - E.g., The average debt-equity ratio for US
industrial firms is 20 - Alternative Hypothesis (H1) The alternative
hypothesis is the one accepted if the null
hypothesis is rejected - E.g., The average debt-equity ratio for US
industrial firms is different than 20
6Stating the Testable Hypotheses
- Note The null hypothesis is a statement that is
considered true unless the sample used in the
hypothesis testing provides evidence that it is
false - Hypothesis tests for a population parameter ? in
relation to a possible value ?0 can be formulated
as follows - H0 ? ?0 vs. H1 ? ? ?0
- H0 ? ? ?0 vs. H1 ? gt ?0
- H0 ? ? ?0 vs. H1 ? lt ?0
7Stating the Testable Hypotheses
- The first formulation is a two-sided test while
the other two are one-sided tests - In each formulation the null and the alternative
account for all possible values of the population
parameter - Regardless of the formulation, the test is always
conducted at the point of equality, ? ?0
8Stating the Testable Hypotheses
- How do we state the null and alternative
hypotheses? - Example Suppose that theory tells us that growth
funds outperform value funds - H0 Growth funds perform worse or equal to value
funds - H1 Growth funds perform better than value funds
- We formulate the alternative hypothesis as the
statement that the condition is true and test the
validity of the null that the statement is false
9Identifying the Test Statistic and its
Probability Distribution
- The decision rule for the hypothesis test is
based on a test statistic - The test statistic is a quantity calculated from
sample information that typically has the
following form - (Sample Statistic Value of Parameter under
H0)/St. Error of Sample Statistic
10Identifying the Test Statistic and its
Probability Distribution
- Example Suppose that we want to test the null
hypothesis that the mean return on the SP 500
index during the past five years is less or equal
than 10 vs. the alternative that it is greater - Drawing a sample and calculating the sample mean,
we know - If population distribution is normal with known
variance, sample mean follows normal distribution
and we use the standardized variable Z as our
test statistic
11Identifying the Test Statistic and its
Probability Distribution
- If in the above case, the population variance is
unknown, but the sample is large, we again use Z
as our test statistic - If the population variance is unknown or sample
size is small, we use the variable t as out test
statistic - If, for example, the variance of SP 500 returns
is unknown, we will use the variable t, known as
the t-statistic
12Specifying the Significance Level
- To reject or not the null hypothesis, the
t-statistic is compared to a pre-specified value - The selected value is based on a pre-determined
level of significance - Note that the null hypothesis can be either true
or false - However, there are four possible outcomes when a
hypothesis is tested
13Specifying the Significance Level
- A false null hypothesis is rejected, which is a
correct decision - A true null hypothesis is rejected (this is
called a Type I error) - A false null hypothesis is not rejected (this is
called a Type II error) - A true null hypothesis is not rejected, which is
again a correct decision
14Specifying the Significance Level
- The probability of Type I error in a hypothesis
test is called the level of significance of the
test - Conducting a hypothesis test, we want the chance
of type I error to be as low as possible - E.g., A level of significance of 5 implies a 5
chance of type I error - Note As we decrease the chance of a type I
error, we increase the chance of a type II error
15Specifying the Significance Level
- Lowering the chance of type I error implies that
the null will be rejected less often, including
when it is false (type II error) - To lower the probabilities of both errors we need
to increase the sample size - The power of a test is the probability of
correctly rejecting a false null hypothesis (The
power of a test is 1 P(type II error)) - Conventional significant levels when testing
hypotheses are 10, 5, 1
16Specifying the Significance Level
- Example
- If we reject the null hypothesis at the 10
significance level, we have some evidence that
the alternative is true - If we reject the null hypothesis at the 5
significance level, we have strong evidence that
the alternative is true - If we reject the null hypothesis at the 1
significance level, we have very strong evidence
that the alternative is true
17Stating the Decision Rule
- The decision rule compares the calculated test
statistic with specific cutoffs from the tables
of the statistics distribution - Example Suppose that the test statistic that we
use is the Z-statistic (Z variable) and that we
use a 5 significance level - If the hypothesis test is H0 ? ?0 vs. H1 ? ?
?0 then the two rejection values are Z0.025 1.96
and - Z0.025 -1.96 - We would reject the null if Z lt -1.96 or Z gt 1.96
18Collecting Data, Calculating Test Statistic and
Making a Decision
- In collecting a sample, it is important to avoid
problems of sample selection bias, such as
survivorship bias - Example If we want to test a hypothesis
regarding bank performance and we choose in our
sample only the banks that exist in the last
quarter, we do not include the banks that have
failed - Banks still in existence must have performed
better and, thus, there will be some bias in our
sample
19Hypothesis Tests and Financial Decisions
- Deciding to reject or not the null hypothesis
implies making a statistical decision - Does this always translate into a corresponding
financial decision? - Example Suppose we find support through a test
for the hypothesis that on average stocks provide
higher returns than bonds
20Hypothesis Tests and Financial Decisions
- Does this statistical decision have a financial
meaning, as well? - From a financial or investment perspective we may
also want to understand what are the risks of
investing in these two types of assets - Finally, we define the p-value as the smallest
level of significance at which we can reject the
null hypothesis
21Hypothesis Test for a Single Mean(Normal
Distribution, Variance Unknown)
22Example of Hypothesis Test for a Single Mean
- Suppose that the controller of a firm monitors
the firms payments from its customers through
days receivables - The firm has tried to maintain an average of 45
days in receivables - A recent random sample of 50 accounts has shown a
mean of 49 days and a standard deviation of 8
days - Can we reject the hypothesis that the average
days in receivables for this firm has increased?
23Example of Hypothesis Test for a Single Mean
- The testable hypotheses are stated as follows
- H0 ? ? 45
- H0 ? gt 45
- The test can be conducted at the 5 and 1 levels
of significance - Since the population variance is unknown, we use
the t-statistic, which is
24Example of Hypothesis Test for a Single Mean
- The cutoffs for the t-distribution with 49
degrees of freedom at the 5 and 1 level of
significance are 1.677 and 2.405, respectively - Given that our t-statistic is greater than both
cutoffs, the null hypothesis is rejected both at
the 5 and 1 levels - This implies that there has been a statistically
significant increase in the days receivables for
this firm
25Hypothesis Test for Difference Between Population
Means
- We often want to test the hypothesis that the
population means differ between two groups - Examples
- Is the average debt-equity ratio higher for
mature compared to young firms? - Do average stock returns differ by decade?
- Do community banks on average lend more to small
businesses than larger banking institutions? - Do average corporate defaults differ by industry?
26Hypothesis Test for Difference Between Population
Means
- Taking samples from the two populations, we can
formulate the following hypotheses - H0 ?1 ?2 vs. H1 ?1 ? ?2
- H0 ?1 ? ?2 vs. H1 ?1 gt ?2
- H0 ?1 ? ?2 vs. H1 ?1 lt ?2
- Two cases (assuming samples are independent)
- Populations are assumed normally distributed,
variances are unknown, but equal - Populations are assumed normally distributed,
variances are unknown, but unequal
27Hypothesis Test for Difference Between Population
Means
- When population variances are assumed to be
equal, the t-statistic is as follows - where
- and the degrees of freedom are n1 n2 -2
28Hypothesis Test for Difference Between Population
Means
- When population variances cannot be assumed to be
equal, the t-statistic is as follows - and the degrees of freedom are
29Example of Hypothesis Test for DifferencesBetween
Population Means
- Suppose that we observe monthly returns on the
SP 500 from the 1970s and the 1980s (equal
samples 120 observations) - For the 1970s, the mean monthly return is 0.58
and the standard deviation is 4.598 - For the 1980s, the mean monthly return is 1.47
and the standard deviation is 4.738 - We want to test whether the two population means
are equal, assuming that they are both normally
distributed and that variances are not known
30Example of Hypothesis Test for DifferencesBetween
Population Means
- The hypothesis test is formulated as follows
- H0 ?70 ?80 vs. H1 ?70 ? ?80
- Suppose we are interested in testing the above
hypothesis at the 5 and 1 levels of
significance - Assuming the two samples are independent, the
degrees of freedom are 238
31Example of Hypothesis Test for DifferencesBetween
Population Means
- Plugging the relevant information into the
formulas for the estimator of the common
population variance, s2, and the t-statistic, we
find that t -1.477 - The cutoff of the t-distribution for this
two-sided test are - At the 5 level, we reject the null if t lt -1.972
or t gt 1.972 - At the 1 level, we reject the null if t lt -2.601
or t gt 2.601 - Given our t-statistic of 1.477, we cannot reject
the null hypothesis at either the 5 or the 1
significance level