Title: MBA Statistics 5165100 COURSE
1MBA Statistics 51-651-00COURSE 2
- Do we have winning conditions?
- Decision making from statistical inference
2Very often, a decision is taken following a
quantitative analysis of certain parameters.
- You are proposed two advertising concepts to
launch a new product. You will choose the one
which will obtain the best score of effectiveness
in your targeted market. - If the resistance or the average durability of a
new product is significantly larger than the one
of the best competing product, you will put this
product on the market. - If the winning conditions were present and
more than 50 of people in Quebec voted yes in a
referendum for sovereignty, then Bernard Landry
would make the decision to hold one.
3In general, the parameters which interest us are
estimated using a sample and our decision will be
made following a hypothesis test.
- Example
- We ask 1000 residents of the Province of Quebec,
chosen at random who have the right to vote, if
today, they would vote yes in a Quebec
referendum on sovereignty.
4What would Bernard Landry do if
- 432 voters voted yes?
- (432/1000 43.2)
- He would most probably not hold a referendum.
- 517 voters voted yes?
- (517/1000 51.7)
- Is 51.7 significantly larger than 50?
- 612 voters voted yes?
- (612/1000 61.2)
- 61.2 is probably significantly larger than 50.
Therefore he would decide to hold a referendum on
the sovereignty of Quebec.
5Basic notions of hypothesis tests
- To help us decide (especially in case 2 of the
previous slide), we will try to quantify the term
significantly different , statistically
speaking, by associating a probability of error
with it. - In other words, we want to know, starting from
the results obtained in the sample, what is the
probability that the Premier is making a mistake
deciding to hold a referendum on sovereignty.
6Basic notions of hypothesis tests (contd.)
- If the probability of making a mistake is small
(for example, lower than 5) he will then decide
to hold a referendum on sovereignty soon. - If this probability is large (for example,
higher than 5) he will then wait a certain time
to have winning conditions and to hold a
referendum.
7Basic notions of hypothesis tests(contd)
- There are essentially two possibilities
- 50 or less of the voters would vote yes if a
referendum took place today - more than 50 of the voters would vote yes.
- The first possibility is called the null
hypothesis (noted H0). - The second possibility is called the alternate
hypothesis (noted H1).
8Notation
- Let p be the true proportion of voters who
would vote yes at a referendum. We then have the
following two possibilities - H0 p ? 50 vs H1 p gt 50
- Often, the alternate hypothesis is what we want
to show in any reasonable doubt! i.e. we
want the probability of making a mistake by
making the decision H1 starting from the results
of the sample, to be small.
9Choosing H1
- The choice of H1 is determined by the question
you need to answer. - H1 must be chosen in such a way that you can
answer yes (resp. no) to the question if one
accepts H1 and you can answer no (resp. yes) if
one accepts H0. - Typically there are three choices for H1
- ? gt 0, ? lt 0 or ? ? 0
10Choosing H1 (continued)
- The question Bernard Landry is asking himself is
Do I have a chance of winning? - H1 p lt ½ is not good. If one accepts H0 then one
can conclude that p ½ so the answer to his
question is not yes or no! The same is true for
the choice H1 p ? ½. - But H1 p gt ½ is the right choice. If H1 is
accepted, the answer is yes while if H0 is
accepted, then p ½ so the answer is no.
11Possible errors in decision making starting from
a sample
- Type I error
- To reject H0 in favour of H1 (i.e. to take the
decision H1) when actually H0 is true. - The probability of Type I error is the
probability that we have observed the value
obtained in our sample, or a value even
further away from H0 , if H0 is true. In
statistical jargon, this probability is often
called p-value . - Type II error
- Not rejecting H0 in favour of H1 when actually H1
is true.
12Is the defendant guilty or not guilty?
13Control of Type I and Type II errors
- Given the results obtained in the sample, we
calculate the probability of Type I error
(p-value). - If this probability is relatively small (for
example p-value lt 5), then we will reject H0 to
make the decision H1. If not, we will not reject
H0.
14P-value
- Measures the confidence you should have about H0
- A small p-value indicates that you should be
less confident in H0 - How small the p-value should be to reject H0 in
favor of H1? - It depends on you
- Illustration p-value.xls
15Real life analog
- One of your friend just lied to you. Is he still
your friend? - Then he lies again, and again, and again?
- When will you stop considering him/her as a
friend?
16Control of Type I and Type II errors (continued)
- For a type I error fixed in advance (ex. 5), we
control, using the sample size, the type II error
before undertaking the study. - We define the power of the hypothesis test as the
quantity - ( 1 - probability of a type II error )
17In the next few hours, we will see basic
statistical tests
- Test of a proportion.
- Test of a mean.
- Test of a difference between two means from the
same sample (similar to case 2).
18 Test of a proportion
- Example
- Two years ago, a company put a new product on
the market. - The top management of the firm plans to increase
expense if less than 70 of the population know
the product.
19What are the possible hypotheses we want to
examine?
- Let p be the true proportion of individuals
in the population who know the product and
p0 the value which corresponds to our
hypothesis or decision making (p0 70 in the
previous example). We have to choose between - H0 p ? p0 vs H1 p gt p0 (right-tailed test)
- H0 p ? p0 vs H1 p lt p0 (left-tailed test)
- H0 p p0 vs H1 p ? p0 (two-tailed test)
20- One must choose the hypothesis H1 so that the
answer to the question is yes or no. - In this case, the question is should we increase
advertising expenses?
21- H0 p ? 70 vs H1 p gt 70
- If H1 is accepted, the answer is No. If H0, is
accepted, the answer is NYES! - H1 p gt 70 is not appropriate.
22- H0 p 70 vs H1 p ? 70
- If H0 is accepted, the answer is No. If H1is
accepted, the answer is NYES! - H1 p ? 70 is not appropriate.
23- H0 p ? 70 vs H1 p lt 70
- If H0 is accepted, the answer is No. If H1 is
accepted, the answer is Yes! - H1 p lt 70 is the appropriate choice.
24Procedure
- We take a sample of n individuals in the target
population, and we calculate the proportion of
individuals who know the product. - We will reject the null hypothesis H0, at the ?
level, if we have sufficient proof against it,
i.e. enough evidence in favour of the alternate
hypothesis H1, i.e. p-value lt ?.
25The test statistic is given by
-
- If the null hypothesis H0 is true and the sample
size is large, the statistic z will approximately
follow a normal distribution with mean 0 and
variance 1 denoted N(0,1) .
26In order to make a decision, we calculate the
p-value
- Right-tailed test
- p-value ProbN(0,1) gt z
- Left-tailed test
- p-value ProbN(0,1) lt z
- Two-tailed test
- p-value 2 x ProbN(0,1) gt z
- The p-value is calculated with proportion-1t.xls
27The company contacted by telephone 500 people
from the target population
- 330 individuals answer that they know the product
(330/500 66). - H0 p ? 70 vs H1 p lt 70
-
- p-value 0.0255
- We reject H0 (or accept H1) at level 5.
- Therefore we will make the decision to rise the
advertising budget for this product.
28Intentions to vote example
- We choose at random 1000 residents of Quebec that
have the right to vote and ask them if today,
they would vote yes in a referendum on
sovereignty. In the sample, 517 voters answered
that they would vote yes. - H0 p ? 50 vs H1 p gt 50
-
- p-value 0.1411
- We will not reject H0 at the 5 level
- Bernard Landry will not hold a referendum in a
near future.
29Intentions to vote example
- We choose at random 1000 residents of Quebec that
have the right to vote and ask them if today,
they would vote yes in a referendum on
sovereignty. In the sample, 612 voters answered
that they would vote yes. - H0 p ? 50 vs H1 p gt 50
- p-value 1.1146E-12
- We will reject H0 at the 5 level
- Bernard Landry will hold a referendum in a near
future.
30Exercise
- Recall the last example in the estimation
section. - Can you now answer the question satisfactorily?
31Remark Test vs Confidence interval
- Testing H0 p p0 vs H1 p ? p0 is
equivalent to constructing a confidence interval
for p0. - H0 is rejected iff p0 is not in the interval.
32 Test of one mean
- ExampleYou are in charge of the department
which manufactures and produces 170 g bags of
chips (brand CCC). To verify if, on average, the
process of filling is maintained at 170 g, each
day one of your employees is asked to take a
random sample of 100 bags and the average weight
of the sample is calculated. The process of
filling will be stopped if the average weight is
significantly different from 170 g.
33What are the possible hypotheses we want to
examine?
- Let ? be the true mean of a characteristic in
the population. This mean is unknown, as is the
variance ?2. Let ?0 be the value of the mean
which corresponds to our hypothesis or decision
making (?0170g in the previous example ). We
have to choose between - H0 ? ? ?0 vs H1 ? gt ?0 (right-tailed test
) - H0 ? ? ?0 vs H1 ? lt ?0 (left-tailed test
) - H0 ? ?0 vs H1 ? ? ?0 (two-tailed test)
34Procedure
- We take a sample of size n in the target
population and we calculate the mean and the
standard deviation s. - We will reject the null hypothesis H0, at the ?
level, if we have sufficient proof against it,
i.e. enough evidence in favour of the alternate
hypothesis H1, i.e. p-value lt ?.
35The test statistic is given by
- If the null hypothesis H0 is true, the t
statistic will follow a Student distribution with
n-1 degrees of freedom noted t(n-1).
36In order to make a decision, we calculate p-value.
- Right-tailed test
- p-valueProb t(n-1) gt t
- Left-tailed test
- p-valueProb t(n-1) lt t
- Two-tailed test
- p-value 2 x Prob t(n-1) gt t
- (1-?) confidence interval for ?
- The p-value is calculated using mean-1t.xls
X
37Example
- The sample mean of the 100 bags of chips is
169.9 grams and the standard deviation s0.27. - H0 ? 170g vs H1 ? ? 170g
-
- p-value 0.0003
- We reject H0 without being afraid of being wrong!
- 95 confidence interval for ?
- 169.846 169.953
- The interval does not contain the value 170
- ? We reject H0 at the 5 level
38- If the mean of the sample of 100 bags of chips
is 170.011 grams and the standard deviation s
0.27. - H0 ? 170g vs H1 ? ? 170g
-
- p-value 0.69
- We will not reject H0
- 95 confidence interval for ?
- 169.957 170.064
- The interval contains the value 170 ? we will not
reject H0 at the 5 level
39Case study
- The average annual salary of a group of employees
in a city is 45 000. One of the main issue of
the negotiations is that the representative of
the union states that this particular group is
paid much lower than in other comparable cities. - One decides to verify that hypothesis. If the
union is right, the employer will increase the
salaries in such a way that the average salary
will not be significantly lower than in the other
cities. Both parties agree to take a risk of 5.
40Case study (continued)
- To perform the comparison, 50 comparable cities
were chosen at random, the mean of the 50
(average) annual salaries was 50000, and the
standard deviation was 16 000. - a) What is the conclusion?
- b) The city proposes to increase to average
annual salary to 46 500. Is it honest?
41Remark Test vs Confidence interval
- Testing H0 m m0 vs H1 m ? m0 is
equivalent to constructing a confidence interval
for m0. - H0 is rejected if m0 is not in the interval.
423. Test of a difference of two means from the
same sample
- ExampleThe human resources director of a company
wants to suggest that the management implement a
special training program for the employees
assigned to the assembling department. To
evaluate the effectiveness of this 3-week
program, we chose, at random, 15 employees and we
observed the number of parts assembled during
this period of time. Thereafter, these 15
employees participated in the training program
and once again, we observed the number of parts
assembled during the same period of time.
43The results obtained (hr.xls) were as follow
individual before after
difference 1 15
17 2 2 13
16 3 3 8
10 2 4 9
9 0 5 7
9 2 6 12
13 1 7 11
14 3 8 12
15 3 9 11
14 3 10 9
11 2 11 10
14 4 12 12
11 -1 13 11
13 2 14 7
10 3 15 12
13 1
44The results of the statistical analysis using
Excel were as follow
45This test is equivalent to a test of the mean
difference between after and before
Thus, the average productivity is
significantly higher after the program. If the
costs of the training program are less than the
profits in productivity, then the program will be
adopted.