Title: Hypothesis Testing And Univariate Analysis
1Hypothesis Testing And Univariate Analysis
2Preface
- For market researchers, the scientific inquiry
translates into a desire to ask questions about
the nature of relationships that affect behavior
within markets - The willingness to formulate hypothesis capable
of being tested to determine - What relationships exists
- When and where these relationships hold
- This chapter will extend the research process to
include the - Testing of relationships
- Formulation of hypotheses
- Making of inferences
3Formulating Hypotheses
- The Objective of the study, with associate
hypotheses, should be stated as clearly as
possible and agreed upon at the outset. - Objectives and hypotheses shape and mold the
study, they determine - the kinds of question to be asked
- The measurement scales for the data to be
collected - The kinds of analysis that will be necessary
- Actual research projects almost always formulate
and test new hypotheses during the project. It is
both acceptable and desirable. - The new hypothesis can be supported or not
supported by the data or it can be neither
supported nor rejected be the data.
4Formulating Hypotheses cont.
- In a typical survey project, the analyst may
alternate between searching the data and
formulating hypotheses. - Three practices of survey analysts (Selvin and
Stuart 1996) - Snooping
- -the process of searching through a body of data
and looking at many relations in order to find
those worth testing - Fishing
- -the process of using the data to choose which of
a number of predesignated variables to include in
an explanatory model - Hunting
- -the process of testing each of a predesignated
set of hypotheses with the data - This investigative approach is reasonable for
basic research but my not be practical for
decisional research.
5Formulating Hypotheses cont.
- What is a hypothesis?
- A hypothesis is an assertion that variables
(measured concepts) are related in a specific way
such that this relationship explains certain
facts or phenomena. - Outcomes are predicted if a specific course of
action is followed. - Hypotheses are often stated as research questions
when reporting either the purpose of the
investigation or the findings. - Hypotheses must be empirically testable.
- Hypotheses may be stated informally as research
questions, or more formally as a set of
alternative hypotheses, or in a testable form
known as a null hypothesis, which states that
there is no relationship between the variables to
be examined - Research questions are not empirically testable,
but aid in the important task of directing and
focusing the research effort.
6EXHIBIT 15.1 Development of a Research Question
for Mingles
- Mingles is an exclusive restaurant specializing
in seafood prepared with a light Italian flair.
Barbara C., the owner and manager, has attempted
to create an airy contemporary atmosphere that is
conducive to conversation and dining enjoyment.
In the first three months, business has grown to
about 70 percent of capacity during dinner hours.
Barbara developed a questionnaire that asked,
among other things, How would you rate the value
of Mingles food for the price paid? The response
form provided five answers with boxes for the
respondent to check Much Better A Little
Better About A Little Worse Much Worse Than
Expected Than Expected Average Than Expected
Than Expected Customers responses were coded
using a scale of 2, 1, 0, 1, and 2. - When tabulated, the average response was found
to be 0.89 with a sample standard deviation of
1.43. The research question asks if Mingles is
perceived as being better than average when
considering the price and value of the food. - The research question that Barbara C. has
developed is How satisfied are Mingles customers
with the concept, service, food, and value?
7Formulating Hypotheses cont.
- Null hypotheses (H0) are statements identifying
relationships that are statistically testable and
can be shown not to hold (nullified). - The logic of the null hypothesis is that we
hypothesize no difference, and we reject the
hypotheses if a difference is found. - A null hypothesis may also be used to specify
other types of relationships that are being
tested, such as the difference between two
groups, or the ability of a specific variable to
predict a phenomenon such as sales or repeat
business. - Alternative hypotheses may be considered to be
the opposite of the null hypotheses. - The alternative hypothesis makes a formal
statement of expected difference, and may state
simply that a difference exists or that a
directional difference exists, depending upon how
the null hypothesis is stated.
8Making Inferences
- Once the data have been tabulated and summary
measures calculated, we often will make
inferences about the nature of the population and
ask a multitude of questions. - we would want to know about the underlying
associated variables that influence preference
purchase, or use (such as color, ease of opening,
accuracy in dispensing the desired quantity,
comfort in handling, etc.) - The broad objective of testing hypotheses
underlies all decisional research. Sometimes the
population as a whole can be measured and
profiled in its entirety. - We cannot measure everyone in the population but
instead must estimate the population using a
sample of respondents drawn from the population.
9The Relationship Between a Population, a Sampling
Distribution, and a Sample
10The Relationship Between theSample and the
Sampling Distribution
11Acceptable Error in Hypothesis Testing
- A question that continually plagues analysts is,
What significance level should be used in
hypothesis testing? - The significance level refers to the amount of
error we are willing to accept in our decisions
that are based on the hypothesis test. - In hypothesis testing the sample results
sometimes lead us to reject H0 when it is true.
This is a Type I error. - On other occasions the sample findings may lead
us to accept H0 when it is false. This is a Type
II error.
12Types of Error in Making a Wrong Decision
- 1. A Type I error occurs when we incorrectly
conclude that a difference exists. The
probability of this is expressed as a, the
probability that we will incorrectly reject H0,
the null hypothesis, or hypothesis of no
difference. - 2. A Type II error occurs when we accept a null
hypothesis when it is in reality false (we find
no difference when a difference really does
exist). - 3. We correctly retain the null hypothesis (we
could also say we tentatively accept or that it
could not be rejected). This is equal to the area
under the normal curve less the area occupied by
a, the significance level. - 4. The power of the test is the ability to reject
the null hypothesis when it should be rejected
(when false). Because power increases as abecomes
larger, esearchers may choose an a of .10 or
even .20 to increase power. Alternatively, sample
size may be increased to increase power.
Increasing sample size is the preferred option
for most market researchers.
13Power of a Test
- The power of a hypothesis test is defined as 1
ß, or one minus the probability of a Type II
error. This means that the power of a test is its
ability to reject the null hypothesis when it is
false - The power of a statistical test is determined by
- 1. acceptable amount of discrepancy between the
tested hypothesis and the true situation - 2. Power is also increased by increasing the
sample size (which decreases the confidence
interval).
14SELECTING TESTS OF STATISTICAL SIGNIFICANCE
- Tests are performed on interval or ratio data
using what is known as parametric tests and
include such techniques as the F, t, and z tests. - Nonparametric methods are often called
distribution-free methods because the inferences
are based on a test statistic whose sampling
distribution does not depend upon the specific
distribution of the population from which the
sample is drawn - to determine an appropriate test for a particular
set of data depends on - the level of measurement of the data
- the number of variables that are involved
- for multiple variables, how they are assumed to
be related.
15PARAMETRIC AND NONPARAMETRIC ANALYSIS
- The process of making inferences from the sample
to the populations parameters is called
parametric analysis. - Parametric methods rely almost exclusively on
either interval or ratio scaled data. - nonparametric methods may be used to compare
entire distributions that are based on nominal
data. - Other nonparametric methods use an ordinal
measurement scale test for the ordering of
observations in the data set. - Problems that may be solved with parametric
methods may often be solved by a nonparametric
method designed to address a similar question.
16Univariate Analyses of Parametric Data
- Marketing researchers are often concerned with
estimating parameters of a population. In
addition, many studies go beyond estimation and
compare population parameters by testing
hypotheses about differences between them. - Very often, the means, proportions, and variances
are the summary measures of concern.
The Confidence Interval The confidence interval
is a range of values with a given probability
(.95, .99, etc.) of including the true population
parameter.
17Univariate Hypothesis Testing of Means
- Population Variance Is Known
- The z statistic describes probabilities of the
normal distribution and is the appropriate tool
to test the difference between µ, the mean of the
sampling distribution, and X, the sample mean
when the population variance is known. - The z statistic may be used only when the
following conditions are met
- Individual items in the sample must be drawn in
a random manner. - The population must be normally distributed. If
this is not the case, the sample must be large (gt
30), so that the sampling distribution is
normally distributed. - The data must be at least interval scaled.
- The variance of the population must be known.
18Population Variance Is Known
- The traditional hypothesis testing approach is as
follows
- The null hypothesis (H0) is specified that there
is no difference between µ and x -. Any observed
difference is due solely to sample variation. - The alpha risk (Type I error) is established
(usually .05). - The z value is calculated by the appropriate z
formula - The probability of the observed difference having
occurred by chance is determined from a table of
the normal distribution (Appendix A, Table A.1). - If the probability of the observed differences
having occurred by chance is greater than the
alpha used, then H0 cannot be rejected and it is
concluded that the sample mean is drawn from a
sampling distribution of the population having
mean µ.
19Univariate Hypothesis Testing of Means
- Population Variance Is Unknown
- Researchers rarely know the true variance of the
population, and must therefore rely on an
estimate of s2, namely, the sample variance s2.
With this variance estimate, we may compute the t
statistic. - The appropriate t distribution to use in an
analysis is determined by the available degrees
of freedom. In univariate analyses, the available
degrees of freedom are n 1.
20Univariate Analysis of Categorical dataThe
Chi-Square Goodness of Fit Test
- Chi-square analysis (?2) can be used when the
data identifies the number of times or frequency
that each category of a tabulation or
cross-tabulation appears. Chi-square is a useful
technique for testing the following - we compute a measure (chi-square) of the
variation between actual and theoretical
frequencies, under the null hypothesis that there
is no difference between the model and the
observed frequencies. We may say that the model
fits the facts. -
1. Determining the significance of sample
deviations from an assumed theoretical
distribution that is, determining whether a
certain model fits the data. This is typically
called a goodness-of-fit test. 2. Determining the
significance of the observed associations found
in the cross-tabulation of two or more variables.
This is typically called a test of independence.
If the measure of variation is high, we reject
the null hypothesis at some specified alpha risk.
If the measure is low, we accept the null
hypothesis that the models output is in
agreement with the actual frequencies.
21Univariate Analysis Test of a Proportion
- The univariate test of proportions, like the
univariate test of means, compares the population
proportion to the proportion observed in the
sample. For a sample proportion, p - where Sp, the estimated standard error of the
proportion, is given by
z standard normal value p the sample
proportion of successes q (1 - p) the sample
proportion of failures n sample size
22Summary
- This chapter has introduced the basic concepts of
formulating hypotheses, hypothesis testing, and
making statistical inferences in the context of
univariate analysis. - A hypothesis is a statement that variables
(measured constructs) are related in a specific
way. The null hypothesis, H0, is a statement that
no relationship exists between the variables
tested or that there is no difference. - Statistics are based on making inferences from
the sample of respondents to the population of
all respondents by means of a sampling
distribution. - A Type I Error occurs when a true H0 is rejected
(there is no difference, but we find there is). A
Type II Error occurs when we accept a false H0
(there is a difference, but we find that none
exists).
23Summary (cont.)
- The power of a test was explained as the ability
to reject H0 when it should be rejected. - Selecting the appropriate statistical technique
for investigating a given relationship depends of
the level of measurement (nominal, ordinal, or
interval) and the number of variables to be
analyzed. - The choice of parametric versus nonparametric
analyses depends on the analysts willingness to
accept the distributional assumptions of
normality and homogeneity of variances. - Univariate hypothesis testing was demonstrated
using the standard normal distribution - Statistic (z) to compare a mean and proportion to
the population values. - The t-test was demonstrated as a parametric test
for populations of unknown variance and samples
of small size. - The chi-square goodness-of-fit test was
demonstrated as a nonparametric test of nominal
data that make no distributional assumptions.