Hypothesis Testing And Univariate Analysis - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

Hypothesis Testing And Univariate Analysis

Description:

The Chi-Square Goodness of Fit Test ... We may say that the model fits the facts. ... The chi-square goodness-of-fit test was demonstrated as a nonparametric test of ... – PowerPoint PPT presentation

Number of Views:213

Avg rating:3.0/5.0

Slides: 24

Provided by: brianj80

Category:

more less

Transcript and Presenter's Notes

Title: Hypothesis Testing And Univariate Analysis

1
Hypothesis Testing And Univariate Analysis

Chapter 15

2
Preface

For market researchers, the scientific inquiry
translates into a desire to ask questions about
the nature of relationships that affect behavior
within markets
The willingness to formulate hypothesis capable
of being tested to determine
What relationships exists
When and where these relationships hold
This chapter will extend the research process to
include the
Testing of relationships
Formulation of hypotheses
Making of inferences

3
Formulating Hypotheses

The Objective of the study, with associate
hypotheses, should be stated as clearly as
possible and agreed upon at the outset.
Objectives and hypotheses shape and mold the
study, they determine
the kinds of question to be asked
The measurement scales for the data to be
collected
The kinds of analysis that will be necessary
Actual research projects almost always formulate
and test new hypotheses during the project. It is
both acceptable and desirable.
The new hypothesis can be supported or not
supported by the data or it can be neither
supported nor rejected be the data.

4
Formulating Hypotheses cont.

In a typical survey project, the analyst may
alternate between searching the data and
formulating hypotheses.
Three practices of survey analysts (Selvin and
Stuart 1996)
Snooping
-the process of searching through a body of data
and looking at many relations in order to find
those worth testing
Fishing
-the process of using the data to choose which of
a number of predesignated variables to include in
an explanatory model
Hunting
-the process of testing each of a predesignated
set of hypotheses with the data
This investigative approach is reasonable for
basic research but my not be practical for
decisional research.

5
Formulating Hypotheses cont.

What is a hypothesis?
A hypothesis is an assertion that variables
(measured concepts) are related in a specific way
such that this relationship explains certain
facts or phenomena.
Outcomes are predicted if a specific course of
action is followed.
Hypotheses are often stated as research questions
when reporting either the purpose of the
investigation or the findings.
Hypotheses must be empirically testable.
Hypotheses may be stated informally as research
questions, or more formally as a set of
alternative hypotheses, or in a testable form
known as a null hypothesis, which states that
there is no relationship between the variables to
be examined
Research questions are not empirically testable,
but aid in the important task of directing and
focusing the research effort.

6
EXHIBIT 15.1 Development of a Research Question
for Mingles

Mingles is an exclusive restaurant specializing
in seafood prepared with a light Italian flair.
Barbara C., the owner and manager, has attempted
to create an airy contemporary atmosphere that is
conducive to conversation and dining enjoyment.
In the first three months, business has grown to
about 70 percent of capacity during dinner hours.
Barbara developed a questionnaire that asked,
among other things, How would you rate the value
of Mingles food for the price paid? The response
form provided five answers with boxes for the
respondent to check Much Better A Little
Better About A Little Worse Much Worse Than
Expected Than Expected Average Than Expected
Than Expected Customers responses were coded
using a scale of 2, 1, 0, 1, and 2.
When tabulated, the average response was found
to be 0.89 with a sample standard deviation of
1.43. The research question asks if Mingles is
perceived as being better than average when
considering the price and value of the food.
The research question that Barbara C. has
developed is How satisfied are Mingles customers
with the concept, service, food, and value?

7
Formulating Hypotheses cont.

Null hypotheses (H0) are statements identifying
relationships that are statistically testable and
can be shown not to hold (nullified).
The logic of the null hypothesis is that we
hypothesize no difference, and we reject the
hypotheses if a difference is found.
A null hypothesis may also be used to specify
other types of relationships that are being
tested, such as the difference between two
groups, or the ability of a specific variable to
predict a phenomenon such as sales or repeat
business.
Alternative hypotheses may be considered to be
the opposite of the null hypotheses.
The alternative hypothesis makes a formal
statement of expected difference, and may state
simply that a difference exists or that a
directional difference exists, depending upon how
the null hypothesis is stated.

8
Making Inferences

Once the data have been tabulated and summary
measures calculated, we often will make
inferences about the nature of the population and
ask a multitude of questions.
we would want to know about the underlying
associated variables that influence preference
purchase, or use (such as color, ease of opening,
accuracy in dispensing the desired quantity,
comfort in handling, etc.)
The broad objective of testing hypotheses
underlies all decisional research. Sometimes the
population as a whole can be measured and
profiled in its entirety.
We cannot measure everyone in the population but
instead must estimate the population using a
sample of respondents drawn from the population.

9
The Relationship Between a Population, a Sampling
Distribution, and a Sample
10
The Relationship Between theSample and the
Sampling Distribution
11
Acceptable Error in Hypothesis Testing

A question that continually plagues analysts is,
What significance level should be used in
hypothesis testing?
The significance level refers to the amount of
error we are willing to accept in our decisions
that are based on the hypothesis test.
In hypothesis testing the sample results
sometimes lead us to reject H0 when it is true.
This is a Type I error.
On other occasions the sample findings may lead
us to accept H0 when it is false. This is a Type
II error.

12
Types of Error in Making a Wrong Decision

1. A Type I error occurs when we incorrectly
conclude that a difference exists. The
probability of this is expressed as a, the
probability that we will incorrectly reject H0,
the null hypothesis, or hypothesis of no
difference.
2. A Type II error occurs when we accept a null
hypothesis when it is in reality false (we find
no difference when a difference really does
exist).
3. We correctly retain the null hypothesis (we
could also say we tentatively accept or that it
could not be rejected). This is equal to the area
under the normal curve less the area occupied by
a, the significance level.
4. The power of the test is the ability to reject
the null hypothesis when it should be rejected
(when false). Because power increases as abecomes
larger, esearchers may choose an a of .10 or
even .20 to increase power. Alternatively, sample
size may be increased to increase power.
Increasing sample size is the preferred option
for most market researchers.

13
Power of a Test

The power of a hypothesis test is defined as 1
ß, or one minus the probability of a Type II
error. This means that the power of a test is its
ability to reject the null hypothesis when it is
false
The power of a statistical test is determined by
1. acceptable amount of discrepancy between the
tested hypothesis and the true situation
2. Power is also increased by increasing the
sample size (which decreases the confidence
interval).

14
SELECTING TESTS OF STATISTICAL SIGNIFICANCE

Tests are performed on interval or ratio data
using what is known as parametric tests and
include such techniques as the F, t, and z tests.
Nonparametric methods are often called
distribution-free methods because the inferences
are based on a test statistic whose sampling
distribution does not depend upon the specific
distribution of the population from which the
sample is drawn
to determine an appropriate test for a particular
set of data depends on
the level of measurement of the data
the number of variables that are involved
for multiple variables, how they are assumed to
be related.

15
PARAMETRIC AND NONPARAMETRIC ANALYSIS

The process of making inferences from the sample
to the populations parameters is called
parametric analysis.
Parametric methods rely almost exclusively on
either interval or ratio scaled data.
nonparametric methods may be used to compare
entire distributions that are based on nominal
data.
Other nonparametric methods use an ordinal
measurement scale test for the ordering of
observations in the data set.
Problems that may be solved with parametric
methods may often be solved by a nonparametric
method designed to address a similar question.

16
Univariate Analyses of Parametric Data

Marketing researchers are often concerned with
estimating parameters of a population. In
addition, many studies go beyond estimation and
compare population parameters by testing
hypotheses about differences between them.
Very often, the means, proportions, and variances
are the summary measures of concern.

The Confidence Interval The confidence interval
is a range of values with a given probability
(.95, .99, etc.) of including the true population
parameter.
17
Univariate Hypothesis Testing of Means

Population Variance Is Known
The z statistic describes probabilities of the
normal distribution and is the appropriate tool
to test the difference between µ, the mean of the
sampling distribution, and X, the sample mean
when the population variance is known.
The z statistic may be used only when the
following conditions are met

Individual items in the sample must be drawn in
a random manner.
The population must be normally distributed. If
this is not the case, the sample must be large (gt
30), so that the sampling distribution is
normally distributed.
The data must be at least interval scaled.
The variance of the population must be known.

18
Population Variance Is Known

The traditional hypothesis testing approach is as
follows

The null hypothesis (H0) is specified that there
is no difference between µ and x -. Any observed
difference is due solely to sample variation.
The alpha risk (Type I error) is established
(usually .05).
The z value is calculated by the appropriate z
formula
The probability of the observed difference having
occurred by chance is determined from a table of
the normal distribution (Appendix A, Table A.1).
If the probability of the observed differences
having occurred by chance is greater than the
alpha used, then H0 cannot be rejected and it is
concluded that the sample mean is drawn from a
sampling distribution of the population having
mean µ.

19
Univariate Hypothesis Testing of Means

Population Variance Is Unknown
Researchers rarely know the true variance of the
population, and must therefore rely on an
estimate of s2, namely, the sample variance s2.
With this variance estimate, we may compute the t
statistic.
The appropriate t distribution to use in an
analysis is determined by the available degrees
of freedom. In univariate analyses, the available
degrees of freedom are n 1.

20
Univariate Analysis of Categorical dataThe
Chi-Square Goodness of Fit Test

Chi-square analysis (?2) can be used when the
data identifies the number of times or frequency
that each category of a tabulation or
cross-tabulation appears. Chi-square is a useful
technique for testing the following
we compute a measure (chi-square) of the
variation between actual and theoretical
frequencies, under the null hypothesis that there
is no difference between the model and the
observed frequencies. We may say that the model
fits the facts.

1. Determining the significance of sample
deviations from an assumed theoretical
distribution that is, determining whether a
certain model fits the data. This is typically
called a goodness-of-fit test. 2. Determining the
significance of the observed associations found
in the cross-tabulation of two or more variables.
This is typically called a test of independence.
If the measure of variation is high, we reject
the null hypothesis at some specified alpha risk.
If the measure is low, we accept the null
hypothesis that the models output is in
agreement with the actual frequencies.
21
Univariate Analysis Test of a Proportion

The univariate test of proportions, like the
univariate test of means, compares the population
proportion to the proportion observed in the
sample. For a sample proportion, p
where Sp, the estimated standard error of the
proportion, is given by

z standard normal value p the sample
proportion of successes q (1 - p) the sample
proportion of failures n sample size
22
Summary

This chapter has introduced the basic concepts of
formulating hypotheses, hypothesis testing, and
making statistical inferences in the context of
univariate analysis.
A hypothesis is a statement that variables
(measured constructs) are related in a specific
way. The null hypothesis, H0, is a statement that
no relationship exists between the variables
tested or that there is no difference.
Statistics are based on making inferences from
the sample of respondents to the population of
all respondents by means of a sampling
distribution.
A Type I Error occurs when a true H0 is rejected
(there is no difference, but we find there is). A
Type II Error occurs when we accept a false H0
(there is a difference, but we find that none
exists).

23
Summary (cont.)

The power of a test was explained as the ability
to reject H0 when it should be rejected.
Selecting the appropriate statistical technique
for investigating a given relationship depends of
the level of measurement (nominal, ordinal, or
interval) and the number of variables to be
analyzed.
The choice of parametric versus nonparametric
analyses depends on the analysts willingness to
accept the distributional assumptions of
normality and homogeneity of variances.
Univariate hypothesis testing was demonstrated
using the standard normal distribution
Statistic (z) to compare a mean and proportion to
the population values.
The t-test was demonstrated as a parametric test
for populations of unknown variance and samples
of small size.
The chi-square goodness-of-fit test was
demonstrated as a nonparametric test of nominal
data that make no distributional assumptions.