Hypothesis Testing And Univariate Analysis - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Hypothesis Testing And Univariate Analysis

Description:

The Chi-Square Goodness of Fit Test ... We may say that the model fits the facts. ... The chi-square goodness-of-fit test was demonstrated as a nonparametric test of ... – PowerPoint PPT presentation

Number of Views:213
Avg rating:3.0/5.0
Slides: 24
Provided by: brianj80
Category:

less

Transcript and Presenter's Notes

Title: Hypothesis Testing And Univariate Analysis


1
Hypothesis Testing And Univariate Analysis
  • Chapter 15

2
Preface
  • For market researchers, the scientific inquiry
    translates into a desire to ask questions about
    the nature of relationships that affect behavior
    within markets
  • The willingness to formulate hypothesis capable
    of being tested to determine
  • What relationships exists
  • When and where these relationships hold
  • This chapter will extend the research process to
    include the
  • Testing of relationships
  • Formulation of hypotheses
  • Making of inferences

3
Formulating Hypotheses
  • The Objective of the study, with associate
    hypotheses, should be stated as clearly as
    possible and agreed upon at the outset.
  • Objectives and hypotheses shape and mold the
    study, they determine
  • the kinds of question to be asked
  • The measurement scales for the data to be
    collected
  • The kinds of analysis that will be necessary
  • Actual research projects almost always formulate
    and test new hypotheses during the project. It is
    both acceptable and desirable.
  • The new hypothesis can be supported or not
    supported by the data or it can be neither
    supported nor rejected be the data.

4
Formulating Hypotheses cont.
  • In a typical survey project, the analyst may
    alternate between searching the data and
    formulating hypotheses.
  • Three practices of survey analysts (Selvin and
    Stuart 1996)
  • Snooping
  • -the process of searching through a body of data
    and looking at many relations in order to find
    those worth testing
  • Fishing
  • -the process of using the data to choose which of
    a number of predesignated variables to include in
    an explanatory model
  • Hunting
  • -the process of testing each of a predesignated
    set of hypotheses with the data
  • This investigative approach is reasonable for
    basic research but my not be practical for
    decisional research.

5
Formulating Hypotheses cont.
  • What is a hypothesis?
  • A hypothesis is an assertion that variables
    (measured concepts) are related in a specific way
    such that this relationship explains certain
    facts or phenomena.
  • Outcomes are predicted if a specific course of
    action is followed.
  • Hypotheses are often stated as research questions
    when reporting either the purpose of the
    investigation or the findings.
  • Hypotheses must be empirically testable.
  • Hypotheses may be stated informally as research
    questions, or more formally as a set of
    alternative hypotheses, or in a testable form
    known as a null hypothesis, which states that
    there is no relationship between the variables to
    be examined
  • Research questions are not empirically testable,
    but aid in the important task of directing and
    focusing the research effort.

6
EXHIBIT 15.1 Development of a Research Question
for Mingles
  • Mingles is an exclusive restaurant specializing
    in seafood prepared with a light Italian flair.
    Barbara C., the owner and manager, has attempted
    to create an airy contemporary atmosphere that is
    conducive to conversation and dining enjoyment.
    In the first three months, business has grown to
    about 70 percent of capacity during dinner hours.
    Barbara developed a questionnaire that asked,
    among other things, How would you rate the value
    of Mingles food for the price paid? The response
    form provided five answers with boxes for the
    respondent to check Much Better A Little
    Better About A Little Worse Much Worse Than
    Expected Than Expected Average Than Expected
    Than Expected Customers responses were coded
    using a scale of 2, 1, 0, 1, and 2.
  • When tabulated, the average response was found
    to be 0.89 with a sample standard deviation of
    1.43. The research question asks if Mingles is
    perceived as being better than average when
    considering the price and value of the food.
  • The research question that Barbara C. has
    developed is How satisfied are Mingles customers
    with the concept, service, food, and value?

7
Formulating Hypotheses cont.
  • Null hypotheses (H0) are statements identifying
    relationships that are statistically testable and
    can be shown not to hold (nullified).
  • The logic of the null hypothesis is that we
    hypothesize no difference, and we reject the
    hypotheses if a difference is found.
  • A null hypothesis may also be used to specify
    other types of relationships that are being
    tested, such as the difference between two
    groups, or the ability of a specific variable to
    predict a phenomenon such as sales or repeat
    business.
  • Alternative hypotheses may be considered to be
    the opposite of the null hypotheses.
  • The alternative hypothesis makes a formal
    statement of expected difference, and may state
    simply that a difference exists or that a
    directional difference exists, depending upon how
    the null hypothesis is stated.

8
Making Inferences
  • Once the data have been tabulated and summary
    measures calculated, we often will make
    inferences about the nature of the population and
    ask a multitude of questions.
  • we would want to know about the underlying
    associated variables that influence preference
    purchase, or use (such as color, ease of opening,
    accuracy in dispensing the desired quantity,
    comfort in handling, etc.)
  • The broad objective of testing hypotheses
    underlies all decisional research. Sometimes the
    population as a whole can be measured and
    profiled in its entirety.
  • We cannot measure everyone in the population but
    instead must estimate the population using a
    sample of respondents drawn from the population.

9
The Relationship Between a Population, a Sampling
Distribution, and a Sample
10
The Relationship Between theSample and the
Sampling Distribution
11
Acceptable Error in Hypothesis Testing
  • A question that continually plagues analysts is,
    What significance level should be used in
    hypothesis testing?
  • The significance level refers to the amount of
    error we are willing to accept in our decisions
    that are based on the hypothesis test.
  • In hypothesis testing the sample results
    sometimes lead us to reject H0 when it is true.
    This is a Type I error.
  • On other occasions the sample findings may lead
    us to accept H0 when it is false. This is a Type
    II error.

12
Types of Error in Making a Wrong Decision
  • 1. A Type I error occurs when we incorrectly
    conclude that a difference exists. The
    probability of this is expressed as a, the
    probability that we will incorrectly reject H0,
    the null hypothesis, or hypothesis of no
    difference.
  • 2. A Type II error occurs when we accept a null
    hypothesis when it is in reality false (we find
    no difference when a difference really does
    exist).
  • 3. We correctly retain the null hypothesis (we
    could also say we tentatively accept or that it
    could not be rejected). This is equal to the area
    under the normal curve less the area occupied by
    a, the significance level.
  • 4. The power of the test is the ability to reject
    the null hypothesis when it should be rejected
    (when false). Because power increases as abecomes
    larger, esearchers may choose an a of .10 or
    even .20 to increase power. Alternatively, sample
    size may be increased to increase power.
    Increasing sample size is the preferred option
    for most market researchers.

13
Power of a Test
  • The power of a hypothesis test is defined as 1
    ß, or one minus the probability of a Type II
    error. This means that the power of a test is its
    ability to reject the null hypothesis when it is
    false
  • The power of a statistical test is determined by
  • 1. acceptable amount of discrepancy between the
    tested hypothesis and the true situation
  • 2. Power is also increased by increasing the
    sample size (which decreases the confidence
    interval).

14
SELECTING TESTS OF STATISTICAL SIGNIFICANCE
  • Tests are performed on interval or ratio data
    using what is known as parametric tests and
    include such techniques as the F, t, and z tests.
  • Nonparametric methods are often called
    distribution-free methods because the inferences
    are based on a test statistic whose sampling
    distribution does not depend upon the specific
    distribution of the population from which the
    sample is drawn
  • to determine an appropriate test for a particular
    set of data depends on
  • the level of measurement of the data
  • the number of variables that are involved
  • for multiple variables, how they are assumed to
    be related.

15
PARAMETRIC AND NONPARAMETRIC ANALYSIS
  • The process of making inferences from the sample
    to the populations parameters is called
    parametric analysis.
  • Parametric methods rely almost exclusively on
    either interval or ratio scaled data.
  • nonparametric methods may be used to compare
    entire distributions that are based on nominal
    data.
  • Other nonparametric methods use an ordinal
    measurement scale test for the ordering of
    observations in the data set.
  • Problems that may be solved with parametric
    methods may often be solved by a nonparametric
    method designed to address a similar question.

16
Univariate Analyses of Parametric Data
  • Marketing researchers are often concerned with
    estimating parameters of a population. In
    addition, many studies go beyond estimation and
    compare population parameters by testing
    hypotheses about differences between them.
  • Very often, the means, proportions, and variances
    are the summary measures of concern.

The Confidence Interval The confidence interval
is a range of values with a given probability
(.95, .99, etc.) of including the true population
parameter.
17
Univariate Hypothesis Testing of Means
  • Population Variance Is Known
  • The z statistic describes probabilities of the
    normal distribution and is the appropriate tool
    to test the difference between µ, the mean of the
    sampling distribution, and X, the sample mean
    when the population variance is known.
  • The z statistic may be used only when the
    following conditions are met
  • Individual items in the sample must be drawn in
    a random manner.
  • The population must be normally distributed. If
    this is not the case, the sample must be large (gt
    30), so that the sampling distribution is
    normally distributed.
  • The data must be at least interval scaled.
  • The variance of the population must be known.

18
Population Variance Is Known
  • The traditional hypothesis testing approach is as
    follows
  • The null hypothesis (H0) is specified that there
    is no difference between µ and x -. Any observed
    difference is due solely to sample variation.
  • The alpha risk (Type I error) is established
    (usually .05).
  • The z value is calculated by the appropriate z
    formula
  • The probability of the observed difference having
    occurred by chance is determined from a table of
    the normal distribution (Appendix A, Table A.1).
  • If the probability of the observed differences
    having occurred by chance is greater than the
    alpha used, then H0 cannot be rejected and it is
    concluded that the sample mean is drawn from a
    sampling distribution of the population having
    mean µ.

19
Univariate Hypothesis Testing of Means
  • Population Variance Is Unknown
  • Researchers rarely know the true variance of the
    population, and must therefore rely on an
    estimate of s2, namely, the sample variance s2.
    With this variance estimate, we may compute the t
    statistic.
  • The appropriate t distribution to use in an
    analysis is determined by the available degrees
    of freedom. In univariate analyses, the available
    degrees of freedom are n 1.

20
Univariate Analysis of Categorical dataThe
Chi-Square Goodness of Fit Test
  • Chi-square analysis (?2) can be used when the
    data identifies the number of times or frequency
    that each category of a tabulation or
    cross-tabulation appears. Chi-square is a useful
    technique for testing the following
  • we compute a measure (chi-square) of the
    variation between actual and theoretical
    frequencies, under the null hypothesis that there
    is no difference between the model and the
    observed frequencies. We may say that the model
    fits the facts.

1. Determining the significance of sample
deviations from an assumed theoretical
distribution that is, determining whether a
certain model fits the data. This is typically
called a goodness-of-fit test. 2. Determining the
significance of the observed associations found
in the cross-tabulation of two or more variables.
This is typically called a test of independence.
If the measure of variation is high, we reject
the null hypothesis at some specified alpha risk.
If the measure is low, we accept the null
hypothesis that the models output is in
agreement with the actual frequencies.
21
Univariate Analysis Test of a Proportion
  • The univariate test of proportions, like the
    univariate test of means, compares the population
    proportion to the proportion observed in the
    sample. For a sample proportion, p
  • where Sp, the estimated standard error of the
    proportion, is given by

z standard normal value p the sample
proportion of successes q (1 - p) the sample
proportion of failures n sample size
22
Summary
  • This chapter has introduced the basic concepts of
    formulating hypotheses, hypothesis testing, and
    making statistical inferences in the context of
    univariate analysis.
  • A hypothesis is a statement that variables
    (measured constructs) are related in a specific
    way. The null hypothesis, H0, is a statement that
    no relationship exists between the variables
    tested or that there is no difference.
  • Statistics are based on making inferences from
    the sample of respondents to the population of
    all respondents by means of a sampling
    distribution.
  • A Type I Error occurs when a true H0 is rejected
    (there is no difference, but we find there is). A
    Type II Error occurs when we accept a false H0
    (there is a difference, but we find that none
    exists).

23
Summary (cont.)
  • The power of a test was explained as the ability
    to reject H0 when it should be rejected.
  • Selecting the appropriate statistical technique
    for investigating a given relationship depends of
    the level of measurement (nominal, ordinal, or
    interval) and the number of variables to be
    analyzed.
  • The choice of parametric versus nonparametric
    analyses depends on the analysts willingness to
    accept the distributional assumptions of
    normality and homogeneity of variances.
  • Univariate hypothesis testing was demonstrated
    using the standard normal distribution
  • Statistic (z) to compare a mean and proportion to
    the population values.
  • The t-test was demonstrated as a parametric test
    for populations of unknown variance and samples
    of small size.
  • The chi-square goodness-of-fit test was
    demonstrated as a nonparametric test of nominal
    data that make no distributional assumptions.
Write a Comment
User Comments (0)
About PowerShow.com