Title: Inferential Statistics: Hypothesis Development
1Inferential StatisticsHypothesis Development
Testing
Prepared by Sebastian Thomas
2Populations Samples
- Before we discuss inferential statistics we have
to understand the difference between a population
and a sample. - Population In statistical research, the
population represents the complete set of all the
elements being studied. A population can be of
any size, depending on what is being studied. For
example a study on contraceptive practices in
Jamaica would have all the people in Jamaica as
its population while a study on contraceptive
practices at UWI would only have UWI students as
its population. - Sample A sample is a subset of a population.
Samples are usually constructed in such a way
that they are representative of the population
they are taken from. Since most populations are
too large to study in their entirety, the idea is
that one could study a smaller yet representative
sample instead.
3Inferential Statistics
A significant portion of quantitative research is
done with samples. Inferential statistics are
procedures that are used to make inferences or
judgments about a population based on data
collected from a representative sample of that
population.
4Inferential Statistics
- There are a great number of inferential
statistical procedures and selecting the
appropriate one depends on several factors. The
key factors are listed below. - Research Goals
- Sample Size
- Level of Measurement of Data
- Data Distribution
5Research Goals
For the purposes of this course, we will limit
our research goals to those associated with
explanatory research. The primary goal of this
type of research is to establish the nature of
the relationship between two or more variables.
For example an explanatory research model might
seek to determine if there is a relationship
between a persons level of education and the
number of children that person has. Explanatory
research begins by formulating a hypothesis
(which is defined as a statement about how nature
operates) that leads us to predict a relationship
that the experiment or study should demonstrate.
For instance we might hypothesize that the higher
a persons level of education the less children
that person is likely to have.
6Hypothesis
A hypothesis is a statement about the
relationship between two or more variables. A
hypothesis requires at least two variables, one
independent variable and one dependent variable.
Analysis based on a two variable hypothesis is
called bivariate analysis while a hypothesis with
more than two variables would require
multivariate analysis. A typical explanatory
research project requires two types of
hypotheses, an experimental hypothesis and a set
of statistical hypotheses.
7Experimental Hypothesis
When conducting an explanatory study the
researcher should have a theory as to what the
nature of the relationship being studied will be.
For example if a researcher is studying the
relationship between marital status and
happiness, the researcher may theorize that a
married person is more likely to be happy than an
unmarried person. The experimental hypothesis
describes the predicted outcome we may or may not
find in an experiment or study. So the
researchers theory about the relationship
between marital status and happiness represents
the experimental hypothesis. The essence of
inferential statistics is to determine whether
the researchers theory (experimental hypothesis)
is true or false.
8Statistical Hypotheses
There are two possible outcomes when testing an
experimental hypothesis. It could either be true
or false. These outcomes are represented in a
pair of statistical hypotheses. All inferential
procedures have a set of statistical hypotheses
associated with them and the purpose of each
procedure is to test which of the two is true.
The first of the two hypotheses is known as the
null hypothesis while the second is called the
alternative hypothesis. These are discussed in
the following sections.
9Null Hypothesis (H0)
This is referred to as the hypothesis of no
difference or relationship. It represents the
possibility that the predicted relationship does
not exist. In other words it states that the
independent variable will either have no effect
or not the predicted effect on the dependent
variable. For the experimental hypothesis that
married people are more likely to be happy than
unmarried people, the H0 would read as
follows H0Married people are less or just as
happy as unmarried people.
10Alternative Hypothesis (Ha)
This hypothesis represents the possibility that
the predicted relationship is true. This
hypothesis is more or less identical to the
experimental hypothesis. For the experimental
hypothesis that married people are more likely to
be happy than unmarried people, the Ha would
read as follows Ha Married people are happier
than unmarried people.
11Directional vs. Nondirectional
The experimental hypothesis, and by extension the
statistical hypotheses can be directional or
nondirectional. The hypothesis stating that
married people are more likely to be happier than
unmarried people is directional because it
asserts a direction for the relationship being
theorized about. An example of a nondirectional
hypothesis would be There is a relationship
between marital status and happiness. In
statistics nondirectional hypotheses are referred
to as two tailed hypotheses while directional
hypotheses are referred to as one tailed
hypotheses.
12Directional vs. Nondirectional
Experimental Hypothesis (Nondirectional) There is
a relationship between age and literacy
level. Ho There is no relationship between age
and literacy. Ha There is a relationship between
age and literacy.
Experimental Hypothesis (Directional) Younger
individuals have higher literacy levels than
older people. Ho Younger individuals have lower
or the same literacy levels as older people. Ha
Younger individuals have higher literacy levels
than older people.
13Hypothesis Testing
Hypothesis testing can be defined as using
inferential procedures to make a decision between
accepting or rejecting the null hypothesis (H0).
The process can be summarised into eight (8)
steps that are applicable for all inferential
procedures. 1. Research Question state the
question of interest. 2. Hypotheses State the
experimental and statistical hypotheses. 3.
Statistical Test state what inferential
procedure to use and its assumptions. 4.
Computation calculate the statistical test.
14Hypothesis Testing Contd
5. Critical Value Find the critical value and
compare it with the obtained value. 6.
Decision Determine whether to accept or reject
the H0. 7. Results Describe the outcome of the
statistical test. 8. Conclusion Draw
conclusions based on the results with regards to
the research question.
15Sampling Error
As has been stated before, inferential statistics
involve using a representative sample to make
judgments about a population. Lets say that we
wanted to determine the nature of the
relationship between area of residence and
poverty among the Jamaican population. We could
select a representative sample of say 10,000
Jamaicans to conduct our study. If we find that
there is a relationship between area of residence
and poverty in the sample we could then
generalize this to the entire Jamaican
population. However even the most
representative sample is not going to be exactly
the same as its population. 10,000 Jamaicans are
not going to be exactly the same as 2.5 million
Jamaicans. Given this, there is always a chance
that the things we find in a sample are anomalies
and do not occur in the population that the
sample represents. This error is referred as
sampling error.
16Sampling Error
A formal definition of sampling error is as
follows Sampling error occurs when random chance
produces a sample statistic that is not equal to
the population parameter it represents. Due to
sampling error there is always a chance that we
are making a mistake when accepting or rejecting
our null hypothesis.
17Type I and Type II errors
There are two possible types of errors that can
be made when making a statistical decision, i.e.
accepting or rejecting the null hypothesis
(H0). Type I error occurs whenever the null
hypothesis is rejected and it is true. Type II
error occurs whenever the null hypothesis is
accepted and it is false. So using our
earlier hypothesis that there is a relationship
between area of residence and poverty, we would
be committing a Type I error if our sample shows
that there is a relationship and there isnt one
and we would be committing a type II error if our
sample shows that there isnt a relationship and
there is one.
18Type I and Type II errors
There are four possible possibilities in making a
decision after completing a statistical
test. This can be illustrated by the sampling
error matrix.
19Type I and Type II errors
The use of this matrix can be illustrated with
the following example H0 The man is innocent
of murder. Ha The man is guilty of murder.
20Statistical Significance
In doing a statistical test the researcher must
decide which of the two errors he/she would most
like to avoid. Once this has been decided the
researcher can then select the level of
significance or alpha ( ). You can think of
the alpha as representing the amount of error a
researcher is willing to accept when making a
statistical decision. In social research there
are generally two standard alphas to choose
between, .05 or .01. At the first alpha the
researcher will conclude that a test is
statistically significant or that a relationship
exists between the variables if he/she is 95
confident while at .01 the researcher will only
do so if he/she is 99 confident.
21Inferential Statistics
Inferential statistics can be categorised into 2
types parametric and nonparametric. Parametric
Statistics Statistical procedures that make
certain assumptions about the population
represented by the sample data. Nonparametric
Statistics Statistical procedures that do not
require certain assumptions about the population
represented by the sample data.
22Parametric Statistics
- There are two assumptions common to all
parametric procedures - The population of raw scores forms a normal
distribution - The dependent scores are interval or ratio
scores. - In other words, the data must be normally
distributed and the level of measurement of the
dependent variable must be interval or ratio.
23Types of Parametric Statistics
- There are several types of parametric statistics,
for this course we will be interested in four(4) - Independent Samples T-Test
- Analysis of Variance
- Pearsons Correlation
- Simple Linear Regression
- We will be looking at each of these procedures in
greater detail later on in the course.
24Nonparametric Statistics
These procedures are usually used with nominal or
ordinal dependent scores or with a skewed
distribution of interval or ratio scores (when
the data are most appropriately described by the
median or the mode). Nonparametric procedures
are generally considered to be less powerful than
their parametric counterparts.
25Types of Nonparametric Statistics
- The nonparametric statistics we will be covering
in this course are as follows - Chi-Square
- Mann-Whitney U test
- Kruskal-Wallis H test
26Practice Problem
- Using the following set of statistical
hypotheses, complete the sampling matrix using
the actual terms (Type I error, Type II error,
correct decision, correct decision) and then in
words with respect to the given null hypotheses. - Ho Women are not smarter than men.
- Ha Women are smarter than men.