Title: Validity and Reliability
1Validity and Reliability
2Instrument Concerns
- Validity Does the instrument measure what it is
supposed to measure? - Reliability Does the instrument consistently
yield the same results?
3Validity
- There are three major categories of validity.
- Content (also called Face Validity)
- Criterion Related
- Concurrent
- Predictive
- Construct
4Content Validity
- Researchers are most concerned with Content or
Face Validity - Does the instrument really measure what it is
supposed to measure. - This is similar to does a test in a class really
cover the content that has been taught. - Content validity is determined by having a panel
of experts in the field examine the instrument
and compare it to the research objectives. - THERE IS NO STATISTICAL TEST TO DETERMINE CONTENT
VALIDITY.
5Concurrent Validity
- Does the instrument yield similar results to
other recognized instruments or tests? - Example If I developed an instrument to identify
quality FFA chapters and most of the chapters I
identified were recognized as 3 star FFA chapters
by the National FFA organization, this means my
instrument has concurrent validity.
6Predictive Validity
- Does the instrument predict how well subjects
will perform at a later data? - Example I developed an instrument that I
believed would identify freshman in high school
who would go to college. Four years later if the
students I identified did go to college, then my
instrument has predictive validity.
7Construct Validity
- Does the instrument really measure some
abstract or mental concept such as
creativeness, common sense, loyalty, or honesty. - This is very difficult to establish, thus we
seldom are concerned with construct validity in
educational research.
8Reliability
- An instrument first must be valid, then it must
be reliable. - It must measure accurately and consistently each
time it is used. - If I had a bathroom scale and stepped on it
three times in a row, and got drastically
different weights each timethen it wouldnot be
reliable. Some survey instruments that are not
well designed behave in the same manner.
9Determining Instrument Reliability
- Test-Retest administer the instrument, then
administer it to the same group later, correlate
the two scores (I really dont like this
technique. I think it is impractical and too time
consuming)
10Determining Instrument Reliability
- Equivalent Forms have two forms of the
instrument, administer form A to a group then
administer form B later, correlate the two scores
(I really dont like this technique. I think it
is impractical and too time consuming)
11Determining Instrument Reliability
- Split-halves divide the instrument in half and
calculate a correlation between the halves. There
should be a high correlation. This determines
internal consistency which is generally
regarded as being synonymous with reliability.
12Determining Instrument Reliability
- Statistical calculations There are several
statistical procedures for determining internal
consistency. - Cronbachs Alpha
- Kuder-Richardson 20 or 21
- Reliability coefficients (scores) range from 0-1.
The higher the score, the more reliable the
instrument. Aim for .70 or higher.
Note Technically there is a difference between
reliability and internal consistency but for
all practical purposes they are interchangeable
13SPSS
- We typically use a statistical software program
SPSS (Statistical Package for the Social
Sciences) to determine internal consistency of an
instrument. NCSU students can download this for
free from the university software site.
14Increasing Instrument Reliability
- In addition to doing a good job in designing an
instrument, two factors impact on the reliability
of an instrument - Number of questions the more questions the more
reliable (but more questions can reduce response
rates) - Number of people completing the instrument the
more who take it, the more reliable it will be
15Application
- Which two images depict reliability?
- Which image shows both reliability and validity?
- Which image shows neither reliability or validity?
16Internal Validity
- The previous slides about reliability and
validity refer to instrument development. There
is another type of validity that researchers are
concerned with and that involves experimental
and quasi-experimental research. This other
validity is known as Internal Validity
17What is Internal Validity?
- If we are sure the treatment made a difference in
an experiment, then we believe the study in
internally valid. - In other words the independent variable made the
difference and not some other variables.
18Common threats to internal validity
- History
- Maturation
- Testing
- Instrumentation
- Statistical Regression
- Experimental Mortality
- Selection
19History
- the specific events which occur between the
first and second measurement. - External events might influence the outcome of
the study, not the treatment. - Assume I developed a teaching unit to make people
more accepting of other cultures and started the
use of this unit on Sept. 1, 2001 and ended the
unit on Sept. 30, 2001. - To my great surprise, my students were less
accepting of other cultures after completing the
unit. Why? Is my teaching unit no good? - The 9/11 terrorist attack occurred during this
time period and probably had a major impact on
the attitudes of my students regarding other
cultures.
20Maturation
- the processes within subjects which act as a
function of the passage of time. i.e. if the
project lasts a few years, most participants may
improve their performance regardless of
treatment. - I develop a liquid that children can drink once a
month. - This liquid is designed to help potty train
children. - If they start drinking it at the age of 1, 98 of
the children are potty trained by the age of 3. - Did my treatment make the difference or do
children naturally become potty trained during
this time period?
21Testing
- When you have a pretest and a posttest, students
may remember some of the questions from the
pretest and do better on the posttest because of
this. - If the time between the pre and posttest is
short, this could be a problem.
22Instrumentation
- To prevent the testing threat to the internal
validity of research, some researchers will
develop another version of the test. - Version A will be used as a pretest.
- Version B will be the posttest.
- It is possible the two versions of the test are
not of the same difficulty and may have other
differences. - If we use interviewers or observers to collect
data, they may become tired after a while and
report things differently than they did when they
were fresh.
23Statistical Regression
- Also known as regression to the mean.
- This threat is caused by the selection of
subjects on the basis of extreme scores or
characteristics. - People who score unusually high or low on a
measure typically have more normal scores on
subsequent tests. - Give me the 20 students with the lowest reading
scores and I guarantee that they will show
immediate improvement right after my reading
treatment.
24Experimental Mortality
- The loss of subjects during the study
- Those who stay in a study all the way to the end
may be more motivated to learn and thus achieved
higher performance.
25Selection
- If you do not randomly select subjects for the
experimental group and the treatment group, you
may be introducing a bias. - There are two groups of senior ag students. One
meets 1st period and one meets 4th period. - One group gets the treatment, the other group is
the control group. - Advanced math is taught in this school during 1st
period, thus all the smarter students, including
the Ag seniors, are in this class. - So by comparing the scores of the 1st and 4th
period students you are really comparing smarter
students with average students.
26John Henry or Hawthorne Effect
- People know they are in a study, so they try
extra hard to do good.
27Controlling these Threats
- Typically these threats to the internal validity
of research can be controlled by - Having a control group
- Randomly assigning people to groups and randomly
assigning treatments to groups.
28Background Information
- Campbell and Stanley are the researchers who
identified and described the threats to the
internal validity of research.