Business Research Methods

About This Presentation

Title:

Business Research Methods

Description:

Business Research Methods Measurement and Scaling: Noncomparative Scaling Techniques – PowerPoint PPT presentation

Number of Views:129

Avg rating:3.0/5.0

Slides: 41

Provided by: dcom81

Category:

more less

Transcript and Presenter's Notes

Title: Business Research Methods

1
Business Research Methods

Measurement and Scaling
Noncomparative ScalingTechniques

2
Noncomparative Scaling Techniques

Respondents evaluate only one object at a time,
and for this reason noncomparative scales are
often referred to as monadic scales.
Noncomparative techniques consist of continuous
and itemized rating scales.

3
Continuous Rating Scale

Respondents rate the objects by placing a mark at
the appropriate position
on a line that runs from one extreme of the
criterion variable to the other.
The form of the continuous scale may vary
considerably.
How would you rate Sears as a department store?
Version 1
Probably the worst - - - - - - -I - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - Probably the best
Version 2
Probably the worst - - - - - - -I - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - -
- - - -- - Probably the best
0 10 20 30 40 50 60 70 80 90 100
Version 3
Very bad Neither good Very
good
nor bad
Probably the worst - - - - - - -I - - - - - - - -
- - - - - - - - - - - - - -- - - - - - - - - - -
- - - - - -Probably the best
0 10 20 30 40 50 60 70 80 90 100

4
Itemized Rating Scales

The respondents are provided with a scale that
has a number or brief description associated with
each category.
The categories are ordered in terms of scale
position, and the respondents are required to
select the specified category that best describes
the object being rated.
The commonly used itemized rating scales are the
Likert,
semantic differential, and
Stapel scales.

5
Likert Scale

The Likert scale requires the respondents to
indicate a degree of agreement or
disagreement with each of a series of statements
about the stimulus objects.
Strongly Disagree Neither Agree Strongly
disagree agree nor agree
disagree
1. Sears sells high quality merchandise.
1 2X 3 4 5
2. Sears has poor in-store service. 1 2X 3 4 5
3. I like to shop at Sears. 1 2 3X 4 5
The analysis can be conducted on an item-by-item
basis (profile analysis), or a total (summated)
score can be calculated.
When arriving at a total score, the categories
assigned to the negative statements by the
respondents should be scored by reversing the
scale.

6
Semantic Differential Scale

The semantic differential is a seven-point rating
scale with end
points associated with bipolar labels that have
semantic meaning.
SEARS IS
Powerful ---------X----- Weak
Unreliable -----------X--- Reliable
Modern -------------X- Old-fashioned
The negative adjective or phrase sometimes
appears at the left side of the scale and
sometimes at the right.
This controls the tendency of some respondents,
particularly those with very positive or very
negative attitudes, to mark the right- or
left-hand sides without reading the labels.
Individual items on a semantic differential scale
may be scored on either a -3 to 3 or a 1 to 7
scale.

7
A Semantic Differential Scale for Measuring Self-
Concepts, Person Concepts, and Product Concepts
1) Rugged ---------------------
Delicate
2) Excitable ---------------------
Calm 3) Uncomfortable ----------------
----- Comfortable 4)
Dominating ---------------------
Submissive 5)
Thrifty ---------------------
Indulgent 6) Pleasant
--------------------- Unpleasant
7) Contemporary -----------------
---- Obsolete 8)
Organized ---------------------
Unorganized
9) Rational ---------------------
Emotional 10) Youthful
--------------------- Mature
11) Formal ---------------------
Informal 12) Orthodox
--------------------- Liberal
13) Complex ---------------------
Simple 14) Colorless
--------------------- Colorful 15)
Modest --------------------- Vain
8
Stapel Scale

The Stapel scale is a unipolar rating scale with
ten categories
numbered from -5 to 5, without a neutral point
(zero). This scale
is usually presented vertically.
SEARS
5 5
4 4
3 3
2 2X
1 1
HIGH QUALITY POOR SERVICE
-1 -1
-2 -2
-3 -3
-4X -4
-5 -5
The data obtained by using a Stapel scale can be
analyzed in the

9
Basic Noncomparative Scales
Scale

Basic
Examples

Advantages

Disadvantages
Characteristics

Continuous
Place a mark on a
Reaction to
Easy to construct

Scoring can be
continuous line

TV
cumbersome
Rating
commercials

unless
Scale

computerized

Itemized Rating

Scales

Likert Scale

Degrees of
Measurement
Easy to construct,
More
agreement on a 1
of attitudes

administer, and
time
-
consuming

(strongly disagree)
understand

to 5 (strongly agree)
scale

Semantic
Seven
-
point scale
Brand,
Versatile

Controversy as
with bipolar labels

product, and
to whether the
Differential

company
data are interval

images

Stapel
Unipolar ten
-
point
Measurement
Easy to construct,
Confusing and
scale,
-
5 to 5,
of attitudes
administer over
difficult to apply

Scale

witho
ut a neutral
and images

telephone

point (zero)

10
Summary of Itemized Scale Decisions

1) Number of categories Although there
is no single, optimal number, traditional
guidelines suggest that there should be
between five and nine categories
2) Balanced vs. unbalanced In general, the
scale should be balanced to obtain objective
data
3) Odd/even no. of categories If a neutral or
indifferent scale response is possible from
at least some of the respondents, an odd
number of categories should be used
4) Forced vs. non-forced In situations where the
respondents are expected to have no opinion,
the accuracy of the data may be improved by a
non-forced scale
5) Verbal description An argument can be made
for labeling all or many scale categories.
The category descriptions
should be located as close to the response
categories as possible
6) Physical form A number of options should be
tried and the best selected

11
Balanced and Unbalanced Scales
Figure 9.1
Jovan Musk for Men is Jovan Musk for Men is
Extremely good Extremely good Very
good Very good Good Good
Bad Somewhat good Very bad Bad
Extremely bad Very bad
Balanced Scale
Unbalanced Scale
12
Rating Scale Configurations
A variety of scale configurations may be
employed to measure the gentleness of Cheer
detergent. Some examples include Cheer
detergent is 1) Very harsh
--- --- --- --- --- --- --- Very gentle
2) Very harsh 1 2 3 4 5 6
7 Very gentle 3) . Very
harsh . .
. Neither harsh nor gentle . .
. Very gentle 4)
____ ____ ____
____ ____ ____
____ Very Harsh
Somewhat Neither harsh Somewhat
Gentle Very harsh
Harsh nor gentle gentle
gentle 5)
Very Neither harsh Very
harsh nor gentle

gentle

Figure 9.2

Cheer
-3
-1
0
1
2
-2
3
13
Some Unique Rating Scale Configurations
Thermometer Scale Instructions Please
indicate how much you like McDonalds hamburgers
by coloring in the thermometer. Start at the
bottom and color up to the temperature level that
best indicates how strong your preference is.
Form Smiling Face Scale
Instructions Please point to the face
that shows how much you like the Barbie Doll. If
you do not like the Barbie Doll at all, you would
point to Face 1. If you liked it very much, you
would point to Face 5. Form
1 2 3 4 5
Figure 9.3
Like very much
100 75 50 25 0
Dislike very much
14
Thurstone Scale

It is a two stage procedure
In the first stage researcher selects 80 to 100
items indicating different degrees of favourable
attitude for concept under study
They are given to a group of judges to group them
into favourable disfavour able by keeping equal
intervals between categories
All items that have consensus from judges are
selected distributed uniformly on a scale of
favourability
This scale is then administered to respondents to
measure their attitude towards a particular
concept
It is time consuming costly is rarely used in
applied BR

15
Measurement Accuracy

The true score model provides a framework for
understanding the accuracy of measurement.
XO XT XS XR
where
XO the observed score or measurement
XT the true score of the characteristic
XS systematic error
XR random error

16
Systematic Error

Lack of clarity of the scale, including the
instructions or the items themselves.
Mechanical factors, such as poor printing,
overcrowding items in the questionnaire, and poor
design.

17
Random Error

Short-term or transient personal factors, such as
health, emotions,and fatigue.
Situational factors, such as the presence of
other people, noise, and distractions.

18
Criteria for evaluating measurement

The criteria for evaluating measurements are
Reliability
Validity
Sensitivity
Generalizability
Relevance

19
Reliability

The degree to which measures are free from random
error and therefore yield consistent results
across time or situations.
Perfect reliability requires that there is no
random error
XR0

20
Validity

The ability of a scale to measure what was
intended to be measured.
Perfect validity requires that there is no
measurement error either systematic or random.
XRo XS0

21
Relationship between validity reliability

If a measure is perfectly valid it is also
perfectly reliable
However if a measure is perfectly reliable it may
or may not be perfectly valid
If a measure is unreliable it will not be valid
Reliability is a necessary but not a sufficient
condition for validity

22
THE GOAL OF MEASUREMENT VALIDITY and RELIABILITY
23
Reliability and Validity on Target
Old Rifle New Rifle New Rifle
Sunglare Low Reliability High
Reliability Reliable but Not Low Validity High
Validity Valid (Target A) (Target B) (Target C)
24
RELIABILITY
Of index measures
Repeatability
25
Types of Reliability

There are two dimensions of reliabilityRepeatabil
ity Internal consistency
If the results of the research are the same even
when it is conducted second or third time it
confirms repeatability aspect
Test-Retest Method An approach for assessing
reliability in which respondents are administered
identical sets of scale items at two different
times under as nearly equivalent conditions as
possible
This measures repeatability since the same scale
or measure is administered to the same set of
respondents at two separate points. If the
measure is stable over time , it should obtain
similar results.(40 satisfied with jobs both
times)
However it is difficult to locate all respondents
for the second round, their attitudes may change
over time or the first measure may sensitize the
respondents

26
Equivalent Forms Method

An approach to assess reliability that requires
two equivalent forms of scale to be constructed
administered to the same respondents at two
different times
However it is difficult , time consuming
expensive to construct two equivalent forms of
scale

27
Internal Consistency

This measure of reliability focuses on internal
consistency of the set of items forming the
scale.
It is used to assess reliability of a summated
scale where several items are summed to form a
total score .Each item measures some aspect of
the construct and the items should be consistent
in what they indicate about the characteristics

28
Split half Method

Split half Method It is a method of measuring
internal consistency reliability in which the
items constituting the scale are divided into two
halves and the resulting scores of two halves are
correlated. High correlation indicates high
consistency
However results will depend on how the scale
items are split
Coefficient alpha(Cronbachs Alpha) A measure of
internal consistency reliability that is the
average of all possible split half coefficients
resulting from different splitting of the scale
items

Some multi item scales include several sets of
items measuring different dimensions of a
multidimensional construct. Since these
dimensions are independent a measure of internal
consistency computed across dimensions would be
inappropriate. so internal consistency
reliability can be computed for each dimension
Store image is a multidimensional construct that
includes
--- Quality of goods,
--- variety of goods,
---returns policy,
---service ,
----price,
----location,
----layout
----billing credit policy

30
Face Professional agreement that logically it
appears valid. (Subjective) Content-Depends on
established theories for support
(objective) Criterion Does it fit or correlate
with other similar measure/constructs? Body Fat
caliper, water displacement, electrical
impedance, BMI. Concurrent two measure, same
time Predictive Two measures at diff.
times. Construct - confirmed with network of
hypotheses. Convergent(High relationship with
similar concepts). and divergent or discriminant
validity (low relationship with dissimilar
concepts).
31
Face Validity

Face Validity Subjective agreement among
professionals that a scale logically appears to
accurately measure what it is intended to
measure. Weakest form without any analysis
Face validity is concerned with how a measure or
procedure appears. Does it seem like a reasonable
way to gain the information the researchers are
attempting to obtain? Does it seem well designed?
Does it seem as though it will work reliably?
Unlike content validity, face does not depend on
established theories for support

32
Content Validity

Content Validity is based on the extent to
which a measurement reflects the specific
intended domain of content .
Researchers aim to study mathematical learning
and create a survey to test for mathematical
skill. If these researchers only tested for
multiplication and then drew conclusions from
that survey, their study would not show content
validity because it excludes other mathematical
functions.
To measure adequacies of facilities in schools
attractiveness of school name, frequency of old
students meet. eatables in the canteen not
relevant variables
Number of classrooms, Number of qualified
teachers, playground, liabrary- relevant
variables

33
Criterion related Validity

Criterion related validity, also referred to as
instrumental validity, is used to demonstrate the
accuracy of a measure or procedure by comparing
it with another measure or procedure which has
been demonstrated to be valid.
For example, imagine a hands-on driving test has
been shown to be an accurate test of driving
skills. By comparing the scores on the written
driving test with the scores from the hands-on
driving test, the written test can be validated
by using a criterion related strategy in which
the hands-on driving test is compared to the
written test.
New measure correlates with criterion measure

34
Predictive Validity

Predictive Validity. A type of criterion validity
whereby a new measure correlates with criterion
measure administered at a later time
In order for a test to be a valid screening
device for some future behaviour, it must have
predictive validity. The SAT is used by college
screening committees as one way to predict
college grades. The GMAT is used to predict
success in business .It measures predictive
validity .
We determine predictive validity by computing a
correlation coefficient comparing
SAT(NEW/Independent) scores, for example, and
college grades (Criterion/dependent). If they
are directly related, then we can make a
prediction regarding college grades based on SAT
score. We can show that students who score high
on the SAT tend to receive high grades in
college.

35
Concurrent Validity

A type of criterion validity whereby a new
measure correlates with a criterion measure at
the same time.
A new test of adult intelligence, for example,
would have concurrent validity if it had a high
positive correlation with the Wechsler Adult
Intelligence Scale since the Wechsler is an
accepted measure of the construct we call
intelligence. An obvious concern relates to the
validity of the test against which you are
comparing your test.

36
Construct Validity

Construct validity seeks agreement between a
theoretical concept and a specific measuring
device or procedure. For example, a researcher
inventing a new IQ test might spend a great deal
of time attempting to "define" intelligence in
order to reach an acceptable level of construct
validity.
Construct validity can be broken down into two
sub-categories Convergent validity and
discriminate validity. Convergent validity is the
actual general agreement among ratings, where
measures should be theoretically related.
Discriminate validity is the lack of a
relationship among measures which theoretically
should not be related

To measure Tendency to stay in low cost hotels
Four personality variables High level of self
confidence, low need for status, low need for
distinctiveness, high level of adaptability
Not related to brand loyalty, high level of
aggressiveness
The scale can be said to have construct if it
correlates highly with other measures of tendency
to stay in low cost hotels Reported hotels
patronised and social class (convergent)
Low correlation with the unrelated constructs of
brand loyalty high level of aggressiveness
(Divergent)

38
SENSITIVITY

A measurement instruments ability to accurately
measure variability in stimuli or responses.
Yes and no agree or disagree are not very
sensitive
Strongly agree, mildly agree, indifferent, mildly
disagree, strongly disagree ,are categories whose
inclusion increases scales sensitivity

39
Generizability

It is the degree to which a study based on a
sample applies to a universe of generalization
Universe of generalization includes set of all
conditions of measurement items, interviewers,
modes of data collection etc.
To generalize a scale developed for personal
interview to other modes of data collection such
as mail, telephone etc.
To generalize from a sample of items to universe
of items

40
Relevance

It represents appropriateness of using a
particular scale for measuring a variable
Relevance Reliability x Validity
If either reliability or validity is low then the
scale will have little relevance
If correlation coefficient is used to analyse
both reliability validity then the scale can
have relevance from 0 to 1.

Write a Comment

User Comments (0)