Reliability, Validity, and Utility in Selection - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Reliability, Validity, and Utility in Selection

Description:

Stability, Consistency, Accuracy, dependability. Statistically represented by rxx ... Internal Consistency Coefficient Alpha ... – PowerPoint PPT presentation

Number of Views:232

Avg rating:3.0/5.0

Slides: 25

Provided by: Rainer2

Category:

more less

Transcript and Presenter's Notes

Title: Reliability, Validity, and Utility in Selection

1
Reliability, Validity, and Utility in Selection
2
Requirements for Selection Systems

Reliable
Valid
Fair
Effective

3
Reliability

Extent to which a score is stable and free from
error
Stability, Consistency, Accuracy, dependability
Statistically represented by rxx
Should be .80 or higher for selection

4
Factors that Affect Reliability

Test length - longer better
Homogeneity of test items higher r if all items
measure same construct
Adherence to standardized procedures results in
higher reliability

5
Factors that Negatively Affect Reliability

Poorly constructed devices
User error
Unstable attributes
Item difficulty too hard or too easy inflates
reliability

6
Standardized Administration

All test takers receive
Test items presented in same order
Same time limit
Same test content
Same administration method
Same scoring method of responses

7
Types of Reliability

Test-retest
Alternate Forms
Internal Consistency
Interrater

8
Test-Retest Reliability

Temporal stability
Obtained by correlating pairs of scores from the
same person on two different administrations of
the same test
Drawbacks maturation learning practice memory

9
Alternate Forms

Form stability aka parallel forms, equivalent
forms
Two different versions of a test that have equal
means, standard deviations, item content, and
item difficulties
Obtained by correlating pairs of scores from the
same person on two different versions of the same
test
Drawbacks need to create 2x items (cost)
practice learning maturation

10
Internal Consistency - Split-half Reliability

obtained by correlating two pairs of scores
obtained from equivalent halves of a single test
administered once
r must be adjusted statistically to correct for
test length
Spearman-Brown Prophecy formula
Advantages efficient eliminates some of the
drawbacks seen in other methods

11
Internal Consistency Coefficient Alpha

Represents the degree of correlation among all
the items on a scale calculated from a single
administration of a single form of a test
Obtained by averaging all possible split-half
reliability estimates
Drawback test must be uni-dimensional can be
artificially inflated if test is lengthened
Advantages same as split-half
Most commonly used method of r

12
Interrater Reliability

Degree of agreement that exists between two or
more raters or scorers
Used to determine if scores represent rater
characteristics rather than what is being rated
Obtained by correlating ratings made by one rater
with those of other raters for each person being
rated

13
Validity

Extent to which inferences based on test scores
are justified given the evidence
Is the test measuring what it is supposed to
measure?
Builds upon reliability, i.e. reliability is
necessary but not sufficient for validity
No single best strategy

14
Types of Validity

Content Validity
Criterion Validity
Construct Validity
Face Validity

15
Content Validity

Degree to which test taps into domain or
content of what it is supposed to measure
Determined through Job Analysis
Identification of essential tasks
Identification of KSAOs required to complete
tasks
Relies on judgment of SMEs
Can also be done informally

16
Criterion Validity

Degree to which a test is related (statistically)
to a measure of job performance
Statistically represented by rxy
Usu. ranges from .30 to .55 for effective
selection
Can be established two ways
Concurrent Validity
Predictive Validity

17
Concurrent Validity

Test scores and criterion measure scores are
obtained at the same time correlated with each
other
Drawbacks
Must involve current employees, which results in
range restriction non-representative sample
Current employees will not be as motivated to do
well on the test as job seekers

18
Predictive Validity

Test scores are obtained prior to hiring, and
criterion measure scores are obtained after being
on the job scores are then correlated with each
other
Drawbacks
Will have range restriction unless all applicants
are hired
Must wait several months for job performance
(criterion) data

19
Construct Validity

Degree to which a test measures the theoretical
construct it purports to measure
Construct unobservable, underlying, theoretical
trait

20
Construct Validity (cont.)

Often determined through judgment, but can be
supported with statistical evidence
Test homogeneity (high alpha factor analysis)
Convergent validity evidence - test score
correlates with other measures of same or similar
construct
Discriminant or divergent validity evidence
test score does not correlate with measures of
other theoretically dissimilar constructs

21
Additional Representations of Validity

Face Validity degree to which a test appears to
measure what it purports to measure i.e., do the
test items appear to represent the domain being
evaluated?
Physical Fidelity do physical characteristics
of test represent reality
Psychological Fidelity do psychological demands
of test reflect real-life situation

22
Where to Obtain Reliability Validity Information

Derive it yourself
Publications that contain information on tests
e.g., Buros Mental Measurements Yearbook
Test publishers should have data available,
often in the form of a technical report

23
Selection System Utility

Taylor-Russell Tables estimate percentage of
employees selected by a test who will be
successful on the job
Expectancy Charts similar to T-R, but not as
accurate
Lawshe Tables estimate probability of job
success for a single applicant

24
Methods for Selection Decisions

Top-down those with the highest scores are
selected first
Passing or cutoff score everyone above a
certain score is hired
Banding all scores within a statistically
determined interval or band are considered equal
Multiple hurdles several devices are used
applicants are eliminated at each step

Write a Comment

User Comments (0)