Psychometrics: An introduction - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Psychometrics: An introduction

Description:

Interested in (in fact, obsessed with) individual differences and their ... which looked at the construct of geekiness': the extent to which a person is a geek. ... – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0
Slides: 32
Provided by: chriswe
Category:

less

Transcript and Presenter's Notes

Title: Psychometrics: An introduction


1
PsychometricsAn introduction
2
Overview
  • A brief history of psychometrics
  • The main types of tests
  • The 10 most common tests
  • Why psychometrics? Clinical versus actuarial
    judgment

3
A brief history
  • Testing for proficiency dates back to 2200 B.C.,
    when the Chinese emperor used grueling tests to
    assess fitness for office

4
Francis Galton
  • Modern psychometrics dates to Sir Francis Galton
    (1822-1911), Charles Darwins cousin
  • Interested in (in fact, obsessed with)
    individual differences and their distribution
  • 1884-1890 Tested 17000 individuals (!) on
    height, weight, sizes of accessible body parts,
    behavior hand strength, visual acuity, RT etc
  • Demonstrated that objective tests could provide
    meaningful scores
  • Invented correlation First regression line was
    the average diameter of seeds against the average
    diameter of their parents

5
Regression to the mean
  • Galton also popularized the idea of regression to
    the mean extreme values when repeated tend to be
    less extreme

Francis Galton (1886). Regression Towards
Mediocrity in Hereditary Stature. Journal of the
Anthropological Institute 15 246263.
6
Regression to the mean
  • So Second albums by great bands tend to be worse
    than first albums second novels by successful
    first novelists tend to be worse than first
    novels sports teams who excelled in one game or
    season tend do worse in the next game/season
    geniuses have childen who are less brilliant than
    they are etc.
  • WHY?

7
Regression to the mean
  • I had the most satisfying Eureka experience of
    my career while attempting to teach flight
    instructors that praise is more effective than
    punishment for promoting skill-learning. When I
    had finished my enthusiastic speech, one of the
    most seasoned instructors in the audience raised
    his hand and made his own short speech, which
    began by conceding that positive reinforcement
    might be good for the birds, but went on to deny
    that it was optimal for flight cadets. He said,
    "On many occasions I have praised flight cadets
    for clean execution of some aerobatic maneuver,
    and in general when they try it again, they do
    worse. On the other hand, I have often screamed
    at cadets for bad execution, and in general they
    do better the next time. So please don't tell us
    that reinforcement works and punishment does not,
    because the opposite is the case." This was a
    joyous moment, in which I understood an important
    truth about the world because we tend to reward
    others when they do well and punish them when
    they do badly, and because there is regression to
    the mean, it is part of the human condition that
    we are statistically punished for rewarding
    others and rewarded for punishing them. I
    immediately arranged a demonstration in which
    each participant tossed two coins at a target
    behind his back, without any feedback. We
    measured the distances from the target and could
    see that those who had done best the first time
    had mostly deteriorated on their second try, and
    vice versa. But I knew that this demonstration
    would not undo the effects of lifelong exposure
    to a perverse contingency.
  • Daniel Kahneman (In his Nobel acceptance speech)

8
James Cattell
  • James Cattell (studied with Wundt Galton)
    first used the term mental test in 1890
  • His tests were in the brass instruments
    tradition of Galton
  • mostly motor and acuity tests
  • Founded Psychological Review(1897)

9
Clark Wissler
  • Clark Wissler (Cattells student) did the first
    basic validational research, examining the
    relation between the old mental test scores and
    academic achievement
  • His results were largely discouraging
  • He had only bright college students in his
    sample
  • Why is this a problem?
  • Wissler became an anthropologist with a strong
    environmentalist bias.

10
Alfred Binet
  • Goodenough (1949) The Galtonian approach was
    like inferring the nature of genius from the the
    nature of stupidity or the qualities of water
    from those of.hydrogen and oxygen.
  • Alfred Binet (1905) introduced the first modern
    intelligence test, which directly tested higher
    psychological processes (real abilities
    practical judgments)
  • i.e. picture naming, rhyme production, weight
    ordering, question answering, word definition.
  • Also motivated IQ (Stern, 1914) mental age
    divided by chronological age

11
The rise of psychometrics
  • Lewis Terman (1916) produced a major revision of
    Binets scale
  • Robert Yerkes (1919) convinced the US government
    to test 1.75 million army recruits
  • Post WWI Factor analysis emerged, making other
    aptitude and personality tests possible

12
What is a psychometric test?
  • A test is a standardized procedure for sampling
    behavior and describing it using scores or
    categories
  • Most tests are predictive of some non-test
    behavior of interest (or what would be the
    point?)
  • Most tests are norm-referenced they describe
    the behavior in terms of norms, test results
    gathered from a large group of subjects (the
    standardization sample)
  • Some tests are criterion-referenced the
    objective is to see if the subject can attain
    some pre-specified criterion.

13
The main types of tests
  • Intelligence tests Assess intelligence
  • Aptitude tests Assess capability
  • Achievement tests Assess degree of
    accomplishment
  • Creativity tests Assess capacity for novelty
  • Personality tests Assess traits
  • Interest inventories Assess preferences for
    activities
  • Behavioral tests Measure behaviors and their
    antecedents/consequences
  • Neuropsychological tests Measure cognitive,
    sensory, perceptual, or motor functions

14
The 10 most commonly used tests
  • 1.) Wechsler Intelligence Scale for Children
    (WISC)
  • 2.) Bender Visual-Motor Gestalt Test
  • 3.) Wechsler Adult Intelligence Scale (WAIS)
  • 4.) Minnesota Multiphasic Personality Inventory
    (MMPI)
  • 5.) Rorschach Ink Blot Test
  • 6.) Thematic Apperception Test (TAT)
  • 7.) Sentence Completion
  • 8.) Goodenough Draw-A-Person Test
  • 9.) House-Tree-Person Test
  • 10.) Stanford-Binet Intelligence Scale
  • From Brown McGuire, 1976

15
Clinical versus actuarial judgment
  • Clinical judgment reaching a decision by
    processing information in ones head
  • Actuarial judgment reaching a decision without
    employing human judgment, using
    empirically-established relations between data
    and the event of interest
  • Actuarial ad. L. actu amac ri-us, a keeper
    of accounts
  • Note that some of the data in an actuarial
    judgment may be qualitative clinical
    observations, allowing a mixture of methods

16
Clinical versus actuarial judgment
  • Paul Meehl (1954) first addressed the question
    Which is better?
  • His ground rules for comparison
  • Both methods should draw from the same data set
    (this was relaxed by others, with no changes in
    results)
  • Cross-validation should be required, to avoid
    using variation specific to the data set
  • There should be explicit prediction of success,
    recidivism, or recovery

17
Meehl (1954) Results
  • He looked at between 16 and 20 studies (depending
    on inclusion criteria)
  • it is clear that the dogmatic, complacent
    assertion sometimes heard from clinicians that
    naturally clinical prediction, being based on
    real understanding is superior, is simply not
    justified by the facts to date.
  • In all but one case, predictions made by
    actuarial means were equal to or better than
    clinical methods
  • In a later paper, he changed his mind about the
    one!

18
Thirty years later...
  • Review and reflection indicate that no more than
    5 of what was written in the 1954 book entitled,
    Clinical Versus Statistical Prediction needs to
    be retracted 30 years later. If anything, these
    retractions would result in the book's being more
    actuarial than it was.
  • There is no controversy in social science that
    shows such a large body of qualitatively diverse
    studies coming out so uniformly as this one.
  • Paul Meehl, 1986 (Causes and Effects of My
    Disturbing Little Book)

19
In 1989
  • After eliminating studies that might be biased
    against clinicians, by 1989 there were
    approximately 100 studies that pitted actuarial
    against clinical methods
  • In virtually every one of these studies, the
    actuarial method has equaled or surpassed the
    clinicla method, sometimes substantially
  • Dawes, Faust, Meehl, 1989 In your course pack

20
Example Goldbergs Rule
  • Goldbergs Rule (1965) gives a simple formula for
    diagnosing psychosis versus neurosis from MMPI
    scale scores (we will see these scales later)
  • It was derived by looking at gold standard
    discharge diagnoses
  • It was compared to 29 judges on 861 profiles from
    7 settings
  • Judges got an average of 62 correct
  • The best judge got 67 correct
  • Goldbergs Rule got 70 correct, and exceeded
    judges in every one of the 7 settings
  • Additional training didnt help the judges do
    better (and note also that the judges knew and
    could have used Goldbergs Rule!)

21
Where are clinicians strengths? I
  • i.) Theory-mediated judgments
  • If the predictor knows the relevant causal
    influences, can measure them, and has a model
    specific enough to take him/her from theory to
    fact
  • However, are there any reasons to doubt this
    potential advantage?

22
Where are clinicians strengths? II
  • ii.) Ability to use rare events
  • If the predictor knows that the current case is
    an exception to the statistical trend, s/he can
    use that information to over-ride the trend
  • It is also possible to build these into actuarial
    methods
  • Why is it very difficult in practice?
  • Why might we worry about clinicians ability to
    incorporate rare events into prediction?

23
Where are clinicians strengths? III
  • iii.) Able to detect complex predictive cues
  • - Humans beings are still (for now) masters at
    recognizing some complex configurations, such as
    facial expressions etc.

24
Where are clinicians strengths? IV
  • iv.) Able to re-weight utilities in real-time
  • - For ethical, legal, humanitarian, or financial
    reasons, we might decide to do things differently
    than usual in particular cases.

25
Where are actuarial strengths? I
  • i.) Immunity from fatigue, forgetfulness,
    hang-overs, hostility, prejudice, ignorance,
    false association, over-confidence, bias,
    heart-ache, and random fluctuations in judgment.

26
Where are actuarial strengths? II
  • ii.) Consistency proper weighting
  • - Variables are weighted the same way every
    time, according to their actual demonstrable
    contributions to the criterion of interest
  • - Perhaps more importantly irrelevant variables
    are properly weighted to zero

27
Where are actuarial strengths? III
  • iii.) Feedback base-rates built-in to the
    system
  • - Clinicians rarely know how they are doing
    because they dont get immediate feedback and
    because they have imperfect memory
  • - Actuarial records constitute perfect memories
    of how things came out in similar cases and can
    include a larger and wider sample than a single
    human or a small group of humans can ever hope to
    see

28
Where are actuarial strengths? IV
  • iv.) Not overly sensitive to optimal weightings
  • - Even simplistic actuarial judgments often beat
    human judgments
  • - Simple linear weightings often do better than
    humans
  • v.) Optimal (non-linear) weightings are
    possible.

29
The power of non-linearity
  • Linear relations are those that say that X goes
    up by the same amount for each equal sized
    increments in Y
  • P aX bY c
  • Such equations are represented graphically by a
    straight line relating X and Y or any higher
    number of dimensions
  • Non-linear relations are those that say that X
    goes up by different amounts for each equal sized
    increments in Y (there are many many such
    equations)
  • Such equations are represented graphically by a
    non-straight line relating X and Y either
    because the line breaks or because it curves

30
The power of non-linearity
  • Westbury, C., Buchanan, L., Sanderson, M.,
    Rhemtulla, M., Phillips, L. (2003). Using
    genetic programming to discover non-linear
    variable interactions. Behavior Research Methods,
    Instruments, and Computers, 352 202-216.
  • We used computational means to discover
    non-linear weightings for a test (constructed for
    PSYCO 431) which looked at the construct of
    geekiness the extent to which a person is a
    geek.
  • This test was validated against a self-rating on
    a Likert scale.
  • The test consisted of 76 questions.
  • The validation set contained 59 subjects
  • -The test set contained 30 subjects.

31
The power of non-linearity (and the need for
cross-validation)
  • The non-linear estimate was about as good at
    predicting scores on unseen tests as the (gold
    standard) summed validation score around which
    the test had been designed
  • It blew away the linear regression (0.56 versus
    0.20)
  • The non-linear combination used responses to
    only 12 of the 76 test questions in its
    prediction.
Write a Comment
User Comments (0)
About PowerShow.com