Title: Engineering Psychology PSY 378S
1Engineering PsychologyPSY 378S
- University of Toronto
- Spring 2005
- L16 Memory and Training
2Outline Lecture 1
- Working Memory
- Short-Term vs. Long-Term Memory
- evidence
- Baddeleys Working Memory Model
- Evidence
- WM Codes and Modalities
3STM versus LTM
- We all know that we can know something, and then
later forget it - Often this is labeled short-term vs. long-term
memory (STM vs. LTM) - What is experimental evidence for STM/LTM
distinction? - Results from serial position experiments
- Give people a list of 20 words, one at a time at
a certain rate, say 1 every 2 s (Dog, Shovel,
Run, Ripple, ) - At end of list, ask them to recall as many words
as they can (free recall) - We do this again and again, maybe 10 times
4Serial Position Curve
- Plot P(Recall) vs. serial position (words
position in list)
Primacy Effect
Recency Effect
P(Recall)
1
20
17
Serial Position
5Effect of Interference
- Procedure varied a number of different ways
- Participants perform interference task
(arithmetic) before recalling list - Recency effect reduced or eliminated (if time
spent on arithmetic task increased) - Early part of curve (including primacy effect)
unaffected
Recency Effect
P(Recall)
Serial Position
6Effect of Presentation Rate
Primacy Effect
- Using two different presentation rates (1 s 2 s
per item)--affects primacy only - Different manipulations affect different parts of
the curve - Suggests different mechanisms produce different
parts of the curve - Justifies distinction between STM and LTM
P(Recall)
2 s
1 s
Serial Position
STM
LTM
P(Recall)
Serial Position
7The Modal Model
- Two stores short-term (STM, working memory)
- And long term (LTM)
8Working Memory(Baddeley, 1986, 1995)
- Working memory is version of activity in STM
- Differs from STM in two ways
- Has a more functional emphasiswhat is STM for?
- Different components in STM
9Working Memory
- Three components to working memory
- Visuospatial sketchpad--Maintenance and activity
in visual-spatial domain (e.g., imagery such as
mental rotation) - Central executiveControls WM activity assigns
resources to other WM subsystems - Phonological storeLanguage-based short-term
storage and rehearsal (articulatory loop used for
verbal rehearsal)
Visuospatial sketchpad
Central Executive
Articulatory Loop
Phonological Store
10Working Memory
- What is evidence for 3 components?
- Interference occurs when two tasks (or task
components) draw upon same WM subsystem - Performance degrades relative to situation where
different WM subsystems involved
11Working Memory
Visuospatial sketchpad
Task 1
Task 1
Good Time Sharing
INTERFERENCE!
Central Executive
Task 3
Task 2
Phonological Store
12Working Memory
- An example is the experiment by Brooks (1968)
Task
Verbal
Spatial
Verbal
Response Method
Â
Spatial
13Brooks Experiment
Task
Â
Â
14Brooks Experiment Results
- Spatial task (Big F task) better performed with
verbal responses - Verbal task (quick brown fox nouns and verbs)
better performed with spatial responses - Why? Task and response method draw upon different
WM components in these cases
15Implications
- Implications
- The two subsystems of working memory are
functionally independent susceptible to
interference from different types of activities - Tasks should be designed such that disruption
does not occur
16Implications (contd)
- Tasks that impose high loads on visuospatial
sketch pad (e.g., air traffic control) should not
be performed concurrently with other tasks (or
task components) that will also use this system - Use auditory-phonetic system (the phonological
storearticulatory loop) instead - On the other hand,
- Tasks involving heavy demands on phonological
store/artic loop (e.g., editing text, computing
numbers) will be more disrupted by concurrent
voice input/output than by visual manual
interaction (control with a mouse)
17Kinesthetic WM
- Also evidence for kinesthetic working memory
- Separate from visuospatial (Woodin Heil, 1996)
- Used experienced rowers as participants
- Tapping own body interfered with memory for
rowing positions, but not positions in 4 x 4
matrix - Implications for sports training and performance
Visuospatial sketchpad
Kinesthetic Output Component
Central Executive
Phonological Store
18Codes and Modalities
- Is there correspondence between stimulus modality
and working memory codes? Yes - Tasks that demand verbal working memory better
served by speech displays than by print (if not
much verbal information to communicate) - Auditory modality more effective at processing
language information - Although possible to employ auditory-spatial
displays for spatial tasks, usually less
effective than visual displays - Auditory modality less effective at processing
spatial information
19Codes and Modalities
Obligatory Access
20Longer Communication
- With longer messages, both auditory and visual
channels are likely to show failures of memory - But with print, can physically prolong the
message--makes it more effective for long
messages - Might want to code redundantly (use both auditory
and visual displays) if it does not cause too
much interference with other tasks
21Quiz
22Break
23Outline Lecture 2
- Properties of Working Memory
- Duration, Capacity
- Items and Chunks
- Expertise
- RI and PI
- Running Memory Task
- Knowledge in the World
24Duration of Working Memory
- Brown-Peterson paradigm
- Participant presented with auditory sequence of 3
letters (e.g., XVR) - Try to remember them while performing interfering
task (counting backwards by 3s)
25Duration of Working Memory
26Duration of Spatial WM
- Loftus et al. (1979)
- Subjects tried to remember air-traffic
navigational info. - Moray (1986)
- Subjects were radar controllers trying to recall
info. that had been displayed on a radar scope. - Both researchers found same types of forgetting
functions - Essentially can clear out spatial WM memory in 18
s or so - So, transience occurs both in visuospatial sketch
pad AND in phonological store
27Duration Affected by Number of Items
- Curves a and c represent 1 and 5 item (letter)
sequences - Faster decay observed with more items
28WM Explains Word Length Effect
- Component of phonological store is articulatory
loop - With more items to be rehearsed, there is longer
delay between successive rehearsals of each item - In fact, the length of items--how long the items
take to say--decreases the capacity of working
memory--so speed of rehearsal makes a difference
29Limiting Case
- Limiting case--curve d memory span-- 7 items
(Millers 7?2) - Some items cant be recalled even immediately
after the presentation
30But What is an Item?
- We talked about an item being a letter
- In absolute judgment task, items were things like
different line lengths - Couldnt a word be an item?
- Lets try Brown-Peterson task again--with words
this time - DOG CAT BOY
- Pack in more information--three three-letter
words contains nine letters - Now we have more information being held
31Chunking
- Miller addressed this question by proposing the
concept of chunk - Chunking is grouping together items based on
their meaning - and so a chunk is that group
- e.g., f-b-i could be could be reorganized
(recoded) as FBI--now we have a chunk - Working memory capacity is 7 ? 2 chunks of
information - New capacity unit for WM
- Chunk can be letter, word,
32Chunking (contd)
- Components of a chunk need to be semantically
tied together, typically through assn in LTM - Chunking can occur at higher levels as
well--e.g., sentences - London is the largest city in England (7 words)
- Maybe could associate the words together into a
meaningful whole--into a superchunk (high level
chunk chunk of chunks) - New York is the largest city in the United States
- Toronto is the largest city in Canada, etc.
33Chunking (contd)
- Should avoid having people perform tasks
requiring working at 7 ? 2 limit - To avoid capacity and decay limitations of
working memory facilitate chunking whenever
possible - People with large working memory capacities
typically have system for chunking numbers or
letters so that they are meaningful (e.g., dates
or ages), or by combining them hierarchically to
form superchunks - Ss with normal memory spans can get up to 80
digits or so, using various chunking techniques - Expertise plays a role herelong-term WM
(Ericsson Kintsch, 1995)info in LTWM is
stable, but accessed through temporarily active
retrieval cues in WM
34Chess
- Analogous to memory for chess position by masters
and novices (Chase and Simon, 1973) - If board position taken from the progression of a
reasonable game, experts recalled better than
novices - If board position random, no difference between
two groups
35Pilots and Programmers
- Barnett (1989) found similar results with novice
and expert pilots for communication exchange in
air traffic control - When exchanges flowed in normal sequence, experts
performed better, but no difference if exchanges
in random sequence. - Barfield (1997) Programmers given random lines
of code no better than novices at remembering,
but experts better at random chunks or full
program
36Domain Language
- Chunking--resulting in improved memory
capacity--is a byproduct of domain knowledge - Everyday English words become specialized terms
in particular domains - Hockey terms
- Cycling the puck game up high game down low in
the slot five hole go to the net offside
floater dogging it - Mathematics
- law, proof, derive (derivative), differential,
integral - Law
- Trial, witness, suspect, bail, sentence,
- Experts using these terms allows faster
communicationchunked informationwithin the
domain
37English Experts
- Were all fluent in Englishall highly trained,
experts - Designers capitalize on language familiarity
- Coding--codes can be developed to facilitate
chunking - License plate codes--vanity plates more memorable
e.g., FUN2GO - Commercial phone numbers (967-1111)
- TV stations (CITY-TV)
- Radio station codes (1050 CHUM, flow 93.5, Edge
102)
38RI and PI
- Information can be lost from working memory
through active interference from other
information - Retroactive Interferenceactivity after material
to be recalled (MTBR) affects recall of MTBR - Proactive Interference activity before MTBR
affects recall of MTBR
39RI and PI
- Not just a laboratory phenomemon
- Demonstrated in air-traffic control context
(Loftus et al., 1979) - PI At least 10 s delay necessary before material
remembered in previous exchange did not disrupt
memory for a subsequent exchange - RI and PI can be reduced if interfering activity
uses different code than MTBR (e.g., spatial
activity ok after verbal list) (Haelbig et al.,
1998) - WM in action
40Running Memory Task
- More realistic task (similar to ATC, dispatching)
- 7 ? 2 probably an optimistic figure
- In the running memory task, a sequence of items
(e.g., letters, numbers) is presented to the
operator, and the operator has to identify the
item K items ago
41Running Memory Task (contd)
- Operator does not know how long the string is
- Operator not expected to remember entire string
- As each item comes in, operator expected to do
something with it (categorize it, check its
value, etc.) - Performance falls off rapidly when K 2
- If asked to recall the last items, memory span
much less than 7?2
42Yntema (1963)
- Used running memory task
- Participant kept track of large number of objects
(aircraft), each varying on multiple attributes
(altitude, airspeed, location) - Two key results
- Performance better with few objects and many
attributes than the reversean integration/chunkin
g effect - Performance better if each attribute has its own
scale - Hess et al. (1999) Use of consistent spatial
locations in square grid allowed operator to keep
track of attributes of multiple objects
43Running Memory Recommendations
- From Yntema Result 1 Assign each operator to
monitor all attributes of a few objects - From Yntema Result 2 Dont code spatial
variables with same units - e.g., if code altitude in feet, then code
distance from airport in miles, not feet - From Hess et al. result Consistent spatial
location will improve running memory - Beyond air-traffic control, results may be
applicable to other domains where information
isnt continuously shown (e.g., taxicab
dispatcher)
44Putting Memory in the World
- Knowledge is not all in the head--it is partially
in the world, and in the constraints of the world
(Norman, Design of Everyday Things) - Result precise behavior can result from
imprecise knowledge for four reasons - Information is in the world
- Precision not required
- Natural constraints are present
- Cultural constraints are present
451) Information is in the World
46Information is in the World
- Information coded in memory need only be precise
enough to sustain quality of behavior desired - Whenever information needed to do a task is
readily available, the need for us to learn it
diminishes - Examples
- penny
- hunt-and-peck typists
- I can take you there, but I cant tell you how
to get there
472) Precision not Required
- Dont need all information in head
- Can distinguish quarter from nickel, although may
not be able to tell you what is on each coin, or
the words on the coins - But if you make more precise memory necessary you
will have a problem
48When Precision is Required
- Britain one-pound coin--confusable with five
pence piece - US Susan B. Anthony one-dollar coin--confusable
with quarter - France 10-franc coin confusable with half-franc
coin - Descriptions formed to distinguish among the old
coins were not precise enough to distinguish
between the new one and one of the old onesÂ
49My Red Notebook
- I buy a notebook
- What do I call it?
- Then get another notebook--a blue one
- What do I call my first notebook?
- Then get a small red notebook
- Now what do I call my first notebook?
- Mental representation need only discriminate
among choices in front of me - But add another choice and have to change my
representationmake it more precise
503) Natural Constraints
- Often an objects physical features limit how it
can be used - Natural constraints are present and limit the
range of allowable actions not a random world - Cant use a shovel to brush teeth
- Cant use a rock to mow the lawn
514) Cultural Constraints
- Society has evolved many conventions that govern
acceptable social behavior - This lets us know what to do in unfamiliar
circumstances - What is appropriate behavior at a party, or in a
restaurant - What is the sequence of events in a restaurant?
- If we have to wait for something to happen (like
the waitress to come and take our order) some of
us get fidgety
52Tradeoff between Knowledge in the World and in
the Head
- We need both knowledge in the world and in the
head - But in certain situations we choose to rely more
on one than the other - Gaining the advantages of knowledge in the world
means losing the advantages of knowledge in the
head.
53Tradeoff Examples
- Can put information in the world
- Provide visual echo for message pilot receives
from air-traffic control - Provide the pilot with CDTI (cockpit display of
traffic info.) - Stick Post-It notes around my computer display
- Show continuous record of location in a
hierarchical menu structure - But causes visual clutter, might disrupt
performance of pilot or user - With CDTIs, may increase the visual workload--is
the increase worth the benefits? - Memory aids (information in the world) a mixed
blessingÂ
54Knowledge in World vs. in Head
From Norman (1992), Design of Everyday Things
55Break
56Quiz 2
57Part 3
- Score each recalled word as Case, Rhyme, Semantic
for Y and N answers separately - Count up the number in each category
Case
Rhyme
Semantic
58- Case
- BIRDS
- EAGLE
- FLESH
- FROGS
- GOOSE
- GRASS
- LEMON
- OTTER
- PANSY
- PLANT
- SHRUB
- STRAW
- TROUT
- WEEDS
- WHALE
- WHEAT
- Rhyme
- APPLE
- BIRCH
- CEDAR
- GRAIN
- HORSE
- PANDA
- PEACH
- POPPY
- ROBIN
- SEEDS
- SHARK
- SNAIL
- SNAKE
- SUGAR
- WORMS
- ZEBRA
Semantic CORAL CROWS FOXES GRAPE LILAC MAPLE MOOSE
MOUSE OLIVE QUAIL RAVEN REEDS SHEEP THORN TIGER T
ULIP
59Outline Lecture 3
- Long-Term Memory and Training
- Levels of Processing
- Skill Acquisition
- Training Methods
- Transfer of TrainingMethods
- Negative Transfer
60Levels of ProcessingCraik Lockhart (1972)
- More deeply you process something, better the
chance you will remember it - That is, that you will transfer the info to LTM
from STM (working memory) - Deeper approx. equal to more meaningful
- Process view of memory
P(Recall)
Case
Rhyme
Semantic
Level of Processing
61Levels of Processing Another Take
- Normans taxonomy of memory
- Memory for arbitrary things
- Memory for meaningful relationships
- Memory thru explanation
P(Recall)
Arbitrary
Relationship
Explanation
Level of Processing
62Memory for Arbitrary Things
- Items to be remembered are arbitrary
- No particular relationship to each other or to
anything else - Storage of arbitrary codes (e.g., passwords)
- Requires rote learning, which is difficult, can
take considerable time and effort - When problems arise, memorized sequence gives no
hint as to what has gone wrong - No suggestion of what you might do to fix problem
63Memory for Meaningful Relationships
- Can relate what we learn to knowledge that we
already have - New material can be understood, interpreted,
integrated, with previously acquired material - e.g., Mr. Tanakas L/R turn signals on
handlebars - Now much easier to interpret and remember
- Although doesnt really explain anything
- Cant be used for future prediction
R
L
64Memory Thru Explanation
- Material can be derived from some explanatory
mechanism, e.g., mental model - Mental model allows you to predict, test
hypothesis - Details can be derived when needed, as in
unexpected situations - Designers should provide users with appropriate
models - If not supplied, people will make them up (e.g.,
thermostat, impetus model)
65Long-Term Memory and Training
- The HF practitioner is often faced with the
problem of developing the most efficient training
program--greatest level of proficiency per dollar
invested - Different forms of training are necessary for
mastery of declarative vs. procedural knowledge
66Declarative vs. Procedural Knowledge
- Declarative Knowledge--Facts about a domain, we
can verbalize these, or write them down (e.g.,
knowledge in typical university course) - Better off with study and rehearsal
- Levels of Processing (both kinds) important here
- Procedural Knowledge--How to do something, often
not easily verbalized (e.g., riding bike, driving
car, skating, using lathe) - Tell someone everything you know about riding a
bike, but it wont help much - Better off with practice and performance
67Skill Acquisition
- Practice makes perfect
- Most skills continue to improve for weeks,
months, years - Can obtain errorless performance in many tasks
quite quickly - But two other performance measures continue to
improve speed (RT), attention or resource
demand (as measured by performing concurrent
task)
68Still Improving After Millionth Cigar
693 Stages of Skill Acquisition
- Cognitive stage
- Learner works from written or spoken instructions
- Declarative representation
- Learner rehearses instructions, e.g., driving
std, press clutch down first - Associative stage
- Go from declarative rep. to procedural rep.
- Performance becomes more fluid and error free
- Verbalization goes
- Autonomous stage
- Skill becomes more automated and rapid--less
conscious - Person loses ability to verbally describe the
skill - Performance overlearned
- (Anderson, 1981)
70Production Rules
- Anderson (1981) talks about production rules
(if-THEN) - e.g., if high RPM and in first gear, THEN switch
to second gear - Key structure unifying course of skill
acquisition - Development of skill in associative stage can be
decomposed into many component production rules
(ACT-R model see book) - Motor program is THEN part of production rule
its learning in the autonomous stage is the
fine-tuning of the production rule. - To get automaticity, stimuli or rules must be
consistently mapped to a response
71Guided Training
- Training that allows errors to be made trial
after trial will become detrimental, b/c errors
become learned - Practice makes permanentdont want to practice
errors - Guided training ensures that learners
performance never strays far from what task
requires - Two types training wheels, augmented feedback
- Similar to constraints and affordances,
respectively
72Training Wheels
- Error prevention often accomplished by guided
training such as the training wheels idea for
software (Catrombone Carroll, 1987) - With training wheels, users prevented from
straying off beaten pathmaking typical mistakes
that result in wasted time - Instead of allowing error to affect system,
training wheels informs the user about the error,
then allows user to continue on - Good evidence to support this approach (in
computer software context)
73Augmented Feedback
- Error prevention can also be accomplished by
using augmented feedback techniques - Flight training in simulator paint an ideal
flight path through the sky to the runway - Learner tracks path to achieve proper landing
approach-- ingrains correct sequence of
responding - Helps to produce rapid learning of skill
74Problem for Guided Training
- Whats the problem with training wheels/augmented
feedback? - Can lead to poor transfer in more realistic
environment (taking off training wheels) - Sometimes making errors leads to learning
- Need happy mediumEliminate sources of error that
change task or waste training time - But keep those sources of error intrinsic to task
75Adaptive Training
- Some component of the task made simpler to reduce
initial level of difficulty - e.g., controlling system without lag first
- Then, as training proceeds, this component
gradually increases in difficulty until level of
target task is reached - e.g., introduce different types of lags
76Evaluation of Adaptive Training
- Reviews mixed on this technique
- Simplification does make it easier to perform the
consistent elements of the task - However, the easy versions of the task may induce
a response strategy incompatible with one
necessary to perform the final task - Time stress is effective in adaptive training,
however - increase time stress (speed at which events
occur) as approach the final task
77Part-Task Training
- Elements of complex task learned separately
- Two different forms
- Segmentation and fractionization (Wightman
Lintern, 1985)
78Segmentation
- Segmentation defines situation where different
sequential phases of the skill are practiced
before being integrated - e.g., playing piano Train up on difficult
passage, then play easy passage once, then play
them together - Research shows this is useful--not wasting time
on easy stuff--efficient
79Fractionization
- Practice components of task separately (e.g., LH,
RH on piano) that you eventually perform
concurrently - Merits not clear cutprevents development of
time-sharing skillsmay be necessary to link and
co-ordinate the two activities - If careful in selecting components of tasks that
can be easily broken off (vs. practiced together)
fractionization training effective
80Varied Priority Training
- Shown to be effective (Gopher, Weil, Siegel,
1989) - Perform everything together, but attend to one
component and de-emphasize others - Integrality of task not destroyed
- Since only small amount of attention paid to
lower priority component, it does not distract
from the main component
81Transfer of Training
82Transfer of Training
- Can learning a new skill, or a skill in a new
environment, capitalize on what has been learned
before? - e.g.,
- Learning MS Excel then MS Access
- Training in flight simulator before training in
plane - Training course before on-the-job training
83Flight Simulators Do they help?
737-400 FNPT II MCC Source www.frasca.com
84Transfer of Training
- How do we measure it?
- Control group took 10 hr. to reach criterion
- Transfer group took 8 hr. to reach criterion
- Savings ctrl time - transfer time
- 10 8 2 hours
- Â Transfer savings control time
- 2/10 20
85Transfer Effectiveness Ratio (TER)
- But wait a minute
- Control Group spent 10 hours training,
- Training Group 2 spent 12 hours training, 4 hours
in simulator, 8 hours in real task - The Transfer Effectiveness Ratio (TER) expresses
this relative efficiency - TER savings training period
- Â 2/4 .50
86(No Transcript)
87Transfer Effectiveness Ratio (TER)
- If TER 1, training for transfer group more
efficient than for ctrl group (training to
criterion) - Your training program is better than training on
the real systemhighly effective - If TER ? 0, your training program is worthless
(actually harmful) - If 0 worthless, for two reasons
- Training program may be safer
- May be less expensive
88Training Cost Ratio (TCR)
- Training Cost Ratio (TCR) reflects the cost
component - TCR (Training cost in real task environment per
unit time) ? (Training cost in training program
per unit time) - Cheaper the training device, the lower your
allowable TER can be (everything else held
constant) - If TER ? TCR Â 1, program is cost effective,
otherwise not - Even if program not cost effective, important to
consider safety issues
89Diminishing Returns
- Diminishing effectiveness of most training
devices (as measured by TER) with increased
training time - i.e., TERs decrease with time in training
- Amount of training at which TER ? TCR 1 is
point beyond which the training program is no
longer cost effective
90Picking the App to Train
- Large TCR indicates potential for simulation
training - Importance of relative cost of training program
vs. training in environment - Helicopter Deck Landing Simulator
- Developed at DRDC TorontoÂ
91Training System Fidelity
- Should training simulators resemble the real
world as much as possible? NO. - Why?
- Realistic simulators are expensive--added realism
may add little to TER, but affects TCR - e.g., plants in office situation
- If similarity does not achieve complete identity,
may lead to negative transfer - e.g., unrealistic motion in flight simulators
does not help - If high realism leads to high task complexity,
may divert attention from critical skill to be
learned - e.g., hard to learn to drive a manual
transmission in big city traffic
92Capture Important Task Components
- Instead of total fidelity, need to understand
which components of target task should be
preserved in training situation or simulator - Mission, task analyses useful
- e.g., sequence of steps that user has to perform
93Gibsons Invariants in Simulators
- Evidence for usefulness of including perceptual
invariants - e.g., global optical flow in flight simulator
- Optical flow in driving simulator--heading of
vehicle relative to vanishing point - Sense of immersion does not require extremely
high fidelitytask-related invariants are what is
necessary
94Types of Transfer
- Positive transferTraining program and target
task are highly similar - Zero transferExtreme differences between program
and task - Negative transferSimilar in some respects,
different in others, leading to improper
expectations Â
95Types of Transfer
Stimulus Elements
Different
Same
Same
Response Elements
0
Different
96Negative Transfer
- When two situations have similar stimulus
elements but different response or strategic
components, transfer will be negative - This is especially true if new and old response
are opposites (incompatible) - Stick position for reverse gear varies across car
manufacturers - Task situation (parking, turning vehicle around),
use of clutch, steering etc. these
characteristics will remain the same - But motor response will be opposite (left vs.
right)
97Negative Transfer
- Negative transfer can be serious concern for an
operator who has to switch back and forth between
two systems - e.g.,
- Truck driver with two different gear arrangements
- Switching between applications Using keyboard
shortcuts with s/w that doesnt follow Windows
conventions - Mode errors with cellphones, cameras, Unix vi
- Number of aircraft a pilot can fly without going
through special training
98End