Title: OPERANT CONDITIONING
1OPERANT CONDITIONING
- Traditional Learning
- Learning based on operating on the environment.
2Operant Conditioning
- Learning based on
- Stimulus gt Response gt Consequence
3Responses in Operant Conditioning
- Response are external
- Responses are active
- Responses are goal oriented
- Responses are purposeful
- Responses are voluntary
- Responses operate on the environment
- Responses are made to gain a reward or avoid
punishment - Responses are initially brand new to the learner
and must be learned
4Early Work on Operant Conditioning
- B.F. Skinner used rats and pigeons in a specially
designed Skinner Box where the animals could
learn to press a bar or peck a disk for food. - Thorndike developed the Law of Effect.
- Consequences predict future responses.
5Thorndikes Law of Effect
- If you do something (make a response) and the
consequence is good, youre likely to make that
response again. - If you do something (make a response) and the
consequence is bad, youre less likely to do it
again.
6Skinners Box
- Stimulus Bar to press for food
- Response To press the bar
- Consequence Pellet of food
7Acquiring Operantly Conditioned Responses
- Learner must initially be taught to make the new,
external, voluntary, goal-oriented response. - Learner will have to be focused on the
consequence. - Learner will have to be active.
- Both simple and complex responses can be learned
through operant conditioning.
8Response AcquisitionLearning a New Response
- Unique to operant conditioning because
classically conditioned responses involve
involuntary and inborn responses - Steps for teaching a learner a new response
- Wait for response to occur coincidentally
- Increase the learners motivation
- Limit other possible actions
- Use Shaping rewarding successive approximations
to the behavior you
are attempting to shape.
9Superstitious Behavior
- Misunderstanding which response is leading to a
consequence - Skinners pigeons misunderstanding
- Turning in a circle and pecking disk gets a
pellet of food. - Turning is circle is not necessary, only pecking
the disk - Sport figures misunderstanding
- They must follow a certain ritual to perform well
in a game.
10Extinction
- To eliminate an operantly conditioned response
- Eliminate the consequence
- Unique phenomena of extinction in operant
conditioning - Behavior/Response increases before it decreases
11Generalization
- Applying what you have learned to stimuli other
than the one you learned on. - Examples
- You can tell time on a clock other than the one
you learned on. - You can drive a car you have never driven.
- Generalization allows us to apply what we learn
and interact easily with the environment.
12Discrimination
- Making a specific response to a particular
stimulus. - Examples
- Putting a key in a door and turning the key and
knob to get in, but remembering that this
particular door sticks, so you must force it open
with a shove. - Discriminating how to start a particular car
when that car has an anti-theft device (e.g., you
must turn on the lights before turning on the car)
13Discriminative Stimulus
- A stimulus that tells us
- If a certain response will have the consequence
we expect - When a certain response is likely to have the
consequence we expect - Which response we should make to get the response
we want
14Discriminative Stimulus
- Unique to operant conditioning because it helps
the learner decide which response to make. - In classical conditioning the learner doesnt
think about his/her responses they are
involuntary. - Examples
- Light in the Skinner box tells the rat when to
press the bar for food. Light off no reward
for response. - Out of order sign tells you to put your money
in a vending machine, because you will not get
the consequence you want. - Traffic sign or signal tells us to stop or go in
order to get through the intersection safely.
15Reinforcement
- Any consequence that increases a response in the
future.
16Primary Secondary Reinforcement
- Primary
- In of itself reinforcing
- Examples
- Attention
- Fulfillment of a Drive
- Secondary
- Used to get a more primary reinforcer
- Examples
- Money
Academic Grades Primary or Secondary? Answer
Secondary - used to get something more primary
17Schedules of Reinforcement
- So far we have assumed that we are reinforcing
our learner for every response s/he makes. This
is called continuous reinforcement. - We can also reinforce our learner for only some
responses. This is called partial reinforcement.
18Ratio Schedules of Reinforcement
- The partial schedule we choose to use to
reinforce our learner may be based on the number
of responses the learner makes. - This is calles a ratio schedule of reinforcement.
- Example
- We give a rat a pellet every time s/he presses
the bar 5 times.
19Interval Schedules of Reinforcement
- The partial schedule we choose to use to
reinforce our learner may require our learner to
make one response within a certain period of
time. - This is call an interval schedule of
reinforcement. - Example
- We give a rat a pellet for pressing the bar once
within each 2 minute interval.
20Fixed and Variable Schedules
- Whether we are using a ratio or interval schedule
to reinforce our learner, we can apply the
schedule in a fixed or variable pattern. - In a fixed schedule the ratio or interval always
stays the same. - In a varied schedule the ratio or interval varies
for each learning trial.
21Examples of the Schedules of Reinforcement
- Fixed Ratio
- The reinforcement will be given for every set
number of responses, and that number will stay
the same for each trial. - Example
- Getting paid by the unit
- You are given a bonus for every 3 health club
memberships you sell.
22More about Fixed Ratio Schedules
- These schedules tend to make the learner have a
high response rate and feel in control or their
reinforcements. - The learner knows that the harder s/he works, the
more reinforcements s/he will get. - Reinforcement is based on the learners
performance.
23Examples of the Schedules of Reinforcement
- Variable Ratio
- The reinforcement will be given based on the
number of responses, but the number of responses
needed to get a reinforcer will change with each
trial. - Examples
- Gambling on a slot machine.
- You win more for making more responses (putting
more coins into the machine increases your
chances of winning), but you dont know how many
quarters will be required before you win. - Selling real estate by commission
- The more properties you show, the great your
chances of making a sale
24More about Variable Ratio Schedules
- These schedules tend to make the learner have the
highest response rate of any other schedule. - The learner knows that the harder s/he works, the
more reinforcements s/he will get. - Reinforcement is based on the learners
performance, but some responses are reinforced
while others are not.
25Examples of the Schedules of Reinforcement
- Fixed Interval
- The reinforcement will be given for making one
response (or minimal response) within a set
period of time. - Examples
- Getting paid by the hour, week, month or year.
- You get paid for each hour of work, as long as
you are making at least a minimal effort to work.
26More about Fixed Interval Schedules
- These schedules produce the lowest response rate
of all the schedules. - The learner knows that working harder will not
lead to more reinforcement - so why work hard? - Reinforcement does not come faster when the
learner works faster. - Reinforcement does not increase when the learner
works harder.
27Examples of the Schedules of Reinforcement
- Variable Interval
- The reinforcement will be given for making a
minimal response within a varying period of time. - Examples
- Studying for a pop quiz.
- You do not know when the quiz is coming, but you
know to study for it so that you can be ready
when it does come.
28More about Variable Interval Schedules
- The learner knows that working harder will not
lead to more reinforcement - so why work hard? - Reinforcement does not come faster when the
learner works faster, and does not increase when
the learner works harder. - Why press an elevator button more than once when
pressing it more (response) will not make the
elevator arrive faster (consequence)? - Learner does learn to make the response right
after receiving a consequence to let the teacher
know, Im ready for another reinforcer as soon
as you can give it.
29Determining the Effectiveness of the Schedules of
Reinforcement
- Both ratio schedules are more effective (produce
more responses from the learner) than either
interval schedule. - Variable schedules are more effective than fixed
schedules - Most effective schedules variable ratio, then
fixed ratio - Least effective schedule is the fixed interval
30Punishment
- Any consequences that eliminates or decreases
responses in the future.
31Factors Influencing the Effectiveness of
Punishment
- Timing
- Punishment should be given immediately following
the response to be eliminated - Intensity
- Punishment should be intense something the
learner really dislikes - Consistency
- Punishment should be given each time a response
is made
32Undesirable Effects of Punishment
- Primarily motivates learner to avoid punishment
- Behavior is suppressed, but not eliminated
- Learner does not unlearn the response
- No alternative behavior is learned
- May cause anger and aggression in learner
- May cause learner to stop making attempts to
perform well
33Two Types of LearningBased on Punishment
- Escape Learning
- Learning to make a response that allows you to
escape from a punishment that has already begun. - Stimulus (punishment)gt Response gt
Consequence (punishment stops) - Example
- Dog learns to jump partition in cage to get away
from electric shock.
34Two Types of LearningBased on Punishment
- Avoidance Learning
- Learning to make a response that allows you to
avoid being punished. - Stimulus (signal of punishment)gt Response gt
Consequence (punishment avoided) - Example
- Dog learns to jump partition in cage when it
hears a bell. This allows the dog to avoid an
electric shock that will soon follow the bell.
35Positive Negative Reinforcement Punishment
36Positive vs. Negative
- In learning, positive means adding something or
giving something - In learning, negative means taking something away
or removing something
37Definitions Revisited
- Following a response,
- Reinforcement increases the response in the
future - Punishment decreases the response in the future
38Positive Reinforcement
- A consequence that gives or adds something to a
situation in order to make the response it
followed likely to increase in the future. - The learner makes a response, and something is
given so they will tend to repeat that response. - Examples
- Giving praise
- Giving a reward
39Negative Reinforcement
- A consequence that takes away something from a
situation in order to make the response it
followed likely to increase in the future. - The learner makes a response, and something is
taken away so they will tend to repeat that
response. - Examples
- Lifting a restriction you may play after you do
your homework - As are exempt from the final if study hard
enough to keep an A average, the final will be
removed
40Positive Punishment
- A consequence that gives or adds something to a
situation in order to make the response it
followed likely to decrease in the future. - The learner makes a response, and something is
given so they will not tend to repeat that
response. - Examples
- A fine is imposed
- A spanking is given
41Negative Punishment
- A consequence that takes away something from a
situation in order to make the response it
followed likely to decrease in the future. - The learner makes a response, and something is
taken away so they will not tend to repeat that
response. - Examples
- Taking away a privilege
- Grounding a child from an activity they enjoy
42Summary of Operant Conditioning
- Responses learned are external, voluntary and
goal-oriented - Learner learns to make a brand new response.
- Association is made between a response and its
consequence. - Consequences are either reinforcing or punishing
- Reinforcement increases a response
- Punishment decreases a response
- Learner is active and focused on the consequence.
- Law of effect consequences received can predict
which responses will be made in the future.
43Applications of Classical Operant Conditioning
- Token Economy
- A secondary reinforcer (a token) is given for
good behavior. Learner turns the token in for
something more primary. - Based on operant conditioning.
44Applications of Classical Operant Conditioning
- Time Out (from positive reinforcement)
- When a learner misbehaves, his/her positive
reinforcement is removed. - Must have been giving positive reinforcement, so
that you can remove it when the learner
misbehaves. - Based on operant conditioning.
45Applications of Classical Operant Conditioning
- Flooding (Exposure)
- Used to remove fears
- The learner is flooded with whatever they fear
- Based on classical conditioning because the
responses being dealt with (fear or calm) are
involuntary
46Applications of Classical Operant Conditioning
- Systematic Desensitization
- Used to remove fears
- A stimulus that causes fear is paired with
relaxation - Learner makes a hierarchy of steps involved in
the fear - Each step of the hierarchy is paired with
relaxation, first cognitively (covert
desensitization), and then in real life (in vivo) - Based on classical conditioning because the
response being dealt with (fear or calm) are
involuntary
47Cognitive Learning
- Latent Learning Tolman
- Hidden learning
- We can know how to do something, yet not show
that we know it (not make the response) - Observational Learning Bandura
- Learning from others consequences
- Also called modeling or vicarious learning