Introduction to Reinforcement Learning

About This Presentation

Title:

Description:

Number of Views:53

Avg rating:3.0/5.0

Slides: 13

Provided by: stu56

Transcript and Presenter's Notes

Title: Introduction to Reinforcement Learning

1
Introduction toReinforcement Learning

2
Overview

3
General principles of RL

Neural Networks are supervised learning
algorithms for each input, we know the output.
What if we dont know the output for each input?
Flight control system example
Let the agent learn how to achieve certain goals
itself, through interaction with the environment.

4
General principles of RL

Let the agent learn how to achieve certain goals
itself, through interaction with the environment.

5
Popular model MDPs

a0
a1
s0
s1
r0
6
Values of states Vp(s)

Definition of value Vp(s)
Cumulative reward when starting in state s, and
executing some policy untill terminal state is
reached.
Optimal policy yields V(s)

7
Determining Vp(s)

TD-learningV(s) V(s) a(R(s)V(s)-V(s))

- Necessary to consider all states.
8
Values of state-action Q(a,s)

Q-values Q(a,s) Value of doing an action in a
certain state.
Dynamic Programming Q(a,s) R(s)
SsTssmaxaQ(a,s)
TD-learning
Q(a,s) Q(a,s) a(R(s) maxaQ(a,s) -
Q(a,s)) T is not in this formula Model
free learning!

9
Exploration vs. Exploitation

10
Some issues

11
Conclusion

RL Learning through interaction and rewards.
Markov Decision Process popular model
Values of states V(s)
Values of action/states Q(a,s) (model
free!)
Still some problems... not quite ready for
complex real-world problems yet, but research
underway!

12
Literature