More actions
imported>rabierre No edit summary |
imported>rabierre No edit summary |
||
| Line 6: | Line 6: | ||
* Sarsa | * Sarsa | ||
** on policy | ** on policy | ||
** Sarsa는 다음과 같은 조건에서 converge한다 | |||
## GLIE sequence of policies | |||
## Robinson Monro sequence of step sizes | |||
Revision as of 06:51, 5 August 2017
Reinforcement Learning
Lecture 5: Model Free Control
동영상 주소: https://www.youtube.com/watch?v=0g4j2k_Ggc4&t=2466s
- on policy vs off policy
- ε-Greedy
- Sarsa
- on policy
- Sarsa는 다음과 같은 조건에서 converge한다
- GLIE sequence of policies
- Robinson Monro sequence of step sizes