Toggle menu
Toggle personal menu
Not logged in
Your IP address will be publicly visible if you make any edits.

머신러닝스터디/2017/Reinforcement Learning/: Difference between revisions

From ZeroWiki
imported>rabierre
No edit summary
imported>rabierre
No edit summary
Line 6: Line 6:
* Sarsa
* Sarsa
** on policy
** on policy
** Sarsa는 다음과 같은 조건에서 converge한다
## GLIE sequence of policies
## Robinson Monro sequence of step sizes



Revision as of 06:51, 5 August 2017

Reinforcement Learning

Lecture 5: Model Free Control

동영상 주소: https://www.youtube.com/watch?v=0g4j2k_Ggc4&t=2466s

  • on policy vs off policy
  • ε-Greedy
  • Sarsa
    • on policy
    • Sarsa는 다음과 같은 조건에서 converge한다
    1. GLIE sequence of policies
    2. Robinson Monro sequence of step sizes