Toggle menu
Toggle personal menu
Not logged in
Your IP address will be publicly visible if you make any edits.

머신러닝스터디/2017: Difference between revisions

From ZeroWiki
imported>rabierre
No edit summary
(Repair MoniWiki formatting after migration)
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[pagelist(^(머신러닝스터디/2017/))]]
* CNN, Artistic style
* CNN, Artistic style
* Reinforcement learning, game play
* Reinforcement learning, game play
Line 4: Line 6:
=== Reinforcement Learning ===
=== Reinforcement Learning ===
https://en.wikipedia.org/wiki/Bellman_equation
https://en.wikipedia.org/wiki/Bellman_equation
Planning vs Learning
Planning
* Know about Model
* Dynamic Programming
Learning
* Model free
* Monte Carlo method, Temporal Difference learning
==== Monte-Carlo Reinforcement Learning ====
==== Monte-Carlo Reinforcement Learning ====


Line 17: Line 26:
* model-free
* model-free
* 끌나지 않은 경험에서도 학습 가능하다(Bootstraping)
* 끌나지 않은 경험에서도 학습 가능하다(Bootstraping)

Latest revision as of 14:01, 26 March 2026

[[pagelist(^(머신러닝스터디/2017/))]]

  • CNN, Artistic style
  • Reinforcement learning, game play

Reinforcement Learning

https://en.wikipedia.org/wiki/Bellman_equation Planning vs Learning Planning

  • Know about Model
  • Dynamic Programming

Learning

  • Model free
  • Monte Carlo method, Temporal Difference learning

Monte-Carlo Reinforcement Learning

  • 직접적인 경험으로부터 배움
  • Model-free : 직접적인 MDP transition과 보상을 알 필요가 없다
  • 끝난 에피소드로부터 학습한다.
  • episodic MDP 문제만 풀 수 있다.


Temporal-Difference Learning

  • 경험으로부터 학습한다
  • model-free
  • 끌나지 않은 경험에서도 학습 가능하다(Bootstraping)