ZeroWiki - User contributions [en]

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:28:32Z

39.7.51.92:

__TOC__
= machine learning =
# Supervised learning
# Unsupervised learning
# Reinforcement learning
== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification
== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* + Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
* 필요한 라이브러리numpy, gym, tensorflow 필요
$ pip install gym
$ pip install tensorflow

# cartpole 실행을 해보자! - cartpole_init.py
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py
# q-network(q-learning의 NN버전) - cartpole.py
# DQN - cartpole_dqn.py
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py

== reference ==
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]
* 코드: [https://github.com/Rabierre/cartpole github]
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]
== 하고싶은 말 ==

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:28:13Z

39.7.51.92:

__TOC__
= machine learning =
# Supervised learning
# Unsupervised learning
# Reinforcement learning
== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification
== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
* 필요한 라이브러리numpy, gym, tensorflow 필요
$ pip install gym
$ pip install tensorflow

# cartpole 실행을 해보자! - cartpole_init.py
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py
# q-network(q-learning의 NN버전) - cartpole.py
# DQN - cartpole_dqn.py
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py

== reference ==
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]
* 코드: [https://github.com/Rabierre/cartpole github]
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]
== 하고싶은 말 ==

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:28:05Z

39.7.51.92:

__TOC__
= machine learning =
종류
# Supervised learning
# Unsupervised learning
# Reinforcement learning
== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification
== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
* 필요한 라이브러리numpy, gym, tensorflow 필요
$ pip install gym
$ pip install tensorflow

# cartpole 실행을 해보자! - cartpole_init.py
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py
# q-network(q-learning의 NN버전) - cartpole.py
# DQN - cartpole_dqn.py
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py

== reference ==
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]
* 코드: [https://github.com/Rabierre/cartpole github]
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]
== 하고싶은 말 ==

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:27:50Z

39.7.51.92:

__TOC__
= machine learning =
* 종류
# Supervised learning
# Unsupervised learning
# Reinforcement learning
== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification
== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
* 필요한 라이브러리numpy, gym, tensorflow 필요
$ pip install gym
$ pip install tensorflow

# cartpole 실행을 해보자! - cartpole_init.py
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py
# q-network(q-learning의 NN버전) - cartpole.py
# DQN - cartpole_dqn.py
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py

== reference ==
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]
* 코드: [https://github.com/Rabierre/cartpole github]
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]
== 하고싶은 말 ==

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:26:20Z

39.7.51.92:

__TOC__
= machine learning =

== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification

== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
* 필요한 라이브러리numpy, gym, tensorflow 필요
$ pip install gym
$ pip install tensorflow

# cartpole 실행을 해보자! - cartpole_init.py
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py
# q-network(q-learning의 NN버전) - cartpole.py
# DQN - cartpole_dqn.py
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py

== reference ==
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]
* 코드: [https://github.com/Rabierre/cartpole github]
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]
== 하고싶은 말 ==

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:19:26Z

39.7.51.92:

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:15:47Z

39.7.51.92:

__TOC__
= machine learning =

== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification

== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
numpy, gym, tensorflow 필요

* cartpole_init.py
* cartpole_random.py
* cartpole.py
* cartpole_dqn.pyn

== reference ==
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]
* 코드: [https://github.com/Rabierre/cartpole github]

== 하고싶은 말 ==

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:15:29Z

39.7.51.92:

__TOC__
= machine learning =

== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification

== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
numpy, gym, tensorflow 필요

* cartpole_init.py
* cartpole_random.py
* cartpole.py
* cartpole_dqn.pyn

== reference ==
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]
* 코드: [https://github.com/Rabierre/cartpole github]
==

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:09:37Z

39.7.51.92:

= machine learning =

== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification

== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
numpy, gym, tensorflow 필요

* cartpole_init.py
* cartpole_random.py
* cartpole.py
* cartpole_dqn.pyn

== reference ==
빌표 슬라이드: [http://slides.com/rabierre/deck slide]
코드: [https://github.com/Rabierre/cartpole github]
==

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:09:28Z

39.7.51.92:

= machine learning =

== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification

== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
numpy, gym, tensorflow 필요

* cartpole_init.py
* cartpole_random.py
* cartpole.py
* cartpole_dqn.pyn

== reference ==
빌표 슬라디으: [http://slides.com/rabierre/deck slide]
코드: [https://github.com/Rabierre/cartpole github]
==

데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기

2017-07-01T04:09:09Z

39.7.51.92:

= machine learning =

== supervised learning ==
* 학습을 시킬 때 label에 정답이 있는 것
* Need input, target
* Learning from difference between prediction and target
* e.g. mnist, classification

== unsupervised learning ==
* label 이 미리 정해져 있지 않은 것
* Need input
* Cluster by distance between inputs
* Can't predict outcome
* e.g. clustering
== reinforcement learning ==
* 일종의 unsupervised learning
* input : environment, reward, output : action
* Learn from try
* Model free
* e.g. game play, stock trading
== reinforcement learning ==
* Q learning
* Neural Network
* DQN : Deep Q Learning
== Basic knowledge ==
* MDP : Markov Decision Process
* Bellman equation
* Dynamic programming
* Value, Polish
* Value function, Polish function
* Value iteration, Polish iteration

== 실습 ==
numpy, gym, tensorflow 필요

* cartpole_init.py
* cartpole_random.py
* cartpole.py
* cartpole_dqn.pyn

== reference ==
[http://slides.com/rabierre/deck slide]
[https://github.com/Rabierre/cartpole cartpole code in github]
==