<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://mediawiki.zeropage.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=39.7.51.92</id>
	<title>ZeroWiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://mediawiki.zeropage.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=39.7.51.92"/>
	<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php/Special:Contributions/39.7.51.92"/>
	<updated>2026-05-15T20:55:45Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.39.8</generator>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49120</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49120"/>
		<updated>2017-07-01T04:28:32Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
= machine learning =&lt;br /&gt;
# Supervised learning&lt;br /&gt;
# Unsupervised learning&lt;br /&gt;
# Reinforcement learning&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* + Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
* 필요한 라이브러리numpy, gym, tensorflow 필요&lt;br /&gt;
   $ pip install gym&lt;br /&gt;
   $ pip install tensorflow&lt;br /&gt;
&lt;br /&gt;
# cartpole 실행을 해보자! - cartpole_init.py&lt;br /&gt;
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py&lt;br /&gt;
# q-network(q-learning의 NN버전) - cartpole.py &lt;br /&gt;
# DQN - cartpole_dqn.py&lt;br /&gt;
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]&lt;br /&gt;
* 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]&lt;br /&gt;
== 하고싶은 말 ==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49119</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49119"/>
		<updated>2017-07-01T04:28:13Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
= machine learning =&lt;br /&gt;
# Supervised learning&lt;br /&gt;
# Unsupervised learning&lt;br /&gt;
# Reinforcement learning&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
* 필요한 라이브러리numpy, gym, tensorflow 필요&lt;br /&gt;
   $ pip install gym&lt;br /&gt;
   $ pip install tensorflow&lt;br /&gt;
&lt;br /&gt;
# cartpole 실행을 해보자! - cartpole_init.py&lt;br /&gt;
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py&lt;br /&gt;
# q-network(q-learning의 NN버전) - cartpole.py &lt;br /&gt;
# DQN - cartpole_dqn.py&lt;br /&gt;
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]&lt;br /&gt;
* 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]&lt;br /&gt;
== 하고싶은 말 ==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49118</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49118"/>
		<updated>2017-07-01T04:28:05Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
= machine learning =&lt;br /&gt;
종류&lt;br /&gt;
# Supervised learning&lt;br /&gt;
# Unsupervised learning&lt;br /&gt;
# Reinforcement learning&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
* 필요한 라이브러리numpy, gym, tensorflow 필요&lt;br /&gt;
   $ pip install gym&lt;br /&gt;
   $ pip install tensorflow&lt;br /&gt;
&lt;br /&gt;
# cartpole 실행을 해보자! - cartpole_init.py&lt;br /&gt;
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py&lt;br /&gt;
# q-network(q-learning의 NN버전) - cartpole.py &lt;br /&gt;
# DQN - cartpole_dqn.py&lt;br /&gt;
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]&lt;br /&gt;
* 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]&lt;br /&gt;
== 하고싶은 말 ==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49117</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49117"/>
		<updated>2017-07-01T04:27:50Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
= machine learning =&lt;br /&gt;
* 종류&lt;br /&gt;
# Supervised learning&lt;br /&gt;
# Unsupervised learning&lt;br /&gt;
# Reinforcement learning&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
* 필요한 라이브러리numpy, gym, tensorflow 필요&lt;br /&gt;
   $ pip install gym&lt;br /&gt;
   $ pip install tensorflow&lt;br /&gt;
&lt;br /&gt;
# cartpole 실행을 해보자! - cartpole_init.py&lt;br /&gt;
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py&lt;br /&gt;
# q-network(q-learning의 NN버전) - cartpole.py &lt;br /&gt;
# DQN - cartpole_dqn.py&lt;br /&gt;
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]&lt;br /&gt;
* 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]&lt;br /&gt;
== 하고싶은 말 ==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49116</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49116"/>
		<updated>2017-07-01T04:26:20Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
= machine learning =&lt;br /&gt;
&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
* 필요한 라이브러리numpy, gym, tensorflow 필요&lt;br /&gt;
   $ pip install gym&lt;br /&gt;
   $ pip install tensorflow&lt;br /&gt;
&lt;br /&gt;
# cartpole 실행을 해보자! - cartpole_init.py&lt;br /&gt;
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py&lt;br /&gt;
# q-network(q-learning의 NN버전) - cartpole.py &lt;br /&gt;
# DQN - cartpole_dqn.py&lt;br /&gt;
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]&lt;br /&gt;
* 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
* 논문: [https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement Learning]&lt;br /&gt;
== 하고싶은 말 ==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49115</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49115"/>
		<updated>2017-07-01T04:19:26Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
= machine learning =&lt;br /&gt;
&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
* 필요한 라이브러리numpy, gym, tensorflow 필요&lt;br /&gt;
   $ pip install gym&lt;br /&gt;
   $ pip install tensorflow&lt;br /&gt;
&lt;br /&gt;
# cartpole 실행을 해보자! - cartpole_init.py&lt;br /&gt;
# random action(왼쪽, 오른쪽)을 하는 cartpole - cartpole_random.py&lt;br /&gt;
# q-network(q-learning의 NN버전) - cartpole.py &lt;br /&gt;
# DQN - cartpole_dqn.py&lt;br /&gt;
# 2015에 Deep Mind에서 발표한 DQN - cartpole_dqn2015.py&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]&lt;br /&gt;
* 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
&lt;br /&gt;
== 하고싶은 말 ==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49114</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49114"/>
		<updated>2017-07-01T04:15:47Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
= machine learning =&lt;br /&gt;
&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
numpy, gym, tensorflow 필요&lt;br /&gt;
&lt;br /&gt;
* cartpole_init.py&lt;br /&gt;
* cartpole_random.py&lt;br /&gt;
* cartpole.py &lt;br /&gt;
* cartpole_dqn.pyn&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]&lt;br /&gt;
* 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
&lt;br /&gt;
== 하고싶은 말 ==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49113</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49113"/>
		<updated>2017-07-01T04:15:29Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
= machine learning =&lt;br /&gt;
&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
numpy, gym, tensorflow 필요&lt;br /&gt;
&lt;br /&gt;
* cartpole_init.py&lt;br /&gt;
* cartpole_random.py&lt;br /&gt;
* cartpole.py &lt;br /&gt;
* cartpole_dqn.pyn&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
* 발표 슬라이드: [https://slides.com/rabierre/playing_a_game_with_rl slide]&lt;br /&gt;
* 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49111</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49111"/>
		<updated>2017-07-01T04:09:37Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= machine learning =&lt;br /&gt;
&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
numpy, gym, tensorflow 필요&lt;br /&gt;
&lt;br /&gt;
* cartpole_init.py&lt;br /&gt;
* cartpole_random.py&lt;br /&gt;
* cartpole.py &lt;br /&gt;
* cartpole_dqn.pyn&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
 빌표 슬라이드: [http://slides.com/rabierre/deck slide]&lt;br /&gt;
 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49110</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49110"/>
		<updated>2017-07-01T04:09:28Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= machine learning =&lt;br /&gt;
&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
numpy, gym, tensorflow 필요&lt;br /&gt;
&lt;br /&gt;
* cartpole_init.py&lt;br /&gt;
* cartpole_random.py&lt;br /&gt;
* cartpole.py &lt;br /&gt;
* cartpole_dqn.pyn&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
 빌표 슬라디으: [http://slides.com/rabierre/deck slide]&lt;br /&gt;
 코드: [https://github.com/Rabierre/cartpole github]&lt;br /&gt;
==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
	<entry>
		<id>https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49109</id>
		<title>데블스캠프2017/강화학습으로컴퓨터에게고전게임플레이시키기</title>
		<link rel="alternate" type="text/html" href="https://mediawiki.zeropage.org/index.php?title=%EB%8D%B0%EB%B8%94%EC%8A%A4%EC%BA%A0%ED%94%842017/%EA%B0%95%ED%99%94%ED%95%99%EC%8A%B5%EC%9C%BC%EB%A1%9C%EC%BB%B4%ED%93%A8%ED%84%B0%EC%97%90%EA%B2%8C%EA%B3%A0%EC%A0%84%EA%B2%8C%EC%9E%84%ED%94%8C%EB%A0%88%EC%9D%B4%EC%8B%9C%ED%82%A4%EA%B8%B0&amp;diff=49109"/>
		<updated>2017-07-01T04:09:09Z</updated>

		<summary type="html">&lt;p&gt;39.7.51.92: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= machine learning =&lt;br /&gt;
&lt;br /&gt;
== supervised learning ==&lt;br /&gt;
* 학습을 시킬 때 label에 정답이 있는 것&lt;br /&gt;
* Need input, target&lt;br /&gt;
* Learning from difference between prediction and target&lt;br /&gt;
* e.g. mnist, classification&lt;br /&gt;
&lt;br /&gt;
== unsupervised learning ==&lt;br /&gt;
* label 이 미리 정해져 있지 않은 것&lt;br /&gt;
* Need input&lt;br /&gt;
* Cluster by distance between inputs&lt;br /&gt;
* Can&#039;t predict outcome&lt;br /&gt;
* e.g. clustering&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* 일종의 unsupervised learning&lt;br /&gt;
* input : environment, reward, output : action&lt;br /&gt;
* Learn from try&lt;br /&gt;
* Model free&lt;br /&gt;
* e.g. game play, stock trading&lt;br /&gt;
== reinforcement learning ==&lt;br /&gt;
* Q learning&lt;br /&gt;
* Neural Network&lt;br /&gt;
* DQN : Deep Q Learning&lt;br /&gt;
== Basic knowledge ==&lt;br /&gt;
* MDP : Markov Decision Process&lt;br /&gt;
* Bellman equation&lt;br /&gt;
* Dynamic programming&lt;br /&gt;
* Value, Polish&lt;br /&gt;
* Value function, Polish function&lt;br /&gt;
* Value iteration, Polish iteration&lt;br /&gt;
&lt;br /&gt;
== 실습 ==&lt;br /&gt;
numpy, gym, tensorflow 필요&lt;br /&gt;
&lt;br /&gt;
* cartpole_init.py&lt;br /&gt;
* cartpole_random.py&lt;br /&gt;
* cartpole.py &lt;br /&gt;
* cartpole_dqn.pyn&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== reference ==&lt;br /&gt;
 [http://slides.com/rabierre/deck slide]&lt;br /&gt;
 [https://github.com/Rabierre/cartpole cartpole code in github]&lt;br /&gt;
==&lt;br /&gt;
&lt;/div&gt;</summary>
		<author><name>39.7.51.92</name></author>
	</entry>
</feed>