Toggle menu
Toggle personal menu
Not logged in
Your IP address will be publicly visible if you make any edits.

머신러닝스터디/2016/2016 09 03: Difference between revisions

From ZeroWiki
No edit summary
(Repair batch-0005 pages from live compare)
 
Line 19: Line 19:
  test  = pd.read_csv("../input/test.csv")
  test  = pd.read_csv("../input/test.csv")
   
   
  y_train = train['label'].as_matrix()
  y_train = train['label'].as_matrix()
  X_train = train.drop('label', axis=1).as_matrix()
  X_train = train.drop('label', axis=1).as_matrix()
   
   
Line 37: Line 37:
  model.add(keras.layers.Dense(10, activation='softmax'))
  model.add(keras.layers.Dense(10, activation='softmax'))
   
   
  model.compile(loss='categorical_crossentropy', optimizer='adagrad', metrics=['accuracy'])
  model.compile(loss='categorical_crossentropy', optimizer='adagrad', metrics=['accuracy'])
  model.fit(X_train, to_categorical(y_train, 10), nb_epoch=5, batch_size=600)
  model.fit(X_train, to_categorical(y_train, 10), nb_epoch=5, batch_size=600)
   
   
Line 44: Line 44:
  print(score)
  print(score)
   
   
  print(model.predict(X_test))[0]
  print(model.predict(X_test))[0]
  print(y_test[0])
  print(y_test[0])
== 다음 시간에는 ==
== 다음 시간에는 ==
* 다음 시간에도 kaggle digits 계속...
* 다음 시간에도 kaggle digits 계속...

Latest revision as of 00:44, 27 March 2026

머신러닝스터디/2016 머신러닝스터디/2016/목차

내용

  • Project
    • kaggle digit recignizer
    • train 데이터의 0번째 컬럼이 y값(785열)이고 test는 y값이 주어지지 않음(784 열)
    • 데이터 전처리 필요
    • pandas 라이브러리에 익숙하지 않아 각자 input data 핸들링 하는 방법에 대해 알아봄
  • 0번째 컬럼을 분리하였으나 학습정확도가 10%대
    • 학습이 전혀 되지 않은 것..(0~9중에 찍었을 때 1/10 확률로 정답)

코드

import pandas as pd
import keras
from sklearn.cross_validation import train_test_split
from keras.utils.np_utils import to_categorical

train = pd.read_csv("../input/train.csv")
test  = pd.read_csv("../input/test.csv")

y_train = train['label'].as_matrix()
X_train = train.drop('label', axis=1).as_matrix()

X_train, X_test, y_train, y_test = train_test_split(X_train, y_train, test_size=0.30)

model = keras.models.Sequential()

model.add(keras.layers.Dense(64, input_dim=28*28, activation='relu'))
model.add(keras.layers.Dropout(0.5))

model.add(keras.layers.Dense(32, activation='relu'))
model.add(keras.layers.Dropout(0.5))

model.add(keras.layers.Dense(16, activation='relu'))
model.add(keras.layers.Dropout(0.5))

model.add(keras.layers.Dense(10, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adagrad', metrics=['accuracy'])
model.fit(X_train, to_categorical(y_train, 10), nb_epoch=5, batch_size=600)

score = model.evaluate(X_test, to_categorical(y_test, 10), batch_size=700)

print(score)

print(model.predict(X_test))[0]
print(y_test[0])

다음 시간에는

  • 다음 시간에도 kaggle digits 계속...