2019/1/11

Convolutional Neural Network (CNN) 實作

26 Convolutional Neural Network (CNN)實作

🔶 CNN with TensorFlow因使用 tensorflow.examples.tutorials.mnist 模組導入MINST 資料集會跑出函數 deprecated 警告,所以就先借用keras 模組來導入資料集,此方法於先前的keras 簡介的單元已使用過
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from keras.datasets import mnist

(X_train, y_train_label), (X_test, y_test_label) = mnist.load_data()
print(X_train.shape)
print(X_test.shape)
print(y_train_label.shape)
print(y_test_label.shape)
60000, 28, 28)
(10000, 28, 28)
(60000,)
(10000,)
之後再整理一下資料集
from keras.utils.np_utils import to_categorical
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0
X_train = np.reshape(X_train, [-1,28,28,1])
X_test = np.reshape(X_test, [-1,28,28,1])
y_train = to_categorical(y_train_label)
y_test= to_categorical(y_test_label)
print(X_train.shape)
print(X_test.shape)
print(y_train[10])
(60000, 28, 28, 1)
(10000, 28, 28, 1)
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
資料集已導入,因要饋入conv2d 必需是4維,必需在array 加上最後一維大小為1代表channel 為1,且 label 必需轉成one_hot編碼。
tf.reset_default_graph()
learning_rate = 0.001
n_epochs = 10
batch_size = 100
train_batches = int(X_train.shape[0]/batch_size)
test_batches = int(X_test.shape[0]/batch_size)
x, y = tf.placeholder(tf.float32, shape=
                      [None,28,28,1]), tf.placeholder(tf.float32, shape=[None,10])
dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(
    batch_size).repeat().shuffle(200)
iter = dataset.make_initializable_iterator()
feature, label = iter.get_next()
net1 = tf.layers.conv2d(feature, filters = 32, kernel_size = 4, 
                        padding = "same", activation = tf.nn.relu) 
net1_pool = tf.layers.max_pooling2d(net1, pool_size = 2, 
                                    strides = 2, padding = "same")
net2 = tf.layers.conv2d(net1_pool, filters = 64, kernel_size = 4, 
                        padding = "same", activation = tf.nn.relu) 
net2_pool = tf.layers.max_pooling2d(net2, pool_size = 2, 
                                    strides = 2, padding = "same") 
net3 = tf.layers.dense(tf.layers.flatten(net2_pool), 1024, activation=tf.nn.relu)
model = tf.layers.dense(net3, 10, activation = tf.nn.softmax)
entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=model,
                                            labels = tf.stop_gradient(label))
loss = tf.reduce_mean(entropy)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
程式第29行定義位器x,y 並於第31行帶入dataset,batch size 設為100,所以每個epoch 需要引入600 次,第33 行宣告一 initializable iterator 並於下一行使用get next 取出feature 和 label,亦即當每次feature 和 label被叫用時便會從dataset 裡取100個樣本供訓練使用,幾個layers 輸入和輸出tensors的size 如下:
  • net1: $(100,28,28,1)\rightarrow (100,28,28,32)$
  • net1_pool: $(100,28,28,32)\rightarrow (100,14,14,32)$
  • net2: $(100,14,14,32)\rightarrow (100,14,14,64)$
  • net2_pool: $(100,14,14,64)\rightarrow (100,7,7,64)$
  • net3: $(100,7,7,64)  \mbox{ flatten } (100, 64\times 7\times 7) \rightarrow (100,1024)$
  • model: $(100,1024)\rightarrow (100,10)$
從dataset 提取的feature 於第35行饋入net1 ,而label 於第46行饋入softmax_cross_entropy,使用labels = tf.stop_gradient() 參數表示labels 項不執行後向傳播。第49行使用AdamOptimizer這個佳化器。
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(iter.initializer, feed_dict={ x: X_train, y: y_train})
    for epoch in range(n_epochs):
        total_loss = 0.0
        for batch in range(train_batches):
            batch_loss,_ = sess.run([loss, optimizer])
            total_loss += batch_loss 
        average_loss = total_loss / train_batches
        print("Epoch: {0:04d}   loss = {1:0.6f}".format(epoch,average_loss))
    print("Model Trained.")
    total_accu = 0.0
    sess.run(iter.initializer, feed_dict={ x: X_test, y: y_test})    
    for batch in range(test_batches):                                                                                      
        predictions_check = tf.equal(tf.argmax(model,1),tf.argmax(label,1))
        accuracy = tf.reduce_mean(tf.cast(predictions_check, tf.float32))
        batch_accu = sess.run(accuracy)
        total_accu += batch_accu
    accu = total_accu/test_batches
    print("Accuracy:", accu)
Epoch: 0000   loss = 1.590750
Epoch: 0001   loss = 1.481860
Epoch: 0002   loss = 1.476062
Epoch: 0003   loss = 1.474154
Epoch: 0004   loss = 1.472385
Epoch: 0005   loss = 1.471157
Epoch: 0006   loss = 1.470972
Epoch: 0007   loss = 1.469852
Epoch: 0008   loss = 1.470761
Epoch: 0009   loss = 1.469699
Model Trained.
Accuracy: 0.9873000103235244
注意第50行,雖然本例中並未使用tf.Variable,但layers 裡都有內建Variable,所以在run 之前還是要初始化變數。 第51行執行iter 初始化,之後當feature和label 每被叫用一次就會從dataset 裡提取batch 量的樣本,如此就可避免在model 訓練過程中使用feed_dict 饋人料。但當處理test data 時,導入的資料必需變成測試集,所以當訓練完成後要測試時我們要重新啟始dataset,並將測試集資料帶到x,y 持位器,並導入dataset,此例中於測試集batch size 一樣設成100。當然測試時也可將10000筆樣本一次導入,但這樣會佔用太多記憶體,記憶體不大,像我的電腦記憶體只有4G時,就會報 exceeding memory,使用 dataset 一次導入100個樣本這樣就不會報超過記憶體了。

本例使用tf.layers 模組裡的layers 可以不用設weights 變數,這樣就有點像使用 keras 一樣,比較簡潔,若是使用tf.nn 模組裡的函數,則需設定weights 變數和bias 變數,這樣比較複雜。

🔶 CNN with Keras
我們再使用Keras 的 CNN 模型來訓練MINST 資料集,讀進資料集和相關的reshape 和上一個例手相同,程式如下
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from keras.datasets import mnist
tf.reset_default_graph()
(X_train, y_train_label), (X_test, y_test_label) = mnist.load_data()
print(X_train.shape)
print(X_test.shape)
print(y_train_label.shape)
print(y_test_label.shape)
from keras.utils.np_utils import to_categorical
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0
X_train = np.reshape(X_train, [-1,28,28,1])
X_test = np.reshape(X_test, [-1,28,28,1])
y_train = to_categorical(y_train_label)
y_test= to_categorical(y_test_label)
print(X_train.shape)
print(X_test.shape)
print(y_train[10])
如此就建立好要饋入訓練模型的資料了。
import keras
from keras.models import Sequential
from keras.layers import Conv2D,MaxPooling2D, Dense, Flatten, Reshape
from keras.optimizers import SGD
tf.reset_default_graph()
keras.backend.clear_session()
n_filters=[32,64]
n_classes = 10  # 0-9 digits
n_width = 28
n_height = 28
n_depth = 1
n_inputs = n_height * n_width * n_depth  # total pixels
learning_rate = 0.01
n_epochs = 10
batch_size = 100
model = Sequential()
model.add(Dense(1, input_shape=(n_width,n_height,n_depth)))
model.add(Conv2D(filters=n_filters[0], 
                 kernel_size=4, 
                 padding='SAME', 
                 activation='relu' 
                ) 
         )
model.add(MaxPooling2D(pool_size=(2,2), 
                       strides=(2,2) 
                      ) 
         )
model.add(Conv2D(filters=n_filters[1], 
                 kernel_size=4, 
                 padding='SAME', 
                 activation='relu', 
                ) 
         )
model.add(MaxPooling2D(pool_size=(2,2), 
                       strides=(2,2) 
                      ) 
         )
model.add(Flatten())
model.add(Dense(units=1024, activation='relu'))
model.add(Dense(units=n_classes, activation='softmax'))
model.summary()
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 28, 28, 1)         2         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 28, 28, 32)        544       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 14, 14, 64)        32832     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 7, 7, 64)          0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 3136)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1024)              3212288   
_________________________________________________________________
dense_3 (Dense)              (None, 10)                10250     
=================================================================
Total params: 3,255,916
Trainable params: 3,255,916
Non-trainable params: 0
第39行告訴Keras 輸入的shape,這樣才能用第63行列印出模型的參數。其他各層的參數都和上例用TensorFlow 處理的模型一樣。
model.compile(loss='categorical_crossentropy',
              optimizer=SGD(lr=learning_rate),
              metrics=['accuracy'])
model.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=n_epochs)
score = model.evaluate(X_test, y_test)
print('\nTest loss:', score[0])
print('Test accuracy:', score[1])
Epoch 1/10
60000/60000 [===================] - 111s 2ms/step - loss: 1.4023 - acc: 0.6004
Epoch 2/10
60000/60000 [===================] - 109s 2ms/step - loss: 0.2317 - acc: 0.9296
Epoch 3/10
60000/60000 [===================] - 109s 2ms/step - loss: 0.1434 - acc: 0.9570
Epoch 4/10
60000/60000 [===================] - 113s 2ms/step - loss: 0.1056 - acc: 0.9681
Epoch 5/10
60000/60000 [===================] - 110s 2ms/step - loss: 0.0841 - acc: 0.9738
Epoch 6/10
60000/60000 [===================] - 113s 2ms/step - loss: 0.0707 - acc: 0.9786
Epoch 7/10
60000/60000 [===================] - 111s 2ms/step - loss: 0.0611 - acc: 0.9817
Epoch 8/10
60000/60000 [===================] - 112s 2ms/step - loss: 0.0555 - acc: 0.9828
Epoch 9/10
60000/60000 [===================] - 111s 2ms/step - loss: 0.0485 - acc: 0.9853 
Epoch 10/10
60000/60000 [===================] - 111s 2ms/step - loss: 0.0449 - acc: 0.9865
10000/10000 [===================] - 6s 563us/step

Test loss: 0.0475634125239565
Test accuracy: 0.9846
最後就是輸出的結果。由本例的實驗我閃知道Keras使用上比TensorFlow方便,只要在於資料樣本的導入,使用tf 時我們需要設dataset ,再從中每次取 batch size 的樣本量饋入模型中供訓練用,但在Keras,這些工作model自己會去做,我們並不需設定、亦即keras會自已從導入的tensor 中每次取batch size用於訓練。

參考文獻
Armando Fandango, Mastering TensorFlow 1.x, Packt Publishing, 2018

沒有留言:

張貼留言