🔶 CNN with TensorFlow因使用 tensorflow.examples.tutorials.mnist 模組導入MINST 資料集會跑出函數 deprecated 警告,所以就先借用keras 模組來導入資料集,此方法於先前的keras 簡介的單元已使用過
import numpy as np import tensorflow as tf import matplotlib.pyplot as plt from keras.datasets import mnist (X_train, y_train_label), (X_test, y_test_label) = mnist.load_data() print(X_train.shape) print(X_test.shape) print(y_train_label.shape) print(y_test_label.shape)
60000, 28, 28) (10000, 28, 28) (60000,) (10000,)之後再整理一下資料集
from keras.utils.np_utils import to_categorical X_train = X_train.astype('float32') X_test = X_test.astype('float32') X_train /= 255.0 X_test /= 255.0 X_train = np.reshape(X_train, [-1,28,28,1]) X_test = np.reshape(X_test, [-1,28,28,1]) y_train = to_categorical(y_train_label) y_test= to_categorical(y_test_label) print(X_train.shape) print(X_test.shape) print(y_train[10])
(60000, 28, 28, 1) (10000, 28, 28, 1) [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]資料集已導入,因要饋入conv2d 必需是4維,必需在array 加上最後一維大小為1代表channel 為1,且 label 必需轉成one_hot編碼。
tf.reset_default_graph() learning_rate = 0.001 n_epochs = 10 batch_size = 100 train_batches = int(X_train.shape[0]/batch_size) test_batches = int(X_test.shape[0]/batch_size) x, y = tf.placeholder(tf.float32, shape= [None,28,28,1]), tf.placeholder(tf.float32, shape=[None,10]) dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch( batch_size).repeat().shuffle(200) iter = dataset.make_initializable_iterator() feature, label = iter.get_next() net1 = tf.layers.conv2d(feature, filters = 32, kernel_size = 4, padding = "same", activation = tf.nn.relu) net1_pool = tf.layers.max_pooling2d(net1, pool_size = 2, strides = 2, padding = "same") net2 = tf.layers.conv2d(net1_pool, filters = 64, kernel_size = 4, padding = "same", activation = tf.nn.relu) net2_pool = tf.layers.max_pooling2d(net2, pool_size = 2, strides = 2, padding = "same") net3 = tf.layers.dense(tf.layers.flatten(net2_pool), 1024, activation=tf.nn.relu) model = tf.layers.dense(net3, 10, activation = tf.nn.softmax) entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=model, labels = tf.stop_gradient(label)) loss = tf.reduce_mean(entropy) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)程式第29行定義位器x,y 並於第31行帶入dataset,batch size 設為100,所以每個epoch 需要引入600 次,第33 行宣告一 initializable iterator 並於下一行使用get next 取出feature 和 label,亦即當每次feature 和 label被叫用時便會從dataset 裡取100個樣本供訓練使用,幾個layers 輸入和輸出tensors的size 如下:
- net1: $(100,28,28,1)\rightarrow (100,28,28,32)$
- net1_pool: $(100,28,28,32)\rightarrow (100,14,14,32)$
- net2: $(100,14,14,32)\rightarrow (100,14,14,64)$
- net2_pool: $(100,14,14,64)\rightarrow (100,7,7,64)$
- net3: $(100,7,7,64) \mbox{ flatten } (100, 64\times 7\times 7) \rightarrow (100,1024)$
- model: $(100,1024)\rightarrow (100,10)$
with tf.Session() as sess: sess.run(tf.global_variables_initializer()) sess.run(iter.initializer, feed_dict={ x: X_train, y: y_train}) for epoch in range(n_epochs): total_loss = 0.0 for batch in range(train_batches): batch_loss,_ = sess.run([loss, optimizer]) total_loss += batch_loss average_loss = total_loss / train_batches print("Epoch: {0:04d} loss = {1:0.6f}".format(epoch,average_loss)) print("Model Trained.") total_accu = 0.0 sess.run(iter.initializer, feed_dict={ x: X_test, y: y_test}) for batch in range(test_batches): predictions_check = tf.equal(tf.argmax(model,1),tf.argmax(label,1)) accuracy = tf.reduce_mean(tf.cast(predictions_check, tf.float32)) batch_accu = sess.run(accuracy) total_accu += batch_accu accu = total_accu/test_batches print("Accuracy:", accu)
Epoch: 0000 loss = 1.590750 Epoch: 0001 loss = 1.481860 Epoch: 0002 loss = 1.476062 Epoch: 0003 loss = 1.474154 Epoch: 0004 loss = 1.472385 Epoch: 0005 loss = 1.471157 Epoch: 0006 loss = 1.470972 Epoch: 0007 loss = 1.469852 Epoch: 0008 loss = 1.470761 Epoch: 0009 loss = 1.469699 Model Trained. Accuracy: 0.9873000103235244注意第50行,雖然本例中並未使用tf.Variable,但layers 裡都有內建Variable,所以在run 之前還是要初始化變數。 第51行執行iter 初始化,之後當feature和label 每被叫用一次就會從dataset 裡提取batch 量的樣本,如此就可避免在model 訓練過程中使用feed_dict 饋人料。但當處理test data 時,導入的資料必需變成測試集,所以當訓練完成後要測試時我們要重新啟始dataset,並將測試集資料帶到x,y 持位器,並導入dataset,此例中於測試集batch size 一樣設成100。當然測試時也可將10000筆樣本一次導入,但這樣會佔用太多記憶體,記憶體不大,像我的電腦記憶體只有4G時,就會報 exceeding memory,使用 dataset 一次導入100個樣本這樣就不會報超過記憶體了。
本例使用tf.layers 模組裡的layers 可以不用設weights 變數,這樣就有點像使用 keras 一樣,比較簡潔,若是使用tf.nn 模組裡的函數,則需設定weights 變數和bias 變數,這樣比較複雜。
🔶 CNN with Keras
我們再使用Keras 的 CNN 模型來訓練MINST 資料集,讀進資料集和相關的reshape 和上一個例手相同,程式如下
import numpy as np import tensorflow as tf import matplotlib.pyplot as plt from keras.datasets import mnist tf.reset_default_graph() (X_train, y_train_label), (X_test, y_test_label) = mnist.load_data() print(X_train.shape) print(X_test.shape) print(y_train_label.shape) print(y_test_label.shape) from keras.utils.np_utils import to_categorical X_train = X_train.astype('float32') X_test = X_test.astype('float32') X_train /= 255.0 X_test /= 255.0 X_train = np.reshape(X_train, [-1,28,28,1]) X_test = np.reshape(X_test, [-1,28,28,1]) y_train = to_categorical(y_train_label) y_test= to_categorical(y_test_label) print(X_train.shape) print(X_test.shape) print(y_train[10])如此就建立好要饋入訓練模型的資料了。
import keras from keras.models import Sequential from keras.layers import Conv2D,MaxPooling2D, Dense, Flatten, Reshape from keras.optimizers import SGD tf.reset_default_graph() keras.backend.clear_session() n_filters=[32,64] n_classes = 10 # 0-9 digits n_width = 28 n_height = 28 n_depth = 1 n_inputs = n_height * n_width * n_depth # total pixels learning_rate = 0.01 n_epochs = 10 batch_size = 100 model = Sequential() model.add(Dense(1, input_shape=(n_width,n_height,n_depth))) model.add(Conv2D(filters=n_filters[0], kernel_size=4, padding='SAME', activation='relu' ) ) model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2) ) ) model.add(Conv2D(filters=n_filters[1], kernel_size=4, padding='SAME', activation='relu', ) ) model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2) ) ) model.add(Flatten()) model.add(Dense(units=1024, activation='relu')) model.add(Dense(units=n_classes, activation='softmax')) model.summary()
Layer (type) Output Shape Param # ================================================================= dense_1 (Dense) (None, 28, 28, 1) 2 _________________________________________________________________ conv2d_1 (Conv2D) (None, 28, 28, 32) 544 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 14, 14, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 14, 14, 64) 32832 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 7, 7, 64) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 3136) 0 _________________________________________________________________ dense_2 (Dense) (None, 1024) 3212288 _________________________________________________________________ dense_3 (Dense) (None, 10) 10250 ================================================================= Total params: 3,255,916 Trainable params: 3,255,916 Non-trainable params: 0第39行告訴Keras 輸入的shape,這樣才能用第63行列印出模型的參數。其他各層的參數都和上例用TensorFlow 處理的模型一樣。
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=learning_rate), metrics=['accuracy']) model.fit(X_train, y_train, batch_size=batch_size, epochs=n_epochs) score = model.evaluate(X_test, y_test) print('\nTest loss:', score[0]) print('Test accuracy:', score[1])
Epoch 1/10 60000/60000 [===================] - 111s 2ms/step - loss: 1.4023 - acc: 0.6004 Epoch 2/10 60000/60000 [===================] - 109s 2ms/step - loss: 0.2317 - acc: 0.9296 Epoch 3/10 60000/60000 [===================] - 109s 2ms/step - loss: 0.1434 - acc: 0.9570 Epoch 4/10 60000/60000 [===================] - 113s 2ms/step - loss: 0.1056 - acc: 0.9681 Epoch 5/10 60000/60000 [===================] - 110s 2ms/step - loss: 0.0841 - acc: 0.9738 Epoch 6/10 60000/60000 [===================] - 113s 2ms/step - loss: 0.0707 - acc: 0.9786 Epoch 7/10 60000/60000 [===================] - 111s 2ms/step - loss: 0.0611 - acc: 0.9817 Epoch 8/10 60000/60000 [===================] - 112s 2ms/step - loss: 0.0555 - acc: 0.9828 Epoch 9/10 60000/60000 [===================] - 111s 2ms/step - loss: 0.0485 - acc: 0.9853 Epoch 10/10 60000/60000 [===================] - 111s 2ms/step - loss: 0.0449 - acc: 0.9865 10000/10000 [===================] - 6s 563us/step Test loss: 0.0475634125239565 Test accuracy: 0.9846最後就是輸出的結果。由本例的實驗我閃知道Keras使用上比TensorFlow方便,只要在於資料樣本的導入,使用tf 時我們需要設dataset ,再從中每次取 batch size 的樣本量饋入模型中供訓練用,但在Keras,這些工作model自己會去做,我們並不需設定、亦即keras會自已從導入的tensor 中每次取batch size用於訓練。
參考文獻
Armando Fandango, Mastering TensorFlow 1.x, Packt Publishing, 2018
沒有留言:
張貼留言