项目04:初识卷积神经网络-Fashion MNIST

1. 卷积神经网络简介

卷积神经网络(Convolutional Neural Networks, CNN)是由Yann LeCun,Wei Zhang,Alexander Waibel 等人发明提出,作为深度学习在图像处理方面最具有代表性的算法之一,它出现在非常多的图像识别项目当中,例如人脸识别、人体姿态估计、车牌识别、证件识别等。(这边需要插点简介啥的)

1.2. 多层感知器和卷积神经网络

在上一章节我们使用多层感知机搭建了一个手写字符集识别的模型。那个模型的输入层我们采用的28 × 28 = 784(图片的像素) 个神经元作为输入,这样就相当于我们把一组多维的数组,压成一组向量,这样等于让空间局部性消失了。而卷积神经网络可以在保存空间信息的同时进行特征提取。

1.3. 卷积神经网络网络

1.3.1 卷积层的特征提取与运算方式

在卷积运算时,会给定一个大小为FF的方阵,称为过滤器,又叫做卷积核,该矩阵的大小又称为感受野。过滤器的深度d和输入层的深度d维持一致,因此可以得到大小为FFd的过滤器,从数学的角度出发,其为d个FF的矩阵。在实际的操作中,不同的模型会确定不同数量的过滤器,其个数记为K,每一个K包含d个F*F的矩阵,并且计算生成一个输出矩阵。

image.png

1.3.2 非线性激活函数

2. LeNet-5详解

为了完整保存数据的信息,我们抛弃之前使用多层感知机的降低到一维数据的784个元素的input,这里采用多维数组的数据类型作为输入。输入的数据格式为28 × 28 × 1,分别代表着28乘28像素的单通道图像。 lenet-5.png

3. Fashion MNIST

3.1 服装分类的数据集

Fashion-MNIST数据集包含了60,000个28x28灰度图像,共10个时尚分类作为训练集。测试集包含10,000张图片。该数据集可作为MNIST数据集的进化版本,10个类别标签分别是:

标签 0 1 2 3 4 5 6 7 8 9
分类 T恤 裤子 套头衫 连衣裙 外套 凉鞋 衬衫 帆布鞋 短靴

可以发现,Fashion-MNIST和MNIST的数据是相识的,所以一些操作数据集的方法可以进行复用。

3.2 数据集的下载与使用

这里第一次使用mnist.fashion_mnist()时,系统会检测用户目录是否存在数据集,如果没有,它会自动下载文件。所以第一次下载需要等待比较长的时间。

# 导入需要使用打包
import numpy as np
import pandas as pd
from keras.utils import np_utils
from keras.datasets import fashion_mnist
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
import keras

# 下载数据集
(X_train_image,y_train_label),(X_test_image,y_test_label) = fashion_mnist.load_data()
# 将标签映射到图像,比较方便查看物品属性
CLASSES_NAME = ['短袖圆领T恤', '裤子', '套头衫', '连衣裙', '外套',
              '凉鞋', '衬衫', '帆布鞋','包', '短靴']

3.3 了解Fashion_MNIST数据集

font_zh = FontProperties(fname='./fz.ttf')
# 定义一个可输出图片和数字的函数
def show_image(images, labels, idx, alias=[]):
    fig = plt.gcf()
    plt.imshow(images[idx], cmap='binary')
    if alias:
        plt.xlabel(str(CLASSES_NAME[labels[idx]]), fontproperties = font_zh, fontsize = 15)
    else:
        plt.xlabel('label:'+str(labels[idx]), fontsize = 15)
    plt.show()

# 定义一个可输多个图片和数字的函数
def show_images_set(images,labels,prediction,idx,num=15, alias=[]):
    fig = plt.gcf()
    fig.set_size_inches(14, 14)
    for i in range(0,num):
        color = 'black'
        tag = ''
        ax = plt.subplot(5,5,1+i)
        ax.imshow(images[idx],cmap='binary')
        if len(alias)>0:
            title = str(CLASSES_NAME[labels[idx]])
        else:
            title = "label:"+str(labels[idx])
        if len(prediction)>0:
            if prediction[idx] != labels[idx]:
                color = 'red'
                tag = '×'
            if alias:
                title +="("+str(CLASSES_NAME[prediction[idx]])+")" + tag
            else:
                title +=",predict="+str(prediction[idx])
        ax.set_title(title, fontproperties = font_zh, fontsize = 13, color=color)
        ax.set_xticks([])
        ax.set_yticks([])
        idx+=1
    plt.show()

使用show_images_set显示训练集的数据。prediction为传入预测结果数据集,这边暂时为空,idx为需要从第几项数据开始遍历,默认为num=10项

show_images_set(images=X_train_image, labels=y_train_label, prediction=[], idx=10, alias=CLASSES_NAME)

png

4. 进行Fashion MNIST数据集识别

4.1 初始处理数据

import numpy as np
from keras.utils import np_utils
from keras.datasets import mnist
import pandas as pd
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,Activation

# 加载数据集
(X_train_image,y_train_label),(X_test_image,y_test_label) = fashion_mnist.load_data()
# 图像转换成向量的处理
x_Train4D = X_train_image.reshape(X_train_image.shape[0],28,28,1).astype('float32')
x_Test4D = X_test_image.reshape(X_test_image.shape[0],28,28,1).astype('float32')
# 图像归一化处理
x_Train4D_normalize = x_Train4D / 255
x_Test4D_normalize = x_Test4D / 255
# 标签one-hot编码处理
y_TrainOneHot = np_utils.to_categorical(y_train_label)
y_TestOneHot = np_utils.to_categorical(y_test_label)

# 设置模型参数和训练参数
# 分类的类别
CLASSES_NB = 10
# 模型输入层数量
INPUT_SHAPE = (28,28,1)
# 验证集划分比例
VALIDATION_SPLIT = 0.2
# 训练周期,这边设置10个周期即可
EPOCH = 20
# 单批次数据量
BATCH_SIZE = 300
# 训练LOG打印形式
VERBOSE = 1
# 将标签映射到图像,比较方便查看物品属性
CLASSES_NAME = ['T恤', '裤子', '套头衫', '连衣裙', '外套',
              '凉鞋', '衬衫', '帆布鞋','包', '短靴']

4.2 搭建LeNet-5与训练模型

model = Sequential()
model.add(Conv2D(filters=6,
                 kernel_size=(5,5),
                 strides=(1,1),
                 input_shape=(28,28,1),
                 padding='valid',
                 kernel_initializer='uniform'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(16,
                 kernel_size=(5,5),
                 strides=(1,1),
                 padding='valid',
                 kernel_initializer='uniform'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(120))
model.add(Activation('relu'))
model.add(Dense(84))
model.add(Activation('relu'))
model.add(Dense(CLASSES_NB))
model.add(Activation('softmax'))
model.compile(optimizer='sgd',loss='categorical_crossentropy',metrics=['accuracy'])
model.summary()
WARNING: Logging before flag parsing goes to stderr.
W0115 00:53:07.540752 4395775424 deprecation_wrapper.py:119] From /Users/jingyuyan/anaconda3/envs/dlwork/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:4070: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.



Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 24, 24, 6)         156       
_________________________________________________________________
activation_1 (Activation)    (None, 24, 24, 6)         0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 6)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 8, 8, 16)          2416      
_________________________________________________________________
activation_2 (Activation)    (None, 8, 8, 16)          0         
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 16)          0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 256)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 120)               30840     
_________________________________________________________________
activation_3 (Activation)    (None, 120)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 84)                10164     
_________________________________________________________________
activation_4 (Activation)    (None, 84)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                850       
_________________________________________________________________
activation_5 (Activation)    (None, 10)                0         
=================================================================
Total params: 44,426
Trainable params: 44,426
Non-trainable params: 0
_________________________________________________________________

查看深度模型结构图

1

train_history = model.fit(x=x_Train4D_normalize,
                         y=y_TrainOneHot,validation_split=VALIDATION_SPLIT,
                         epochs=EPOCH,batch_size=BATCH_SIZE,verbose=VERBOSE)
Train on 48000 samples, validate on 12000 samples
Epoch 1/20
48000/48000 [==============================] - 24s 493us/step - loss: 2.2980 - acc: 0.2162 - val_loss: 2.2907 - val_acc: 0.2464
Epoch 2/20
48000/48000 [==============================] - 18s 372us/step - loss: 2.2520 - acc: 0.2860 - val_loss: 2.1341 - val_acc: 0.3353
Epoch 3/20
48000/48000 [==============================] - 18s 384us/step - loss: 1.4939 - acc: 0.5165 - val_loss: 1.1224 - val_acc: 0.5860
Epoch 4/20
48000/48000 [==============================] - 18s 370us/step - loss: 0.9720 - acc: 0.6404 - val_loss: 0.8758 - val_acc: 0.6878
Epoch 5/20
48000/48000 [==============================] - 19s 406us/step - loss: 0.8543 - acc: 0.6768 - val_loss: 0.8099 - val_acc: 0.6692
Epoch 6/20
48000/48000 [==============================] - 19s 402us/step - loss: 0.7894 - acc: 0.6999 - val_loss: 0.7352 - val_acc: 0.7049
Epoch 7/20
48000/48000 [==============================] - 19s 401us/step - loss: 0.7420 - acc: 0.7169 - val_loss: 0.7282 - val_acc: 0.7208
Epoch 8/20
48000/48000 [==============================] - 21s 441us/step - loss: 0.7191 - acc: 0.7273 - val_loss: 0.7015 - val_acc: 0.7499
Epoch 9/20
48000/48000 [==============================] - 19s 399us/step - loss: 0.6791 - acc: 0.7438 - val_loss: 0.6914 - val_acc: 0.7366
Epoch 10/20
48000/48000 [==============================] - 20s 416us/step - loss: 0.6633 - acc: 0.7480 - val_loss: 0.6355 - val_acc: 0.7583
Epoch 11/20
48000/48000 [==============================] - 21s 434us/step - loss: 0.6444 - acc: 0.7558 - val_loss: 0.6164 - val_acc: 0.7598
Epoch 12/20
48000/48000 [==============================] - 23s 470us/step - loss: 0.6275 - acc: 0.7613 - val_loss: 0.6047 - val_acc: 0.7627
Epoch 13/20
48000/48000 [==============================] - 19s 403us/step - loss: 0.6096 - acc: 0.7700 - val_loss: 0.5954 - val_acc: 0.7729
Epoch 14/20
48000/48000 [==============================] - 20s 422us/step - loss: 0.5949 - acc: 0.7754 - val_loss: 0.5819 - val_acc: 0.7839
Epoch 15/20
48000/48000 [==============================] - 21s 428us/step - loss: 0.5877 - acc: 0.7797 - val_loss: 0.5731 - val_acc: 0.7841
Epoch 16/20
48000/48000 [==============================] - 20s 418us/step - loss: 0.5787 - acc: 0.7823 - val_loss: 0.5626 - val_acc: 0.7956
Epoch 17/20
48000/48000 [==============================] - 21s 434us/step - loss: 0.5674 - acc: 0.7869 - val_loss: 0.5659 - val_acc: 0.7989
Epoch 18/20
48000/48000 [==============================] - 21s 441us/step - loss: 0.5669 - acc: 0.7883 - val_loss: 0.5529 - val_acc: 0.7982
Epoch 19/20
48000/48000 [==============================] - 20s 427us/step - loss: 0.5497 - acc: 0.7938 - val_loss: 0.5640 - val_acc: 0.7896
Epoch 20/20
48000/48000 [==============================] - 20s 417us/step - loss: 0.5529 - acc: 0.7919 - val_loss: 0.5366 - val_acc: 0.8019

4.3 训练过程与评估模型

训练完毕后我们查看模型训练的结果

# 定义绘制训练过程的函数图像
def show_train_history(train_history,train,validation):
    plt.plot(train_history.history[train])
    plt.plot(train_history.history[validation])
    plt.title('Train histoty')
    plt.ylabel(train)
    plt.xlabel('Epoch')
    plt.legend(['train','validation',],loc = 'upper left')
    plt.show()
show_train_history(train_history,'acc','val_acc')

png

show_train_history(train_history,'loss','val_loss')

png

scores = model.evaluate(x_Test4D_normalize, y_TestOneHot)
print(scores[1])
10000/10000 [==============================] - 2s 239us/step
0.8073
# 保存训练的好的模型权重
model.save('mnist_model_v2.h5')

4.4 卷积输出可视化

为了更加清洗的看清卷积处理图像的过程,我们将其以图片的形式输出。

# 读取上面保存好的模型
from keras.models import load_model
model_v2 = load_model('mnist_model_v2.h5')
from keras.models import Model
# 定义获取某一层中预测的结果函数
def get_layer_output(model, layer_name, data_set):
    try:
        out = model.get_layer(layer_name).output
    except:
        raise Exception('Error layer named {}!'.format(layer_name))

    conv1_layer = Model(inputs=model.inputs, outputs=out)
    res = conv1_layer.predict(data_set)
    return res
# 定义显示预测结果的函数
def show_layer_output(imgs, r=1, c=7):
    fig = plt.gcf()
    fig.set_size_inches(12, 14)
    length = imgs.shape[2]
    for _ in range(length):
        show_img = imgs[:, : , _]
        show_img.shape = imgs.shape[:2]
        plt.subplot(r, c, _ + 1)
        plt.imshow(show_img)
    plt.show()

这里我们随机使用一个T恤的样本展现图片在各层网络中所处理的效果图像。

show_image(X_train_image, y_train_label, 1, CLASSES_NAME)

png

# 获取第一个卷积层中计算过程的图像
conv2d_1 = get_layer_output(model_v2, "conv2d_1", x_Test4D)[1]
activation_1 = get_layer_output(model_v2, "activation_1", x_Test4D)[1]
max_pooling2d_1 = get_layer_output(model_v2, "max_pooling2d_1", x_Test4D)[1]

# 获取第二个卷积层中计算过程的图像
conv2d_2 = get_layer_output(model_v2, "conv2d_2", x_Test4D)[1]
activation_2 = get_layer_output(model_v2, "activation_2", x_Test4D)[1]
max_pooling2d_2 = get_layer_output(model_v2, "max_pooling2d_2", x_Test4D)[1]
# 卷积层1过程
show_layer_output(conv2d_1)

png

# 激活函数1过程
show_layer_output(activation_1)

png

# 池化层1过程
show_layer_output(max_pooling2d_1)

png

# 卷积层2过程
show_layer_output(conv2d_2, r=8, c=8)

png

# 激活函数2过程
show_layer_output(activation_2, r=8, c=8)

png

# 池化层1过程
show_layer_output(max_pooling2d_2, r=8, c=8)

png

5. 改进LeNet-5实现改进Fashion MNIST数据集识别

5.1 初始处理数据

import numpy as np
from keras.utils import np_utils
from keras.datasets import mnist
import pandas as pd
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,Activation

# 加载数据集
(X_train_image,y_train_label),(X_test_image,y_test_label) = fashion_mnist.load_data()
# 图像转换成向量的处理
x_Train4D = X_train_image.reshape(X_train_image.shape[0],28,28,1).astype('float32')
x_Test4D = X_test_image.reshape(X_test_image.shape[0],28,28,1).astype('float32')
# 图像归一化处理
x_Train4D_normalize = x_Train4D / 255
x_Test4D_normalize = x_Test4D / 255
# 标签one-hot编码处理
y_TrainOneHot = np_utils.to_categorical(y_train_label)
y_TestOneHot = np_utils.to_categorical(y_test_label)

# 设置模型参数和训练参数
# 分类的类别
CLASSES_NB = 10
# 模型输入层数量
INPUT_SHAPE = (28,28,1)
# 验证集划分比例
VALIDATION_SPLIT = 0.2
# 训练周期,这边设置10个周期即可
EPOCH = 20
# 单批次数据量
BATCH_SIZE = 300
# 训练LOG打印形式
VERBOSE = 2
# 将标签映射到图像,比较方便查看物品属性
CLASSES_NAME = ['短袖圆领T恤', '裤子', '套头衫', '连衣裙', '外套',
              '凉鞋', '衬衫', '帆布鞋','包', '短靴']

5.2 搭建模型与训练

在原来的网络结构下,尝试修改网络结构和一些参数来达到提升模型预测精度的效果

model = Sequential()
model.add(Conv2D(filters=16,
                kernel_size=(5,5),
                padding='same',
                input_shape=(28,28,1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(filters=50,
                             kernel_size=(5,5),
                             padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(500,activation='relu'))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(CLASSES_NB))
model.add(Activation('softmax'))
print(model.summary())
Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_3 (Conv2D)            (None, 28, 28, 16)        416       
_________________________________________________________________
activation_6 (Activation)    (None, 28, 28, 16)        0         
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 14, 14, 16)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 14, 14, 50)        20050     
_________________________________________________________________
activation_7 (Activation)    (None, 14, 14, 50)        0         
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 7, 7, 50)          0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 7, 7, 50)          0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 2450)              0         
_________________________________________________________________
dense_4 (Dense)              (None, 500)               1225500   
_________________________________________________________________
activation_8 (Activation)    (None, 500)               0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 500)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 10)                5010      
_________________________________________________________________
activation_9 (Activation)    (None, 10)                0         
=================================================================
Total params: 1,250,976
Trainable params: 1,250,976
Non-trainable params: 0
_________________________________________________________________
None

神经网络模型结构如下

model2

可以看到,我们使用改进了模型结构和参数后,重新训练。

model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
train_history = model.fit(x=x_Train4D_normalize,
                         y=y_TrainOneHot,validation_split=VALIDATION_SPLIT,
                         epochs=EPOCH,batch_size=BATCH_SIZE,verbose=VERBOSE)
Train on 48000 samples, validate on 12000 samples
Epoch 1/20
 - 55s - loss: 0.6406 - acc: 0.7660 - val_loss: 0.4064 - val_acc: 0.8569
Epoch 2/20
 - 55s - loss: 0.3978 - acc: 0.8563 - val_loss: 0.3492 - val_acc: 0.8751
Epoch 3/20
 - 55s - loss: 0.3430 - acc: 0.8762 - val_loss: 0.3090 - val_acc: 0.8901
Epoch 4/20
 - 58s - loss: 0.3077 - acc: 0.8886 - val_loss: 0.2827 - val_acc: 0.8995
Epoch 5/20
 - 55s - loss: 0.2853 - acc: 0.8957 - val_loss: 0.2704 - val_acc: 0.9040
Epoch 6/20
 - 56s - loss: 0.2634 - acc: 0.9035 - val_loss: 0.2523 - val_acc: 0.9076
Epoch 7/20
 - 59s - loss: 0.2503 - acc: 0.9081 - val_loss: 0.2465 - val_acc: 0.9101
Epoch 8/20
 - 61s - loss: 0.2347 - acc: 0.9134 - val_loss: 0.2314 - val_acc: 0.9153
Epoch 9/20
 - 65s - loss: 0.2230 - acc: 0.9180 - val_loss: 0.2376 - val_acc: 0.9105
Epoch 10/20
 - 61s - loss: 0.2104 - acc: 0.9214 - val_loss: 0.2312 - val_acc: 0.9155
Epoch 11/20
 - 72s - loss: 0.2021 - acc: 0.9254 - val_loss: 0.2255 - val_acc: 0.9193
Epoch 12/20
 - 62s - loss: 0.1925 - acc: 0.9276 - val_loss: 0.2166 - val_acc: 0.9193
Epoch 13/20
 - 65s - loss: 0.1856 - acc: 0.9314 - val_loss: 0.2143 - val_acc: 0.9218
Epoch 14/20
 - 69s - loss: 0.1753 - acc: 0.9346 - val_loss: 0.2109 - val_acc: 0.9215
Epoch 15/20
 - 57s - loss: 0.1698 - acc: 0.9359 - val_loss: 0.2161 - val_acc: 0.9207
Epoch 16/20
 - 54s - loss: 0.1620 - acc: 0.9391 - val_loss: 0.2169 - val_acc: 0.9212
Epoch 17/20
 - 54s - loss: 0.1559 - acc: 0.9421 - val_loss: 0.2062 - val_acc: 0.9249
Epoch 18/20
 - 54s - loss: 0.1460 - acc: 0.9451 - val_loss: 0.2115 - val_acc: 0.9244
Epoch 19/20
 - 54s - loss: 0.1388 - acc: 0.9473 - val_loss: 0.2057 - val_acc: 0.9268
Epoch 20/20
 - 55s - loss: 0.1345 - acc: 0.9496 - val_loss: 0.2098 - val_acc: 0.9250

5.3 训练过程与评估模型

可以看到,加入卷积神经网络后的模型,训练时长更加的长。我们可以看看模型效果是否更好。

# 定义绘制训练过程的函数图像
def show_train_history(train_history,train,validation):
    plt.plot(train_history.history[train])
    plt.plot(train_history.history[validation])
    plt.title('Train histoty')
    plt.ylabel(train)
    plt.xlabel('Epoch')
    plt.legend(['train','validation',],loc = 'upper left')
    plt.show()
show_train_history(train_history,'acc','val_acc')

png

show_train_history(train_history,'loss','val_loss')

png

可以看到模的精度相比多层感知机有所提示,并且没有出现严重的过拟合情况。

试着使用测试集来评估模型的准确度,

scores = model.evaluate(x_Test4D_normalize, y_TestOneHot)
print(scores[1])
10000/10000 [==============================] - 5s 458us/step
0.9916

可以看到模型的精确度可以达到0.99以上的精度,相比上一个实验的多层感知器有了更加精准的提升。

5.4 测试集预测

对测试集所有样本进行预测,随机挑选几个样本进行查看

result_class = model.predict_classes(x_Test4D)
show_images_set(X_test_image,y_test_label,result_class,idx=40, alias=CLASSES_NAME)

png

可以看到,这里从第39个样本开始后的15个结果中有4个样本预测错误

建立误差矩阵,可以更加清晰的看清楚各个类别的混淆情况

# 使用pandas库
import pandas as pd
pd.crosstab(y_test_label, result_class, rownames=['label'], colnames=['predict'])
predict 0 1 2 3 4 5 6 7 8 9
label
0 813 2 13 16 5 1 141 0 9 0
1 1 982 0 9 1 0 5 0 2 0
2 12 1 787 7 121 0 71 0 1 0
3 6 5 5 906 43 0 32 0 3 0
4 0 1 17 14 934 0 34 0 0 0
5 0 0 0 0 0 988 0 7 0 5
6 70 0 28 23 108 0 764 0 7 0
7 0 0 0 0 0 13 0 959 3 25
8 1 1 1 2 2 2 2 1 988 0
9 0 0 0 0 0 5 1 21 0 973

通过混淆矩阵可以清晰的发现,最容易混淆的地方分别是2(套头衫)和4(外套)共121次混淆,4(外套)和6(衬衫)共108次混淆。

创建DataFrame,来分析混淆情况

# 创建DataFrame
dic = {'label':y_test_label, 'predict':result_class}
df = pd.DataFrame(dic)
# T是将矩阵转置,方便查看数据
df.T
0 1 2 3 4 5 6 7 8 9 ... 9990 9991 9992 9993 9994 9995 9996 9997 9998 9999
label 9 2 1 1 6 1 4 6 5 7 ... 5 6 8 9 1 9 1 8 1 5
predict 9 2 1 1 6 1 4 6 5 7 ... 5 2 8 9 1 9 1 8 1 5

2 rows × 10000 columns

查看2(套头衫)和4(外套)的混淆情况

df[(df.label==2)&(df.predict==4)].T
74 227 255 457 511 546 715 799 851 893 ... 9337 9387 9441 9449 9537 9545 9648 9743 9784 9946
label 2 2 2 2 2 2 2 2 2 2 ... 2 2 2 2 2 2 2 2 2 2
predict 4 4 4 4 4 4 4 4 4 4 ... 4 4 4 4 4 4 4 4 4 4

2 rows × 121 columns

这里选择第1项错误的74的索引进行查看

show_image(X_test_image, y_test_label, 74, CLASSES_NAME)

png

查看4(外套)6(衬衫)的混淆情况
df[(df.label==4)&(df.predict==6)].T
396 476 558 905 1055 1101 1223 1356 1408 1462 ... 6899 6908 7134 7233 7278 7596 7986 8296 8933 8958
label 4 4 4 4 4 4 4 4 4 4 ... 4 4 4 4 4 4 4 4 4 4
predict 6 6 6 6 6 6 6 6 6 6 ... 6 6 6 6 6 6 6 6 6 6

2 rows × 34 columns

这里选择第2项错误的476的索引进行查看

show_image(X_test_image, y_test_label, 558, CLASSES_NAME)

png

5.5 保存模型与网络结构

保存model的网络结构为json格式

from keras.models import model_from_json
import json
# 将model的结构转换成json
model_json = model.to_json()
# 格式化json方便阅读
model_dict = json.loads(model_json)
model_json = json.dumps(model_dict, indent=4, ensure_ascii=False)
# 将json保存到当前目录下
with open("./fashion_mnist_model_json.json",'w') as json_file:
    json_file.write(model_json)

将model的权重保存为h5格式

from keras.models import load_model

# 保存训练的好的model权重
model.save('fashion_mnist_mode_v1.h5')

6. 使用自然测试集进行预测

之前的实验均是采用数据集中的测试集进行预测,现在我们将尝试采用自己收集的一些图片作为自然测试集进行预测,查看模型在自然测试集的图片效果如何。图片放在img_sets文件夹中,获取方法请读者翻阅附录获得下载方法。

6.1 图片预处理

使用自定义图片进行预测,需要将图片转换成numpy数组形式,并且设置好图像的一些属性才能进行预测。这里将采用开源的计算机视觉库opencv进行图像的一些处理操作。

import cv2
import numpy as np
import os
import matplotlib.pyplot as plt
# 存放自动以图片的位置,图片均为jpg格式
path = "img_sets"
imgs = []
labs = []
for i,filename in enumerate(os.listdir(path)):
    if filename.endswith(".jpg"):
        _path = os.path.join(path , filename)
        # opencv读取图片
        img  = cv2.imread(_path)
        # 将图片添加至列表中
        imgs.append(img)
        # 从文件名获取label
        lab = filename[4:5]
        labs.append(int(lab))
show_images_set(imgs, labs, [], idx=0, num=8, alias=CLASSES_NAME)

png

# 查看图片数据
imgs
[array([[[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        ...,

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]]], dtype=uint8), array([[[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        ...,

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]]], dtype=uint8), array([[[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        ...,

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]]], dtype=uint8), array([[[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        ...,

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]]], dtype=uint8), array([[[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        ...,

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]]], dtype=uint8), array([[[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        ...,

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]]], dtype=uint8), array([[[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        ...,

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]]], dtype=uint8), array([[[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        ...,

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]]], dtype=uint8), array([[[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        ...,

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]],

        [[255, 255, 255],
         [255, 255, 255],
         [255, 255, 255],
         ...,
         [255, 255, 255],
         [255, 255, 255],
         [255, 255, 255]]], dtype=uint8)]
X_img = []
for img in imgs :
    # 将图片转换成灰度图
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img = img - 255
    img = cv2.resize(img, (28, 28))
    X_img.append(img)

X_img = np.array(X_img)
# 图像转换成向量的处理
X_img_4d = X_img.copy()
X_img_4d = X_img_4d.reshape(X_img.shape[0],28,28,1).astype('float32')

6.2 预测结果

import keras
from keras.models import load_model

model_fashion_v1 = load_model('fashion_mnist_mode_v1.h5')
res = model_fashion_v1.predict_classes(X_img_4d)
show_images_set(imgs, labs, res, idx=0, num=8, alias=CLASSES_NAME)

png

结论

本章主要描述如何搭建卷积神经网络识别Fashion-Mnist数据集,同样的实验可以把数据集切换成Mnist手写字符集甚至不需要更换一行代码,请读者自行实践与研究。

版权声明:如无特殊说明,文章均为本站原创,转载请注明出处

本文链接:http://tunm.top/article/learning_04/