利用Keras实现多层感知器(MLP)模型和卷积神经网络(CNN)模型并对手写数字图像分类

利用Keras实现多层感知器(MLP)模型和卷积神经网络(CNN)模型并对手写数字图像分类

闲来无事,利用Keras实现MLP和CNN模型并对手写数字图像分类,测试数据是大量手写的0-9数字图像,尺寸为:28x28(784)像素,来源于鼎鼎有名的Modified National Institute of Standards and Technology (MNIST)。数据获取很方便,只要保持互联网畅通,使用Keras自带的API:mnist.load_data()就可在第一次调用时自动下载至当前用户的~/.keras/datasets文件夹下(例如当前用户为:davidhopper,则在Windows系统中下载路径为:C:\Users\davidhopper.keras\datasets,Linux系统中的下载路径为:/home/davidhopper/.keras/datasets)。
文中所有源代码均来自“Develop Deep Learning Models on Theano and TensorFlow Using Keras”(Jason Brownlee)一书,我对其稍作修改,以使其更加适用于Keras 2 API。

1. MLP模型

MLP模型采用经典的“输入层-中间层(隐藏层)-输出层”结构,输入层单元数为28X28=784,输出层数为:10(0-9数字的标识号),流程图如下:

Created with Raphaël 2.1.0 开始 输入层(Input Layer, 784个输入) 隐藏层(Hidden Layer,784个神经元) 输出层(Output Layer,10个输出) 结束

实现代码如下:

# baseline MLP for mnist dataset
import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.utils import np_utils

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2] 
X_train = X_train.reshape(X_train.shape[0], num_pixels).astype("float32")
X_test = X_test.reshape(X_test.shape[0], num_pixels).astype("float32")

# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255

# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]

# define baseline model
def baseline_model():
    # create model
    model = Sequential()
    model.add(Dense(num_pixels, input_dim = num_pixels, kernel_initializer = "normal", activation = "relu"))
    model.add(Dense(num_classes, kernel_initializer = "normal", activation = "softmax"))

    # compile model
    model.compile(loss = "categorical_crossentropy", optimizer = "adam", metrics = ["accuracy"])
    return model

# build the model
model = baseline_model()

# fit the model
model.fit(X_train, y_train, validation_data = (X_test, y_test), epochs = 10, batch_size = 200, verbose = 2)

# final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose = 0)
print("Baseline Error: %.2f%%" % (100 - scores[1] * 100))

我使用普通的GeForce GT 740显卡加速,每轮(epoch)训练耗时约为3-5s,错误率为:1.79%。

(C:\Users\Administrator\Anaconda3) d:\Python\code\mlp>python mlp_for_mnist.py
Using TensorFlow backend.
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
2017-12-16 10:15:04.752675: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GT 740 major: 3 minor: 0 memoryClockRate(GHz): 1.0585
pciBusID: 0000:01:00.0
totalMemory: 1.00GiB freeMemory: 834.86MiB
2017-12-16 10:15:04.752806: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GT 740, pci bus id: 0000:01:00.0, compute capability: 3.0)
 - 5s - loss: 0.2811 - acc: 0.9206 - val_loss: 0.1412 - val_acc: 0.9575
Epoch 2/10
 - 3s - loss: 0.1116 - acc: 0.9680 - val_loss: 0.0919 - val_acc: 0.9709
Epoch 3/10
 - 3s - loss: 0.0714 - acc: 0.9798 - val_loss: 0.0786 - val_acc: 0.9776
Epoch 4/10
 - 3s - loss: 0.0503 - acc: 0.9857 - val_loss: 0.0743 - val_acc: 0.9770
Epoch 5/10
 - 3s - loss: 0.0371 - acc: 0.9892 - val_loss: 0.0685 - val_acc: 0.9789
Epoch 6/10
 - 4s - loss: 0.0268 - acc: 0.9927 - val_loss: 0.0631 - val_acc: 0.9798
Epoch 7/10
 - 3s - loss: 0.0205 - acc: 0.9947 - val_loss: 0.0624 - val_acc: 0.9808
Epoch 8/10
 - 3s - loss: 0.0141 - acc: 0.9969 - val_loss: 0.0618 - val_acc: 0.9797
Epoch 9/10
 - 3s - loss: 0.0107 - acc: 0.9978 - val_loss: 0.0583 - val_acc: 0.9818
Epoch 10/10
 - 3s - loss: 0.0082 - acc: 0.9984 - val_loss: 0.0581 - val_acc: 0.9821
Baseline Error: 1.79%

2. 稍简单的CNN模型

首先实现稍简单的CNN模型,采用“输入层-卷积层-最大汇聚层-丢弃层-扁平层-隐藏层-输出层”结构,输入层采用三维向量:1x28X28,输出层为10个输出(0-9数字的标识号),流程图如下:

Created with Raphaël 2.1.0 开始 输入层(Input Layer, 1x28x28) 卷积层(Convolutional Layer, 32映射,卷积核:5x5) 最大汇聚层(Max Pooling Layer, 2x2) 丢弃层(丢弃20%) 扁平层(将矩阵转换为向量) 隐藏层(128个神经元) 输出层(10个输出) 结束

实现代码如下:

# Simple CNNfor the MNIST Dataset
import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Convolution2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
K.set_image_dim_ordering("th")

# fix randoom seed for reproducibility
seed = 7
numpy.random.seed(seed)

# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# reshape to be [samples] [channels] [width] [height]
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype("float32")
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28).astype("float32")

# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255

# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]

# define a simple CNN model
def simple_cnn_model():
    # create the model
    model = Sequential()
    model.add(Convolution2D(32, (5, 5), input_shape = (1, 28, 28), activation = "relu", padding = "valid"))
    model.add(MaxPooling2D(pool_size = (2, 2)))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(128, activation = "relu"))
    model.add(Dense(num_classes, activation = "softmax"))

    # compile the model
    model.compile(loss = "categorical_crossentropy", optimizer = "adam", metrics = ["accuracy"])

    return model

# build the model
model = simple_cnn_model()

# fit the model 
model.fit(X_train, y_train, validation_data = (X_test, y_test), epochs = 10, batch_size = 200, verbose = 2)

# evaluate the model
scores = model.evaluate(X_test, y_test, verbose = 0)
print("CNN error: %.2f%%" % (100 - scores[1] * 100))

我使用普通的GeForce GT 740显卡加速,每轮(epoch)训练耗时约为11-13s,错误率为:1.04%。

(C:\Users\Administrator\Anaconda3) d:\Python\code\mlp>python simple_cnn_for_mnist.py
Using TensorFlow backend.
2017-12-16 08:43:05.851596: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GT 740 major: 3 minor: 0 memoryClockRate(GHz): 1.0585
pciBusID: 0000:01:00.0
totalMemory: 1.00GiB freeMemory: 834.86MiB
2017-12-16 08:43:05.852300: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GT 740, pci bus id: 0000:01:00.0, compute capability: 3.0)
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
 - 13s - loss: 0.2340 - acc: 0.9342 - val_loss: 0.0818 - val_acc: 0.9742
Epoch 2/10
 - 13s - loss: 0.0734 - acc: 0.9782 - val_loss: 0.0468 - val_acc: 0.9843
Epoch 3/10
 - 12s - loss: 0.0533 - acc: 0.9837 - val_loss: 0.0434 - val_acc: 0.9859
Epoch 4/10
 - 12s - loss: 0.0405 - acc: 0.9876 - val_loss: 0.0406 - val_acc: 0.9866
Epoch 5/10
 - 13s - loss: 0.0338 - acc: 0.9892 - val_loss: 0.0341 - val_acc: 0.9881
Epoch 6/10
 - 11s - loss: 0.0278 - acc: 0.9912 - val_loss: 0.0325 - val_acc: 0.9892
Epoch 7/10
 - 13s - loss: 0.0236 - acc: 0.9926 - val_loss: 0.0359 - val_acc: 0.9884
Epoch 8/10
 - 12s - loss: 0.0207 - acc: 0.9938 - val_loss: 0.0336 - val_acc: 0.9885
Epoch 9/10
 - 12s - loss: 0.0170 - acc: 0.9946 - val_loss: 0.0308 - val_acc: 0.9896
Epoch 10/10
 - 12s - loss: 0.0144 - acc: 0.9957 - val_loss: 0.0333 - val_acc: 0.9896
CNN error: 1.04%

3. 更复杂的CNN模型

接下来实现更复杂的CNN模型,采用“输入层-卷积层-最大汇聚层-卷积层-最大汇聚层-丢弃层-扁平层-隐藏层-隐藏层-输出层”结构,输入层采用三维向量:1x28X28,输出层为10个输出(0-9数字的标识号),流程图如下:

Created with Raphaël 2.1.0 开始 输入层(Input Layer, 1x28x28) 卷积层(Convolutional Layer, 30映射,卷积核:5x5) 最大汇聚层(Max Pooling Layer, 2x2) 卷积层(Convolutional Layer, 15映射,卷积核:3x3) 最大汇聚层(Max Pooling Layer, 2x2) 丢弃层(Dropout Layer,丢弃20%) 扁平层(Flatten Layer,将矩阵转换为向量) 隐藏层(Hidden Layer,128个神经元) 隐藏层(Hidden Layer,50个神经元) 输出层(10个输出) 结束

实现代码如下:

# Large CNN for the MNIST Dataset
import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Convolution2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
K.set_image_dim_ordering("th")

# fix random seed fro reproducibility
seed = 7
numpy.random.seed(seed)

# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# reshape to be [samples] [channels] [width] [height]
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype("float32")
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28).astype("float32")

# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255

# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]

# define the large CNN model
def large_cnn_model():
    # create the model
    model = Sequential()
    model.add(Convolution2D(30, (5, 5), input_shape = (1, 28, 28), activation = "relu", padding = "valid"))
    model.add(MaxPooling2D(pool_size = (2, 2)))
    model.add(Convolution2D(15, (3, 3), activation = "relu"))
    model.add(MaxPooling2D(pool_size = (2, 2)))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(128, activation = "relu"))
    model.add(Dense(50, activation = "relu"))
    model.add(Dense(num_classes, activation = "softmax"))

    # Compile the model
    model.compile(loss = "categorical_crossentropy", optimizer = "adam", metrics = ["accuracy"])

    return model

# build the model
model = large_cnn_model()
# fit the model
model.fit(X_train, y_train, validation_data = (X_test, y_test), epochs = 10, batch_size = 200, verbose = 2)

# evalute the model 
scores = model.evaluate(X_test, y_test, verbose = 0) 
print("Large CNN Error: %.2f%%" % (100 - scores[1] * 100))

我使用普通的GeForce GT 740显卡加速,每轮(epoch)训练耗时约为11-13s,错误率为:0.80%。

(C:\Users\Administrator\Anaconda3) d:\Python\code\mlp>python large_cnn_for_mnist.py
Using TensorFlow backend.
2017-12-16 09:09:36.676635: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GT 740 major: 3 minor: 0 memoryClockRate(GHz): 1.0585
pciBusID: 0000:01:00.0
totalMemory: 1.00GiB freeMemory: 834.86MiB
2017-12-16 09:09:36.676757: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GT 740, pci bus id: 0000:01:00.0, compute capability: 3.0)
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
 - 13s - loss: 0.3966 - acc: 0.8776 - val_loss: 0.1004 - val_acc: 0.9681
Epoch 2/10
 - 11s - loss: 0.0941 - acc: 0.9706 - val_loss: 0.0593 - val_acc: 0.9811
Epoch 3/10
 - 12s - loss: 0.0688 - acc: 0.9786 - val_loss: 0.0381 - val_acc: 0.9884
Epoch 4/10
 - 10s - loss: 0.0564 - acc: 0.9821 - val_loss: 0.0333 - val_acc: 0.9885
Epoch 5/10
 - 10s - loss: 0.0477 - acc: 0.9851 - val_loss: 0.0294 - val_acc: 0.9906
Epoch 6/10
 - 12s - loss: 0.0426 - acc: 0.9860 - val_loss: 0.0278 - val_acc: 0.9907
Epoch 7/10
 - 12s - loss: 0.0375 - acc: 0.9883 - val_loss: 0.0253 - val_acc: 0.9918
Epoch 8/10
 - 12s - loss: 0.0336 - acc: 0.9896 - val_loss: 0.0247 - val_acc: 0.9918
Epoch 9/10
 - 12s - loss: 0.0314 - acc: 0.9902 - val_loss: 0.0227 - val_acc: 0.9928
Epoch 10/10
 - 12s - loss: 0.0271 - acc: 0.9913 - val_loss: 0.0243 - val_acc: 0.9920
Large CNN Error: 0.80%

4. 实现细节

对于一段较长的python代码,直接在交互式窗口中编写肯定不合适,我们需要借助一个文本编辑器,编写生成python源文件(如对于MLP模型,我命名为:mlp_for_mnist.py)。注意:在Windows系统中一定不要使用记事本或写字板编写源文件,因为这两个二货根本不会输出正确的UTF-8编码!我个人强烈推荐:Sublime Text: http://www.sublimetext.com/(付费软件,但不注册最多也就是偶尔弹出购买对话框,不影响使用),该编辑器对代码的高亮显示和提示都做得不错;其次推荐:notepad++: https://notepad-plus-plus.org/,提示稍差一些,不过免费,也挺好用。
源文件编写完毕后,点击“开始”菜单,打开“Anaconda3(64-bit)–>Anaconda Prompt”窗口,切换到源文件所在的文件夹,输入“python mlp_for_mnist.py”命令,就可以执行代码了。如果执行中报错,则修改、保存源文件后,继续输入“python mlp_for_mnist.py”命令执行:

(C:\Users\Administrator\Anaconda3) C:\Users\Administrator>cd /D d:\Python\code\mlp
(C:\Users\Administrator\Anaconda3) d:\Python\code\mlp>python mlp_for_mnist.py
手写数字图像分类是计算机视觉中的一个经典问题,可以使用卷积神经网络(Convolutional Neural Network, CNN)来实现。 以下是一些基本步骤: 1.准备数据集:MNIST数据集是一个经典的手写数字图像数据集,包含60,000个训练样本10,000个测试样本。可以通过各种深度学习框架来获取。 2.构建模型CNN是一种基于多层感知器(Multi-Layer Perceptron, MLP)的神经网络,它包括卷积层、池化层、全连接层等。可以使用各种深度学习框架来构建模型,例如TensorFlow、Keras等。 3.训练模型:使用数据集对模型进行训练,通常使用随机梯度下降(Stochastic Gradient Descent, SGD)算法进行优化。 4.评估模型:使用测试集评估模型的性能,例如准确率(accuracy)等指标。 下面是一个简单的CNN模型实现手写数字图像分类的示例代码(使用Keras框架): ```python import tensorflow as tf from tensorflow.keras import datasets, layers, models # 加载数据集 (train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data() # 对数据进行预处理 train_images = train_images.reshape((60000, 28, 28, 1)) test_images = test_images.reshape((10000, 28, 28, 1)) train_images, test_images = train_images / 255.0, test_images / 255.0 # 构建模型 model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Flatten()) model.add(layers.Dense(64, activation='relu')) model.add(layers.Dense(10)) # 编译模型 model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) # 训练模型 model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels)) # 评估模型 test_loss, test_acc = model.evaluate(test_images, test_labels) print('Test accuracy:', test_acc) ``` 在上面的代码中,我们使用了卷积层、池化层全连接层构建了一个简单的CNN模型使用MNIST数据集进行训练测试。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值