使用TensorFlow对电影评论进行文本分类

本文基于TensorFlow,利用IMDB数据集进行文本分类。内容包括数据来源、处理、下载,模型构建、编译训练、评估,以及提供完整代码和最终结果。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

数据来源

本次使用的数据来源于网络电影数据库的IMDB数据集,其中包含50,000条影评文本。

数据处理

从该数据集切割出的25,000条评论用作训练,另外25,000条用作测试。训练集与测试集是平衡的,意味着它们包含相等数量的积极和消极评论。

数据下载

在这里插入图片描述

模型的构建

在这里插入图片描述

模型的编译与训练

在这里插入图片描述

模型的评估

在这里插入图片描述

完整代码

# !/usr/bin/env python
# —*— coding: utf-8 —*—
# @Time:    2020/1/2 7:42
# @Author:  Martin
# @File:    Film_Reviews_Classification.py
# @Software:PyCharm

from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_datasets as tfds
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = '3'
# 下载IMDB数据集
train_validation_split = tfds.Split.TRAIN.subsplit([6, 4])
(train_data, validation_data), test_data = tfds.load(
    name="imdb_reviews",
    split=(train_validation_split, tfds.Split.TEST),
    as_supervised=True
)
# 构建模型
embedding = "https://hub.tensorflow.google.cn/google/tf2-preview/gnews-swivel-20dim/1"
hub_layer = hub.KerasLayer(embedding, input_shape=[], dtype=tf.string, trainable=True)
model = tf.keras.Sequential()
model.add(hub_layer)
model.add(tf.keras.layers.Dense(16, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
# 编译模型
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])
# 训练模型
history = model.fit(train_data.shuffle(10000).batch(512),
                    epochs=20,
                    validation_data=validation_data.batch(512),
                    verbose=1)
# 评估模型
results = model.evaluate(test_data.batch(512), verbose=2)
for name, value in zip(model.metrics_names, results):
    print("%s: %.3f" % (name, value))



最终结果

Epoch 1/20

      1/Unknown - 1s 1s/step - loss: 0.8219 - accuracy: 0.4609
      2/Unknown - 2s 782ms/step - loss: 0.8012 - accuracy: 0.4756
      3/Unknown - 2s 563ms/step - loss: 0.7966 - accuracy: 0.4818
      4/Unknown - 2s 450ms/step - loss: 0.7910 - accuracy: 0.4800
      5/Unknown - 2s 381ms/step - loss: 0.7840 - accuracy: 0.4828
      6/Unknown - 2s 334ms/step - loss: 0.7851 - accuracy: 0.4801
      7/Unknown - 2s 299ms/step - loss: 0.7776 - accuracy: 0.4886
      8/Unknown - 2s 274ms/step - loss: 0.7712 - accuracy: 0.4944
      9/Unknown - 2s 255ms/step - loss: 0.7656 - accuracy: 0.4998
     10/Unknown - 2s 240ms/step - loss: 0.7622 - accuracy: 0.5033
     11/Unknown - 2s 227ms/step - loss: 0.7624 - accuracy: 0.5025
     12/Unknown - 3s 215ms/step - loss: 0.7596 - accuracy: 0.5065
     13/Unknown - 3s 206ms/step - loss: 0.7586 - accuracy: 0.5066
     14/Unknown - 3s 197ms/step - loss: 0.7569 - accuracy: 0.5070
     15/Unknown - 3s 190ms/step - loss: 0.7537 - accuracy: 0.5096
     16/Unknown - 3s 184ms/step - loss: 0.7504 - accuracy: 0.5125
     17/Unknown - 3s 178ms/step - loss: 0.7484 - accuracy: 0.5149
     18/Unknown - 3s 174ms/step - loss: 0.7449 - accuracy: 0.5193
     19/Unknown - 3s 170ms/step - loss: 0.7421 - accuracy: 0.5225
     20/Unknown - 3s 166ms/step - loss: 0.7401 - accuracy: 0.5238
     21/Unknown - 3s 162ms/step - loss: 0.7382 - accuracy: 0.5252
     22/Unknown - 3s 159ms/step - loss: 0.7366 - accuracy: 0.5264
     23/Unknown - 4s 156ms/step - loss: 0.7347 - accuracy: 0.5284
     24/Unknown - 4s 153ms/step - loss: 0.7327 - accuracy: 0.5303
     25/Unknown - 4s 151ms/step - loss: 0.7312 - accuracy: 0.5304
     26/Unknown - 4s 149ms/step - loss: 0.7290 - accuracy: 0.5322
     27/Unknown - 4s 147ms/step - loss: 0.7269 - accuracy: 0.5348
     28/Unknown - 4s 144ms/step - loss: 0.7255 - accuracy: 0.5363
     29/Unknown - 4s 142ms/step - loss: 0.7242 - accuracy: 0.5369
     30/Unknown - 4s 139ms/step - loss: 0.7231 - accuracy: 0.5374
30/30 [==============================] - 6s 207ms/step - loss: 0.7231 - accuracy: 0.5374 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 2/20

 1/20 [>.............................] - ETA: 12s - loss: 0.6648 - accuracy: 0.6133
 2/20 [==>...........................] - ETA: 6s - loss: 0.6550 - accuracy: 0.6230 
 3/20 [===>..........................] - ETA: 4s - loss: 0.6492 - accuracy: 0.6243
 4/20 [=====>........................] - ETA: 4s - loss: 0.6570 - accuracy: 0.6211
 5/20 [======>.......................] - ETA: 3s - loss: 0.6621 - accuracy: 0.6133
 6/20 [========>.....................] - ETA: 2s - loss: 0.6596 - accuracy: 0.6159
 7/20 [=========>....................] - ETA: 2s - loss: 0.6600 - accuracy: 0.6158
 8/20 [===========>..................] - ETA: 2s - loss: 0.6589 - accuracy: 0.6187
 9/20 [============>.................] - ETA: 1s - loss: 0.6568 - accuracy: 0.6237
10/20 [==============>...............] - ETA: 1s - loss: 0.6567 - accuracy: 0.6244
11/20 [===============>..............] - ETA: 1s - loss: 0.6574 - accuracy: 0.6223
12/20 [=================>............] - ETA: 1s - loss: 0.6539 - accuracy: 0.6265
13/20 [==================>...........] - ETA: 1s - loss: 0.6535 - accuracy: 0.6259
14/20 [====================>.........] - ETA: 0s - loss: 0.6540 - accuracy: 0.6253
15/20 [=====================>........] - ETA: 0s - loss: 0.6531 - accuracy: 0.6254
16/20 [=======================>......] - ETA: 0s - loss: 0.6520 - accuracy: 0.6263
17/20 [========================>.....] - ETA: 0s - loss: 0.6516 - accuracy: 0.6252
18/20 [==========================>...] - ETA: 0s - loss: 0.6519 - accuracy: 0.6253
19/20 [===========================>..] - ETA: 0s - loss: 0.6504 - accuracy: 0.6278
30/30 [==============================] - 5s 173ms/step - loss: 0.6466 - accuracy: 0.6398 - val_loss: 0.6144 - val_accuracy: 0.6696
Epoch 3/20

 1/20 [>.............................] - ETA: 12s - loss: 0.6175 - accuracy: 0.6602
 2/20 [==>...........................] - ETA: 6s - loss: 0.6040 - accuracy: 0.6826 
 3/20 [===>..........................] - ETA: 5s - loss: 0.6080 - accuracy: 0.6823
 4/20 [=====>........................] - ETA: 3s - loss: 0.5987 - accuracy: 0.6914
 5/20 [======>.......................] - ETA: 3s - loss: 0.5974 - accuracy: 0.6922
 6/20 [========>.....................] - ETA: 2s - loss: 0.5993 - accuracy: 0.6872
 7/20 [=========>....................] - ETA: 2s - loss: 0.5989 - accuracy: 0.6881
 8/20 [===========>..................] - ETA: 2s - loss: 0.6000 - accuracy: 0.6890
 9/20 [============>.................] - ETA: 1s - loss: 0.5987 - accuracy: 0.6866
10/20 [==============>...............] - ETA: 1s - loss: 0.5985 - accuracy: 0.6867
11/20 [===============>..............] - ETA: 1s - loss: 0.5995 - accuracy: 0.6848
12/20 [=================>............] - ETA: 1s - loss: 0.5986 - accuracy: 0.6857
13/20 [==================>...........] - ETA: 0s - loss: 0.5981 - accuracy: 0.6870
14/20 [====================>.........] - ETA: 0s - loss: 0.5998 - accuracy: 0.6871
15/20 [=====================>........] - ETA: 0s - loss: 0.5991 - accuracy: 0.6880
16/20 [=======================>......] - ETA: 0s - loss: 0.5985 - accuracy: 0.6890
17/20 [========================>.....] - ETA: 0s - loss: 0.5982 - accuracy: 0.6898
18/20 [==========================>...] - ETA: 0s - loss: 0.5972 - accuracy: 0.6900
19/20 [===========================>..] - ETA: 0s - loss: 0.5948 - accuracy: 0.6926
30/30 [==============================] - 5s 171ms/step - loss: 0.5923 - accuracy: 0.6983 - val_loss: 0.5705 - val_accuracy: 0.7160
Epoch 4/20

 1/20 [>.............................] - ETA: 13s - loss: 0.5646 - accuracy: 0.7031
 2/20 [==>...........................] - ETA: 7s - loss: 0.5533 - accuracy: 0.7158 
 3/20 [===>..........................] - ETA: 5s - loss: 0.5497 - accuracy: 0.7240
 4/20 [=====>........................] - ETA: 4s - loss: 0.5535 - accuracy: 0.7168
 5/20 [======>.......................] - ETA: 3s - loss: 0.5614 - accuracy: 0.7094
 6/20 [========>.....................] - ETA: 3s - loss: 0.5617 - accuracy: 0.7113
 7/20 [=========>....................] - ETA: 2s - loss: 0.5603 - accuracy: 0.7132
 8/20 [===========>..................] - ETA: 2s - loss: 0.5576 - accuracy: 0.7161
 9/20 [============>.................] - ETA: 2s - loss: 0.5545 - accuracy: 0.7216
10/20 [==============>...............] - ETA: 1s - loss: 0.5570 - accuracy: 0.7209
11/20 [===============>..............] - ETA: 1s - loss: 0.5560 - accuracy: 0.7227
12/20 [=================>............] - ETA: 1s - loss: 0.5540 - accuracy: 0.7251
13/20 [==================>...........] - ETA: 1s - loss: 0.5549 - accuracy: 0.7258
14/20 [====================>.........] - ETA: 0s - loss: 0.5525 - accuracy: 0.7288
15/20 [=====================>........] - ETA: 0s - loss: 0.5504 - accuracy: 0.7314
16/20 [=======================>......] - ETA: 0s - loss: 0.5510 - accuracy: 0.7311
17/20 [========================>.....] - ETA: 0s - loss: 0.5492 - accuracy: 0.7333
18/20 [==========================>...] - ETA: 0s - loss: 0.5483 - accuracy: 0.7340
19/20 [===========================>..] - ETA: 0s - loss: 0.5470 - accuracy: 0.7366
30/30 [==============================] - 6s 200ms/step - loss: 0.5450 - accuracy: 0.7421 - val_loss: 0.5294 - val_accuracy: 0.7485
Epoch 5/20

 1/20 [>.............................] - ETA: 15s - loss: 0.5316 - accuracy: 0.7480
 2/20 [==>...........................] - ETA: 8s - loss: 0.5228 - accuracy: 0.7588 
 3/20 [===>..........................] - ETA: 6s - loss: 0.5195 - accuracy: 0.7572
 4/20 [=====>........................] - ETA: 4s - loss: 0.5217 - accuracy: 0.7515
 5/20 [======>
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值