Python搭建Keras CNN模型破解网站验证码的实现

论坛 期权论坛 脚本     
niminba   2021-5-23 04:24   2221   0

在本项目中,将会用Keras来搭建一个稍微复杂的CNN模型来破解以上的验证码。验证码如下:

 利用Keras可以快速方便地搭建CNN模型,本项目搭建的CNN模型如下:

将数据集分为训练集和测试集,占比为8:2,该模型训练的代码如下: 

# -*- coding: utf-8 -*-
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
 
from keras.utils import np_utils, plot_model
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.callbacks import EarlyStopping
from keras.layers import Conv2D, MaxPooling2D
 
# 读取数据
df = pd.read_csv('./data.csv')
 
# 标签值
vals = range(31)
keys = ['1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','J','K','L','N','P','Q','R','S','T','U','V','X','Y','Z']
label_dict = dict(zip(keys, vals))
 
x_data = df[['v'+str(i+1) for i in range(320)]]
y_data = pd.DataFrame({'label':df['label']})
y_data['class'] = y_data['label'].apply(lambda x: label_dict[x])
 
# 将数据分为训练集和测试集
X_train, X_test, Y_train, Y_test = train_test_split(x_data, y_data['class'], test_size=0.3, random_state=42)
x_train = np.array(X_train).reshape((1167, 20, 16, 1))
x_test = np.array(X_test).reshape((501, 20, 16, 1))
 
# 对标签值进行one-hot encoding
n_classes = 31
y_train = np_utils.to_categorical(Y_train, n_classes)
y_val = np_utils.to_categorical(Y_test, n_classes)
 
input_shape = x_train[0].shape
 
# CNN模型
model = Sequential()
 
# 卷积层和池化层
model.add(Conv2D(32, kernel_size=(3, 3), input_shape=input_shape, padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(32, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
 
# Dropout层
model.add(Dropout(0.25))
 
model.add(Conv2D(64, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
 
model.add(Dropout(0.25))
 
model.add(Conv2D(128, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(128, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
 
model.add(Dropout(0.25))
 
model.add(Flatten())
 
# 全连接层
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(128, activation='relu'))
model.add(Dense(n_classes, activation='softmax'))
 
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 
# plot model
##plot_model(model, to_file=r'./model.png', show_shapes=True)
 
# 模型训练
callbacks = [EarlyStopping(monitor='val_acc', patience=5, verbose=1)]
batch_size = 64
n_epochs = 100
history = model.fit(x_train, y_train, batch_size=batch_size, epochs=n_epochs, \
          verbose=1, validation_data=(x_test, y_val), callbacks=callbacks)
 
mp = './verifycode_Keras.h5'
model.save(mp)
 
# 绘制验证集上的准确率曲线
val_acc = history.history['val_acc']
plt.plot(range(len(val_acc)), val_acc, label='CNN model')
plt.title('Validation accuracy on verifycode dataset')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.legend()
plt.show()

在上述代码中,训练模型的时候采用了early stopping技巧。early stopping是用于提前停止训练的callbacks。具体地,可以达到当训练集上的loss不在减小(即减小的程度小于某个阈值)的时候停止继续训练。 

运行上述模型训练代码,输出的结果如下:

......(忽略之前的输出)
Epoch 22/100
 
 64/1167 [>.............................] - ETA: 3s - loss: 0.0399 - acc: 1.0000
 128/1167 [==>...........................] - ETA: 3s - loss: 0.1195 - acc: 0.9844
 192/1167 [===>..........................] - ETA: 2s - loss: 0.1085 - acc: 0.9792
 256/1167 [=====>........................] - ETA: 2s - loss: 0.1132 - acc: 0.9727
 320/1167 [=======>......................] - ETA: 2s - loss: 0.1045 - acc: 0.9750
 384/1167 [========>.....................] - ETA: 2s - loss: 0.1006 - acc: 0.9740
 448/1167 [==========>...................] - ETA: 2s - loss: 0.1522 - acc: 0.9643
 512/1167 [============>.................] - ETA: 1s - loss: 0.1450 - acc: 0.9648
 576/1167 [=============>................] - ETA: 1s - loss: 0.1368 - acc: 0.9653
 640/1167 [===============>..............] - ETA: 1s - loss: 0.1353 - acc: 0.9641
 704/1167 [=================>............] - ETA: 1s - loss: 0.1280 - acc: 0.9659
 768/1167 [==================>...........] - ETA: 1s - loss: 0.1243 - acc: 0.9674
 832/1167 [====================>.........] - ETA: 0s - loss: 0.1577 - acc: 0.9639
 896/1167 [======================>.......] - ETA: 0s - loss: 0.1488 - acc: 0.9665
 960/1167 [===============ТF27FF"F"Т&uhX[Y~[jzVN[XxnzsRRRТRF67B67BТЦ&SУFcУ^KiYN(NkX{>iУ6VУFV$fr&6V#У#RS3S33e6S#ЦVvw53RV&fv6FfUGW&U&BC5R7W'V7FFBF2FV$fr&v26BFSe##УududG'VR#У"sDrsDrG'VR##У2sTuRrsTuRrG'VR3#УBsU%rsU%rG'VRC#УRsUErsUErG'VRS#Уbsu3c"rsu3c"rG'VRc#Уrs#%rs#%rG'VRs#Уs$ebrs$ebrG'VR#Уs$%Brs$%BrG'VR#УsRrsRrG'VR#УscuTrscuTrG'VR#У"ssETrssETrG'VR##У2tUC"rtUC"rG'VR3#УBtbrtbrG'VRC#УRt4Urt4UrG'VRS#Уbt#3srt#3srG'VRc#Уrt45t45G'VRs#Уt4dCRrttdCRrfRs#Уt4rt4rG'VR#У#tCEbrtCEbrG'VR#У#tDertDerG'VR##У#"tErtErG'VR##У#2tS42rtS42rG'VR###У#BtSd"rtSd"rG'VR#3#У#RtDSRrtDSRrG'VR#C#У#btd"rtd"rG'VR#S#У#rtdrtdrG'VR#c#У#te4rte4rG'VR#s#У#teerteerG'VR##У3tt3drtt3drG'VR##У3ttc"rttc"rG'VR3#У3"terterG'VR3#У32tcurtcurG'VR3##У3BtT2rtT2rG'VR33#У3Rtc$"rtc$"rG'VR3C#У3btcUrtcUrG'VR3S#У3rt5rt5rG'VR3c#У3tTBrtTBrG'VR3s#У3tC"rtC"rG'VR3#УCstDrstDrG'VR3#УCtrtrG'VRC#УC"t"rt"rG'VRC#УC2tSRrtSRrG'VRC##УCBtrtrG'VRC3#УCRt2rt2rG'VRCC#УCbtrtrG'VRCS#УCrtrrtrrG'VRCc#УCt2rt2rG'VRCs#УCueRrueRrG'VRC#УSuCC"ruCC"rG'VRC#УSuT4rruUrrfRC#УS"ubrubrG'VRS#УS2uUTuUTG'VRS#УSBu$DtC$DfRS#УSRu5ru5rG'VRS##УSbuD#ruD$rfRS##УSruTBruTBrG'VRS3#УSu#5CRru#5CRrG'VRSC#УSu3Bru3BrG'VRSS#Уcu5tu5tG'VRSc#Уcu5#$ru5#$rG'VRSs#Уc"u5URru5eRrfRSs#Уc2uC%5ruC%5rG'VRS#УcBuSecruSecrG'VRS#УcRuT3ruT3rG'VRc#УcbuTeBruTeBrG'VRc#УcrucucfRc#Уcuc3Uruc3UrG'VRc##УcucbrucbrG'VRc3#УsudC#rudC#rG'VRcC#УsutRrutRrG'VRcS#Уs"ururG'VRcc#Уs2uebruebrG'VRcs#УsBud%2rud%2rG'VRc#УsRtUCertUCerG'VRc#УsbuDd2ruDd2rG'VRs#Уsrs%T5Rrs%T5RrG'VRs#УstrtrG'VRs##УsudBrudBrG'VRs3#УututG'VRsC#Уt5Rrt5RrG'VRsS#У"t奒rtrfRsS#У2u%3Bru%3BrG'VRsc#УBsRrsRrG'VRss#УRtUDErtUDErG'VRs#УbsD5"rsD5"rG'VRs#УrtdSrtdSrG'VR#Уt#urt#urG'VR#Уs%Trs%TrG'VR##УuT$5ruT$5rG'VR3#УteCRrteCRrG'VRC#У"t4rt4rG'VRS#У2s5rs5rG'VRc#УBud3uBrud3uBrG'VRs#УRsusrsurfRs#УbsDc#rsDc#rG'VR#Уrs4rs4rG'VR#УsCrrsCrrG'VR#Уs$Rrs$RrG'VR#УsEDrsEDrG'VR##У#ЮhX[Y~[#Юjz.[#ЮXxnzs"R#УУ&6VУX^>YD4YX[n(NkXikXxnz~Y^K^K8#УXhnK{z&VcGGSCBW&g6#S""F&vWCg6#S#УjX[>KFW&24YNz>{zyNZNih~z[K{NZIyX[5FW&24Nz>{zXh^Z{J.zKK^XNih~zhn{~{XyNyX[>ih~z[ZJ~Z^YZIiJ
分享到 :
0 人收藏
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

积分:1060120
帖子:212021
精华:0
期权论坛 期权论坛
发布
内容

下载期权论坛手机APP