第四次作业:猫狗大战挑战赛

Part1

首先下载了老师给的数据集,完成数据下载之后,需要对数据进行一些预处理:

图片将被整理成 224 × 224 × 3 224\times 224 \times 3224×224×3 的大小,同时还将进行归一化处理。
其他的一些对数据的复杂的预处理/变换 (normalization, cropping, flipping, jittering 等)可以参照 torchvision.tranforms 的官方文档说明。同时将数据拆分为训练集和测试集,将部分图片打印出来可视化以方便测试。

VGG模型提出了迁移学习,所以我们只需要更改最后两层即可,设置梯度下降为FALSE,我们只需要训练测试全连接层即可。

屏幕截图 2021-10-20 190935

屏幕截图 2021-10-20 190954

之后测试为如上图所示的结果。

part2

猫狗大战比赛:

有了老师提供的代码,在其基础上做了一些更改。

首先下载数据集,unrar解压文件。注意到解压后的文件中,val和train中都没有分类出前缀为cat和dog的图片,于是我们要运行脚本将其分离。

运行完以后,如图

QQ图片20211021131113

接下来是我对于代码的改动部分:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
import numpy as np
import matplotlib.pyplot as plt
import os
import torch
import torch.nn as nn
import torchvision
from torchvision import models,transforms,datasets
import time
import json
# 判断是否存在GPU设备
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print('Using gpu: %s ' % torch.cuda.is_available())
#处理数据
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
vgg_format = transforms.Compose([
transforms.CenterCrop(224),
transforms.ToTensor(),
normalize,
])

data_dir = "/content/cat_dog"

dsets = {x: datasets.ImageFolder(os.path.join(data_dir, x), vgg_format)
for x in ['train', 'val']}

dset_sizes = {x: len(dsets[x]) for x in ['train', 'val']}
dset_classes = dsets['train'].classes

#修改batch_size
loader_train = torch.utils.data.DataLoader(dsets['train'], batch_size=128, shuffle=True, num_workers=6)
loader_valid = torch.utils.data.DataLoader(dsets['val'], batch_size=5, shuffle=False, num_workers=6)
#加载vgg16模型
!wget https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json
#使用vgg16需要
model_vgg = models.vgg16(pretrained=True)

print(model_vgg)

model_vgg_new = model_vgg;

#冻结VGG16中的参数,不进行梯度下降
for param in model_vgg_new.parameters():
param.requires_grad = False
#修改模型后两层
model_vgg_new.classifier._modules['6'] = nn.Linear(4096, 2)
model_vgg_new.classifier._modules['7'] = torch.nn.LogSoftmax(dim = 1)
model_vgg_new = model_vgg_new.to(device)

print(model_vgg_new.classifier)

'''
第一步:创建损失函数和优化器

损失函数 NLLLoss() 的 输入 是一个对数概率向量和一个目标标签.
它不会为我们计算对数概率,适合最后一层是log_softmax()的网络.
'''
criterion = nn.NLLLoss()

# 学习率
lr = 0.001

#修改为Adam优化器
optimizer_vgg = torch.optim.Adam(model_vgg_new.classifier[6].parameters(), lr=lr)

'''
第二步:训练模型
'''

def train_model(model,dataloader,size,epochs=1,optimizer=None):
model.train()
max_acc = 0
for epoch in range(epochs):
running_loss = 0.0
running_corrects = 0
count = 0
for inputs,classes in dataloader:
inputs = inputs.to(device)
classes = classes.to(device)
outputs = model(inputs)
loss = criterion(outputs,classes)
optimizer = optimizer
optimizer.zero_grad()
loss.backward()
optimizer.step()
_,preds = torch.max(outputs.data,1)
# statistics
running_loss += loss.data.item()
running_corrects += torch.sum(preds == classes.data)
count += len(inputs)
#print('Training: No. ', count, ' process ... total: ', size)
epoch_loss = running_loss / size
epoch_acc = running_corrects.data.item() / size

print('epoch: {:} Loss: {:.4f} Acc: {:.4f}\n'.format(epoch,epoch_loss, epoch_acc))
if epoch_acc > max_acc:
max_acc = epoch_acc
path = './sample_data' + str(epoch+1) + '' + str(epoch_acc) + '' + '.pth'
torch.save(model, path)
print("save: ", path,"\n")

# 模型训练,修改训练次数10次
train_model(model_vgg_new, loader_train, size=dset_sizes['train'], epochs=10,
optimizer=optimizer_vgg)

# 第三步:测试模型
def test_model(model,dataloader,size):
model.eval()
predictions = np.zeros(size)
all_classes = np.zeros(size)
all_proba = np.zeros((size,2))
i = 0
running_loss = 0.0
running_corrects = 0
for inputs,classes in dataloader:
inputs = inputs.to(device)
classes = classes.to(device)
outputs = model(inputs)
loss = criterion(outputs,classes)
_,preds = torch.max(outputs.data,1)
# statistics
running_loss += loss.data.item()
running_corrects += torch.sum(preds == classes.data)
predictions[i:i+len(classes)] = preds.to('cpu').numpy()
all_classes[i:i+len(classes)] = classes.to('cpu').numpy()
all_proba[i:i+len(classes),:] = outputs.data.to('cpu').numpy()
i += len(classes)
print('Testing: No. ', i, ' process ... total: ', size)
epoch_loss = running_loss / size
epoch_acc = running_corrects.data.item() / size
print('Loss: {:.4f} Acc: {:.4f}'.format(
epoch_loss, epoch_acc))
return predictions, all_proba, all_classes

# 模型测试
predictions, all_proba, all_classes = test_model(model_vgg_new,loader_valid,size=dset_sizes['val'])

总结起来有以下几点:

1.修改batch_size为128

2.使用Adam优化

3.增加训练次数,保存训练模型以选择最优模型。

QQ图片20211021135318

QQ图片20211021135246

然后选取最佳的模型10号(盲猜训练次数越多正确率越高,且随次数呈正相关)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import torch
import numpy as np
from torchvision import transforms,datasets
from tqdm import tqdm
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
vgg_format = transforms.Compose([
transforms.CenterCrop(224),
transforms.ToTensor(),
normalize,
])

dsets_mine = datasets.ImageFolder(r"/content/cat_dog", vgg_format)
loader_test = torch.utils.data.DataLoader(dsets_mine, batch_size=1, shuffle=False, num_workers=0)
model_vgg_new = torch.load(r'/content/sample_data100.98115.pth')
model_vgg_new = model_vgg_new.to(device)
dic = {}
def test(model,dataloader,size):
model.eval()
predictions = np.zeros(size)
cnt = 0
for inputs,_ in tqdm(dataloader):
inputs = inputs.to(device)
outputs = model(inputs)
_,preds = torch.max(outputs.data,1)
#这里是切割路径,因为dset中的数据不是按1-2000顺序排列的
key = dsets_mine.imgs[cnt][0].split("\\")[-1].split('.')[0]
dic[key] = preds[0]
cnt = cnt +1
test(model_vgg_new,loader_test,size=2000)
with open("result.csv",'a+') as f:
for key in range(2000):
f.write("{},{}\n".format(key,dic["/content/cat_dog/test/"+str(key)]))

这里一开始遇到了一些问题:

1.colab排序是按首位最优先排序,不按位数,所以容易错位 造成1 10 100 1000 2这种错误。

2.关于如何输出csv查阅了相关资料。

得到csv文件后,提交QQ图片20211021153118

98,挺整齐的还可以,跟10号的98.11差不多,可以接受。

小结:1.适当扩充训练集。

​ 2.更换更合适的优化器

​ 3.更改batch_size

​ 4.迁移学习富有魅力