size mismatch问题：训练权重不匹配问题

在测试二阶段和三阶段模型的时候程序一直报错：RuntimeError: Error(s) in loading state_dict for Eff:size mismatch for fc.weight: copying a param with shape torch.Size([18, 1000]) from checkpoint, the shape in current model is

python_Ezreal

25009人浏览 · 2022-04-23 20:03:10

python_Ezreal · 2022-04-23 20:03:10 发布

在测试二阶段和三阶段模型的时候程序一直报错：

RuntimeError: Error(s) in loading state_dict for Eff:
size mismatch for fc.weight: copying a param with shape torch.Size([18, 1000]) from checkpoint, the shape in current model is torch.Size([14, 1000]).
size mismatch for fc.bias: copying a param with shape torch.Size([18]) from checkpoint, the shape in current model is torch.Size([14]).

这个问题是参数的权重输出维度不同，查阅网上很多资料，也就是权重的fc层的参数不同，大佬都说把这个fc层忽略就好了。但是我找了好多感觉看不懂，最终找到了下面博主的博客。

https://blog.csdn.net/weixin_44966641/article/details/120083303

根据博主说的问题进行操作以后还是发现继续报错。

发现自己添加代码的位置错了：

eff_cls2(inner_model=Eff(num_classes=2),
         ckpt_path="D:\\a\\nzb_test-master\\nzb_test-master\\eff\\cls2.pth",
         data_pool=data_pool)

刚开始加在这里发现继续报错，问题一直得不到解决，然后就继续看自己的报错：

Traceback (most recent call last):
File "D:/a/nzb_test-master/nzb_test-master/test1.py", line 44, in <module>
data_pool=data_pool)
File "D:\a\nzb_test-master\nzb_test-master\process.py", line 318, in __init__
super().__init__(inner_model, ckpt_path, data_pool)
File "D:\a\nzb_test-master\nzb_test-master\process.py", line 39, in __init__
self.prepare()
File "D:\a\nzb_test-master\nzb_test-master\process.py", line 45, in prepare
self.inner_model.load_state_dict(ckpt)
File "D:\ananconda\envs\yolo\lib\site-packages\torch\nn\modules\module.py", line 1498, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))

发现错误在process里面的，于是我就进去找到了prepare函数，发现在里面找到了自己需要的东西，就根据博主的方法把下面两行代码加了进去。

ckpt.pop("fc.bias")
ckpt.pop("fc.weight")

你的报错是什么，这里就些什么就好了，这个语句的意思是直接把权重当中的这两个层直接忽略掉，于是就不会进行报错了。

但是我的代码还在报错，没办法就回去继续看，发现我在引用模型的时候没有加strict=False，这个语句就是指忽略掉模型和参数文件中不匹配的参数。我的代码里面并没有，于是就这一句也加在了里面，代码就跑通了。

修改后的代码如下：

def prepare(self):
    self.inner_model.cuda()
    ckpt = torch.load(self.ckpt_path)
    ckpt.pop("fc.bias")
    ckpt.pop("fc.weight")
    self.inner_model.load_state_dict(ckpt,strict=False)
    self.inner_model.eval()

到此，错误完全解决了！！！！！！