yolov5选择合适自己的超参数-超参数进化Hyperparameter Evolution

yolov5选择合适自己的超参数-超参数进化Hyperparameter Evolution前言1. 初始化超参数2. 定义fitness3. 进化4. 可视化报错问题前言yolov5提供了一种超参数优化的方法–Hyperparameter Evolution，即超参数进化。超参数进化是一种利用遗传算法(GA) 进行超参数优化的方法，我们可以通过该方法选择更加合适自己的超参数。提供的默认参数也是

ayiya_Oese

29450人浏览 · 2021-04-01 14:52:49

ayiya_Oese · 2021-04-01 14:52:49 发布

yolov5选择合适自己的超参数-超参数进化Hyperparameter Evolution

前言

yolov5提供了一种超参数优化的方法–Hyperparameter Evolution，即超参数进化。超参数进化是一种利用 遗传算法(GA) 进行超参数优化的方法，我们可以通过该方法选择更加合适自己的超参数。

提供的默认参数也是通过在COCO数据集上使用超参数进化得来的。由于超参数进化会耗费大量的资源和时间，如果默认参数训练出来的结果能满足你的使用，使用默认参数也是不错的选择。

ML中的超参数控制训练的各个方面，找到一组最佳的超参数值可能是一个挑战。像网格搜索这样的传统方法由于以下原因可能很快变得难以处理：

高维度的搜索空间；
维度之间未知的相关性；
在每个点上评估fitness的代价很高
由于这些原因使得遗传算法成为超参数搜索的合适候选。

1. 初始化超参数

YOLOv5有大约25个用于各种训练设置的超参数，它们定义在/data目录下的yaml文件中。好的初始参数值将产生更好的最终结果，因此在演进之前正确初始化这些值是很重要的。如果有不清楚怎么初始化，只需使用默认值，这些值是针对COCO训练优化得到的。

yolov5/data/hyp.scratch.yaml

# Hyperparameters for COCO training from scratch
# python train.py --batch 40 --cfg yolov5m.yaml --weights '' --data coco.yaml --img 640 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials


lr0: 0.01  # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.2  # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937  # SGD momentum/Adam beta1
weight_decay: 0.0005  # optimizer weight decay 5e-4
warmup_epochs: 3.0  # warmup epochs (fractions ok)
warmup_momentum: 0.8  # warmup initial momentum
warmup_bias_lr: 0.1  # warmup initial bias lr
box: 0.05  # box loss gain
cls: 0.5  # cls loss gain
cls_pw: 1.0  # cls BCELoss positive_weight
obj: 1.0  # obj loss gain (scale with pixels)
obj_pw: 1.0  # obj BCELoss positive_weight
iou_t: 0.20  # IoU training threshold
anchor_t: 4.0  # anchor-multiple threshold
# anchors: 3  # anchors per output layer (0 to ignore)
fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015  # image HSV-Hue augmentation (fraction)
hsv_s: 0.7  # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4  # image HSV-Value augmentation (fraction)
degrees: 0.0  # image rotation (+/- deg)
translate: 0.1  # image translation (+/- fraction)
scale: 0.5  # image scale (+/- gain)
shear: 0.0  # image shear (+/- deg)
perspective: 0.0  # image perspective (+/- fraction), range 0-0.001
flipud: 0.0  # image flip up-down (probability)
fliplr: 0.5  # image flip left-right (probability)
mosaic: 1.0  # image mosaic (probability)
mixup: 0.0  # image mixup (probability)

2. 定义fitness

fitness是我们寻求最大化的值。在YOLOv5中，定义了一个fitness函数对指标进行加权。
yolov5/utils/metrics.py

def fitness(x):
    # Model fitness as a weighted combination of metrics
    w = [0.0, 0.0, 0.1, 0.9]  # weights for [P, R, mAP@0.5, mAP@0.5:0.95]
    return (x[:, :4] * w).sum(1)

3. 进化

使用预训练的yolov5s对COCO128进行微调

python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache

基于这个场景进行超参数进化选择，通过使用参数--evolve：

# Single-GPU
python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve

# Multi-GPU
for i in 0 1 2 3; do
  nohup python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve --device $i > evolve_gpu_$i.log &
done

# 其中多GPU运行时的`nohub`是`no hang up`（不挂起），用于在系统后台不挂断地运行命令，退出终端不会影响程序的运行。
# `&`符号的用途：在后台运行。
# 一般两个一起用`nohup command &`。

# 查看进程：
ps -aux|grep train.py

# #终止进程：
kill -9  进程号

代码中默认进化设置将运行基本场景300次，即300代
yolov5/train.py

for _ in range(300):  # generations to evolve

主要的遗传操作是交叉和变异。在这项工作中，使用了90%的概率和0.04的方差的变异，以所有前几代最好的父母的组合来创造新的后代。结果记录在yolov5/evolve.txt，fitness最高的后代保存在yolov5/runs/evolve/hyp_evolved.yaml

4. 可视化

结果被保存在yolov5/evolve.png，每个超参数一个图表。超参数的值在x轴上，fitness在y轴上。黄色表示浓度较高。垂直线表示某个参数已被固定，且不会发生变化。这是用户在train.py上可选择的meta字典，这对于固定参数和防止它们进化是很有用的。
在这里插入图片描述

报错问题

报错1：KeyError: ‘anchors’ ：
issues/2485
issues/1411
pull/1135

I think commenting the same field in the meta dictionary can work… yes that should work, it will act as if the field does not exist at all. Anchor count will be fixed at 3, and autoanchor will be run if the Best Possible Recall (BPR) dips below threshold, which is set at 0.98 at the moment. Varying the hyps can cause your BPR to vary, so its possible some generations may use it and other not. - - glenn-jocher

EDIT: BTW the reason there are two dictionaries is that the meta dictionary contains gains and bounds applied to each hyperparameter during evolution as key: [gain, lower_bound, upper_bound]. meta is only ever used during evolution, I kept it separated to avoid complicating the hyp dictionary, again not sure if that’s the best design choice, we could merge them, but then each hyp.yaml would be busier and more complicated to read. - - glenn-jocher

原因是data/hyp.scratch.yaml里面的anchors被注释掉，取消注释继续运行，出现下面的错误

报错2：IndexError: index 34 is out of bounds for axis 0 with size 34 ：
pull/1135

将data/hyp.scratch.yaml里面的anchors注释掉；同时将train.py中的mate字典中的anchors也注释掉。运行成功

如果为hyp['anchors']设置一个值，autoanchor将创建新的锚覆盖在model.yaml中指定的任何锚信息。比如：你可以设置anchors:5强制autoanchor为每个输出层创建5个新的锚，取代现有的锚。超参数进化将使用该参数为您进化出最优数量的锚。issue