人脸与关键点检测：YOLO5Face实战

居然花了一天时间把该项目复现，主要是折腾在数据集格式上，作者居然在train2yolo中居然把Widerface训练集（12000+张图）重新一张一张保存，不这么做还报错，原因是无法读到数据缓存。搬运工，肝不起啊！........................

烧技湾

9311人浏览 · 2022-07-25 14:53:02

烧技湾 · 2022-07-25 14:53:02 发布

在这里插入图片描述
Github:https://github.com/deepcam-cn/yolov5-face

导读：居然花了一天时间把该项目复现，主要是折腾在数据集格式上，作者居然在train2yolo中居然把Widerface训练集（12000+张图）重新一张一张保存，不这么还出bug，原因是无法读到数据缓存；在评估阶段，val2yolo也没用上。搬运工，一个字，肝！

一、设置

Step by step

克隆仓库，进入conda环境

git clone https://github.com/deepcam-cn/yolov5-face.git

conda activate pytorch-cifar

测试模型
下载作者预训练的YOLO5m-face模型，并测试单张图片效果。运行detectface如下：

在这里插入图片描述

      "args": [
          // // // to run train2yolo:
          // "data/widerface/train",
          // "data/widerface/train/labels"

          // // // to run val2yolo:
          // "data/widerface",
          // "data/widerface/val/labels"

          // // to run detect_face
          // "--img-size", "640",
          // "--weights", "weights/yolov5m-face.pt",


          // to train on widerface
          // "CUDA_VISIBLE_DEVICES"," 0, ",
          "--data", "data/widerface.yaml",
          "--cfg", "models/yolov5s.yaml",
          "--weights", "weights/yolov5s.pt"
      ]

以上是launch.json的配置参数，分别对应几个任务。其中运行train2yolo.py，巨慢；val2yolo，可以忽略。

二、训练

2.1 准备数据：

推荐去官网http://shuoyang1213.me/WIDERFACE/
下载widerface数据集，分别下载train、val、test三个包。本人参考RetinaFace下载的数据集不全面。

由于官网没找到人脸关键点的标注文本，在开头git链接YOLO5face可以下载带有keypoint的annotation文件。整个文件结构如下：

(base) wqt@ser2024:NewProjects$ ln -s ~/Datasets/widerface  yolov5-face/data/
//

yolov5-face
│   README.md
│   ...   
│data
| └─widerface
│         └───test
|                 └───images
|                 └───labels
│         └───train
|                 └───images
|                 └───labels
|                      └───0..._1_5.jpg
|                      └───0..._1_5.txt
│         └───val

目录画得有点丑，但为了说明详细，也是拼了老命。

下载预训练好的yolo5检测模型，由于本工程基于的yolo版本比较旧（最新6.0版）久，下载链接参考：
https://github.com/ultralytics/yolov5/releases/tag/v4.0

准确读取数据路径，在widerface.yaml修改为：

train: /home/wqt/NewProjects/yolov5-face/data/widerface/train/labels  
val: /home/wqt/NewProjects/yolov5-face/data/widerface/val/labels   

# number of classes
nc: 1

# class names
names: [ 'face']

运行train，没报错，佛祖保佑！

autoanchor: Analyzing anchors... anchors/target = 4.02, Best Possible Recall (BPR) = 0.9997
Image sizes 800 train, 800 test
Using 4 dataloader workers
Logging results to runs/train/exp12
Starting training for 250 epochs...

     Epoch   gpu_mem       box       obj       cls  landmark     total   targets  img_size
     0/249     4.65G    0.1032   0.03953         0    0.1396    0.2823        25       800: 100%|

     Epoch   gpu_mem       box       obj       cls  landmark     total   targets  img_size
     1/249     5.01G   0.08159   0.03674         0   0.02959    0.1479        15       800: 100%|

训练还挺快，一个epoch才几分钟，共250epoch。

有关YOLO数据格式

原始的widerface标注如下：

0--Parade/0_Parade_marchingband_1_849.jpg
1
449 330 122 149 0 0 0 0 0 0

转化为YOLO格式，如下：

0--Parade/0_Parade_marchingband_1_849.jpg
449 330 122 149 488.906 373.643 0.0 542.089 376.442 0.0 515.031 412.83 0.0 485.174 425.893 0.0 538.357 431.491 0.0 0.82

即：bbox (xywh) + (xi, yi, flag)
在这里插入图片描述

COCO keypoints转化为YOLO格式(17个人体关键点)，以下图片对应的标注如下：

0 0.686445 0.531960 0.082891 0.323967 0.667188 0.399061 1.000000 0.670312 0.396714 2.000000 0.000000 0.000000 0.000000 0.678125 0.394366 2.000000 0.000000 0.000000 0.000000 0.689063 0.415493 2.000000 0.696875 0.415493 2.000000 0.682813 0.469484 2.000000 0.671875 0.483568 2.000000 0.671875 0.516432 2.000000 0.656250 0.504695 2.000000 0.695312 0.530516 2.000000 0.706250 0.523474 2.000000 0.698438 0.610329 2.000000 0.709375 0.603286 2.000000 0.710938 0.680751 2.000000 0.717187 0.671362 2.000000

在这里插入图片描述

训练了12h之后，实测结果如下：

  Epoch   gpu_mem       box       obj       cls  landmark     total   targets  img_size
   249/249     5.01G   0.03923   0.02193         0   0.00727   0.06844        18       800: 
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95: 
                 all    3.22e+03    3.96e+04       0.576       0.723       0.715        0.34

即在WiderFace上取得了mAP@0.5=71.5%，mAP@mean=0.34

实测1：

在这里插入图片描述
实测2：

使用缩小五官的效果图去测试，发现有不小误差，例如嘴角的两点偏差比较大。

三、评估

上述模型训练完毕后，需要评估在Widerface上的性能，运行test_widerface，并配置如下。

   //to test on the widerface
    "--weights", "runs/train/exp12/weights/best.pt",
    "--save_folder", "./widerface_evaluate/widerface_txt/",
    "--dataset_folder", "data/widerface/val/images/",
    "--folder_pict", "data/widerface/val/wider_val.txt"

val2yolo.py跑了那么久，结果还是用原来的widerface/val，并且要修改代码如下：

  # for image_path in tqdm(glob.glob(os.path.join(testset_folder, '*'))):
  for image_path in tqdm(glob.glob(os.path.join(testset_folder, '*', "*.jpg"))):

(pytorch-cifar) wqt@ser2024:yolov5-face$ cd widerface_evaluate/
(pytorch-cifar) wqt@ser2024:widerface_evaluate$ python evaluation.py

进入评估程序，显示如下：

Reading Predictions : 100%|
Processing easy: 100%|
Processing medium: 100%|
Processing hard: 100%|
==================== Results ====================
Easy   Val AP: 0.9430954074066416
Medium Val AP: 0.9235452692977046
Hard   Val AP: 0.8268421552890463
=================================================

与作者训练的性能对比，在yolov5s情况下，取得：
Easy 94.3% <（94.67%）
Med 92.4% <（92.75%）
Easy 82.7% <（83.03%）

都比作者给出的情况稍微低一些。

在这里插入图片描述