💎1MB lightweight face detection model (1MB轻量级人脸检测模型)
This model is a lightweight facedetection model designed for edge computing devices.
The training set is the VOC format data set generated by using the cleaned widerface labels provided by Retinaface in conjunction with the widerface data set (PS: the following test results were obtained by myself, and the results may be partially inconsistent).
Model | Easy Set | Medium Set | Hard Set |
---|---|---|---|
libfacedetection v1(caffe) | 0.65 | 0.5 | 0.233 |
libfacedetection v2(caffe) | 0.714 | 0.585 | 0.306 |
Retinaface-Mobilenet-0.25 (Mxnet) | 0.745 | 0.553 | 0.232 |
version-slim | 0.77 | 0.671 | 0.395 |
version-RFB | 0.787 | 0.698 | 0.438 |
Model | Easy Set | Medium Set | Hard Set |
---|---|---|---|
libfacedetection v1(caffe) | 0.741 | 0.683 | 0.421 |
libfacedetection v2(caffe) | 0.773 | 0.718 | 0.485 |
Retinaface-Mobilenet-0.25 (Mxnet) | 0.879 | 0.807 | 0.481 |
version-slim | 0.853 | 0.819 | 0.539 |
version-RFB | 0.855 | 0.822 | 0.579 |
- This part mainly tests the effect of the test set under the medium and small resolutions.
- RetinaFace-mnet (Retinaface-Mobilenet-0.25), from a great job insightface, when testing this network, the original image is scaled by 320 or 640 as the maximum side length, so the face will not be deformed, and the rest of the networks will have a fixed size resize. At the same time, the result of the RetinaFace-mnet optimal 1600 single-scale val set was 0.887 (Easy) / 0.87 (Medium) / 0.791 (Hard).
Model | 1 core | 2 core | 3 core | 4 core |
---|---|---|---|---|
libfacedetection v1 | 28 | 16 | 12 | 9.7 |
Official Retinaface-Mobilenet-0.25 (Mxnet) | 46 | 25 | 18.5 | 15 |
version-slim | 29 | 16 | 12 | 9.5 |
version-RFB | 35 | 19.6 | 14.8 | 11 |
Model | Inference Latency(ms) |
---|---|
slim-320 | 6.33 |
RFB-320 | 7.8 |
Model | Inference Latency(ms) |
---|---|
slim-320 | 65.6 |
RFB-320 | 164.8 |
Model | model file size(MB) |
---|---|
libfacedetection v1(caffe) | 2.58 |
libfacedetection v2(caffe) | 3.34 |
Official Retinaface-Mobilenet-0.25 (Mxnet) | 1.68 |
version-slim | 1.04 |
version-RFB | 1.11 |
(1) The clean widerface data pack after filtering out the 10px*10px small face: Baidu cloud disk (extraction code: cbiu) 、Google Drive
(2) Complete widerface data compression package without filtering small faces: Baidu cloud disk (extraction code: ievk)、Google Drive
python3 ./data/wider_face_2_voc_add_landmark.py
After the program is run and finished, the wider_face_add_lm_10_10 folder will be generated in the ./data directory. The folder data and data package (1) are the same after decompression. The complete directory structure is as follows:
data/
retinaface_labels/
test/
train/
val/
wider_face/
WIDER_test/
WIDER_train/
WIDER_val/
wider_face_add_lm_10_10/
Annotations/
ImageSets/
JPEGImages/
wider_face_2_voc_add_landmark.py
At this point, the VOC training set is ready. There are two scripts: train-version-slim.sh and train-version-RFB.sh in the root directory of the project. The former is used to train the slim version model, and the latter is used. Training RFB version model, the default parameters have been set, if the parameters need to be changed, please refer to the description of each training parameter in ./train.py.
Run train-version-slim.sh train-version-RFB.sh
sh train-version-slim.sh or sh train-version-RFB.sh
(1) Optimal: input size input_size: 640 (640x480) resolution training, and use the same or larger input size for inference, such as using the provided pre-training model version-slim-640.pth or version-RFB-640.pth for inference, lower False positives.
(2) Sub-optimal: input size input_size: 320 (320x240) resolution training, and use 480x360 or 640x480 size input for predictive reasoning, more sensitive to small faces, false positives will increase.