YOLOv6 Versions Save

YOLOv6: a single-stage object detection framework dedicated to industrial applications.

0.4.1

7 months ago

Features

  • Release YOLOv6-Segmentation models at full scales
  • Achieve the state-of-the-art accuracy in Real-time Instance Segmentation.

Performance of YOLOv6-seg models

Model Size mAPbox
50-95
mAPmask
50-95
SpeedT4
trt fp16 b1
(fps)
Params
(M)
FLOPs
(G)
YOLOv6-N 640 35.3 31.2 645 4.9 7.0
YOLOv6-S 640 44.0 38.0 292 19.6 27.7
YOLOv6-M 640 48.2 41.3 148 37.1 54.3
YOLOv6-L 640 51.1 43.7 93 63.6 95.5
YOLOv6-X 640 52.2 44.8 47 119.1 175.5

Table Notes

  • All checkpoints are trained from scratch on COCO for 300 epochs without distillation.
  • Results of the mAP and speed are evaluated on COCO val2017 dataset with the input resolution of 640×640.
  • Speed is tested with TensorRT 8.5 on T4 without post-processing.

0.4.0

1 year ago

v4.0 release

Features

Performance of YOLOv6Lite models

Model Size mAPval
0.5:0.95
sm8350
(ms)
mt6853
(ms)
sdm660
(ms)
Params
(M)
FLOPs
(G)
YOLOv6Lite-S 320*320 22.4 7.99 11.99 41.86 0.55 0.56
YOLOv6Lite-M 320*320 25.1 9.08 13.27 47.95 0.79 0.67
YOLOv6Lite-L 320*320 28.0 11.37 16.20 61.40 1.09 0.87
YOLOv6Lite-L 320*192 25.0 7.02 9.66 36.13 1.09 0.52
YOLOv6Lite-L 224*128 18.9 3.63 4.99 17.76 1.09 0.24
Table Notes
  • From the perspective of model size and input image ratio, we have built a series of models on the mobile terminal to facilitate flexible applications in different scenarios.
  • All checkpoints are trained with 400 epochs without distillation.
  • Results of the mAP and speed are evaluated on COCO val2017 dataset, and the input resolution is the Size in the table.
  • Speed is tested on MNN 2.3.0 AArch64 with 2 threads by arm82 acceleration. The inference warm-up is performed 10 times, and the cycle is performed 100 times.
  • Qualcomm 888(sm8350), Dimensity 720(mt6853) and Qualcomm 660(sdm660) correspond to chips with different performances at the high, middle and low end respectively, which can be used as a reference for model capabilities under different chips.
  • Refer to Test NCNN Speed tutorial to reproduce the NCNN speed results of YOLOv6Lite.

Performance of YOLOv6_MBLA models

Model Size mAPval
0.5:0.95
SpeedT4
trt fp16 b1
(fps)
SpeedT4
trt fp16 b32
(fps)
Params
(M)
FLOPs
(G)
YOLOv6-S-mbla 640 47.0distill 300 424 11.6 29.8
YOLOv6-M-mbla 640 50.3distill 168 216 26.1 66.7
YOLOv6-L-mbla 640 52.0distill 129 154 46.3 118.2
YOLOv6-X-mbla 640 53.5distill 78 94 78.8 199.0
Table Notes
  • Speed is tested with TensorRT 8.4.2.4 on T4.
  • The processes of model training, evaluation, and inference are the same as the original ones. For details, please refer to this README.

0.3.1

1 year ago

Features

  • Face detection and landmarks localization
  • Repulsion loss
  • Same-channel Dehead

Performance on WIDERFACE

Model Size Easy Medium Hard SpeedT4
trt fp16 b1
(fps)
SpeedT4
trt fp16 b32
(fps)
Params
(M)
FLOPs
(G)
YOLOv6-N 640 95.0 92.4 80.4 797 1313 4.63 11.35
YOLOv6-S 640 96.2 94.7 85.1 339 484 12.41 32.45
YOLOv6-M 640 97.0 95.3 86.3 188 240 24.85 70.59
YOLOv6-L 640 97.2 95.9 87.5 102 121 56.77 159.24
  • All checkpoints are fine-tuned from COCO pretrained model for 300 epochs without distillation.
  • Results of the mAP and speed are evaluated on WIDER FACE dataset with the input resolution of 640×640.
  • Speed is tested with TensorRT 8.2 on T4.
  • Refer to Test speed tutorial to reproduce the speed results of YOLOv6.
  • Params and FLOPs of YOLOv6 are estimated on deployed models.

0.3.0

1 year ago

v3.0 release

Features

Release P6 models and update P5 models

  • Renew the neck of the detector with a BiC module and SimCSPSPPF Block.
  • Propose an anchor-aided training (AAT) strategy.
  • Involve a new self-distillation strategy for small models of YOLOv6.
  • Expand YOLOv6 and hit a new SOTA performance on the COCO dataset.

Performance

Model Size mAPval
0.5:0.95
SpeedT4
trt fp16 b1
(fps)
SpeedT4
trt fp16 b32
(fps)
Params
(M)
FLOPs
(G)
YOLOv6-N 640 37.5 779 1187 4.7 11.4
YOLOv6-S 640 45.0 339 484 18.5 45.3
YOLOv6-M 640 50.0 175 226 34.9 85.8
YOLOv6-L 640 52.8 98 116 59.6 150.7
YOLOv6-N6 1280 44.9 228 281 10.4 49.8
YOLOv6-S6 1280 50.3 98 108 41.4 198.0
YOLOv6-M6 1280 55.2 47 55 79.6 379.5
YOLOv6-L6 1280 57.2 26 29 140.4 673.4

Performance of base models

Model Size mAPval
0.5:0.95
SpeedT4
TRT FP16 b1
(FPS)
SpeedT4
TRT FP16 b32
(FPS)
SpeedT4
TRT INT8 b1
(FPS)
SpeedT4
TRT INT8 b32
(FPS)
Params
(M)
FLOPs
(G)
YOLOv6-N-base 640 36.6 727 1302 814 1805 4.65 11.46
YOLOv6-S-base 640 45.3 346 525 487 908 13.14 30.6
YOLOv6-M-base 640 49.4 179 245 284 439 28.33 72.30
YOLOv6-L-base 640 51.1 116 157 196 288 59.61 150.89

0.2.1

1 year ago

v2.1 release

Features

Release base models

  • Use only regular convolution and Relu activation functions.

  • Apply CSP (1/2 channel dim) blocks in the network structure, except for Nano base model.

Advantage:

  • Adopt a unified network structure and configuration, and the accuracy loss of the PTQ 8-bit quantization model is negligible, about 0.4%.
  • Suitable for users who are just getting started or who need to apply, optimize and deploy an 8-bit quantization model quickly and frequently.

Shortcoming:

  • The accuracy on COCO is slightly lower than the v2.0 released models.

Performance

Model Size mAPval
0.5:0.95
SpeedT4
trt fp16 b1
(fps)
SpeedT4
trt fp16 b32
(fps)
Params
(M)
FLOPs
(G)
YOLOv6-N-base 640 35.6400e 832 1249 4.3 11.1
YOLOv6-S-base 640 43.8400e 373 531 11.5 27.6
YOLOv6-M-base 640 48.8distill 179 246 27.7 68.4
YOLOv6-L-base 640 51.0distill 115 153 58.5 144.0

0.2.0

1 year ago

v2.0 release

YOLOv6 has a series of models for various industrial scenarios, including nano/tiny/s/m/l, which the architectures vary considering the model size for better accuracy-speed trade-off. And some Bag-of-freebies methods are introduced to further improve the performance, such as self-distillation and more training epochs. For industrial deployment, we adopt QAT with channel-wise distillation and graph optimization to pursue extreme performance.

New Features

  • Release M/L models and update N/T/S models with enhanced performance.⭐️ Benchmark
  • 2x faster training time.
  • Fix the degration of performance when evaluating on 640x640 inputs.
  • Customized quantization methods. 🚀 Quantization Tutorial

0.1.0

1 year ago

v1.0 release

Features

YOLOv6 is a single-stage object detection framework dedicated to industrial application, with hardware-friendly efficient design and high performance, outperforming YOLOv5, YOLOX and PP-YOLOE.

YOLOv6-nano achieves 35.0 mAP on COCO val2017 dataset with 1242 FPS on T4 using TensorRT FP16 for bs32 inference, and YOLOv6-s achieves 43.1 mAP on COCO val2017 dataset with 520 FPS on T4 using TensorRT FP16 for bs32 inference.

  • Hardware-friendly Design for Backbone and Neck
  • Efficient Decoupled Head with SIoU Loss