YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Our new YOLOv5 v7.0 instance segmentation models are the fastest and most accurate in the world, beating all current SOTA benchmarks. We've made them super simple to train, validate and deploy. See full details in our Release Notes and visit our YOLOv5 Segmentation Colab Notebook for quickstart tutorials.
Our primary goal with this release is to introduce super simple YOLOv5 segmentation workflows just like our existing object detection models. The new v7.0 YOLOv5-seg models below are just a start, we will continue to improve these going forward together with our existing detection and classification models. We'd love your feedback and contributions on this effort!
This release incorporates 280 PRs from 41 contributors since our last release in August 2022.
python train.py --cache ram
will now scan available memory and compare against predicted dataset RAM usage. This reduces risk in caching and should help improve adoption of the dataset caching feature, which can significantly speed up training. (https://github.com/ultralytics/yolov5/pull/10027 by @glenn-jocher)We trained YOLOv5 segmentations models on COCO for 300 epochs at image size 640 using A100 GPUs. We exported all models to ONNX FP32 for CPU speed tests and to TensorRT FP16 for GPU speed tests. We ran all speed tests on Google Colab Pro notebooks for easy reproducibility.
Model | size (pixels) |
mAPbox 50-95 |
mAPmask 50-95 |
Train time 300 epochs A100 (hours) |
Speed ONNX CPU (ms) |
Speed TRT A100 (ms) |
params (M) |
FLOPs @640 (B) |
---|---|---|---|---|---|---|---|---|
YOLOv5n-seg | 640 | 27.6 | 23.4 | 80:17 | 62.7 | 1.2 | 2.0 | 7.1 |
YOLOv5s-seg | 640 | 37.6 | 31.7 | 88:16 | 173.3 | 1.4 | 7.6 | 26.4 |
YOLOv5m-seg | 640 | 45.0 | 37.1 | 108:36 | 427.0 | 2.2 | 22.0 | 70.8 |
YOLOv5l-seg | 640 | 49.0 | 39.9 | 66:43 (2x) | 857.4 | 2.9 | 47.9 | 147.7 |
YOLOv5x-seg | 640 | 50.7 | 41.4 | 62:56 (3x) | 1579.2 | 4.5 | 88.8 | 265.7 |
lr0=0.01
and weight_decay=5e-5
at image size 640 and all default settings.python segment/val.py --data coco.yaml --weights yolov5s-seg.pt
python segment/val.py --data coco.yaml --weights yolov5s-seg.pt --batch 1
export.py
. python export.py --weights yolov5s-seg.pt --include engine --device 0 --half
YOLOv5 segmentation training supports auto-download COCO128-seg segmentation dataset with --data coco128-seg.yaml
argument and manual download of COCO-segments dataset with bash data/scripts/get_coco.sh --train --val --segments
and then python train.py --data coco.yaml
.
# Single-GPU
python segment/train.py --model yolov5s-seg.pt --data coco128-seg.yaml --epochs 5 --img 640
# Multi-GPU DDP
python -m torch.distributed.run --nproc_per_node 4 --master_port 1 segment/train.py --model yolov5s-seg.pt --data coco128-seg.yaml --epochs 5 --img 640 --device 0,1,2,3
Validate YOLOv5m-seg accuracy on ImageNet-1k dataset:
bash data/scripts/get_coco.sh --val --segments # download COCO val segments split (780MB, 5000 images)
python segment/val.py --weights yolov5s-seg.pt --data coco.yaml --img 640 # validate
Use pretrained YOLOv5m-seg to predict bus.jpg:
python segment/predict.py --weights yolov5m-seg.pt --data data/images/bus.jpg
model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5m-seg.pt') # load from PyTorch Hub (WARNING: inference not yet supported)
Export YOLOv5s-seg model to ONNX and TensorRT:
python export.py --weights yolov5s-seg.pt --include onnx engine --img 640 --device 0
This release incorporates many new features and bug fixes (271 PRs from 48 contributors) since our last release in October 2021. It adds TensorRT, Edge TPU and OpenVINO support, and provides retrained models at --batch-size 128
with new default one-cycle linear LR scheduler. YOLOv5 now officially supports 11 different formats, not just for export but for inference (both detect.py and PyTorch Hub), and validation to profile mAP and speed results after export.
Format | export.py --include |
Model |
---|---|---|
PyTorch | - | yolov5s.pt |
TorchScript | torchscript |
yolov5s.torchscript |
ONNX | onnx |
yolov5s.onnx |
OpenVINO | openvino |
yolov5s_openvino_model/ |
TensorRT | engine |
yolov5s.engine |
CoreML | coreml |
yolov5s.mlmodel |
TensorFlow SavedModel | saved_model |
yolov5s_saved_model/ |
TensorFlow GraphDef | pb |
yolov5s.pb |
TensorFlow Lite | tflite |
yolov5s.tflite |
TensorFlow Edge TPU | edgetpu |
yolov5s_edgetpu.tflite |
TensorFlow.js | tfjs |
yolov5s_web_model/ |
Usage examples (ONNX shown):
Export: python export.py --weights yolov5s.pt --include onnx
Detect: python detect.py --weights yolov5s.onnx
PyTorch Hub: model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.onnx')
Validate: python val.py --weights yolov5s.onnx
Visualize: https://netron.app
python export.py --include saved_model pb tflite tfjs
(https://github.com/ultralytics/yolov5/pull/5699 by @imyhxy)python utils/benchmarks.py --weights yolov5s.pt
. Currently operates on CPU, future updates will implement GPU support. (https://github.com/ultralytics/yolov5/pull/6613 by @glenn-jocher).lrf
reduced from 0.2 to 0.1 (https://github.com/ultralytics/yolov5/pull/6525 by @glenn-jocher).All model trainings logged to https://wandb.ai/glenn-jocher/YOLOv5_v61_official
python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n6.pt yolov5s6.pt yolov5m6.pt yolov5l6.pt yolov5x6.pt
Example YOLOv5l before and after metrics:
YOLOv5l Large |
size (pixels) |
mAPval 0.5:0.95 |
mAPval 0.5 |
Speed CPU b1 (ms) |
Speed V100 b1 (ms) |
Speed V100 b32 (ms) |
params (M) |
FLOPs @640 (B) |
---|---|---|---|---|---|---|---|---|
v5.0 | 640 | 48.2 | 66.9 | 457.9 | 11.6 | 2.8 | 47.0 | 115.4 |
v6.0 (previous) | 640 | 48.8 | 67.2 | 424.5 | 10.9 | 2.7 | 46.5 | 109.1 |
v6.1 (this release) | 640 | 49.0 | 67.3 | 430.0 | 10.1 | 2.7 | 46.5 | 109.1 |
Model | size (pixels) |
mAPval 0.5:0.95 |
mAPval 0.5 |
Speed CPU b1 (ms) |
Speed V100 b1 (ms) |
Speed V100 b32 (ms) |
params (M) |
FLOPs @640 (B) |
---|---|---|---|---|---|---|---|---|
YOLOv5n | 640 | 28.0 | 45.7 | 45 | 6.3 | 0.6 | 1.9 | 4.5 |
YOLOv5s | 640 | 37.4 | 56.8 | 98 | 6.4 | 0.9 | 7.2 | 16.5 |
YOLOv5m | 640 | 45.4 | 64.1 | 224 | 8.2 | 1.7 | 21.2 | 49.0 |
YOLOv5l | 640 | 49.0 | 67.3 | 430 | 10.1 | 2.7 | 46.5 | 109.1 |
YOLOv5x | 640 | 50.7 | 68.9 | 766 | 12.1 | 4.8 | 86.7 | 205.7 |
YOLOv5n6 | 1280 | 36.0 | 54.4 | 153 | 8.1 | 2.1 | 3.2 | 4.6 |
YOLOv5s6 | 1280 | 44.8 | 63.7 | 385 | 8.2 | 3.6 | 12.6 | 16.8 |
YOLOv5m6 | 1280 | 51.3 | 69.3 | 887 | 11.1 | 6.8 | 35.7 | 50.0 |
YOLOv5l6 | 1280 | 53.7 | 71.3 | 1784 | 15.8 | 10.5 | 76.8 | 111.4 |
YOLOv5x6 + TTA |
1280 1536 |
55.0 55.8 |
72.7 72.7 |
3136 - |
26.2 - |
19.4 - |
140.7 - |
209.8 - |
python val.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65
python val.py --data coco.yaml --img 640 --task speed --batch 1
python val.py --data coco.yaml --img 1536 --iou 0.7 --augment
Changes between previous release and this release: https://github.com/ultralytics/yolov5/compare/v6.0...v6.1 Changes since this release: https://github.com/ultralytics/yolov5/compare/v6.1...HEAD
tf
conversion in new v6 models by @YoniChechik in https://github.com/ultralytics/yolov5/pull/5153
'onnxruntime-gpu' if torch.has_cuda
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5087
LoadImagesAndLabels()
dataloader by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5172
''
and ""
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5192
on_fit_epoch_end
callback by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5232
EarlyStopping()
message by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5303
autobatch
feature for best batch-size
estimation by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5092
AutoShape.forward()
model.classes example by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5324
nl
fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5332
MixConv2d()
remove shortcut + apply depthwise by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5410
indexing='ij'
for PyTorch 1.10 by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5309
get_loggers()
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/4854
check_git_status()
to run under ROOT
working directory by @MrinalJain17 in https://github.com/ultralytics/yolov5/pull/5441
LoadImages()
dataloader return values by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5455
check_requirements(('tensorflow>=2.4.1',))
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5476
LOGGER
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5483
increment_path()
with multiple-suffix filenames by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5518
is_coco
logic betwen train.py and val.py by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5521
increment_path()
explicit file vs dir handling by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5523
check_file()
avoid repeat URL downloads by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5526
models/hub/*.yaml
files for v6.0n release by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5540
intersect_dicts()
in hubconf.py fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5542
save_one_box()
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5545
--conf-thres
>> 0.001 warning by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5567
LOGGER
consolidation by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5569
DetectMultiBackend()
class by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5549
notebook_init()
to utils/init.py by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5488
check_requirements()
resource warning allocation open file by @ayman-saleh in https://github.com/ultralytics/yolov5/pull/5602
tqdm
to fixed width by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5367
speed
and study
tasks by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5608
np.unique()
sort fix for segments by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5609
WORLD_SIZE
-safe dataloader workers by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5631
shuffle=True
for training by @werner-duvaud in https://github.com/ultralytics/yolov5/pull/5623
LOGGER
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5635
transpose()
with 1 permute
in TransformerBlock()` by @dingyiwei in https://github.com/ultralytics/yolov5/pull/5645
NUM_THREADS
leave at least 1 CPU free by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5706
.autoshape()
method by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5694
--visualize
by @Zengyf-CVer in https://github.com/ultralytics/yolov5/pull/5701
torch==1.7.0
Path support by @miknyko in https://github.com/ultralytics/yolov5/pull/5781
DetectMultiBackend()
by @phodgers in https://github.com/ultralytics/yolov5/pull/5792
model.warmup()
method by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5810
dataset_stats()
to cv2.INTER_AREA
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5821
wandb.errors.UsageError
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5839
imgs
in LoadStreams
by @passerbythesun in https://github.com/ultralytics/yolov5/pull/5850
LoadImages
ret_val=False
handling by @gmt710 in https://github.com/ultralytics/yolov5/pull/5852
*.torchscript
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5856
--workers 8
argument to val.py by @iumyx2612 in https://github.com/ultralytics/yolov5/pull/5857
plot_lr_scheduler()
by @daikankan in https://github.com/ultralytics/yolov5/pull/5864
nl
after cutout()
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5873
AutoShape()
models as DetectMultiBackend()
instances by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5845
Detections().tolist()
explicit argument fix by @lizeng614 in https://github.com/ultralytics/yolov5/pull/5907
notebook_init()
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5919
plot_lr_scheduler()
" by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5920
autocast(False)
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5926
select_device()
robust to batch_size=-1
by @youyuxiansen in https://github.com/ultralytics/yolov5/pull/5940
strip_optimizer()
by @iumyx2612 in https://github.com/ultralytics/yolov5/pull/5949
NUM_THREADS
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5954
tolist()
method by @yonomitt in https://github.com/ultralytics/yolov5/pull/5945
imgsz
bug by @d57montes in https://github.com/ultralytics/yolov5/pull/5948
pretrained=False
fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5966
__init__()
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5979
ar_thr
from 20 to 100 for better detection on slender (high aspect ratio) objects by @MrinalJain17 in https://github.com/ultralytics/yolov5/pull/5556
--weights URL
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5991
jar xf file.zip
for zips by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/5993
self.jit
fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6007
--freeze
argument by @youyuxiansen in https://github.com/ultralytics/yolov5/pull/6019
LOGGER
fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6041
set_logging()
indexing by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6042
--freeze
fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6044
if: else
statements by @cmoseses in https://github.com/ultralytics/yolov5/pull/6087
max_wh=7680
for 8k images by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6178
*_openvino_model/
dir by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6180
anchor_grid
compatibility fix by @imyhxy in https://github.com/ultralytics/yolov5/pull/6185
tensorrt>=7.0.0
checks by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6193
nan
-robust stream FPS by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6198
--int8
'flatbuffers==1.12' fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6216
--int8
'flatbuffers==1.12' fix 2 by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6217
edgetpu_compiler
checks by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6218
edgetpu-compiler
autoinstall by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6223
models/hub
variants by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6230
cmd
string on tfjs
export by @dart-bird in https://github.com/ultralytics/yolov5/pull/6243
--half
FP16 inference by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6268
is_kaggle()
function by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6285
device
count check by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6290
select_device()
cleanup by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6302
train.py
parameter groups desc error by @Otfot in https://github.com/ultralytics/yolov5/pull/6318
dataset_stats()
autodownload capability by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6303
assert im.device.type != 'cpu'
on export by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6340
export.py
return exported files/dirs by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6343
export.py
automatic forward_export
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6352
VERBOSE
environment variable by @johnk2hawaii in https://github.com/ultralytics/yolov5/pull/6353
de_parallel()
rather than is_parallel()
by @imyhxy in https://github.com/ultralytics/yolov5/pull/6354
DEVICE_COUNT
instead of WORLD_SIZE
to calculate nw
by @sitecao in https://github.com/ultralytics/yolov5/pull/6324
--evolve
by @AyushExel in https://github.com/ultralytics/yolov5/pull/6374
albumentations
to Dockerfile by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6392
stop_training=False
flag to callbacks by @haimat in https://github.com/ultralytics/yolov5/pull/6365
detect.py
GIF video inference by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6410
greetings.yaml
email address by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6412
tflite_runtime
for TFLite inference if installed by @motokimura in https://github.com/ultralytics/yolov5/pull/6406
VERBOSE
env variable to YOLOv5_VERBOSE
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6428
*.asf
video support by @toschi23 in https://github.com/ultralytics/yolov5/pull/6436
dataset_stats()
autodownload capability" by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6442
select_device()
for Multi-GPU by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6434
select_device()
for Multi-GPU by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6461
export.py
usage examples by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6495
list()
-> sorted()
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6496
torch.jit.TracerWarning
on export by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6498
export.run()
TracerWarning
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6499
batch_size
on resuming by @AyushExel in https://github.com/ultralytics/yolov5/pull/6512
lrf: 0.1
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6525
sudo
fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6531
tf.lite.experimental.load_delegate
fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6536
if any(f):
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6569
plot_labels()
colored histogram bug by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6574
--evolve
project names by @MattVAD in https://github.com/ultralytics/yolov5/pull/6567
DATASETS_DIR
global in general.py by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6578
opt
from train.run()
by @chf4850 in https://github.com/ultralytics/yolov5/pull/6581
pafy
package by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6603
hyp_evolve.yaml
indexing bug by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6604
ROOT / data
when running W&B log_dataset()
by @or-toledano in https://github.com/ultralytics/yolov5/pull/6606
youtube_dl==2020.12.2
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6612
vmin=0.0
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6638
KeyError
by @imyhxy in https://github.com/ultralytics/yolov5/pull/6637
--workers
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6658
--workers
single-GPU/CPU fix by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6659
--cache val
option by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6663
scipy.cluster.vq.kmeans
too few points by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6668
torch==1.10.2+cu113
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6669
--evolve --bucket gs://...
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6698
export_formats()
in export.py by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6705
torch
AMP-CPU warnings by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6706
nw
to max(nd, 1)
by @glenn-jocher in https://github.com/ultralytics/yolov5/pull/6714
Full Changelog: https://github.com/ultralytics/yolov5/compare/v6.0...v6.1
This release implements YOLOv5-P6 models and retrained YOLOv5-P5 models. All model sizes YOLOv5s/m/l/x are now available in both P5 and P6 architectures:
--img 640
python detect.py --weights yolov5s.pt # P5 models
yolov5m.pt
yolov5l.pt
yolov5x.pt
--img 1280
python detect.py --weights yolov5s6.pt # P6 models
yolov5m6.pt
yolov5l6.pt
yolov5x6.pt
Example usage:
# Command Line
python detect.py --weights yolov5m.pt --img 640 # P5 model at 640
python detect.py --weights yolov5m6.pt --img 640 # P6 model at 640
python detect.py --weights yolov5m6.pt --img 1280 # P6 model at 1280
# PyTorch Hub
model = torch.hub.load('ultralytics/yolov5', 'yolov5m6') # P6 model
results = model(imgs, size=1280) # inference at 1280
python detect.py --source 'https://youtu.be/NUsoVlDFqZg'
. Live streaming videos and normal videos supported. (https://github.com/ultralytics/yolov5/pull/2752)P6 models include an extra P6/64 output layer for detection of larger objects, and benefit the most from training at higher resolution. For this reason we trained all P5 models at 640, and all P6 models at 1280.
python test.py --task study --data coco.yaml --iou 0.7 --weights yolov5s6.pt yolov5m6.pt yolov5l6.pt yolov5x6.pt
Model | size (pixels) |
mAPval 0.5:0.95 |
mAPtest 0.5:0.95 |
mAPval 0.5 |
Speed V100 (ms) |
params (M) |
FLOPS 640 (B) |
|
---|---|---|---|---|---|---|---|---|
YOLOv5s | 640 | 36.7 | 36.7 | 55.4 | 2.0 | 7.3 | 17.0 | |
YOLOv5m | 640 | 44.5 | 44.5 | 63.1 | 2.7 | 21.4 | 51.3 | |
YOLOv5l | 640 | 48.2 | 48.2 | 66.9 | 3.8 | 47.0 | 115.4 | |
YOLOv5x | 640 | 50.4 | 50.4 | 68.8 | 6.1 | 87.7 | 218.8 | |
YOLOv5s6 | 1280 | 43.3 | 43.3 | 61.9 | 4.3 | 12.7 | 17.4 | |
YOLOv5m6 | 1280 | 50.5 | 50.5 | 68.7 | 8.4 | 35.9 | 52.4 | |
YOLOv5l6 | 1280 | 53.4 | 53.4 | 71.1 | 12.3 | 77.2 | 117.7 | |
YOLOv5x6 | 1280 | 54.4 | 54.4 | 72.0 | 22.4 | 141.8 | 222.9 | |
YOLOv5x6 TTA | 1280 | 55.0 | 55.0 | 72.0 | 70.8 | - | - |
python test.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65
python test.py --data coco.yaml --img 640 --conf 0.25 --iou 0.45
python test.py --data coco.yaml --img 1536 --iou 0.7 --augment
Changes between previous release and this release: https://github.com/ultralytics/yolov5/compare/v4.0...v5.0 Changes since this release: https://github.com/ultralytics/yolov5/compare/v5.0...HEAD
Click a section below to expand details:
row major
to be compatible with tensorflow SpaceToDepth
#413
cublasCreate\(handle\)
#2417
torch.nn.modules.module.ModuleAttributeError: 'Hardswish' object has no attribute 'inplace
' #1327
This release implements two architecture changes to YOLOv5, as well as various bug fixes and performance improvements.
Latest models are all slightly smaller to due removal of one convolution within each bottleneck, which have been renamed as C3() modules now in light of the 3 I/O convolutions each one does vs the 4 in the standard CSP bottleneck. The previous manual concatenation and LeakyReLU(0.1) activations have both removed, simplifying the architecture, reducing parameter count, and better exploiting the .fuse() operation at inference time.
nn.SiLU() activations replace nn.LeakyReLU(0.1) and nn.Hardswish() activations throughout the model, simplifying the architecture as we now only have one single activation function used everywhere rather than the two types before.
In general the changes result in smaller models (89.0M params -> 87.7M YOLOv5x), faster inference times (6.9ms -> 6.0ms), and improved mAP (49.2 -> 50.1) for all models except YOLOv5s, which reduced mAP slightly (37.0 -> 36.8). In general the largest models benefit the most from this update. YOLOv5x in particular is now above 50.0 mAP at --img-size 640, which may be the first time this is possible at 640 resolution for any architecture I'm aware of (correct me if I'm wrong though).
** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from google/automl at batch size 8.
Model | size | APval | APtest | AP50 | SpeedV100 | FPSV100 | params | GFLOPS | |
---|---|---|---|---|---|---|---|---|---|
YOLOv5s | 640 | 36.8 | 36.8 | 55.6 | 2.2ms | 455 | 7.3M | 17.0 | |
YOLOv5m | 640 | 44.5 | 44.5 | 63.1 | 2.9ms | 345 | 21.4M | 51.3 | |
YOLOv5l | 640 | 48.1 | 48.1 | 66.4 | 3.8ms | 264 | 47.0M | 115.4 | |
YOLOv5x | 640 | 50.1 | 50.1 | 68.7 | 6.0ms | 167 | 87.7M | 218.8 | |
YOLOv5x + TTA | 832 | 51.9 | 51.9 | 69.6 | 24.9ms | 40 | 87.7M | 1005.3 |
This release aggregates various minor bug fixes and performance improvements since the main v3.0 release and incorporates PyTorch 1.7.0 compatibility updates. v3.1 models share weights with v3.0 models but contain minor module updates (inplace
fields for nn.Hardswish() activations) for native PyTorch 1.7.0 compatibility. For PyTorch 1.7.0 release updates see https://github.com/pytorch/pytorch/releases/tag/v1.7.0.
torch>=1.6.0
required, torch>=1.7.0
recommended (https://github.com/ultralytics/yolov5/pull/1233)This releases includes nn.Hardswish() activation implementation on Conv() modules, which increases mAP for all models at the expense of about 10% in inference speed. Training speeds are not significantly affected, though CUDA memory requirements increase about 10%. Training from scratch as well as finetuning both benefit from this change. The smallest models benefit the most from the Hardswish() activations, with increases of +0.9/+0.8/+0.7/[email protected]:0.95 for YOLOv5s/m/l/x.
All mAP values in our README are now reported at --img-size 640 (v2.0 reported at 672, and v1.0 reported at 736), so we've succeeded in increasing mAP while reducing the required --img-size :)
We've also listed YOLOv5x Test Time Augmentation (TTA) mAP and speeds for v3.0 in our README table for the first time (and for v2.0 below). Best results are YOLOv5x with TTA at 50.8 [email protected]:0.95. We've also updated efficientdet results in our comparison plot to reflect recent improvements in the google/automl repo.
PyTorch 1.6 compatible. torch>=1.6
required (43a616a9551cd53f031c05688884927ba0c13513)
PyTorch 1.6 native Automatic Mixed Precision (AMP) replaces NVIDIA Apex AMP (https://github.com/ultralytics/yolov5/pull/573)
nn.Hardswish()
activations replace nn.LeakyReLU(0.1)
in base convolution module models.Conv()
Dataset Autodownload feature added (https://github.com/ultralytics/yolov5/pull/685)
Model Autodownload improved (https://github.com/ultralytics/yolov5/pull/711)
Layer freezing code added (https://github.com/ultralytics/yolov5/issues/679)
TensorRT export tutorial added (https://github.com/ultralytics/yolov5/pull/623)
** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from google/automl at batch size 8.
August 13, 2020: v3.0 release: nn.Hardswish() activations, data autodownload, native AMP.
July 23, 2020: v2.0 release: improved model definition, training and mAP.
June 22, 2020: PANet updates: new heads, reduced parameters, improved speed and mAP 364fcfd.
June 19, 2020: FP16 as new default for smaller checkpoints and faster inference d4c6674.
June 9, 2020: CSP updates: improved speed, size, and accuracy (credit to @WongKinYiu for CSP).
May 27, 2020: Public release. YOLOv5 models are SOTA among all known YOLO implementations.
April 1, 2020: Start development of future compound-scaled YOLOv3/YOLOv4-based PyTorch models.
Model | APval | APtest | AP50 | SpeedGPU | FPSGPU | params | FLOPS | |
---|---|---|---|---|---|---|---|---|
YOLOv5s | 37.0 | 37.0 | 56.2 | 2.4ms | 476 | 7.5M | 13.2B | |
YOLOv5m | 44.3 | 44.3 | 63.2 | 3.4ms | 333 | 21.8M | 39.4B | |
YOLOv5l | 47.7 | 47.7 | 66.5 | 4.4ms | 256 | 47.8M | 88.1B | |
YOLOv5x | 49.2 | 49.2 | 67.7 | 6.9ms | 164 | 89.0M | 166.4B | |
YOLOv5x + TTA | 50.8 | 50.8 | 68.9 | 25.5ms | 39 | 89.0M | 354.3B | |
YOLOv3-SPP | 45.6 | 45.5 | 65.2 | 4.5ms | 222 | 63.0M | 118.0B |
** APtest denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation except for TTA. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.001
. Test Time Augmentation (TTA) runs at 3 image sizes. Reproduce TTA results by python test.py --data coco.yaml --img 832 --augment
** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
Model | APval | APtest | AP50 | SpeedGPU | FPSGPU | params | FLOPS | |
---|---|---|---|---|---|---|---|---|
YOLOv5s | 36.1 | 36.1 | 55.3 | 2.2ms | 476 | 7.5M | 13.2B | |
YOLOv5m | 43.5 | 43.5 | 62.5 | 3.2ms | 333 | 21.8M | 39.4B | |
YOLOv5l | 47.0 | 47.1 | 65.6 | 4.1ms | 256 | 47.8M | 88.1B | |
YOLOv5x | 49.0 | 49.0 | 67.4 | 6.4ms | 164 | 89.0M | 166.4B | |
YOLOv5x + TTA | 50.4 | 50.4 | 68.5 | 23.4ms | 43 | 89.0M | 354.3B | |
YOLOv3-SPP | 45.6 | 45.5 | 65.2 | 4.5ms | 222 | 63.0M | 118.0B |
** APtest denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by python test.py --data coco.yaml --img 672 --conf 0.001
** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
IMPORTANT: v2.0 release contains breaking changes. Models trained with earlier versions will not operate correctly with v2.0. The last commit before v2.0 that operates correctly with all earlier pretrained models is: https://github.com/ultralytics/yolov5/tree/5e970d45c44fff11d1eb29bfc21bed9553abf986
To clone last commit prior to v2.0:
git clone https://github.com/ultralytics/yolov5 # clone repo
cd yolov5
git reset --hard 5e970d4 # last commit before v2.0
Model | APval | APtest | AP50 | SpeedGPU | FPSGPU | params | FLOPS | |
---|---|---|---|---|---|---|---|---|
YOLOv5s | 36.1 | 36.1 | 55.3 | 2.1ms | 476 | 7.5M | 13.2B | |
YOLOv5m | 43.5 | 43.5 | 62.5 | 3.0ms | 333 | 21.8M | 39.4B | |
YOLOv5l | 47.0 | 47.1 | 65.6 | 3.9ms | 256 | 47.8M | 88.1B | |
YOLOv5x | 49.0 | 49.0 | 67.4 | 6.1ms | 164 | 89.0M | 166.4B | |
YOLOv3-SPP | 45.6 | 45.5 | 65.2 | 4.5ms | 222 | 63.0M | 118.0B |
** APtest denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by python test.py --data coco.yaml --img 672 --conf 0.001
** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
Model | APval | APtest | AP50 | SpeedGPU | FPSGPU | params | FLOPS | |
---|---|---|---|---|---|---|---|---|
YOLOv5s | 36.6 | 36.6 | 55.8 | 2.1ms | 476 | 7.5M | 13.2B | |
YOLOv5m | 43.4 | 43.4 | 62.4 | 3.0ms | 333 | 21.8M | 39.4B | |
YOLOv5l | 46.6 | 46.7 | 65.4 | 3.9ms | 256 | 47.8M | 88.1B | |
YOLOv5x | 48.4 | 48.4 | 66.9 | 6.1ms | 164 | 89.0M | 166.4B | |
YOLOv3-SPP | 45.6 | 45.5 | 65.2 | 4.5ms | 222 | 63.0M | 118.0B |
** APtest denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by python test.py --img 736 --conf 0.001
** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).