ShowLo MobileNetV3 Save

An implementation of MobileNetV3 with pyTorch

Project README

MobileNetV3

An implementation of MobileNetV3 with pyTorch

Theory

 You can find the paper of MobileNetV3 at Searching for MobileNetV3.

Prepare data

  • CIFAR-10
  • CIFAR-100
  • SVHN
  • Tiny-ImageNet
  • ImageNet: Please move validation images to labeled subfolders, you can use the script here.

Train

  • Train from scratch:
CUDA_VISIBLE_DEVICES=3 python train.py --batch-size=128 --mode=small \
--print-freq=100 --dataset=CIFAR100 --ema-decay=0 --label-smoothing=0.1 \
--lr=0.3 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 \
--warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200 --width-multiplier=1 \
-nbd -zero-gamma -mixup

where the meaning of the parameters are as followed:

batch-size
mode: using MobileNetV3-Small(if set to small) or MobileNetV3-Large(if set to large).
dataset: which dataset to use(CIFAR10, CIFAR100, SVHN, TinyImageNet or ImageNet).
ema-decay: decay of EMA, if set to 0, do not use EMA.
label-smoothing: $epsilon$ using in label smoothing, if set to 0, do not use label smoothing.
lr-decay: learning rate decay schedule, step or cos.
lr-min: min lr in cos lr decay.
warmup-epochs: warmup epochs using in cos lr deacy.
num-epochs: total training epochs.
nbd: no bias decay.
zero-gamma: zero $gamma$ of last BN in each block.
mixup: using Mixup.

Pretrained models

 We have provided the pretrained MobileNetV3-Small model in pretrained.

Experiments

Training setting

on ImageNet

CUDA_VISIBLE_DEVICES=5 python train.py --batch-size=128 --mode=small --print-freq=2000 --dataset=imagenet \
--ema-decay=0.99 --label-smoothing=0.1 --lr=0.1 --save-epoch-freq=50 --lr-decay=cos --lr-min=0 --warmup-epochs=5 \
--weight-decay=1e-5 --num-epochs=250 --num-workers=2 --width-multiplier=1 -dali -nbd -mixup -zero-gamma -save

on CIFAR-10

CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR10\
  --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1

on CIFAR-100

CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR100\
  --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1

 Using more tricks:

CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR100\
  --ema-decay=0.999 --label-smoothing=0.1 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1\
  -zero-gamma -nbd -mixup

on SVHN

CUDA_VISIBLE_DEVICES=3 python train.py --batch-size=128 --mode=small --print-freq=1000 --dataset=SVHN\
  --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=20 --num-workers=2 --width-multiplier=1

on Tiny-ImageNet

CUDA_VISIBLE_DEVICES=7 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=tinyimagenet\
  --data-dir=/media/data2/chenjiarong/ImageData/tiny-imagenet --ema-decay=0 --label-smoothing=0 --lr=0.15\
  --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200\
  --num-workers=2 --width-multiplier=1 -dali

 Using more tricks:

CUDA_VISIBLE_DEVICES=7 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=tinyimagenet\
  --data-dir=/media/data2/chenjiarong/ImageData/tiny-imagenet --ema-decay=0.999 --label-smoothing=0.1 --lr=0.15\
  --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200\
  --num-workers=2 --width-multiplier=1 -dali -nbd -mixup

MobileNetV3-Large

on ImageNet

Madds Parameters Top1-acc Top5-acc
Offical 1.0 219 M 5.4 M 75.2% -
Ours 1.0 216.6 M 5.47 M - -

on CIFAR-10

Madds Parameters Top1-acc Top5-acc
Ours 1.0 66.47 M 4.21 M - -

on CIFAR-100

Madds Parameters Top1-acc Top5-acc
Ours 1.0 66.58 M 4.32 M - -

MobileNetV3-Small

on ImageNet

Madds Parameters Top1-acc Top5-acc
Offical 1.0 56.5 M 2.53 M 67.4% -
Ours 1.0 56.51 M 2.53 M 67.52% 87.58%

 The pretrained model with top-1 accuracy 67.52% is provided in the folder pretrained.

on CIFAR-10 (Average accuracy of 5 runs)

Madds Parameters Top1-acc Top5-acc
Ours 1.0 17.51 M 1.52 M 92.97% -

on CIFAR-100 (Average accuracy of 5 runs)

Madds Parameters Top1-acc Top5-acc
Ours 1.0 17.60 M 1.61 M 73.69% 92.31%
More Tricks same same 76.24% 92.58%

on SVHN (Average accuracy of 5 runs)

Madds Parameters Top1-acc Top5-acc
Ours 1.0 17.51 M 1.52 M 97.92% -

on Tiny-ImageNet (Average accuracy of 5 runs)

Madds Parameters Top1-acc Top5-acc
Ours 1.0 51.63 M 1.71 M 59.32% 81.38%
More Tricks same same 62.62% 84.04%

Dependency

 This project uses Python 3.7 and PyTorch 1.1.0. The FLOPs and Parameters and measured using torchsummaryX.

Open Source Agenda is not affiliated with "ShowLo MobileNetV3" Project. README Source: ShowLo/MobileNetV3

Open Source Agenda Badge

Open Source Agenda Rating