ShowLo MobileNetV3 Save

An implementation of MobileNetV3 with pyTorch

Project README

MobileNetV3

An implementation of MobileNetV3 with pyTorch

Theory

You can find the paper of MobileNetV3 at Searching for MobileNetV3.

Prepare data

CIFAR-10
CIFAR-100
SVHN
Tiny-ImageNet
ImageNet: Please move validation images to labeled subfolders, you can use the script here.

Train

Train from scratch:

CUDA_VISIBLE_DEVICES=3 python train.py --batch-size=128 --mode=small \
--print-freq=100 --dataset=CIFAR100 --ema-decay=0 --label-smoothing=0.1 \
--lr=0.3 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 \
--warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200 --width-multiplier=1 \
-nbd -zero-gamma -mixup

where the meaning of the parameters are as followed:

batch-size
mode: using MobileNetV3-Small(if set to small) or MobileNetV3-Large(if set to large).
dataset: which dataset to use(CIFAR10, CIFAR100, SVHN, TinyImageNet or ImageNet).
ema-decay: decay of EMA, if set to 0, do not use EMA.
label-smoothing: $epsilon$ using in label smoothing, if set to 0, do not use label smoothing.
lr-decay: learning rate decay schedule, step or cos.
lr-min: min lr in cos lr decay.
warmup-epochs: warmup epochs using in cos lr deacy.
num-epochs: total training epochs.
nbd: no bias decay.
zero-gamma: zero $gamma$ of last BN in each block.
mixup: using Mixup.

Pretrained models

We have provided the pretrained MobileNetV3-Small model in pretrained.

Experiments

Training setting

on ImageNet

CUDA_VISIBLE_DEVICES=5 python train.py --batch-size=128 --mode=small --print-freq=2000 --dataset=imagenet \
--ema-decay=0.99 --label-smoothing=0.1 --lr=0.1 --save-epoch-freq=50 --lr-decay=cos --lr-min=0 --warmup-epochs=5 \
--weight-decay=1e-5 --num-epochs=250 --num-workers=2 --width-multiplier=1 -dali -nbd -mixup -zero-gamma -save

on CIFAR-10

CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR10\
  --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1

on CIFAR-100

CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR100\
  --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1

Using more tricks：

CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR100\
  --ema-decay=0.999 --label-smoothing=0.1 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1\
  -zero-gamma -nbd -mixup

on SVHN

CUDA_VISIBLE_DEVICES=3 python train.py --batch-size=128 --mode=small --print-freq=1000 --dataset=SVHN\
  --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\
  --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=20 --num-workers=2 --width-multiplier=1

on Tiny-ImageNet

CUDA_VISIBLE_DEVICES=7 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=tinyimagenet\
  --data-dir=/media/data2/chenjiarong/ImageData/tiny-imagenet --ema-decay=0 --label-smoothing=0 --lr=0.15\
  --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200\
  --num-workers=2 --width-multiplier=1 -dali

Using more tricks：

CUDA_VISIBLE_DEVICES=7 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=tinyimagenet\
  --data-dir=/media/data2/chenjiarong/ImageData/tiny-imagenet --ema-decay=0.999 --label-smoothing=0.1 --lr=0.15\
  --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200\
  --num-workers=2 --width-multiplier=1 -dali -nbd -mixup

MobileNetV3-Large

on ImageNet

	Madds	Parameters	Top1-acc	Top5-acc
Offical 1.0	219 M	5.4 M	75.2%	-
Ours 1.0	216.6 M	5.47 M	-	-

on CIFAR-10

	Madds	Parameters	Top1-acc	Top5-acc
Ours 1.0	66.47 M	4.21 M	-	-

on CIFAR-100

	Madds	Parameters	Top1-acc	Top5-acc
Ours 1.0	66.58 M	4.32 M	-	-

MobileNetV3-Small

on ImageNet

	Madds	Parameters	Top1-acc	Top5-acc
Offical 1.0	56.5 M	2.53 M	67.4%	-
Ours 1.0	56.51 M	2.53 M	67.52%	87.58%

The pretrained model with top-1 accuracy 67.52% is provided in the folder pretrained.

on CIFAR-10 (Average accuracy of 5 runs)

	Madds	Parameters	Top1-acc	Top5-acc
Ours 1.0	17.51 M	1.52 M	92.97%	-

on CIFAR-100 (Average accuracy of 5 runs)

	Madds	Parameters	Top1-acc	Top5-acc
Ours 1.0	17.60 M	1.61 M	73.69%	92.31%
More Tricks	same	same	76.24%	92.58%

on SVHN (Average accuracy of 5 runs)

	Madds	Parameters	Top1-acc	Top5-acc
Ours 1.0	17.51 M	1.52 M	97.92%	-

on Tiny-ImageNet (Average accuracy of 5 runs)

	Madds	Parameters	Top1-acc	Top5-acc
Ours 1.0	51.63 M	1.71 M	59.32%	81.38%
More Tricks	same	same	62.62%	84.04%

Dependency

This project uses Python 3.7 and PyTorch 1.1.0. The FLOPs and Parameters and measured using torchsummaryX.

Open Source Agenda is not affiliated with "ShowLo MobileNetV3" Project. README Source: ShowLo/MobileNetV3

Stars

Open Issues

Last Commit

3 years ago

Repository

ShowLo/MobileNetV3

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/showlo-mobilenetv3"><img src="https://www.opensourceagenda.com/projects/showlo-mobilenetv3/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022