Human segmentation models, training/inference code, and trained weights, implemented in PyTorch
Human segmentation models, training/inference code, and trained weights, implemented in PyTorch.
To assess architecture, memory, forward time (in either cpu or gpu), numper of parameters, and number of FLOPs of a network, use this command:
python measure_model.py
Portrait Segmentation (Human/Background)
git clone --recursive https://github.com/AntiAegis/Human-Segmentation-PyTorch.git
cd Human-Segmentation-PyTorch
git submodule sync
git submodule update --init --recursive
workon humanseg
pip install -r requirements.txt
pip install -e models/pytorch-image-models
python train.py --config config/config_DeepLab.json --device 0
where config/config_DeepLab.json is the configuration file which contains network, dataloader, optimizer, losses, metrics, and visualization configurations.
python train.py --config config/config_DeepLab.json --device 0 --resume path_to_checkpoint/model_best.pth
There are two modes of inference: video and webcam.
python inference_video.py --watch --use_cuda --checkpoint path_to_checkpoint/model_best.pth
python inference_webcam.py --use_cuda --checkpoint path_to_checkpoint/model_best.pth
CPU: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
GPU: GeForce GTX 1050 Mobile, CUDA 9.0
Model | Parameters | FLOPs | CPU time | GPU time | mIoU |
---|---|---|---|---|---|
UNet_MobileNetV2 (alpha=1.0, expansion=6) | 4.7M | 1.3G | 167ms | 17ms | 91.37% |
UNet_ResNet18 | 16.6M | 9.1G | 165ms | 21ms | 90.09% |
DeepLab3+_ResNet18 | 16.6M | 9.1G | 133ms | 28ms | 91.21% |
BiSeNet_ResNet18 | 11.9M | 4.7G | 88ms | 10ms | 87.02% |
PSPNet_ResNet18 | 12.6M | 20.7G | 235ms | 666ms | --- |
ICNet_ResNet18 | 11.6M | 2.0G | 48ms | 55ms | 86.27% |