Efficientvit Save Abandoned

EfficientViT is a new family of vision models for efficient high-resolution vision.

Project README

EfficientViT

paper | poster

About EfficientViT Models

EfficientViT is a new family of vision models for efficient high-resolution vision, especially segmentation. The core building block of EfficientViT is a new lightweight multi-scale attention module that achieves global receptive field and multi-scale learning with only hardware-efficient operations.

Here are comparisons with prior SOTA semantic segmentation models:

Here are the results of EfficientViT on image classification:

Getting Started

Installation

conda create -n efficientvit python=3.8.5
conda activate efficientvit
conda install pytorch=1.13.1 torchvision=0.14.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install tqdm opencv-python

Dataset

ImageNet: https://www.image-net.org/
Cityscapes: https://www.cityscapes-dataset.com/
ADE20K: https://groups.csail.mit.edu/vision/datasets/ADE20K/

Download Pretrained Models

Mobile latency is measured on Qualcomm Snapdragon 8Gen1 with Tensorflow-Lite, fp32, batch size 1.

ImageNet

Model	Resolution	ImageNet Top1 Acc	ImageNet Top5 Acc	Params	MACs	Mobile Latency	Checkpoint
EfficientViT-B1	224	79.4	94.3	9.1M	0.52G	19ms	link
EfficientViT-B1	256	79.9	94.7	9.1M	0.68G	24ms	link
EfficientViT-B1	288	80.4	95.0	9.1M	0.86G	31ms	link
EfficientViT-B2	224	82.1	95.8	24M	1.6G	55ms	link
EfficientViT-B2	256	82.7	96.1	24M	2.1G	72ms	link
EfficientViT-B2	288	83.1	96.3	24M	2.6G	92ms	link
EfficientViT-B3	224	83.5	96.4	49M	4.0G	140ms	link
EfficientViT-B3	256	83.8	96.5	49M	5.2G	180ms	link
EfficientViT-B3	288	84.2	96.7	49M	6.5G	228ms	link

Cityscapes

Model	Resolution	Cityscapes mIoU	Params	MACs	Mobile Latency	Checkpoint
EfficientViT-B0	960x1920	75.5	0.7M	3.9G	0.20s	link
EfficientViT-B1	896x1792	80.1	4.8M	19G	0.82s	link
EfficientViT-B2	1024x2048	82.1	15M	74G	3.1s	link
EfficientViT-B3	1184x2368	83.2	40M	240G	10s	link

ADE20K

Model	Resolution	ADE20K mIoU	Params	MACs	Mobile Latency	Checkpoint
EfficientViT-B1	480	42.7	4.8M	2.7G	0.10s	link
EfficientViT-B2	416	45.1	15M	6.0G	0.21s	link
EfficientViT-B3	512	49.0	39M	22G	0.8s	link

Usage

from models.cls_model_zoo import create_cls_model

model = create_cls_model(
  name="b3", 
  pretrained=True, 
  weight_url="assets/checkpoints/cls/b3-r288.pt"
)

from models.seg_model_zoo import create_seg_model

model = create_seg_model(
  name="b3", 
  dataset="cityscapes", 
  pretrained=True, 
  weight_url="assets/checkpoints/seg/cityscapes/b3-r1184.pt"
)

from models.seg_model_zoo import create_seg_model

model = create_seg_model(
  name="b3", 
  dataset="ade20k", 
  pretrained=True, 
  weight_url="assets/checkpoints/seg/ade20k/b3-r512.pt"
)

Evaluation

Please run eval_cls_model.py or eval_seg_model.py to evaluate our models.

Examples: classification, segmentation

Visualization

Please run eval_seg_model.py to visualize the outputs of our semantic segmentation models.

Example:

python eval_seg_model.py --dataset cityscapes --crop_size 1184 --model b3-r1184 --save_path demo/cityscapes/b3-r1184/

Benchmarking with TFLite

To generate TFLite files, please refer to export_tflite.py. It requires the TinyNN package.

pip install git+https://github.com/alibaba/TinyNeuralNetwork.git

Example:

python export_tflite.py --export_path model.tflite --task seg --dataset ade20k --model b3 --resolution 512 512

Contact

Han Cai: [email protected]

TODO

Add super resolution models
Add object detection models
Add training code

Citation

If EfficientViT is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

@article{cai2022efficientvit,
  title={Efficientvit: Enhanced linear attention for high-resolution low-computation visual recognition},
  author={Cai, Han and Gan, Chuang and Han, Song},
  journal={arXiv preprint arXiv:2205.14756},
  year={2022}
}

Open Source Agenda is not affiliated with "Efficientvit" Project. README Source: mit-han-lab/efficientvit

Stars

106

Open Issues

Last Commit

11 months ago

License

Apache-2.0

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/efficientvit"><img src="https://www.opensourceagenda.com/projects/efficientvit/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022