InsightFace REST Versions Save

InsightFace REST API for easy deployment of face recognition services with TensorRT in Docker.

v0.7.0.0

2 years ago

2021-11-06 v0.7.0.0

Since a lot of updates happened since last release version is updated straight to v0.7.0.0

Comparing to previous release (v0.6.2.0) this release brings improved performance for SCRFD based detectors.

Here is performance comparison on GPU Nvidia RTX 2080 Super for scrfd_10g_gnkps detector paired with glintr100 recognition model (all tests are using src/api_trt/test_images/Stallone.jpg, 1 face per image):

Num workers	Client threads	FPS v0.6.2.0	FPS v0.7.0.0	Speed-up
1	1	56	103	83.9%
1	30	72	128	77.7%
6	30	145	179	23.4%

Additions:

Added experimental support for msgpack serializer: helps reduce network traffic for embeddings for ~2x.
Output names no longer required for detection models when building TRT engine - correct output order is now extracted from onnx models.
Detection models now can be exported to TRT engine with batch size > 1 - inference code doesn't support it yet, though now they could be used in Triton Inference Server without issues.

Model Zoo:

Added support for WebFace600k based recognition models from InsightFace repo: w600k_r50 and w600k_mbf
Added md5 check for models to allow automatic re-download if models have changed.
All scrfd based models now supports batch dimension.

Improvements:

1.5x-2x faster SCRFD re-implementation with Numba: 4.5 ms. vs 10 ms. for lumia.jpg example with scrfd_10g_gnkps and threshold = 0.3 (432 faces detected)).
Move image normalization step to GPU with help of CuPy (4x lower data transfer from CPU to GPU, about 6% inference speedup, and some computations offloaded from CPU).
4.5x Faster face_align.norm_crop implementation with help of Numba and removal of unused computations. (Cropping 432 faces from lumia.jpg example tooks 45 ms. vs 205 ms.).
Face crops are now extracted only when needed - when face data or embeddings are requested, improving detection only performance.
Added Numba njit cache to reduce subsequent starts time.
Logging timings rounded to ms for better readability.
Minor refactoring

Fixes:

Since gender/age estimation model is currently not supported exclude it from models preparing step.

v0.6.2.0

2 years ago

2021-09-09 v0.6.2.0

REST-API

Use async httpx lib for retrieving images by urls instead of urllib3 (which caused performance drop in multi-GPU environment under load due to excessive usage of opened sockets)
Update draft Triton Infernce Server support to use CUDA shared memory.
Minor refactoring for future change of project structure.

This release also includes previously missed release version 0.6.1.0:

2021-08-07 v0.6.1.0

REST-API

Dropped support of MXNet inference backend and automatic MXNet->ONNX models conversion, since all models are now distributed as ONNX by default.

v0.6.0.0

2 years ago

v0.6.0.0

REST-API

Added support for newer InsightFace face detection SCRFD models: scrfd_500m_bnkps, scrfd_2.5g_bnkps, scrfd_10g_bnkps
Released custom trained SCRFD models: scrfd_500m_gnkps, scrfd_2.5g_gnkps, scrfd_10g_gnkps
Added support for newer InsightFace face recognition model glintr100
Models auto download switched to Google Drive.
Default models switched to glintr100 and scrfd_10g_gnkps

v0.5.9.9

3 years ago

v0.5.9.9

REST-API

Added JPEG decoding using PyTurboJPEG - increased decoding speed for large JPEGs for about 2x.
Support for batch inference of genderage model.
Support for limiting number of faces for recognition using limit_faces parameter in extract endpoint.
New /multipart/draw_detections endpoint, supporting image upload using multipart form data.
Support for printing face sizes and scores on image by draw_detections endpoints.
More verbose timings for extract endpoint for debug and logging purposes.

v0.5.9.6

3 years ago

v0.1.2

3 years ago

Old version based on TF MTCNN