Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX
Implementation new inference pipeline using NVIDIA Triton Inference Server for CRAFT text detector in Pytorch.
k9ele7en. Give 1 star if you find some value in this repo.
Thank you.
[BSD-3-Clause License] The BSD 3-clause license allows you almost unlimited freedom with the software so long as you include the BSD copyright and license notice in it (found in Fulltext).
13 Jul, 2021: Initial update, preparation script run well.
14 Jul, 2021: Inference on Triton server run well (single request), TensorRT format give advance performance.
$ pip install -r requirements.txt
Check ./README_ENV.md for details. Install tools/packages included:
The code for training is not included in this repository, as ClovaAI provided.
Model name | Used datasets | Languages | Purpose | Model Link |
---|---|---|---|---|
General | SynthText, IC13, IC17 | Eng + MLT | For general purpose | Click |
IC15 | SynthText, IC15 | Eng | For IC15 only | Click |
LinkRefiner | CTW1500 | - | Used with the General Model | Click |
a. Triton Inference Server inference: see details at ./README_ENV.md
Initially, you need to run a (.sh) script to prepare Model Repo, then, you just need to run Docker image when inferencing. Script get things ready for Triton server, steps covered:
Check if Server running correctly:
$ curl -v localhost:8000/v2/health/ready
...
< HTTP/1.1 200 OK
< Content-Length: 0
< Content-Type: text/plain
Now everythings ready, start inference by:
$ sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/maverick911/repo/Triton-TensorRT-Inference-CRAFT-pytorch/model_repository:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-repository=/models
...
+------------+---------+--------+
| Model | Version | Status |
+------------+---------+--------+
| detec_onnx | 1 | READY |
| detec_pt | 1 | READY |
| detec_trt | 1 | READY |
+------------+---------+--------+
I0714 00:37:55.265177 1 grpc_server.cc:4062] Started GRPCInferenceService at 0.0.0.0:8001
I0714 00:37:55.269588 1 http_server.cc:2887] Started HTTPService at 0.0.0.0:8000
I0714 00:37:55.312507 1 http_server.cc:2906] Started Metrics Service at 0.0.0.0:8002
Run infer by cmd:
$ python infer_triton.py -m='detec_trt' -x=1 --test_folder='./images' -i='grpc' -u='localhost:8001'
Request 1, batch size 1s/sample.jpg
elapsed time : 0.9521937370300293s
Output from Triton:
Triton server: (gRPC-HTTP):
Model format | gRPC (s) | HTTP (s) |
---|---|---|
TensoRT | 0.946 | 0.952 |
Torchscript | 1.244 | 1.098 |
ONNX | 1.052 | 1.060 |
Classic Pytorch: 1.319s
-m
: name of model with format-x
: version of model--test_folder
: input image/folder-i
: protocol (HTTP/gRPC)-u
: URL of corresponding protocol (HTTP-8000, gRPC-8001)inference failed: [StatusCode.INTERNAL] request specifies invalid shape for input 'input' for detec_trt_0_gpu0. Error details: model expected the shape of dimension 2 to be between 256 and 1200 but received 1216
b. Classic Pytorch (.pth) inference:
$ python test.py --trained_model=[weightfile] --test_folder=[folder path to test images]
The result image and socre maps will be saved to ./result
by default.