Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗...
Generative AI reference workflows optimized for accelerated infrastructu...
Add bisenetv2. My implementation of BiSeNet
An Alternative for Triton Inference Server. Boosting DL Service Throughp...
The Triton backend for the ONNX Runtime.
Hardware-accelerated DNN model inference ROS 2 packages using NVIDIA Tri...
Deploy DL/ ML inference pipelines with minimal extra code.
Set up CI in DL/ cuda/ cudnn/ TensorRT/ onnx2trt/ onnxruntime/ onnxsim/ ...
Advanced inference pipeline using NVIDIA Triton Inference Server for CRA...