Making large AI models cheaper, faster and more accessible
DeepSpeed is a deep learning optimization library that makes distributed...
Port of OpenAI's Whisper model in C/C++
Cross-platform, customizable ML solutions for live and streaming media.
A high-throughput and memory-efficient inference and serving engine for ...
ncnn is a high-performance neural network inference framework optimized ...
🎨 The exhaustive Pattern Matching library for TypeScript, with smart ty...
YOLOv3 in PyTorch > ONNX > CoreML > TFLite
Example 📓 Jupyter notebooks that demonstrate how to build, train, and d...
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference...
Faster Whisper transcription with CTranslate2
The user analytics platform for LLMs
Large Language Model Text Generation Inference
The Triton Inference Server provides an optimized cloud and edge inferen...
Hello AI World guide to deploying deep-learning inference networks and d...