Making large AI models cheaper, faster and more accessible
DeepSpeed is a deep learning optimization library that makes distributed...
Port of OpenAI's Whisper model in C/C++
Cross-platform, customizable ML solutions for live and streaming media.
ncnn is a high-performance neural network inference framework optimized ...
A high-throughput and memory-efficient inference and serving engine for ...
🎨 The exhaustive Pattern Matching library for TypeScript, with smart ty...
YOLOv3 in PyTorch > ONNX > CoreML > TFLite
Example 📓 Jupyter notebooks that demonstrate how to build, train, and d...
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference...
Faster Whisper transcription with CTranslate2
The user analytics platform for LLMs
Large Language Model Text Generation Inference
Hello AI World guide to deploying deep-learning inference networks and d...
The Triton Inference Server provides an optimized cloud and edge inferen...