A high-throughput and memory-efficient inference and serving engine for ...
The most flexible way to serve AI/ML models in production - Build Model ...
In this repository, I will share some useful notes and references about ...
FEDML - The unified and scalable ML library for large-scale distributed ...
Standardized Serverless ML Inference Platform on Kubernetes
🏕️ Reproducible development environment
LightLLM is a Python-based LLM (Large Language Model) inference and serv...
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
MLRun is an open source MLOps platform for quickly building and managing...
Hopsworks - Data-Intensive AI platform with a Feature Store
The simplest way to serve AI/ML models in production
Model Deployment at Scale on Kubernetes 🦄️
A high-performance ML model serving framework, offers dynamic batching a...
A scalable inference server for models optimized with OpenVINO™
Python + Inference - Model Deployment library in Python. Simplest model ...