Ray is a unified framework for scaling AI and Python applications. Ray c...
A high-throughput and memory-efficient inference and serving engine for ...
Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible...
本项目旨在分享大模型相关技术原理以及实战经验。
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings...
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
RayLLM - LLMs on Ray
A high-performance ML model serving framework, offers dynamic batching a...
Efficient AI Inference & Serving
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse app...
Finetune LLMs on K8s by using Runbooks
A collection of hand on notebook for LLMs practitioner