Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.9...v1.37.9-stable
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.9-stable
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
/global/spend/report
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/3619
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.7...v1.37.9
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.9
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 40 | 45.14681571397024 | 1.5067595942578198 | 1.5067595942578198 | 451 | 451 | 37.28894399998239 | 203.69157899997958 |
/health/liveliness | Failed ❌ | 38 | 43.774724098143416 | 15.65894061704302 | 15.65894061704302 | 4687 | 4687 | 36.20009499996968 | 219.30193999997982 |
/health/readiness | Failed ❌ | 38 | 42.98829494917115 | 15.314824789529593 | 15.314824789529593 | 4584 | 4584 | 36.154727999985425 | 234.44879100000549 |
Aggregated | Failed ❌ | 38 | 43.46756735054526 | 32.48052500083043 | 32.48052500083043 | 9722 | 9722 | 36.154727999985425 | 234.44879100000549 |
/global/spend/report
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/3619
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.7...v1.37.7-stable
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.7-stable
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.6...v1.37.7
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.7
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.5-stable...v1.37.6
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.5...v1.37.5-stable
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.5-stable
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
existing_trace_id
exists by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/3581
client_no_auth
fixture by @msabramo in https://github.com/BerriAI/litellm/pull/3588
test_load_router_config
pass by @msabramo in https://github.com/BerriAI/litellm/pull/3589
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.3-stable...v1.37.5
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.5
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.3...v1.37.3-stable
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.3-stable
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
BETA support for Triton Inference Embeddings on 👉 Start here: https://docs.litellm.ai/docs/providers/triton-inference-server
⚡️ [Feat] Use Team based callbacks for failure_callbacks https://docs.litellm.ai/docs/proxy/team_based_routing#logging--caching
🛠️ [Test] Added Testing to ensure Proxy - uses the same OpenAI Client after 1 min
🛠️ [Fix] Upsert deployment bug on LiteLLM Proxy
🔥 Improved LiteLLM-stable load tests - added testing for Azure OpenAI, and using 50+ deployments on a proxy server
🚀 [Feat] support stream_options on litellm.text_completion
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.2...v1.37.3
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.3
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
litellm.completion_cost(model="bedrock/anthropic.claude-instant-v1"..)
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/3534
End-User
Usage on Usage Tab by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/3530
stream_options
param for OpenAI by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/3537
stream_options
on litellm.text_completion
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/3547
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.37.0.dev2_completion_cost...v1.37.2
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.2
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 24 | 28.59593037362605 | 1.5197959088318929 | 1.5197959088318929 | 455 | 455 | 22.671621000029063 | 184.80915000003506 |
/health/liveliness | Failed ❌ | 23 | 27.673046850246536 | 15.568722485858137 | 15.568722485858137 | 4661 | 4661 | 21.451024999976198 | 1771.8764150000084 |
/health/readiness | Failed ❌ | 23 | 28.361425038412307 | 15.652227755574176 | 15.652227755574176 | 4686 | 4686 | 21.433796999986043 | 1998.6570389999656 |
Aggregated | Failed ❌ | 23 | 28.044976272087183 | 32.74074615026421 | 32.74074615026421 | 9802 | 9802 | 21.433796999986043 | 1998.6570389999656 |