Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
These are the changes in inference v0.11.0.
v0.11.0 introduced break change when launching model that model_engine
should be specified, refer to Model Engine for more information
model_engine
for more clear inference backend by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1466
model_engine
parameter for launching process by @hainaweiben in https://github.com/xorbitsai/inference/pull/1367
__init__
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1400
auto-gptq
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1457
huggingface-hub
to pass CI since it has some break changes by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1427
xinference-worker
by @amumu96 in https://github.com/xorbitsai/inference/pull/1397
model_engine
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1468
/v1/chat/completions
by @amumu96 in https://github.com/xorbitsai/inference/pull/1406
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.10.3...v0.11.0
These are the changes in inference v0.10.3.
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.10.2.post1...v0.10.3
These are the changes in inference v0.10.2.post1.
xinference-client
package depends on internal code by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1330
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.10.2...v0.10.2.post1
These are the changes in inference v0.10.2.
embedding
and rerank
models by @yiboyasss in https://github.com/xorbitsai/inference/pull/1306
FlagEmbedding
in cpu docker by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1318
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.10.1...v0.10.2
These are the changes in inference v0.10.1.
cv2
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1217
opencv
issue in docker container by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1227
llama-cpp-python
v0.2.58
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1242
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.10.0...v0.10.1
These are the changes in inference v0.10.0.
OmniLMM
chat model by @hainaweiben in https://github.com/xorbitsai/inference/pull/1171
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.9.4...v0.10.0
These are the changes in inference v0.9.4.
sglang
backend by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1161
best_of
from benchmark by @qinxuye in https://github.com/xorbitsai/inference/pull/1150
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.9.3...v0.9.4
These are the changes in inference v0.9.3.
llama-cpp-python
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1134
xinference registrations
and xinference list
command by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1140
ctrl+c
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1144
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.9.2...v0.9.3
These are the changes in inference v0.9.2.
n_gpu_layers
parameter for llama-cpp-python
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1070
replica
on running model page by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1093
CPU
when selecting n_gpu by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1096
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.9.1...v0.9.2
These are the changes in inference v0.9.1.
quantization
when registering LLM by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1040
xinference launch
command line by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1048
modelscope
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1066
max_token
being default to 16
instead of 1024
by @ZhangTianrong in https://github.com/xorbitsai/inference/pull/1061
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.9.0...v0.9.1