Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint, locally and in the cloud.
pip install openllm==0.4.44
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.44
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.44 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.43...v0.4.44
pip install openllm==0.4.43
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.43
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.43 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.42...v0.4.43
pip install openllm==0.4.42
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.42
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.42 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.41...v0.4.42
vLLM backend now support GPTQ with upstream
openlml start TheBloke/Mistral-7B-Instruct-v0.2-GPTQ --backend vllm --quantise gptq
pip install openllm==0.4.41
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.41
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.41 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
__origin__
check for older Python by @aarnphm in https://github.com/bentoml/OpenLLM/pull/798
actions/artifacts
workflow by @aarnphm in https://github.com/bentoml/OpenLLM/pull/799
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.40...v0.4.41
pip install openllm==0.4.40
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.40
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.40 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.39...v0.4.40
pip install openllm==0.4.39
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.39
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.39 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.38...v0.4.39
pip install openllm==0.4.38
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.38
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.38 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
openllm import
and openllm build
by @aarnphm in https://github.com/bentoml/OpenLLM/pull/775
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.37...v0.4.38
pip install openllm==0.4.37
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.37
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.37 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.36...v0.4.37