OpenLLM Versions Save

Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint, locally and in the cloud.

v0.5.0-alpha.1

1 month ago

v0.5.0-alpha

1 month ago

v0.4.44

3 months ago

Installation

pip install openllm==0.4.44

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.44

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.44 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix: remove vllm dependency for pytorch bento by @larme in https://github.com/bentoml/OpenLLM/pull/893

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.43...v0.4.44

v0.4.43

3 months ago

Installation

pip install openllm==0.4.43

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.43

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.43 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix: limit BentoML version range by @larme in https://github.com/bentoml/OpenLLM/pull/881
chore: bump up bentoml version to 1.1.11 by @larme in https://github.com/bentoml/OpenLLM/pull/883
Bump BentoML version in tools by @larme in https://github.com/bentoml/OpenLLM/pull/884

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.42...v0.4.43

v0.4.42

3 months ago

Installation

pip install openllm==0.4.42

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.42

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.42 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

docs: Update opt example to ms-phi by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/805
chore(script): run vendored scripts by @aarnphm in https://github.com/bentoml/OpenLLM/pull/808
docs: README.md typo by @weibeu in https://github.com/bentoml/OpenLLM/pull/819
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/818
chore(deps): bump docker/metadata-action from 5.3.0 to 5.4.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/814
chore(deps): bump taiki-e/install-action from 2.22.5 to 2.23.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/813
chore(deps): bump github/codeql-action from 3.22.11 to 3.22.12 by @dependabot in https://github.com/bentoml/OpenLLM/pull/815
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/825
chore(deps): bump crazy-max/ghaction-import-gpg from 6.0.0 to 6.1.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/824
chore(deps): bump taiki-e/install-action from 2.23.1 to 2.23.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/823
docs: Add Llamaindex in freedom to build by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/826
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/836
chore(deps): bump docker/metadata-action from 5.4.0 to 5.5.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/834
chore(deps): bump aquasecurity/trivy-action from 0.16.0 to 0.16.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/832
chore(deps): bump taiki-e/install-action from 2.23.7 to 2.24.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/833
chore(deps): bump vllm to 0.2.7 by @aarnphm in https://github.com/bentoml/OpenLLM/pull/837
chore: update discord link by @aarnphm in https://github.com/bentoml/OpenLLM/pull/838
improv(package): use python slim base image and let pytorch install cuda by @larme in https://github.com/bentoml/OpenLLM/pull/807
fix(dockerfile): conflict deps by @aarnphm in https://github.com/bentoml/OpenLLM/pull/841
chore: fix typo in list_models pydoc by @fuzzie360 in https://github.com/bentoml/OpenLLM/pull/847
docs: update README.md telemetry code link by @fuzzie360 in https://github.com/bentoml/OpenLLM/pull/842
chore(deps): bump taiki-e/install-action from 2.24.1 to 2.25.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/846
chore(deps): bump github/codeql-action from 3.22.12 to 3.23.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/844
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/848
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/858
chore(deps): bump taiki-e/install-action from 2.25.1 to 2.25.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/856
chore(deps): bump github/codeql-action from 3.23.0 to 3.23.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/855
fix: proper SSE handling for vllm by @larme in https://github.com/bentoml/OpenLLM/pull/877
chore: set stop to empty list by default by @larme in https://github.com/bentoml/OpenLLM/pull/878
fix: all runners sse output by @larme in https://github.com/bentoml/OpenLLM/pull/880

New Contributors

@weibeu made their first contribution in https://github.com/bentoml/OpenLLM/pull/819
@fuzzie360 made their first contribution in https://github.com/bentoml/OpenLLM/pull/847

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.41...v0.4.42

v0.4.41

4 months ago

GPTQ Supports

vLLM backend now support GPTQ with upstream

openlml start TheBloke/Mistral-7B-Instruct-v0.2-GPTQ --backend vllm --quantise gptq

Installation

pip install openllm==0.4.41

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.41

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.41 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

docs: add notes about dtypes usage. by @aarnphm in https://github.com/bentoml/OpenLLM/pull/786
chore(deps): bump taiki-e/install-action from 2.22.0 to 2.22.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/790
chore(deps): bump github/codeql-action from 2.22.9 to 3.22.11 by @dependabot in https://github.com/bentoml/OpenLLM/pull/794
chore(deps): bump sigstore/cosign-installer from 3.2.0 to 3.3.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/793
chore(deps): bump actions/download-artifact from 3.0.2 to 4.0.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/791
chore(deps): bump actions/upload-artifact from 3.1.3 to 4.0.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/792
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/796
fix(cli): avoid runtime __origin__ check for older Python by @aarnphm in https://github.com/bentoml/OpenLLM/pull/798
feat(vllm): support GPTQ with 0.2.6 by @aarnphm in https://github.com/bentoml/OpenLLM/pull/797
fix(ci): lock to v3 iteration of actions/artifacts workflow by @aarnphm in https://github.com/bentoml/OpenLLM/pull/799

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.40...v0.4.41

v0.4.40

4 months ago

Installation

pip install openllm==0.4.40

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.40

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.40 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(infra): conform ruff to 150 LL by @aarnphm in https://github.com/bentoml/OpenLLM/pull/781
infra: update blame ignore to formatter hash by @aarnphm in https://github.com/bentoml/OpenLLM/pull/782
perf: upgrade mixtral to use expert parallelism by @aarnphm in https://github.com/bentoml/OpenLLM/pull/783

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.39...v0.4.40

v0.4.39

4 months ago

Installation

pip install openllm==0.4.39

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.39

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.39 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(logprobs): correct check logprobs by @aarnphm in https://github.com/bentoml/OpenLLM/pull/779

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.38...v0.4.39

v0.4.38

4 months ago

Installation

pip install openllm==0.4.38

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.38

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.38 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(mixtral): correct chat templates to remove additional spacing by @aarnphm in https://github.com/bentoml/OpenLLM/pull/774
fix(cli): correct set arguments for openllm import and openllm build by @aarnphm in https://github.com/bentoml/OpenLLM/pull/775
fix(mixtral): setup hack atm to load weights from pt specifically instead of safetensors by @aarnphm in https://github.com/bentoml/OpenLLM/pull/776

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.37...v0.4.38

v0.4.37

4 months ago

Installation

pip install openllm==0.4.37

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.37

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.37 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

feat(mixtral): correct support for mixtral by @aarnphm in https://github.com/bentoml/OpenLLM/pull/772
chore: running all script when installation by @aarnphm in https://github.com/bentoml/OpenLLM/pull/773

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.36...v0.4.37