BentoML Versions Save

The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!

v1.2.0a5

4 months ago

What's Changed

fix: image encoding issue when the format is not specified by @frostming in https://github.com/bentoml/BentoML/pull/4435

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.2.0a4...v1.2.0a5

v1.2.0a4

4 months ago

What's Changed

fix: make client timeout consistent with service config by @frostming in https://github.com/bentoml/BentoML/pull/4425
fix: fix env in deploy and create error msg by @FogDong in https://github.com/bentoml/BentoML/pull/4426
fix: port reservation causes bind error on Windows by @frostming in https://github.com/bentoml/BentoML/pull/4427
fix: add more status check in wait until ready by @FogDong in https://github.com/bentoml/BentoML/pull/4428
fix: use spinner log instead of console log by @FogDong in https://github.com/bentoml/BentoML/pull/4429
fix: add config dict to cli by @FogDong in https://github.com/bentoml/BentoML/pull/4424
docs: Add client and use case docs by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4430
fix(sdk): snake case bento names by @bojiang in https://github.com/bentoml/BentoML/pull/4431
docs: Add SDXL turbo use case by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4433
docs: Update readme doc by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4409

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.2.0a3...v1.2.0a4

v1.2.0a3

4 months ago

What's Changed

fix(server): start-http-server for v1.0 bentos by @bojiang in https://github.com/bentoml/BentoML/pull/4421
fix: refactor cli & sdk for deployment by @FogDong in https://github.com/bentoml/BentoML/pull/4419
fix: fix get bento from cloud by @FogDong in https://github.com/bentoml/BentoML/pull/4423
fix: change entry_service to optional by @xianml in https://github.com/bentoml/BentoML/pull/4422

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.2.0a2...v1.2.0a3

v1.2.0a2

4 months ago

What's Changed

feat: allow url as file input by @frostming in https://github.com/bentoml/BentoML/pull/4411
fix(sdk): validate service name; force lowercase by @bojiang in https://github.com/bentoml/BentoML/pull/4414
refactor(bento): add entry_service to bento info by @bojiang in https://github.com/bentoml/BentoML/pull/4417
fix: set content type from annotation by @frostming in https://github.com/bentoml/BentoML/pull/4416
fix(server): include system headers by @bojiang in https://github.com/bentoml/BentoML/pull/4418
fix: loading service from an absolute path by @frostming in https://github.com/bentoml/BentoML/pull/4415
docs: Add BLIP, CLIP, and sentence embedding use cases by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4420

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.2.0a1...v1.2.0a2

v1.2.0a1

4 months ago

What's Changed

fix(docs): remove eager type evaluation for python 3.8 on openllm examples by @aarnphm in https://github.com/bentoml/BentoML/pull/4393
docs: fix typo in tritonserver_type parameter and code example by @ilyahabr in https://github.com/bentoml/BentoML/pull/4394

New Contributors

@ilyahabr made their first contribution in https://github.com/bentoml/BentoML/pull/4394

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.2.0a0...v1.2.0a1

v1.2.0a0

4 months ago

What's Changed

feat: add preview feature for output by @jinyang1994 in https://github.com/bentoml/BentoML/pull/4319
feat: add feature for form validation by @jinyang1994 in https://github.com/bentoml/BentoML/pull/4322
fix: example and bento config by @FogDong in https://github.com/bentoml/BentoML/pull/4324
fix: set wrong default value when type is array by @jinyang1994 in https://github.com/bentoml/BentoML/pull/4332
feat: add v2 config and json override env by @FogDong in https://github.com/bentoml/BentoML/pull/4331
feat: config override by @frostming in https://github.com/bentoml/BentoML/pull/4334
feat: models.save api and tests by @MingLiangDai in https://github.com/bentoml/BentoML/pull/4307
fix: fix config migration by @FogDong in https://github.com/bentoml/BentoML/pull/4341
fix: async api call by @xianml in https://github.com/bentoml/BentoML/pull/4349
fix(config): typo on override by @Haivilo in https://github.com/bentoml/BentoML/pull/4351
feat: reorganize the new SDK package by @frostming in https://github.com/bentoml/BentoML/pull/4337
feat: support .python-version symlink by @aarnphm in https://github.com/bentoml/BentoML/pull/4354
feat: add loading status when form is submitting by @jinyang1994 in https://github.com/bentoml/BentoML/pull/4361
feat: add e2e tests for new SDK by @frostming in https://github.com/bentoml/BentoML/pull/4352
fix(with_config): annotate return type by @aarnphm in https://github.com/bentoml/BentoML/pull/4355
Chore: add supported gpu type by @xianml in https://github.com/bentoml/BentoML/pull/4363
fix(config): make sure to escape quotation for migration of services config by @aarnphm in https://github.com/bentoml/BentoML/pull/4369
fix(sdk): identify async by original func by @bojiang in https://github.com/bentoml/BentoML/pull/4370
chore(bento.yaml): move fields into services by @bojiang in https://github.com/bentoml/BentoML/pull/4372
fix: add services in manifest by @FogDong in https://github.com/bentoml/BentoML/pull/4373
chore(sdk): envs in bentofile by @bojiang in https://github.com/bentoml/BentoML/pull/4378
chore(cloud): include envs in manifest by @bojiang in https://github.com/bentoml/BentoML/pull/4379
chore: cherry-pick SSE utils into 1.2 branch by @aarnphm in https://github.com/bentoml/BentoML/pull/4375
fix: correct 1.2 model list and tag format when pushing bento by @Haivilo in https://github.com/bentoml/BentoML/pull/4381
feat: Update the bento yaml schema by @frostming in https://github.com/bentoml/BentoML/pull/4371
ci: pre-commit autoupdate [skip ci] by @pre-commit-ci in https://github.com/bentoml/BentoML/pull/4382
chore(sdk): able to specify service name by @bojiang in https://github.com/bentoml/BentoML/pull/4377
feat(bentocloud): deployment v2 api client + cli by @Haivilo in https://github.com/bentoml/BentoML/pull/4335
chore(deps): bump github/codeql-action from 2 to 3 by @dependabot in https://github.com/bentoml/BentoML/pull/4343
feat(sdk): use attribute chain as dependency import string by @frostming in https://github.com/bentoml/BentoML/pull/4385
chore(example): change name to avoid conflict by @bojiang in https://github.com/bentoml/BentoML/pull/4387
refactor(impl): 1.2 loader by @bojiang in https://github.com/bentoml/BentoML/pull/4388
fix: refactor deployment v2 client and cli by @FogDong in https://github.com/bentoml/BentoML/pull/4383

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.1.11...v1.2.0a0

v1.1.11

5 months ago

Bug fixes

Fix streaming for long payloads on remote runners. It will now always yield text and follow SSE protocol. We also provide SSE utils:

import bentoml
from bentoml.io import SSE

class MyRunnable(bentoml.Runnable):
	@bentoml.Runnable.method()
	def streaming(self, text):
		yield "data: 1\n\n"
		yield "data: 12222222222222222222222222222\n\n"

runner = bentoml.Runner(MyRunnable)

svc = bentoml.Service("service", runners=[runner])

@svc.api()
def infer(text):
	result = 0
	async for it in runner.streaming.async_stream(text):
		payload = SSE.from_iterator(it)
		result += int(payload.data)
	return result

What's Changed

docs: Add BentoCloud payment doc by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4286
docs: update quickstart with OpenLLM by @aarnphm in https://github.com/bentoml/BentoML/pull/4295
fix(docs): correct server implementation by @aarnphm in https://github.com/bentoml/BentoML/pull/4297
docs: Remove bill void status by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4299
docs: Update LLM quickstart format and wording by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4300
fix: citation link at README.md by @shenxiangzhuang in https://github.com/bentoml/BentoML/pull/4301
fix(transformers): support trust_remote_code and added unit tests by @MingLiangDai in https://github.com/bentoml/BentoML/pull/4271
docs: Add Bento Deployment details docs by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4304
test: remove outdated tests with pretrained_class parameter by @MingLiangDai in https://github.com/bentoml/BentoML/pull/4308
fix: Updated starlette to >= 0.24.0 by @jakthra in https://github.com/bentoml/BentoML/pull/4306
ci: pre-commit autoupdate [skip ci] by @pre-commit-ci in https://github.com/bentoml/BentoML/pull/4316
fix: syntax error on code snippet in bentoml.onnx.save_model docs by @lucasew in https://github.com/bentoml/BentoML/pull/4323
chore(deps): bump actions/setup-python from 4 to 5 by @dependabot in https://github.com/bentoml/BentoML/pull/4329
docs: fixed typo in file name benchmark README.md by @gazon1 in https://github.com/bentoml/BentoML/pull/4342
fix(stream): streaming enable to work with proxy by @jianshen92 in https://github.com/bentoml/BentoML/pull/4330
docs: fix typo in frameworks transformers guide by @IbrahimAmin1 in https://github.com/bentoml/BentoML/pull/4360
chore(sse): refactor sse utils with efficient buffering by @bojiang in https://github.com/bentoml/BentoML/pull/4362
fix(runner): fix DataFrame container header too long by @larme in https://github.com/bentoml/BentoML/pull/4364
chore(generated): new stubs for proto 4 by @aarnphm in https://github.com/bentoml/BentoML/pull/4374

New Contributors

@shenxiangzhuang made their first contribution in https://github.com/bentoml/BentoML/pull/4301
@jakthra made their first contribution in https://github.com/bentoml/BentoML/pull/4306
@lucasew made their first contribution in https://github.com/bentoml/BentoML/pull/4323
@gazon1 made their first contribution in https://github.com/bentoml/BentoML/pull/4342
@IbrahimAmin1 made their first contribution in https://github.com/bentoml/BentoML/pull/4360

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.1.10...v1.1.11

v1.1.10

6 months ago

Released a patch that set the upper bound for cattrs<23.2, which breaks our whole serialisation process both upstream and downstream.

What's Changed

fix: StreamingResponse compatibility issue by @xianml in https://github.com/bentoml/BentoML/pull/4248
bentocloud doc update --data field by @MingLiangDai in https://github.com/bentoml/BentoML/pull/4272
docs: Add repository and bento selection notes by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4280
fix(dispatcher): unbounded overload batch_size by @aarnphm in https://github.com/bentoml/BentoML/pull/4273
docs: update transformers example to include gpu options by @ssheng in https://github.com/bentoml/BentoML/pull/4281
fix monitoring docs configuration.yaml typo by @KimSoungRyoul in https://github.com/bentoml/BentoML/pull/4287
fix: runnable framework logic in transformers.py by @benfu-verses in https://github.com/bentoml/BentoML/pull/4291
docs: Update docs on supported CUDA versions by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4288
docs: Add docs for new transformers model import API by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4282
fix: Disable exception in serve-grpc on Windows in development mode by @zimka in https://github.com/bentoml/BentoML/pull/4294
fix(dependencies): lock cattrs<23.2 for now by @aarnphm in https://github.com/bentoml/BentoML/pull/4292

New Contributors

@benfu-verses made their first contribution in https://github.com/bentoml/BentoML/pull/4291
@zimka made their first contribution in https://github.com/bentoml/BentoML/pull/4294

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.1.9...v1.1.10

v1.1.9

6 months ago

Import Hugging Face Transformers Model: the bentoml.transformers.import_model API imports pretrained transformers models directly from HuggingFace. Using this API allows importing Transformers models into the BentoML model store without loading the model into memory. The bentoml.transformers.import_model API takes the first argument to be the model name in BentoML store, and the second argument to be the model_id on HuggingFace Hub.

import bentoml

bentomodel = bentoml.transformers.import_model("zephyr-7b-beta", "HuggingFaceH4/zephyr-7b-beta")

Standardize with nvidia-ml-py: BentoML now uses the official nvidia-ml-py package instead of pynvml to avoid conflict with other packages.
Define Environment Variable in Configuration: Within bentoml_configuration.yaml, values in the form of ${ENV_VAR} will be expanded at runtime to the value of the corresponding environment variable, but please note that this only supports string types.

What's Changed

docs: Update the deployment docs by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4260
ci: pre-commit autoupdate [skip ci] by @pre-commit-ci in https://github.com/bentoml/BentoML/pull/4264
feat: import model for transformers framework by @MingLiangDai in https://github.com/bentoml/BentoML/pull/4247
build: Use official nvidia-ml-py package instead of fork by @ecederstrand in https://github.com/bentoml/BentoML/pull/4208

New Contributors

@MingLiangDai made their first contribution in https://github.com/bentoml/BentoML/pull/4247
@ecederstrand made their first contribution in https://github.com/bentoml/BentoML/pull/4208

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.1.7...v1.1.9

v1.1.8

6 months ago

What's Changed

docs: Add the OpenLLM Llama 2 Colab link by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4235
docs: Add best practices doc for building and deploying Bentos by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4237
docs: Update the bento building and deployment best practices doc by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4242
docs: Add global token and token expiration in docs by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4243
fix: client API by @alexparker443 in https://github.com/bentoml/BentoML/pull/4245
docs: Update the Bentos doc by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4246
docs: Update OneDiffusion Colab link by @Sherlock113 in https://github.com/bentoml/BentoML/pull/4249
fix: suppress abstractmethod TypeError of TritonRunnerHandle by @netoou in https://github.com/bentoml/BentoML/pull/4251
fix: send request ID in response headers in all cases by @frostming in https://github.com/bentoml/BentoML/pull/4253
fix(client): prepend http if necessary in wait by @sauyon in https://github.com/bentoml/BentoML/pull/4255
fix: Bug 4252 Restore functioning benoml.ray.deployment by @jerryharrow in https://github.com/bentoml/BentoML/pull/4257
fix(client): add http if required to sync client wait by @sauyon in https://github.com/bentoml/BentoML/pull/4258
fix(ci): fix tests by @sauyon in https://github.com/bentoml/BentoML/pull/4259
feat: support storing secrets with env vars in config by @frostming in https://github.com/bentoml/BentoML/pull/4254

New Contributors

@alexparker443 made their first contribution in https://github.com/bentoml/BentoML/pull/4245
@netoou made their first contribution in https://github.com/bentoml/BentoML/pull/4251
@jerryharrow made their first contribution in https://github.com/bentoml/BentoML/pull/4257

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.1.7...v1.1.8