Michaelfeil Infinity Versions Save

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.

0.0.35

1 week ago

What's Changed

update docs: v2 cli and async request handling by @michaelfeil in https://github.com/michaelfeil/infinity/pull/229

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.34...0.0.35

0.0.34

1 week ago

What's Changed

Add option to enable permissive CORS headers to allow api access from… by @kir-gadjello in https://github.com/michaelfeil/infinity/pull/214
add v2 to CLI by @michaelfeil in https://github.com/michaelfeil/infinity/pull/227
Add revision and trust_remote_code to from_pretrained calls by @chiragjn in https://github.com/michaelfeil/infinity/pull/224

New Contributors

@kir-gadjello made their first contribution in https://github.com/michaelfeil/infinity/pull/214
@chiragjn made their first contribution in https://github.com/michaelfeil/infinity/pull/224

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.33...0.0.34

0.0.33

2 weeks ago

What's Changed

fix-orjson by @michaelfeil in https://github.com/michaelfeil/infinity/pull/201
Add EngineArray Multi-Model [1/3] by @michaelfeil in https://github.com/michaelfeil/infinity/pull/200
Openapi tests by @michaelfeil in https://github.com/michaelfeil/infinity/pull/199
refactor BatchHandler into ModelWorker by @michaelfeil in https://github.com/michaelfeil/infinity/pull/202
Add fp32 as runtime dtype by @michaelfeil in https://github.com/michaelfeil/infinity/pull/211

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.32...0.0.33

0.0.32

1 month ago

What's Changed

You can now run a model with a alias. This will help you communicating with the API.

infinity_emb --served-model-name "your_nickname"

You can now use preload models. This acts as a "run download and load into ram" test. Upon execution, all files are cached, which will speedup consecutive loads. For additonal speedups, use --no-model-warmup to skip model warmup after loading.

infinity_emb --preload-only --model--name-or-path BAAI/bge-large-en-v1.5

PR's

feat: add served_model_name argument for the infinity_server by @bufferoverflow in https://github.com/michaelfeil/infinity/pull/180
FIX: import crossencoder without torch installed and git push of creds by @michaelfeil in https://github.com/michaelfeil/infinity/pull/181
update default model_name to be unified name across routes by @michaelfeil in https://github.com/michaelfeil/infinity/pull/179
python39 type hints by @michaelfeil in https://github.com/michaelfeil/infinity/pull/182
pydantic cli / args validation by @michaelfeil in https://github.com/michaelfeil/infinity/pull/183
update defered moving to cpu & type hints improvement by @michaelfeil in https://github.com/michaelfeil/infinity/pull/187
Update README.md - add Contributors by @michaelfeil in https://github.com/michaelfeil/infinity/pull/189
update infinity offline solution by @michaelfeil in https://github.com/michaelfeil/infinity/pull/195
update offline-mode: deployment docs v2 by @michaelfeil in https://github.com/michaelfeil/infinity/pull/196

New Contributors

@bufferoverflow made their first contribution in https://github.com/michaelfeil/infinity/pull/180 Thanks!

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.31...0.0.32

0.0.31

2 months ago

What's Changed

Create ISSUE_TEMPLATE by @michaelfeil in https://github.com/michaelfeil/infinity/pull/168
bump sentence transformers to v.2.6.0 by @michaelfeil in https://github.com/michaelfeil/infinity/pull/169
Embedding quant by @michaelfeil in https://github.com/michaelfeil/infinity/pull/170
refactor ENUM..TypeHint into a function by @michaelfeil in https://github.com/michaelfeil/infinity/pull/172
refactored more imports by @michaelfeil in https://github.com/michaelfeil/infinity/pull/171
redirect to /docs and optional imports by @michaelfeil in https://github.com/michaelfeil/infinity/pull/175
update typing by @michaelfeil in https://github.com/michaelfeil/infinity/pull/176
update lock by @michaelfeil in https://github.com/michaelfeil/infinity/pull/177

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.30...0.0.31

0.0.30

2 months ago

What's Changed

remove fastembed by @michaelfeil in https://github.com/michaelfeil/infinity/pull/141
Sentence transformers bump to 2.5.0 by @michaelfeil in https://github.com/michaelfeil/infinity/pull/142
Revert "Sentence transformers bump to 2.5.0" by @michaelfeil in https://github.com/michaelfeil/infinity/pull/143
Update README.md by @michaelfeil in https://github.com/michaelfeil/infinity/pull/145
update poetry lock - sentence-transformers 2.5.0 by @michaelfeil in https://github.com/michaelfeil/infinity/pull/144
Support for Inferentia2 (draft) by @michaelfeil in https://github.com/michaelfeil/infinity/pull/118
Add bettertransformer to cli by @michaelfeil in https://github.com/michaelfeil/infinity/pull/152
Fp8 support by @michaelfeil in https://github.com/michaelfeil/infinity/pull/153
Some docstring and typing fixes by @lckr in https://github.com/michaelfeil/infinity/pull/156
add async tokenization to reranker in torch by @michaelfeil in https://github.com/michaelfeil/infinity/pull/154
Update README.md by @sherwin684 in https://github.com/michaelfeil/infinity/pull/167

New Contributors

@lckr made their first contribution in https://github.com/michaelfeil/infinity/pull/156
@sherwin684 made their first contribution in https://github.com/michaelfeil/infinity/pull/167

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.29...0.0.30

0.0.29

2 months ago

What's Changed

OpenAI models compatability and update docs and by @michaelfeil in https://github.com/michaelfeil/infinity/pull/140

This will be the last release with fastembed - fastembed and optimum provide similar capabilities. Please use optimum going forward.

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.28...0.0.29

0.0.28

2 months ago

What's Changed

add macos ci by @michaelfeil in https://github.com/michaelfeil/infinity/pull/133
Quantization: int8 by @michaelfeil in https://github.com/michaelfeil/infinity/pull/134
add docs via mkdocs by @michaelfeil in https://github.com/michaelfeil/infinity/pull/137

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.27...0.0.28

0.0.27

2 months ago

What's Changed

BREAKING: EngineArgs by @michaelfeil in https://github.com/michaelfeil/infinity/pull/124
- new stable interface. batch-size=32 as default. using michaelfeil/bge-small as default model. You can overwrite the pooling method now.
multiple os ci and python 3.12 support by @michaelfeil in https://github.com/michaelfeil/infinity/pull/131
add michaelfeil/bge-small as default model by @michaelfeil in https://github.com/michaelfeil/infinity/pull/135

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.26...0.0.27

0.0.26

2 months ago

What's Changed

hf_transfer is automatically used by @michaelfeil in https://github.com/michaelfeil/infinity/pull/112
add revision to onnx by @michaelfeil in https://github.com/michaelfeil/infinity/pull/109
bump sentence-transformers to 2.4.0 by @michaelfeil in https://github.com/michaelfeil/infinity/pull/113
Adds benchmarking by @michaelfeil in https://github.com/michaelfeil/infinity/pull/110
ONNX/Optimum now works on windows by @michaelfeil in https://github.com/michaelfeil/infinity/pull/117

Full Changelog: https://github.com/michaelfeil/infinity/compare/0.0.25...0.0.26