An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
autogptq_cuda
dir missed in distribution fileFix the problem that installation from pypi failed when the environment variable CUDA_VERSION
is set.
Happy International Children's Day! 🎈 At the age of LLMs and the dawn of AGI, may we always be curious like children, with vigorous energy and courage to explore the bright future.
There are bunch of new features been added in this version:
llama
and gptj
, fused mlp for llama
codegen
, gpt_bigcode
and falcon
Below are the detailed change log:
device_map
by @PanQiWei in https://github.com/PanQiWei/AutoGPTQ/pull/80
model(tokens)
syntax by @TheBloke in https://github.com/PanQiWei/AutoGPTQ/pull/84
push_to_hub
by @TheBloke in https://github.com/PanQiWei/AutoGPTQ/pull/91
Following are new contributors and their first pr. Thank you very much for your love of auto_gptq
and contributions! ❤️
Full Changelog: https://github.com/PanQiWei/AutoGPTQ/compare/v0.1.0...v0.2.0
Full Changelog: https://github.com/PanQiWei/AutoGPTQ/compare/v0.0.5...v0.1.0
Full Changelog: https://github.com/PanQiWei/AutoGPTQ/compare/v0.0.4...v0.0.5
triton
is officially supported start from this version!pip install auto-gptq
is supported start from this version!Full Changelog: https://github.com/PanQiWei/AutoGPTQ/compare/v0.0.3...v0.0.4
eval_tasks
module to support evaluate model's performance on predefined down-stream tasks before and after quantizationLLaMa
modelposition_ids