An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.3...vv0.3.4
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.2...v0.3.3
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.1...v0.3.2
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.0...v0.3.1
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.9...v0.3.0
vllm
decoder for model inference by @44670 in https://github.com/tatsu-lab/alpaca_eval/pull/124
completions_all
and allow sequence of max_tokens by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/125
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.8...v0.2.9
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.7...v0.2.8
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.6...v0.2.7
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.5...v0.2.6
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.2.4...v0.2.5