Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
Python assertions now expect a get_assert
function which returns a native value, rather than parsing stdout (#594). This means instead of:
print(json.dumps((result))
You should just return the assertion result:
return result
Here's a full example of a custom_assert.py
:
def get_assert(output, context) -> Union[bool, float, Dict[str, Any]]
print('Prompt:', context['prompt'])
print('Vars', context['vars']['topic']
# Determine the result...
result = test_output(output)
# Here's an example GradingResult dict
result = {
'pass': True,
'score': 0.6,
'reason': 'Looks good to me',
}
return result
See documentation
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.50.1...0.51.0
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.50.0...0.50.1
transform
by @typpo in https://github.com/promptfoo/promptfoo/pull/605
NEXT_PUBLIC_PROMPTFOO_REMOTE_BASE_URL
by @typpo in https://github.com/promptfoo/promptfoo/pull/609
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.49.3...0.50.0
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.49.2...0.49.3
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.49.1...0.49.2
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.49.0...0.49.1
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.48.0...0.49.0
eval
or view
with 0.48.0, your saved evals will be migrated from .json
files to a sqlite db. Please open an issue if you run into problems.
~/.promptfoo/output
. This directory is backed up at ~/.promptfoo/output-backup-*
and you can restore it and use a previous version by renaming that directory back to output
__description
field by @typpo in https://github.com/promptfoo/promptfoo/pull/556
max_tokens
and seed
by @typpo in https://github.com/promptfoo/promptfoo/pull/561
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.47.0...0.48.0
Multiline inline python assertions don't rely on print
statements anymore. You can simply return
the value instead. print
can be used for debugging.
If this breaks your multiline inline python assertion, the fix is simple: replace statements like print(json.dumps(result))
with return result
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.46.0...0.47.0
Self-hosted instances no longer require the trailing /api
in NEXT_PUBLIC_PROMPTFOO_REMOTE_API_BASE_URL
. Here's an example build command:
docker build --build-arg NEXT_PUBLIC_PROMPTFOO_REMOTE_API_BASE_URL=http://localhost:3000 -t promptfoo-ui .
See further documentation here
Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.45.2...0.46.0