Promptfoo Promptfoo Versions Save

Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.

0.51.0

1 month ago

Breaking update to python custom assertions

Python assertions now expect a get_assert function which returns a native value, rather than parsing stdout (#594). This means instead of:

print(json.dumps((result))

You should just return the assertion result:

return result

Here's a full example of a custom_assert.py:

def get_assert(output, context) -> Union[bool, float, Dict[str, Any]]
    print('Prompt:', context['prompt'])
    print('Vars', context['vars']['topic']

    # Determine the result...
    result = test_output(output)

    # Here's an example GradingResult dict
    result = {
      'pass': True,
      'score': 0.6,
      'reason': 'Looks good to me',
    }
    return result

See documentation

What's Changed

chore: improve json parsing errors by @typpo in https://github.com/promptfoo/promptfoo/pull/620
feat: ability to override path to python binary by @typpo in https://github.com/promptfoo/promptfoo/pull/619
feat(webui): store settings in localstorage by @typpo in https://github.com/promptfoo/promptfoo/pull/617
feat(azureopenai): apiKeyEnvar support by @typpo in https://github.com/promptfoo/promptfoo/pull/628
Add documentation for openai vision by @CamdenClark in https://github.com/promptfoo/promptfoo/pull/637
Support claude vision and images by @CamdenClark in https://github.com/promptfoo/promptfoo/pull/639
fix(webui): ability to save defaultTest and evaluateOptions in yaml editor by @typpo in https://github.com/promptfoo/promptfoo/pull/629
fix: assertion files use relative path by @typpo in https://github.com/promptfoo/promptfoo/pull/624
feat: add provider reference to prompt function by @guilhermetk in https://github.com/promptfoo/promptfoo/pull/633
feat(webui): "progress" page that shows provider/prompt pairs by @typpo in https://github.com/promptfoo/promptfoo/pull/631
feat: ability to import vars using glob by @typpo in https://github.com/promptfoo/promptfoo/pull/641
feat!: return values directly in python assertions by @typpo in https://github.com/promptfoo/promptfoo/pull/638

New Contributors

@CamdenClark made their first contribution in https://github.com/promptfoo/promptfoo/pull/637
@guilhermetk made their first contribution in https://github.com/promptfoo/promptfoo/pull/633

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.50.1...0.51.0

0.50.1

1 month ago

What's Changed

fix: compiled esmodule interop by @typpo in https://github.com/promptfoo/promptfoo/pull/613
fix: downgrade var resolution failure to warning by @typpo in https://github.com/promptfoo/promptfoo/pull/614
fix: glob behavior on windows by @typpo in https://github.com/promptfoo/promptfoo/pull/612

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.50.0...0.50.1

0.50.0

1 month ago

What's Changed

feat(webui): download button by @typpo in https://github.com/promptfoo/promptfoo/pull/482
fix(selfhost): add support for prompts and datasets api endpoints by @typpo in https://github.com/promptfoo/promptfoo/pull/600
feat: support .mjs external imports by @typpo in https://github.com/promptfoo/promptfoo/pull/601
feat: load .env from cli by @typpo in https://github.com/promptfoo/promptfoo/pull/602
feat(webui): toggle for showing full prompt in output cell by @typpo in https://github.com/promptfoo/promptfoo/pull/603
feat: ability to use js files as transform by @typpo in https://github.com/promptfoo/promptfoo/pull/605
feat: ability to reference vars from other vars by @typpo in https://github.com/promptfoo/promptfoo/pull/607
fix: handling for nonscript assertion files by @typpo in https://github.com/promptfoo/promptfoo/pull/608
fix(selfhost): Consolidate to NEXT_PUBLIC_PROMPTFOO_REMOTE_BASE_URL by @typpo in https://github.com/promptfoo/promptfoo/pull/609

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.49.3...0.50.0

0.49.3

1 month ago

What's Changed

fix: bedrock model parsing by @typpo in https://github.com/promptfoo/promptfoo/pull/593
fix: make llm-rubric more resilient to bad json responses. https://github.com/promptfoo/promptfoo/issues/596
feat: display progress bar for each parallel execution by @typpo in https://github.com/promptfoo/promptfoo/pull/597

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.49.2...0.49.3

0.49.2

1 month ago

What's Changed

fix: support relative paths for custom providers by @typpo in https://github.com/promptfoo/promptfoo/pull/589
fix: gemini generationConfig and safetySettings by @typpo in https://github.com/promptfoo/promptfoo/pull/590
feat: cli watch for vars and providers by @typpo in https://github.com/promptfoo/promptfoo/pull/591

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.49.1...0.49.2

0.49.1

1 month ago

What's Changed

fix: lazy import of azure peer dependency by @typpo in https://github.com/promptfoo/promptfoo/pull/586

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.49.0...0.49.1

0.49.0

1 month ago

What's Changed

feat: Add support for huggingface token classification by @typpo in https://github.com/promptfoo/promptfoo/pull/574
feat: Mistral provider support for URL and API key envar by @jvert in https://github.com/promptfoo/promptfoo/pull/570
feat: run assertions in parallel by @typpo in https://github.com/promptfoo/promptfoo/pull/575
feat: support for azure openai assistants by @typpo in https://github.com/promptfoo/promptfoo/pull/577
feat(vertexai): use gcloud application default credentials by @typpo in https://github.com/promptfoo/promptfoo/pull/580
feat: ability to set tags on standalone assertion llm outputs by @typpo in https://github.com/promptfoo/promptfoo/pull/581
feat: add support for claude3 on bedrock by @typpo in https://github.com/promptfoo/promptfoo/pull/582
fix: load file before running prompt function by @typpo in https://github.com/promptfoo/promptfoo/pull/583
fix(selfhost): handle sqlite db in docker image and build by @typpo in https://github.com/promptfoo/promptfoo/pull/568
fix: broken ansi colors on cli table
fix: remove duplicate instruction output
chore: better error messages when expecting json but getting text by @typpo in https://github.com/promptfoo/promptfoo/pull/576
chore(deps): bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /site by @dependabot in https://github.com/promptfoo/promptfoo/pull/579

New Contributors

@jvert made their first contribution in https://github.com/promptfoo/promptfoo/pull/570

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.48.0...0.49.0

0.48.0

2 months ago

What's Changed

feat: migrate filesystem storage to sqlite db by @typpo in https://github.com/promptfoo/promptfoo/pull/558
- When you first run eval or view with 0.48.0, your saved evals will be migrated from .json files to a sqlite db. Please open an issue if you run into problems.
- Restoration: By default, the migration process runs on the promptfoo output directory ~/.promptfoo/output. This directory is backed up at ~/.promptfoo/output-backup-* and you can restore it and use a previous version by renaming that directory back to output
feat: Add anthropic:messages and replicate:mistral as default providers to web ui by @matt-hendrick in https://github.com/promptfoo/promptfoo/pull/562
feat(csv): add support for __description field by @typpo in https://github.com/promptfoo/promptfoo/pull/556
feat: add label field to provider options by @typpo in https://github.com/promptfoo/promptfoo/pull/563
fix(azureopenai): add support for max_tokens and seed by @typpo in https://github.com/promptfoo/promptfoo/pull/561
docs: adjust configuration for python provider by @romaintoub in https://github.com/promptfoo/promptfoo/pull/565
chore: db migration and cleanup by @typpo in https://github.com/promptfoo/promptfoo/pull/564

New Contributors

@matt-hendrick made their first contribution in https://github.com/promptfoo/promptfoo/pull/562

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.47.0...0.48.0

0.47.0

2 months ago

Breaking

Multiline inline python assertions don't rely on print statements anymore. You can simply return the value instead. print can be used for debugging.

If this breaks your multiline inline python assertion, the fix is simple: replace statements like print(json.dumps(result)) with return result

What's Changed

feat: improve python inline asserts to not require printing by @typpo in https://github.com/promptfoo/promptfoo/pull/542
feat: add tools and tool_choice config parameters to azure openai provider by @heartyguy in https://github.com/promptfoo/promptfoo/pull/550
feat: Add support for Claude 3 Haiku by @streichsbaer in https://github.com/promptfoo/promptfoo/pull/552
fix(replicate): support non-array outputs by @typpo in https://github.com/promptfoo/promptfoo/pull/547
fix: validate custom js function return values by @typpo in https://github.com/promptfoo/promptfoo/pull/548
fix: dedupe prompts from combined configs by @typpo in https://github.com/promptfoo/promptfoo/pull/554

New Contributors

@heartyguy made their first contribution in https://github.com/promptfoo/promptfoo/pull/550

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.46.0...0.47.0

0.46.0

2 months ago

Breaking

Self-hosted instances no longer require the trailing /api in NEXT_PUBLIC_PROMPTFOO_REMOTE_API_BASE_URL. Here's an example build command:

docker build --build-arg NEXT_PUBLIC_PROMPTFOO_REMOTE_API_BASE_URL=http://localhost:3000 -t promptfoo-ui .

See further documentation here

What's Changed

feat(self-host): run evals via web ui by @typpo in https://github.com/promptfoo/promptfoo/pull/540
feat(self-host): Persist changes on self-deployed UI without sharing a new link by @typpo in https://github.com/promptfoo/promptfoo/pull/538
feat: add support for calling specific functions for python prompt by @typpo in https://github.com/promptfoo/promptfoo/pull/533
feat(webui): ability to change eval name by @typpo in https://github.com/promptfoo/promptfoo/pull/537
fix(anthropic): wrap text if prompt supplied as json by @typpo in https://github.com/promptfoo/promptfoo/pull/536
fix: openai tools and function checks handle plaintext responses by @typpo in https://github.com/promptfoo/promptfoo/pull/541

Full Changelog: https://github.com/promptfoo/promptfoo/compare/0.45.2...0.46.0