:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Today, we’ve released the stable version of Haystack 2.0. This is ultimately a rewrite of the Haystack framework, so these release notes are not what you’d usually expect to see in regular release notes where we highlight specific changes to the codebase. Instead, we will highlight features of Haystack 2.0 and how it’s meant to be used.
To read more about our motivation for Haystack 2.0 and what makes up our design choices, you can read our release announcement article.
To get started with Haystack, follow our quick starting guide.
Haystack 2.0 is distributed with haystack-ai
, while Haystack 1.x will continue to be supported with farm-haystack
with security updates and bug fixes.
NOTE: Installing haystack-ai
and farm-haystack
into the same Python environment will lead to conflicts - Please use separate virtual environments for each package.
Check out the installation guide for more information.
In Haystack 2.0, pipelines are dynamic computation graphs that support:
Pipelines can be built with a few easy steps:
Pipeline
object.add_component()
method.connect()
method. Trying to connect components that are not compatible in type will raise an error.run()
method.The following pipeline does question-answering on a given URL:
import os
from haystack import Pipeline
from haystack.components.fetchers import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.utils import Secret
os.environ["OPENAI_API_KEY"] = "Your OpenAI API Key"
fetcher = LinkContentFetcher()
converter = HTMLToDocument()
prompt_template = """
According to the contents of this website:
{% for document in documents %}
{{document.content}}
{% endfor %}
Answer the given question: {{query}}
Answer:
"""
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY"))
pipeline = Pipeline()
pipeline.add_component("fetcher", fetcher)
pipeline.add_component("converter", converter)
pipeline.add_component("prompt", prompt_builder)
pipeline.add_component("llm", llm)
# pass the fetchers's `streams` output to the converter using the `sources` parameter
pipeline.connect("fetcher.streams", "converter.sources")
# pass the converted `documents to the prompt_builder using the `documents` parameter
pipeline.connect("converter.documents", "prompt.documents")
# pass the interpolated `prompt to the llm using the `prompt` parameter
pipeline.connect("prompt.prompt", "llm.prompt")
pipeline.run({"fetcher": {"urls": ["https://haystack.deepset.ai/overview/quick-start"]},
"prompt": {"query": "How should I install Haystack?"}})
print(result["llm"]["replies"][0])
Previously known as Nodes, components have been formalized with well-defined inputs and outputs that allow for easy extensibility and composability.
Haystack 2.0 provides a diverse selection of built-in components. Here’s a non-exhaustive overview:
Category | Description | External Providers & Integrations |
---|---|---|
Audio Transcriber | Transcribe audio to text | OpenAI |
Builders | Build prompts and answers from templates | |
Classifiers | Classify documents based on specific criteria | |
Connectors | Interface with external services | OpenAPI |
Converters | Convert data between different formats | Azure, Tika, Unstructured, PyPDF, OpenAPI, Jinja |
Embedders | Transform texts and documents to vector representations | Amazon Bedrock, Azure, Cohere, FastEmbed, Gradient, Hugging Face (Optimum, Sentence Transformers, Text Embedding Inference), Instructor, Jina, Mistral, Nvidia, Ollama, OpenAI |
Extractors | Extract information from documents | Hugging Face, spaCy |
Evaluators | Evaluate components using metrics | Ragas, DeepEval, UpTrain |
Fetcher | Fetch data from remote URLs | |
Generators | Prompt and generate text using generative models | Amazon Bedrock, Amazon Sagemaker, Azure, Cohere, Google AI, Google Vertex, Gradient, Hugging Face, Llama.cpp, Mistral, Nvidia, Ollama, OpenAI |
Joiners | Combine documents from different components | |
Preprocessors | Preprocess text and documents | |
Rankers | Sort documents based on specific criteria | Hugging Face |
Readers | Find answers in documents | |
Retrievers | Fetch documents from a document store based on a query | Astra, Chroma, Elasticsearch, MongoDB Atlas, OpenSearch, Pgvector, Pinecone, Qdrant, Weaviate |
Routers | Manipulate pipeline control flow | |
Validators | Validate data based on schemas | |
Web Search | Perform search queries | Search, SerperDev |
Writers | Write data into data sources |
If Haystack lacks a functionality that you need, you can easily create your own component and slot that into a pipeline. Broadly speaking, writing a custom component requires:
@component
decorator.run()
method. The parameters passed to this method double as the component’s inputs.run()
method with a @component.output_types()
decorator.Below is an example of a toy Embedder component that receives a text
input and returns a random vector representation as embedding
.
import random
from typing import List
from haystack import component, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
@component
class MyEmbedder:
def __init__(self, dim: int = 128):
self.dim = dim
@component.output_types(embedding=List[float])
def run(self, text: str):
print(f"Random embedding for text : {text}")
embedding = [random.uniform(1.0, -1.0) for _ in range(self.dim)]
return {"embedding": embedding}
# Using the component directly
my_embedder = MyEmbedder()
my_embedder.run(text="Hi, my name is Tuana")
# Using the component in a pipeline
document_store = InMemoryDocumentStore()
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", MyEmbedder())
query_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
query_pipeline.run({"text_embedder":{"text": "Who lives in Berlin?"}})
Haystack 2.0 offers ready-made pipeline templates for common use cases, which can be created with just a single line of code.
from haystack import Pipeline, PredefinedPipeline
pipeline = Pipeline.from_template(PredefinedPipeline.CHAT_WITH_WEBSITE)
# and then you can run this pipeline 👇
# pipeline.run({
# "fetcher": {"urls": ["https://haystack.deepset.ai/overview/quick-start"]},
# "prompt": {"query": "How should I install Haystack?"}}
# )
In Haystack 2.0, Document Stores provide a common interface through which pipeline components can read and manipulate data without any knowledge of the backend technology. Furthermore, Document Stores are paired with specialized retriever components that can be used to fetch documents from a particular data source based on specific queries.
This separation of interface and implementation lets us provide support for several third-party providers of vector databases such as Weaviate, Chroma, Pinecone, Astra DB, MongoDB, Qdrant, Pgvector, Elasticsearch, OpenSearch, Neo4j and Marqo.
#pip install chroma-haystack
from haystack_integrations.document_stores.chroma import ChromaDocumentStore
from haystack_integrations.components.retrievers.chroma import ChromaEmbeddingRetriever
document_store = ChromaDocumentStore()
retriever = ChromaEmbeddingRetriever(document_store)
Thanks to Haystack 2.0’s flexible infrastructure, pipelines can be easily extended with external technologies and libraries in the form of new components, document stores, etc, all the while keeping dependencies cleanly separated.
Starting with 2.0, integrations are divided into two categories:
haystack-core-integrations
GitHub repository.Please refer to the official integrations website for more information.
The monitoring of Haystack 2.0 pipelines in production is aided by both a customizable logging system that supports structured logging and tracing correlation out of the box, and code instrumentation collecting spans and traces in strategic points of the execution path, with support for Open Telemetry and Datadog already in place.
Haystack 2.0 provides a framework-agnostic system of addressing and using devices such as GPUs and accelerators across different platforms and providers.
To securely manage credentials for services that require authentication, Haystack 2.0 provides a type-safe approach to handle authentication and API secrets that prevents accidental leaks.
Haystack 2.0 prompt templating uses Jinja, and prompts are included in pipelines with the use of a PromptBuilder
(or DymanicPromptBuilder
for advanced use cases ). Everything in {{ }}
in a prompt, becomes an input to the PromptBuilder
.
The following prompt_builder
will expect documents
and query
as input.
from haystack.components.builders import PromptBuilder
template = """Given these documents, answer the question.
Documents:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}
Question: {{query}}
Answer:"""
prompt_builder = PromptBuilder(template=template)
Alongside Haystack 2.0, today we are also releasing a whole set of new tutorials, documentation, resources and more to help you get started:
Stay up-to-date with Haystack:
Follow the progress we made during beta in each beta release:
Add raise_on_failure flag to BaseConverter class so that big processes can optionally continue without breaking from exceptions.
Upgrade Transformers to the latest version 4.37.2. This version adds support for the Phi-2 and Qwen2 models and improves support for quantization.
Add support for latest OpenAI embedding models text-embedding-3-large and text-embedding-3-small.
API_BASE can now be passed as an optional parameter in the getting_started sample. Only openai provider is supported in this set of changes. PromptNode and PromptModel were enhanced to allow passing of this parameter. This allows RAG against a local endpoint (e.g, http://localhost:1234/v1), so long as it is OpenAI compatible (such as LM Studio)
Logging in the getting started sample was made more verbose, to make it easier for people to see what was happening under the covers.
Added new option split_by="page" to the preprocessor so we can chunk documents by page break.
Add raise_on_failure flag to BaseConverter class so that big processes can optionally continue without breaking from exceptions.
Upgrade Transformers to the latest version 4.37.2. This version adds support for the Phi-2 and Qwen2 models and improves support for quantization.
Add support for latest OpenAI embedding models text-embedding-3-large and text-embedding-3-small.
API_BASE can now be passed as an optional parameter in the getting_started sample. Only openai provider is supported in this set of changes. PromptNode and PromptModel were enhanced to allow passing of this parameter. This allows RAG against a local endpoint (e.g, http://localhost:1234/v1), so long as it is OpenAI compatible (such as LM Studio)
Logging in the getting started sample was made more verbose, to make it easier for people to see what was happening under the covers.
Added new option split_by="page" to the preprocessor so we can chunk documents by page break.
Introducing a flexible and dynamic approach to creating NLP pipelines with Haystack's new PipelineTemplate
class!
This innovative feature utilizes Jinja templated YAML files, allowing users to effortlessly construct and customize complex data processing pipelines for various NLP tasks. From question answering and document indexing to custom pipeline requirements, the PipelineTemplate simplifies configuration and enhances adaptability. Users can now easily override default components or integrate custom settings with simple, straightforward code.
For example, the following pipeline template can be used to create an indexing pipeline:
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack.templates import PipelineTemplate, PipelineType
pt = PipelineTemplate(PipelineType.INDEXING, template_params={"use_pdf_file_converter": True})
pt.override("embedder", SentenceTransformersDocumentEmbedder(progress_bar=True))
pipe = ptb.build()
result = pipe.run(data={"sources": ["some_local_dir/and_text_file.txt", "some_other_local_dir/and_pdf_file.pdf"]})
print(result)
In the above example, a PipelineType.INDEXING
enum is used to create a pipeline with a custom instance of SentenceTransformersDocumentEmbedder and the PDF file converter enabled.
The pipeline is then run on a list of local files and the result is printed (number of indexed documents). We could have of course used the same PipelineTemplate class to create any other pre-defined pipeline or even a custom pipeline with custom components and settings. On the other hand, the following pipeline template can be used to create a pre-defined RAG pipeline:
from haystack.templates import PipelineTemplate, PipelineType
pipe = PipelineTemplate(PipelineType.RAG).build()
result = pipe.run(query="What's the meaning of life?")
print(result)
_templateSource
loads template content from various inputs, including strings, files, predefined templates, and URLs. The class provides mechanisms to load templates dynamically and ensure they contain valid Jinja2 syntax.
Adopt the new framework-agnostic device management in Sentence Transformers Embedders.
Before this change:
from haystack.components.embedders import SentenceTransformersTextEmbedder
embedder = SentenceTransformersTextEmbedder(device="cuda:0")
After this change:
from haystack.utils.device import ComponentDevice, Device
from haystack.components.embedders import SentenceTransformersTextEmbedder
device = ComponentDevice.from_single(Device.gpu(id=0)) # or
# device = ComponentDevice.from_str("cuda:0") embedder = SentenceTransformersTextEmbedder(device=device)
Adopt the new framework-agnostic device management in Local Whisper Transcriber.
Before this change:
from haystack.components.audio import LocalWhisperTranscriber
transcriber = LocalWhisperTranscriber(device="cuda:0")
After this change:
from haystack.utils.device import ComponentDevice, Device from haystack.components.audio import LocalWhisperTranscriber
device = ComponentDevice.from_single(Device.gpu(id=0)) # or
# device = ComponentDevice.from_str("cuda:0") transcriber = LocalWhisperTranscriber(device=device)
Add FilterRetriever. It retrieves documents that match the provided (either at init or runtime) filters.
Add LostInTheMiddleRanker. It reorders documents based on the "Lost in the Middle" order, a strategy that places the most relevant paragraphs at the beginning or end of the context, while less relevant paragraphs are positioned in the middle.
Add support for Mean Reciprocal Rank (MRR) Metric to StatisticalEvaluator. MRR measures the mean reciprocal rank of times a label is present in at least one or more predictions.
Introducing the OutputAdapter component which enables seamless data flow between pipeline components by adapting the output of one component to match the expected input of another using Jinja2 template expressions. This addition opens the door to greater flexibility in pipeline configurations, facilitating custom adaptation rules and exemplifying a structured approach to inter-component communication.
Add is_greedy argument to @component decorator. This flag will change the behaviour of Component`s with inputs that have a `Variadic type when running inside a Pipeline.
Variadic `Component`s that are marked as greedy will run as soon as they receive their first input. If not marked as greedy instead they'll wait as long as possible before running to make sure they receive as many inputs as possible from their senders.
It will be ignored for all other `Component`s even if set explicitly.
Remove the old evaluation API in favor of a Component based API. We now have SASEvaluator and StatisticalEvaluator replacing the old API.
Introduced JsonSchemaValidator to validate the JSON content of ChatMessage against a provided JSON schema. Valid messages are emitted through the 'validated' output, while messages failing validation are sent via the 'validation_error' output, along with useful error details for troubleshooting.
Add a new variable called meta_value_type to the MetaFieldRanker that allows a user to parse the meta value into the data type specified as along as the meta value is a string. The supported values for meta_value_type are '"float"', '"int"', '"date"', or 'None'. If None is passed then no parsing is done. For example, if we specified meta_value_type="date" then for the meta value "date": "2015-02-01" we would parse the string into a datetime object.
Add TextCleaner Component to clean list of strings. It can remove substrings matching a list of regular expressions, convert text to lowercase, remove punctuation, and remove numbers. This is mostly useful to clean generator predictions before evaluation.
pipeline.connect("fetcher", "converter").connect("converter", "splitter").connect("splitter", "ranker")\
.connect("ranker", "prompt_builder").connect("prompt_builder", "llm")
Upgraded the default converter in PyPDFToDocument to insert page breaks "f" between each extracted page. This allows for downstream components and applications to better be able to keep track of the original PDF page a portion of text comes from.
⚠️ Breaking change: Update secret handling for components using the Secret
type. The following components are affected: RemoteWhisperTranscriber
, AzureOCRDocumentConverter
, AzureOpenAIDocumentEmbedder
, AzureOpenAITextEmbedder
, HuggingFaceTEIDocumentEmbedder
, HuggingFaceTEITextEmbedder
, OpenAIDocumentEmbedder
, SentenceTransformersDocumentEmbedder
, SentenceTransformersTextEmbedder
, AzureOpenAIGenerator
, AzureOpenAIChatGenerator
, HuggingFaceLocalChatGenerator
, HuggingFaceTGIChatGenerator
, OpenAIChatGenerator
, HuggingFaceLocalGenerator
, HuggingFaceTGIGenerator
, OpenAIGenerator
, TransformersSimilarityRanker
, SearchApiWebSearch
, SerperDevWebSearch
The default init parameters for api_key
, token
, azure_ad_token
have been adjusted to use environment variables wherever possible. The azure_ad_token_provider
parameter has been removed from Azure-based components. Components based on Hugging Face are now required to either use a token or an environment variable if authentication is required - The on-disk local token file is no longer supported.
Required actions to take: To make fixes to accommodate to this breaking change check the expected environment variable name for the
api_key
of the affected component you are using. Make sure to provide your API keys via this environment variable. Alternatively, if that's not an option, use theSecret.from_token
function to wrap any bare/string API tokens. Mind that pipelines using token secrets cannot be serialized/deserialized.
Expose a Secret
type to provide consistent API for any component that requires secrets for authentication. Currently supports string tokens and environment variables. Token-based secrets are automatically prevented from being serialized to disk (to prevent accidental leakage of secrets).
from haystack.utils import Secret
@component
class MyComponent:
def __init__(self, api_key: Optional[Secret] = None, **kwargs):
self.api_key = api_key
self.backend = None
def warm_up(self):
# Call resolve_value to yield a single result. The semantics of the result is policy-dependent.
# Currently, all supported policies will return a single string token.
self.backend = SomeBackend(api_key=self.api_key.resolve_value() if self.api_key else None, ...)
def to_dict(self):
# Serialize the policy like any other (custom) data. If the policy is token-based, it will
# raise an error.
return default_to_dict(self, api_key=self.api_key.to_dict() if self.api_key else None, ...)
@classmethod
def from_dict(cls, data):
# Deserialize the policy data before passing it to the generic from_dict function.
api_key_data = data["init_parameters"]["api_key"]
api_key = Secret.from_dict(api_key_data) if api_key_data is not None else None
data["init_parameters"]["api_key"] = api_key
return default_from_dict(cls, data)
# No authentication.
component = MyComponent(api_key=None)
# Token based authentication
component = MyComponent(api_key=Secret.from_token("sk-randomAPIkeyasdsa32ekasd32e"))
component.to_dict() # Error! Can't serialize authentication tokens
# Environment variable based authentication
component = MyComponent(api_key=Secret.from_env("OPENAI_API_KEY"))
component.to_dict() # This is fine
Adds support for the Exact Match metric to EvaluationResult.calculate_metrics(...)
:
from haystack.evaluation.metrics import Metric
exact_match_metric = eval_result.calculate_metrics(Metric.EM, output_key="answers")
Adds support for the F1 metric to EvaluationResult.calculate_metrics(...)
:
from haystack.evaluation.metrics import Metric
f1_metric = eval_result.calculate_metrics(Metric.F1, output_key="answers")
Adds support for the Semantic Answer Similarity (SAS) metric to EvaluationResult.calculate_metrics(...)
:
from haystack.evaluation.metrics import Metric
sas_metric = eval_result.calculate_metrics(
Metric.SAS, output_key="answers", model="sentence-transformers/paraphrase-multilingual-mpnet-base-v2" )
Introducing the HuggingFaceLocalChatGenerator
, a new chat-based generator designed for leveraging chat models from Hugging Face's (HF) model hub. Users can now perform inference with chat-based models in a local runtime, utilizing familiar HF generation parameters, stop words, and even employing custom chat templates for custom message formatting. This component also supports streaming responses and is optimized for compatibility with a variety of devices.
Here is an example of how to use the HuggingFaceLocalChatGenerator
:
from haystack.components.generators.chat import HuggingFaceLocalChatGenerator
from haystack.dataclasses import ChatMessage
generator = HuggingFaceLocalChatGenerator(model="HuggingFaceH4/zephyr-7b-beta")
generator.warm_up()
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
print(generator.run(messages))
Pipeline.add_component()
to fail if the Component
instance has already been added in another Pipeline
.device_map
when loading a TransformersSimilarityRanker
and ExtractiveReader
. This allows for multi-device inference and for loading quantized models (e.g. load_in_8bit=True
)ByteStream.from_file_path()
and ByteStream.from_string()
.TransformerSimilarityRanker
default_streaming_callback
was confusing, this function was the go-to-helper one would use to quickly print the generated tokens as they come, but it was not used by default. The function was then renamed to print_streaming_chunk.
pandas
and numpy
packages. This has now been changed to import only the necessary classes and functions.DocumentJoiner
's reciprocal rank fusion, enhancing the relevance of document sorting by allowing customizable influence on the final scoresComponent
haystack-ai
installed.__canals_input__
and __canals_ouput__
have been renamed respectively to __haystack_input__
and __haystack_ouput__
. CANALS_VARIADIC_ANNOTATION
has been renamed to HAYSTACK_VARIADIC_ANNOTATION
and it's value changed from __canals__variadic_t
to __haystack__variadic_t
. Default Pipeline debug_path
has been changed from .canals_debug
to .haystack_debug
.You can now use Titan and Cohere embedding models in your pipelines via the Amazon Bedrock integration.
from haystack.nodes import EmbeddingRetriever
retriever = EmbeddingRetriever(
embedding_model="amazon.titan-embed-text-v1",
document_store=document_store,
aws_config = {"aws_access_key_id": "ACCESS_KEY",
"aws_secret_access_key": "SECRET_KEY",
"aws_session_token": "SESSION_TOKEN"})
The WebDriver
that powers Haystack's crawler is no longer limited to Chrome.
Now you can configure it to use whatever WebDriver
you'd like.
See our Crawler docs for more info.