Haystack Versions Save

:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

v1.25.5

6 days ago

Release Notes

v1.25.5

🐛 Bug Fixes

  • Pipeline run error when using the FileTypeClassifier with the raise_on_error: True option. Instead of returning an unexpected NoneType, we route the file to a dead-end edge.

v1.25.4

1 week ago

Release Notes

v1.25.4

🐛 Bug Fixes

v1.25.3

1 week ago

Release Notes

v1.25.3

⚡️ Enhancement Notes

  • Support for Llama3 models on AWS Bedrock.
  • Support for MistralAI and new Claude 3 models on AWS Bedrock.
  • Upgrade transformers to version 4.39.3 so that Haystack can support the new Cohere Command R models.

🐛 Bug Fixes

  • Fixes SearchEngineDocumentStore.get_metadata_values_by_key method to make use of self.index if no index is provided.

  • When using a Pipeline with a JoinNode (e.g. JoinDocuments) all information from the previous nodes was lost other than a few select fields (e.g. documents). This was due to the JoinNode not properly passing on the information from the previous nodes. This has been fixed and now all information from the previous nodes is passed on to the next node in the pipeline.

    For example, this is a pipeline that rewrites the query during pipeline execution combined with a hybrid retrieval setup that requires a JoinDocuments node. Specifically the first prompt node rewrites the query to fix all spelling errors, and this new query is used for retrieval. And now the JoinDocuments node will now pass on the rewritten query so it can be used by the QAPromptNode node whereas before it would pass on the original query.

v2.0.1

3 weeks ago

Release Notes

v2.0.1

⬆️ Upgrade Notes

  • The HuggingFaceTGIGenerator and HuggingFaceTGIChatGenerator components have been modified to be compatible with huggingface_hub>=0.22.0.

    If you use these components, you may need to upgrade the huggingface_hub library. To do this, run the following command in your environment: pip install "huggingface_hub>=0.22.0"

🚀 New Features

  • Adds streaming_callback parameter to HuggingFaceLocalGenerator, allowing users to handle streaming responses.
  • Introduce a new SparseEmbedding class which can be used to store a sparse vector representation of a Document. It will be instrumental to support Sparse Embedding Retrieval with the subsequent introduction of Sparse Embedders and Sparse Embedding Retrievers.

⚡️ Enhancement Notes

  • Set max_new_tokens default to 512 in Hugging Face generators.

  • In Jupyter notebooks, the image of the Pipeline will no longer be displayed automatically. The textual representation of the Pipeline will be displayed.

    To display the Pipeline image, use the show method of the Pipeline object.

🐛 Bug Fixes

  • The test_comparison_in test case in the base document store tests used to always pass, no matter how the in filtering logic was implemented in document stores. With the fix, the in logic is actually tested. Some tests might start to fail for document stores that don't implement the in filter correctly.
  • Put HFTokenStreamingHandler in a lazy import block in HuggingFaceLocalGenerator. This fixed some breaking core-integrations.
  • Fixes Pipeline.run() logic so Components that have all their inputs with a default are run in the correct order. This happened we gather a list of Components to run internally when running the Pipeline in the order they are added during creation of the Pipeline. This caused some Components to run before they received all their inputs.
  • Fixes HuggingFaceTEITextEmbedder returning an embedding of incorrect shape when used with a Text-Embedding-Inference endpoint deployed using Docker.
  • Add the @component decorator to HuggingFaceTGIChatGenerator. The lack of this decorator made it impossible to use the HuggingFaceTGIChatGenerator in a pipeline.

v1.25.2

4 weeks ago

Release Notes

v1.25.2

⚡️ Enhancement Notes

  • Add support for response_format and seed in OpenAI and Azure OpenAI invocation layers #7422
  • Add boolen to toggle prompt truncation #7431

Full Changelog: https://github.com/deepset-ai/haystack/compare/v1.25.1...v1.25.2

v1.25.1

1 month ago

Release Notes

v1.25.1

⚡️ Enhancement Notes

  • Review and update context windows for OpenAI GPT models.

v2.0.0

1 month ago

Today, we’ve released the stable version of Haystack 2.0. This is ultimately a rewrite of the Haystack framework, so these release notes are not what you’d usually expect to see in regular release notes where we highlight specific changes to the codebase. Instead, we will highlight features of Haystack 2.0 and how it’s meant to be used.

To read more about our motivation for Haystack 2.0 and what makes up our design choices, you can read our release announcement article.

To get started with Haystack, follow our quick starting guide.

🕺 Highlights

  • 📦 A New Package
  • 💪 Powerful Pipelines
  • 🔌 Customizable Components
  • 🍱 Ready-Made Pipeline Templates
  • 🗃️ Document Stores
  • 🧩 Integrations
  • 🕵️ Logging & Tracing
  • 🏎️ Device Management
  • 🔐 Secret Management
  • 📜 Prompt Templating

📦 A New Package

Haystack 2.0 is distributed with haystack-ai, while Haystack 1.x will continue to be supported with farm-haystack with security updates and bug fixes.

NOTE: Installing haystack-ai and farm-haystack into the same Python environment will lead to conflicts - Please use separate virtual environments for each package.

Check out the installation guide for more information.

💪 Powerful Pipelines

In Haystack 2.0, pipelines are dynamic computation graphs that support:

  • 🚦 Control flow: Need to run different components based on the output of another? Not a problem with 2.0.
  • Loops: Implement complex behavior such as self-correcting flows by executing parts of the graph repeatedly.
  • 🎛️ Data flow: Consume it only where you need it. Haystack 2.0 only exposes data to components which need it - benefiting speed and transparency.
  • Validation and type-checking: Ensures all components in your pipeline are compatible even before running it.
  • 💾 Serialization: Save and restore your pipelines from different formats.

Pipelines can be built with a few easy steps:

  1. Create the Pipeline object.
  2. Add components to the pipeline with the add_component() method.
  3. Connect the components with the connect() method. Trying to connect components that are not compatible in type will raise an error.
  4. Execute the pipeline with the run() method.

Example

The following pipeline does question-answering on a given URL:

import os

from haystack import Pipeline
from haystack.components.fetchers import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.utils import Secret

os.environ["OPENAI_API_KEY"] = "Your OpenAI API Key"

fetcher = LinkContentFetcher()
converter = HTMLToDocument()
prompt_template = """
According to the contents of this website:
{% for document in documents %}
  {{document.content}}
{% endfor %}
Answer the given question: {{query}}
Answer:
"""
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY"))

pipeline = Pipeline()
pipeline.add_component("fetcher", fetcher)
pipeline.add_component("converter", converter)
pipeline.add_component("prompt", prompt_builder)
pipeline.add_component("llm", llm)

# pass the fetchers's `streams` output to the converter using the `sources` parameter
pipeline.connect("fetcher.streams", "converter.sources")
# pass the converted `documents to the prompt_builder using the `documents` parameter
pipeline.connect("converter.documents", "prompt.documents")
# pass the interpolated `prompt to the llm using the `prompt` parameter
pipeline.connect("prompt.prompt", "llm.prompt")

pipeline.run({"fetcher": {"urls": ["https://haystack.deepset.ai/overview/quick-start"]},
              "prompt": {"query": "How should I install Haystack?"}})

print(result["llm"]["replies"][0])

🔌 Customizable Components

Previously known as Nodes, components have been formalized with well-defined inputs and outputs that allow for easy extensibility and composability.

Haystack 2.0 provides a diverse selection of built-in components. Here’s a non-exhaustive overview:

Category Description External Providers & Integrations
Audio Transcriber Transcribe audio to text OpenAI
Builders Build prompts and answers from templates
Classifiers Classify documents based on specific criteria
Connectors Interface with external services OpenAPI
Converters Convert data between different formats Azure, Tika, Unstructured, PyPDF, OpenAPI, Jinja
Embedders Transform texts and documents to vector representations Amazon Bedrock, Azure, Cohere, FastEmbed, Gradient, Hugging Face (Optimum, Sentence Transformers, Text Embedding Inference), Instructor, Jina, Mistral, Nvidia, Ollama, OpenAI
Extractors Extract information from documents Hugging Face, spaCy
Evaluators Evaluate components using metrics Ragas, DeepEval, UpTrain
Fetcher Fetch data from remote URLs
Generators Prompt and generate text using generative models Amazon Bedrock, Amazon Sagemaker, Azure, Cohere, Google AI, Google Vertex, Gradient, Hugging Face, Llama.cpp, Mistral, Nvidia, Ollama, OpenAI
Joiners Combine documents from different components
Preprocessors Preprocess text and documents
Rankers Sort documents based on specific criteria Hugging Face
Readers Find answers in documents
Retrievers Fetch documents from a document store based on a query Astra, Chroma, Elasticsearch, MongoDB Atlas, OpenSearch, Pgvector, Pinecone, Qdrant, Weaviate
Routers Manipulate pipeline control flow
Validators Validate data based on schemas
Web Search Perform search queries Search, SerperDev
Writers Write data into data sources

Custom Components

If Haystack lacks a functionality that you need, you can easily create your own component and slot that into a pipeline. Broadly speaking, writing a custom component requires:

  • Creating a class with the @component decorator.
  • Providing a run() method. The parameters passed to this method double as the component’s inputs.
  • Defining the outputs and the output types of the run() method with a @component.output_types() decorator.
  • Returning a dictionary that includes the outputs of the component.

Below is an example of a toy Embedder component that receives a text input and returns a random vector representation as embedding.

import random
from typing import List  
from haystack import component, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever

@component
class MyEmbedder:
	def __init__(self, dim: int = 128):
		self.dim = dim

  @component.output_types(embedding=List[float])
  def run(self, text: str):
		print(f"Random embedding for text : {text}")
		embedding = [random.uniform(1.0, -1.0) for _ in range(self.dim)]
    return {"embedding": embedding}

# Using the component directly
my_embedder = MyEmbedder()
my_embedder.run(text="Hi, my name is Tuana") 

# Using the component in a pipeline
document_store = InMemoryDocumentStore()
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", MyEmbedder())
query_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")

query_pipeline.run({"text_embedder":{"text": "Who lives in Berlin?"}})

🍱 Ready-made Pipeline Templates

Haystack 2.0 offers ready-made pipeline templates for common use cases, which can be created with just a single line of code.

Example

from haystack import Pipeline, PredefinedPipeline

pipeline = Pipeline.from_template(PredefinedPipeline.CHAT_WITH_WEBSITE)

# and then you can run this pipeline 👇
# pipeline.run({
#    "fetcher": {"urls": ["https://haystack.deepset.ai/overview/quick-start"]},
#    "prompt": {"query": "How should I install Haystack?"}}
# )

🗃️ Document Stores

In Haystack 2.0, Document Stores provide a common interface through which pipeline components can read and manipulate data without any knowledge of the backend technology. Furthermore, Document Stores are paired with specialized retriever components that can be used to fetch documents from a particular data source based on specific queries.

This separation of interface and implementation lets us provide support for several third-party providers of vector databases such as Weaviate, Chroma, Pinecone, Astra DB, MongoDB, Qdrant, Pgvector, Elasticsearch, OpenSearch, Neo4j and Marqo.

Example

#pip install chroma-haystack

from haystack_integrations.document_stores.chroma import ChromaDocumentStore
from haystack_integrations.components.retrievers.chroma import ChromaEmbeddingRetriever

document_store = ChromaDocumentStore()
retriever = ChromaEmbeddingRetriever(document_store)

🧩 Integrations

Thanks to Haystack 2.0’s flexible infrastructure, pipelines can be easily extended with external technologies and libraries in the form of new components, document stores, etc, all the while keeping dependencies cleanly separated.

Starting with 2.0, integrations are divided into two categories:

Please refer to the official integrations website for more information.

🕵️ Logging & Tracing

The monitoring of Haystack 2.0 pipelines in production is aided by both a customizable logging system that supports structured logging and tracing correlation out of the box, and code instrumentation collecting spans and traces in strategic points of the execution path, with support for Open Telemetry and Datadog already in place.

🏎️ Device Management

Haystack 2.0 provides a framework-agnostic system of addressing and using devices such as GPUs and accelerators across different platforms and providers.

🔐 Secret Management

To securely manage credentials for services that require authentication, Haystack 2.0 provides a type-safe approach to handle authentication and API secrets that prevents accidental leaks.

📜 Prompt Templating

Haystack 2.0 prompt templating uses Jinja, and prompts are included in pipelines with the use of a PromptBuilder (or DymanicPromptBuilder for advanced use cases ). Everything in {{ }} in a prompt, becomes an input to the PromptBuilder.

Example

The following prompt_builder will expect documents and query as input.

from haystack.components.builders import PromptBuilder

template = """Given these documents, answer the question.
              Documents:
              {% for doc in documents %}
                  {{ doc.content }}
              {% endfor %}
              Question: {{query}}
              Answer:"""
prompt_builder = PromptBuilder(template=template)

🚀 Getting Started

Alongside Haystack 2.0, today we are also releasing a whole set of new tutorials, documentation, resources and more to help you get started:

🧡 Join the Community

Stay up-to-date with Haystack:

⏳ Haystack 2.0-Beta History

Follow the progress we made during beta in each beta release:

v1.25.0

1 month ago

Release Notes

v1.25.0

⚡️ Enhancement Notes

  • Add raise_on_failure flag to BaseConverter class so that big processes can optionally continue without breaking from exceptions.

  • Upgrade Transformers to the latest version 4.37.2. This version adds support for the Phi-2 and Qwen2 models and improves support for quantization.

  • Add support for latest OpenAI embedding models text-embedding-3-large and text-embedding-3-small.

  • API_BASE can now be passed as an optional parameter in the getting_started sample. Only openai provider is supported in this set of changes. PromptNode and PromptModel were enhanced to allow passing of this parameter. This allows RAG against a local endpoint (e.g, http://localhost:1234/v1), so long as it is OpenAI compatible (such as LM Studio)

    Logging in the getting started sample was made more verbose, to make it easier for people to see what was happening under the covers.

  • Added new option split_by="page" to the preprocessor so we can chunk documents by page break.

🐛 Bug Fixes

  • Change the dummy vector used internally in the Pinecone Document Store. A recent change to the Pinecone API does not allow to use vectors filled with zeros as was the previous dummy vector.
  • The types of meta data values accepted by RouteDocuments was unnecessarily restricted to string types. This causes validation errors (for example when loading from a yaml file) if a user tries to use a boolean type for example. We add boolean and int types as valid types for metadata_values.
  • Fixed a bug that made it impossible to write Documents to Weaviate when some of the fields were empty lists (e.g. split_overlap for preprocessed documents).
  • Correct page meta field for pdfs that contain pages without any text content

v1.25.0-rc1

2 months ago

Release Notes

v1.25.0-rc1

⚡️ Enhancement Notes

  • Add raise_on_failure flag to BaseConverter class so that big processes can optionally continue without breaking from exceptions.

  • Upgrade Transformers to the latest version 4.37.2. This version adds support for the Phi-2 and Qwen2 models and improves support for quantization.

  • Add support for latest OpenAI embedding models text-embedding-3-large and text-embedding-3-small.

  • API_BASE can now be passed as an optional parameter in the getting_started sample. Only openai provider is supported in this set of changes. PromptNode and PromptModel were enhanced to allow passing of this parameter. This allows RAG against a local endpoint (e.g, http://localhost:1234/v1), so long as it is OpenAI compatible (such as LM Studio)

    Logging in the getting started sample was made more verbose, to make it easier for people to see what was happening under the covers.

  • Added new option split_by="page" to the preprocessor so we can chunk documents by page break.

🐛 Bug Fixes

  • Change the dummy vector used internally in the Pinecone Document Store. A recent change to the Pinecone API does not allow to use vectors filled with zeros as was the previous dummy vector.
  • The types of meta data values accepted by RouteDocuments was unnecessarily restricted to string types. This causes validation errors (for example when loading from a yaml file) if a user tries to use a boolean type for example. We add boolean and int types as valid types for metadata_values.
  • Fixed a bug that made it impossible to write Documents to Weaviate when some of the fields were empty lists (e.g. split_overlap for preprocessed documents).

v2.0.0-beta.8

2 months ago

Release Notes

v2.0.0-beta.8

Highlights

Introducing a flexible and dynamic approach to creating NLP pipelines with Haystack's new PipelineTemplate class!

This innovative feature utilizes Jinja templated YAML files, allowing users to effortlessly construct and customize complex data processing pipelines for various NLP tasks. From question answering and document indexing to custom pipeline requirements, the PipelineTemplate simplifies configuration and enhances adaptability. Users can now easily override default components or integrate custom settings with simple, straightforward code.

For example, the following pipeline template can be used to create an indexing pipeline:

from haystack.components.embedders import SentenceTransformersDocumentEmbedder 
from haystack.templates import PipelineTemplate, PipelineType 
pt = PipelineTemplate(PipelineType.INDEXING, template_params={"use_pdf_file_converter": True}) 
pt.override("embedder", SentenceTransformersDocumentEmbedder(progress_bar=True)) 
pipe = ptb.build() 
result = pipe.run(data={"sources": ["some_local_dir/and_text_file.txt", "some_other_local_dir/and_pdf_file.pdf"]}) 
print(result) 

In the above example, a PipelineType.INDEXING enum is used to create a pipeline with a custom instance of SentenceTransformersDocumentEmbedder and the PDF file converter enabled.

The pipeline is then run on a list of local files and the result is printed (number of indexed documents). We could have of course used the same PipelineTemplate class to create any other pre-defined pipeline or even a custom pipeline with custom components and settings. On the other hand, the following pipeline template can be used to create a pre-defined RAG pipeline:

from haystack.templates import PipelineTemplate, PipelineType 
pipe = PipelineTemplate(PipelineType.RAG).build() 
result = pipe.run(query="What's the meaning of life?") 
print(result)

_templateSource loads template content from various inputs, including strings, files, predefined templates, and URLs. The class provides mechanisms to load templates dynamically and ensure they contain valid Jinja2 syntax.

⬆️ Upgrade Notes

  • Adopt the new framework-agnostic device management in Sentence Transformers Embedders.

    Before this change:

    from haystack.components.embedders import SentenceTransformersTextEmbedder 
    embedder = SentenceTransformersTextEmbedder(device="cuda:0") 
    

    After this change:

    from haystack.utils.device import ComponentDevice, Device 
    from haystack.components.embedders import SentenceTransformersTextEmbedder 
    device = ComponentDevice.from_single(Device.gpu(id=0)) # or 
    # device = ComponentDevice.from_str("cuda:0") embedder = SentenceTransformersTextEmbedder(device=device) 
    
  • Adopt the new framework-agnostic device management in Local Whisper Transcriber.

    Before this change:

    from haystack.components.audio import LocalWhisperTranscriber  
    transcriber = LocalWhisperTranscriber(device="cuda:0") 
    

    After this change:

    from haystack.utils.device import ComponentDevice, Device from haystack.components.audio import LocalWhisperTranscriber 
    device = ComponentDevice.from_single(Device.gpu(id=0)) # or 
    # device = ComponentDevice.from_str("cuda:0")  transcriber = LocalWhisperTranscriber(device=device) 
    

🚀 New Features

  • Add FilterRetriever. It retrieves documents that match the provided (either at init or runtime) filters.

  • Add LostInTheMiddleRanker. It reorders documents based on the "Lost in the Middle" order, a strategy that places the most relevant paragraphs at the beginning or end of the context, while less relevant paragraphs are positioned in the middle.

  • Add support for Mean Reciprocal Rank (MRR) Metric to StatisticalEvaluator. MRR measures the mean reciprocal rank of times a label is present in at least one or more predictions.

  • Introducing the OutputAdapter component which enables seamless data flow between pipeline components by adapting the output of one component to match the expected input of another using Jinja2 template expressions. This addition opens the door to greater flexibility in pipeline configurations, facilitating custom adaptation rules and exemplifying a structured approach to inter-component communication.

  • Add is_greedy argument to @component decorator. This flag will change the behaviour of Component`s with inputs that have a `Variadic type when running inside a Pipeline.

    Variadic `Component`s that are marked as greedy will run as soon as they receive their first input. If not marked as greedy instead they'll wait as long as possible before running to make sure they receive as many inputs as possible from their senders.

    It will be ignored for all other `Component`s even if set explicitly.

  • Remove the old evaluation API in favor of a Component based API. We now have SASEvaluator and StatisticalEvaluator replacing the old API.

  • Introduced JsonSchemaValidator to validate the JSON content of ChatMessage against a provided JSON schema. Valid messages are emitted through the 'validated' output, while messages failing validation are sent via the 'validation_error' output, along with useful error details for troubleshooting.

  • Add a new variable called meta_value_type to the MetaFieldRanker that allows a user to parse the meta value into the data type specified as along as the meta value is a string. The supported values for meta_value_type are '"float"', '"int"', '"date"', or 'None'. If None is passed then no parsing is done. For example, if we specified meta_value_type="date" then for the meta value "date": "2015-02-01" we would parse the string into a datetime object.

  • Add TextCleaner Component to clean list of strings. It can remove substrings matching a list of regular expressions, convert text to lowercase, remove punctuation, and remove numbers. This is mostly useful to clean generator predictions before evaluation.

⚡️ Enhancement Notes

  • Add __repr__ to all Components to print their I/O. This can also be useful in Jupyter notebooks as this will be shown as a cell output if the it's the last expression in a cell.
  • Add new Pipeline.show() method to generated image inline if run in a Jupyter notebook. If called outside a notebook it will raise a PipelineDrawingError. Pipeline.draw() has also been simplified and the engine argument has been removed. Now all images will be generated using Mermaid.
  • Customize Pipeline.__repr__() to return a nice text representation of it. If run on a Jupyter notebook it will instead have the same behaviour as Pipeline.show().
  • Change Pipeline.run() to check if max_loops_allowed has been reached. If we attempt to run a Component that already ran the number of max_loops_allowed a PipelineMaxLoops will be raised.
  • Merge Pipeline`s definitions into a single `Pipeline class. The class in the haystack.pipeline package has been deleted and only haystack.core.pipeline exists now.
  • Enhanced the OpenAPIServiceConnector to support dynamic authentication handling. With this update, service credentials are now dynamically provided at each run invocation, eliminating the need for pre-configuring a known set of service authentications. This flexibility allows for the introduction of new services on-the-fly, each with its unique authentication, streamlining the integration process. This modification not only simplifies the initial setup of the OpenAPIServiceConnector but also ensures a more transparent and straightforward authentication process for each interaction with different OpenAPI services.

🐛 Bug Fixes

  • Adds api_base_url attribute to OpenAITExtEmbedder. Previously, it was used only for initialization and was not serialized.
  • Previously, when using the same input reference in different components, the Pipeline run logic had an unexpected behavior. This has been fixed by deepcopying the inputs before passing them to the components.