> For the complete documentation index, see [llms.txt](https://docs.zenml.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.zenml.io/user-guides/llmops-guide/rag-with-zenml.md).

# RAG with ZenML

Retrieval-Augmented Generation (RAG) is a powerful technique that combines the\
strengths of retrieval-based and generation-based models. In this guide, we'll\
explore how to set up RAG pipelines with ZenML, including data ingestion, index\
store management, and tracking RAG-associated artifacts.

LLMs are a powerful tool, as they can generate human-like responses to a wide\
variety of prompts. However, they can also be prone to generating incorrect or\
inappropriate responses, especially when the input prompt is ambiguous or\
misleading. They are also (currently) limited in the amount of text they can\
understand and/or generate. While there are some LLMs [like Google's Gemini 1.5\
Pro](https://developers.googleblog.com/2024/02/gemini-15-available-for-private-preview-in-google-ai-studio.html)\
that can consistently handle 1 million tokens (small units of text), the vast majority (particularly\
the open-source ones currently available) handle far less.

The first part of this guide to RAG pipelines with ZenML is about understanding\
the basic components and how they work together. We'll cover the following\
topics:

* why RAG exists and what problem it solves
* how to ingest and preprocess data that we'll use in our RAG pipeline
* how to leverage embeddings to represent our data; this will be the basis for\
  our retrieval mechanism
* how to store these embeddings in a vector database
* how to track RAG-associated artifacts with ZenML

At the end, we'll bring it all together and show all the components working\
together to perform basic RAG inference.

<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.zenml.io/user-guides/llmops-guide/rag-with-zenml.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.