> For the complete documentation index, see [llms.txt](https://docs.zenml.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.zenml.io/user-guides/llmops-guide.md).

# LLMOps guide

Welcome to the ZenML LLMOps Guide, where we dive into the exciting world of Large Language Models (LLMs) and how to integrate them seamlessly into your MLOps pipelines using ZenML. This guide is designed for ML practitioners and MLOps engineers looking to harness the potential of LLMs while maintaining the robustness and scalability of their workflows.

<figure><img src="/files/biluxRRqUpijAYfgrI5c" alt=""><figcaption><p>ZenML simplifies the development and deployment of LLM-powered MLOps pipelines.</p></figcaption></figure>

In this guide, we'll explore various aspects of working with LLMs in ZenML, including:

* [RAG with ZenML](/user-guides/llmops-guide/rag-with-zenml.md)
  * [RAG in 85 lines of code](/user-guides/llmops-guide/rag-with-zenml/rag-85-loc.md)
  * [Understanding Retrieval-Augmented Generation (RAG)](/user-guides/llmops-guide/rag-with-zenml/understanding-rag.md)
  * [Data ingestion and preprocessing](/user-guides/llmops-guide/rag-with-zenml/data-ingestion.md)
  * [Embeddings generation](/user-guides/llmops-guide/rag-with-zenml/embeddings-generation.md)
  * [Storing embeddings in a vector database](/user-guides/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database.md)
  * [Basic RAG inference pipeline](/user-guides/llmops-guide/rag-with-zenml/basic-rag-inference-pipeline.md)
* [Evaluation and metrics](/user-guides/llmops-guide/evaluation.md)
  * [Evaluation in 65 lines of code](/user-guides/llmops-guide/evaluation/evaluation-in-65-loc.md)
  * [Retrieval evaluation](/user-guides/llmops-guide/evaluation/retrieval.md)
  * [Generation evaluation](/user-guides/llmops-guide/evaluation/generation.md)
  * [Evaluation in practice](/user-guides/llmops-guide/evaluation/evaluation-in-practice.md)
* [Reranking for better retrieval](/user-guides/llmops-guide/reranking.md)
  * [Understanding reranking](/user-guides/llmops-guide/reranking/understanding-reranking.md)
  * [Implementing reranking in ZenML](/user-guides/llmops-guide/reranking/implementing-reranking.md)
  * [Evaluating reranking performance](/user-guides/llmops-guide/reranking/evaluating-reranking-performance.md)
* [Improve retrieval by finetuning embeddings](/user-guides/llmops-guide/finetuning-embeddings.md)
  * [Synthetic data generation](/user-guides/llmops-guide/finetuning-embeddings/synthetic-data-generation.md)
  * [Finetuning embeddings with Sentence Transformers](/user-guides/llmops-guide/finetuning-embeddings/finetuning-embeddings-with-sentence-transformers.md)
  * [Evaluating finetuned embeddings](/user-guides/llmops-guide/finetuning-embeddings/evaluating-finetuned-embeddings.md)
* [Finetuning LLMs with ZenML](/user-guides/llmops-guide/finetuning-llms.md)
  * [Finetuning in 100 lines of code](/user-guides/llmops-guide/finetuning-llms/finetuning-100-loc.md)
  * [Why and when to finetune LLMs](/user-guides/llmops-guide/finetuning-llms/why-and-when-to-finetune-llms.md)
  * [Starter choices with finetuning](/user-guides/llmops-guide/finetuning-llms/starter-choices-for-finetuning-llms.md)
  * [Finetuning with 🤗 Accelerate](/user-guides/llmops-guide/finetuning-llms/finetuning-with-accelerate.md)
  * [Evaluation for finetuning](/user-guides/llmops-guide/finetuning-llms/evaluation-for-finetuning.md)
  * [Deploying finetuned models](/user-guides/llmops-guide/finetuning-llms/deploying-finetuned-models.md)
  * [Next steps](/user-guides/llmops-guide/finetuning-llms/next-steps.md)

To follow along with the examples and tutorials in this guide, ensure you have a Python environment set up with ZenML installed. Familiarity with the concepts covered in the [Starter Guide](/user-guides/starter-guide.md) and [Production Guide](/user-guides/production-guide.md) is recommended.

We'll showcase a specific application over the course of this LLM guide, showing how you can work from a simple RAG pipeline to a more complex setup that involves finetuning embeddings, reranking retrieved documents, and even finetuning the LLM itself. We'll do this all for a use case relevant to ZenML: a question answering system that can provide answers to common questions about ZenML. This will help you understand how to apply the concepts covered in this guide to your own projects.

By the end of this guide, you'll have a solid understanding of how to leverage LLMs in your MLOps workflows using ZenML, enabling you to build powerful, scalable, and maintainable LLM-powered applications. First up, let's take a look at a super simple implementation of the RAG paradigm to get started.

<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.zenml.io/user-guides/llmops-guide.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.