> For the complete documentation index, see [llms.txt](https://docs.zenml.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.zenml.io/kitaru/getting-started/deploy.md). # Deploy Production agents need to survive restarts, share one source of truth across your team, and record every run so you can replay and improve it later. Deploying moves your flow off your laptop onto shared cloud compute without changing your code. Three steps. ## 1. Deploy a Kitaru server Locally, the server runs embedded in your Python process. In production, you deploy it as a standalone service so your team shares a single view of all executions and agents run independently of your machine. The server stores execution metadata, checkpoint state, and logs. It does not access your cloud storage directly; it brokers temporary credentials so clients and the UI can read artifacts when needed.


Deploy with Helm	Install the Kitaru server on any Kubernetes cluster	/pages/59D56yXD1Bnb1pInUUEm

## 2. Connect to the server Point your local client at the deployed server: ```bash kitaru login https://kitaru.your-company.com ``` From here, the CLI, `KitaruClient`, and the UI all talk to the same server. Any executions you start will be visible to your whole team. ## 3. Set up a cloud stack A [stack](/kitaru/agent-runtime-stacks/stacks.md) is a named runtime that tells Kitaru where to run your agent code and where to store its outputs. Pick the compute backend that matches your cloud:


Kubernetes	Run agents on any Kubernetes cluster with S3 or GCS storage	/pages/UEBAQ0eNbl2HWNqjGHNd
AWS (SageMaker)	Run agents as SageMaker jobs with S3 storage	/pages/CwCoxwwFTvKf7ouLKwom
GCP (Vertex AI)	Run agents as Vertex AI jobs with GCS storage	/pages/MHYzqWJhVXogczqJhSKO
Azure (AzureML)	Run agents as AzureML jobs with Azure Blob storage	/pages/WbBiMlyiMSZfY3DkO7f7

Once your stack is created, switch to it: ```bash kitaru stack use prod-k8s ``` ## 4. Run your agent in the cloud Your code doesn't change. The same flow, the same checkpoints, the same replay, now running on cloud compute with durable storage. ```python if __name__ == "__main__": research_agent.run(topic="durable execution for AI agents") ``` When you call `.run()`, the client fetches short-lived credentials from the server and dispatches the execution directly to your stack's compute backend. Checkpoint outputs are written to cloud storage. You can observe the execution from the UI, the CLI, or any `KitaruClient` connected to the same server. Every cloud run records the same durable checkpoints as your local runs, so you can `flow.replay(exec_id, at="", flow_overrides={...})` a production execution with one input changed and diff it against the original baseline. --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://docs.zenml.io/kitaru/getting-started/deploy.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.