# Deploy

You have a working flow on your laptop. Now you want it running in the cloud, surviving restarts, and accessible to your team. Three steps.

## 1. Deploy a Kitaru server

Locally, the server runs embedded in your Python process. In production, you deploy it as a standalone service so your team can share a single view of all executions — and so agents can run independently of your machine.

The server stores execution metadata, checkpoint state, and logs. It does not access your cloud storage directly — it brokers temporary credentials so clients and the UI can read artifacts when needed.

<table data-view="cards"><thead><tr><th></th><th></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td><strong>Deploy with Helm</strong></td><td>Install the Kitaru server on any Kubernetes cluster</td><td><a href="/pages/59D56yXD1Bnb1pInUUEm">/pages/59D56yXD1Bnb1pInUUEm</a></td></tr></tbody></table>

## 2. Connect to the server

Point your local client at the deployed server:

```bash
kitaru login https://kitaru.your-company.com
```

From here, the CLI, `KitaruClient`, and the UI all talk to the same server. Any executions you start will be visible to your whole team.

## 3. Set up a cloud stack

A [stack](/kitaru/agent-runtime-stacks/stacks.md) is a named runtime that tells Kitaru where to run your agent code and where to store its outputs. Pick the compute backend that matches your cloud:

<table data-view="cards"><thead><tr><th></th><th></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td><strong>Kubernetes</strong></td><td>Run agents on any Kubernetes cluster with S3 or GCS storage</td><td><a href="/pages/UEBAQ0eNbl2HWNqjGHNd">/pages/UEBAQ0eNbl2HWNqjGHNd</a></td></tr><tr><td><strong>AWS (SageMaker)</strong></td><td>Run agents as SageMaker jobs with S3 storage</td><td><a href="/pages/CwCoxwwFTvKf7ouLKwom">/pages/CwCoxwwFTvKf7ouLKwom</a></td></tr><tr><td><strong>GCP (Vertex AI)</strong></td><td>Run agents as Vertex AI jobs with GCS storage</td><td><a href="/pages/MHYzqWJhVXogczqJhSKO">/pages/MHYzqWJhVXogczqJhSKO</a></td></tr><tr><td><strong>Azure (AzureML)</strong></td><td>Run agents as AzureML jobs with Azure Blob storage</td><td><a href="/pages/WbBiMlyiMSZfY3DkO7f7">/pages/WbBiMlyiMSZfY3DkO7f7</a></td></tr></tbody></table>

Once your stack is created, switch to it:

```bash
kitaru stack use prod-k8s
```

## 4. Run your agent in the cloud

Your code doesn't change. The same flow, the same checkpoints, the same replay — now running on cloud compute with durable storage.

```python
if __name__ == "__main__":
    research_agent.run(topic="durable execution for AI agents")
```

When you call `.run()`, the client fetches short-lived credentials from the server and dispatches the execution directly to your stack's compute backend. Checkpoint outputs are written to cloud storage. You can observe the execution from the UI, the CLI, or any `KitaruClient` connected to the same server.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.zenml.io/kitaru/getting-started/deploy.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
