Kubernetes Orchestrator

How to orchestrate pipelines with Kubernetes

This is an older version of the ZenML documentation. To read and view the latest version please visit this up-to-date URL.

The Kubernetes orchestrator is an orchestrator flavor provided with the ZenML kubernetes integration that runs your pipelines on a Kubernetes cluster.

This component is only meant to be used within the context of remote ZenML deployment scenario. Usage with a local ZenML deployment may lead to unexpected behavior!

When to use it

You should use the Kubernetes orchestrator if:

you're looking lightweight way of running your pipelines on Kubernetes.
you don't need a UI to list all your pipelines runs.
you're not willing to maintain Kubeflow Pipelines on your Kubernetes cluster.
you're not interested in paying for managed solutions like Vertex.

How to deploy it

The Kubernetes orchestrator requires a Kubernetes cluster in order to run. There are many ways to deploy a Kubernetes cluster using different cloud providers or on your custom infrastructure, and we can't possibly cover all of them, but you can check out our cloud guide

If the above Kubernetes cluster is deployed remotely on the cloud, then another pre-requisite to use this orchestrator would be to deploy and connect to a remote ZenML server.

How to use it

To use the Kubernetes orchestrator, we need:

The ZenML kubernetes integration installed. If you haven't done so, run
```
zenml integration install kubernetes
```
Docker installed and running.
kubectl installed.
A remote artifact store as part of your stack.
A remote container registry as part of your stack.
A Kubernetes cluster deployed and the name of your Kubernetes context which points to this cluster. Runkubectl config get-contexts to see a list of available contexts.

We can then register the orchestrator and use it in our active stack:

zenml orchestrator register <ORCHESTRATOR_NAME> \
    --flavor=kubernetes \
    --kubernetes_context=<KUBERNETES_CONTEXT>

# Register and activate a stack with the new orchestrator
zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> ... --set

ZenML will build a Docker image called <CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME> which includes your code and use it to run your pipeline steps in Kubernetes. Check out this page if you want to learn more about how ZenML builds these images and how you can customize them.

You can now run any ZenML pipeline using the Kubernetes orchestrator:

python file_that_runs_a_zenml_pipeline.py

Additional configuration

For additional configuration of the Kubernetes orchestrator, you can pass KubernetesOrchestratorSettings which allows you to configure (among others) the following attributes:

pod_settings: Node selectors, affinity and tolerations to apply to the Kubernetes Pods running your pipline. These can be either specified using the Kubernetes model objects or as dictionaries.

from zenml.integrations.kubernetes.flavors.kubernetes_orchestrator_flavor import KubernetesOrchestratorSettings
from kubernetes.client.models import V1Toleration


kubernetes_settings = KubernetesOrchestratorSettings(
    pod_settings={
        "affinity": {
            "nodeAffinity": {
                "requiredDuringSchedulingIgnoredDuringExecution": {
                    "nodeSelectorTerms": [
                        {
                            "matchExpressions": [
                                {
                                    "key": "node.kubernetes.io/name",
                                    "operator": "In",
                                    "values": ["my_powerful_node_group"],
                                }
                            ]
                        }
                    ]
                }
            }
        },
        "tolerations": [
            V1Toleration(
                key="node.kubernetes.io/name",
                operator="Equal",
                value="",
                effect="NoSchedule"
            )
        ]
    }
)

@pipeline(
    settings={
        "orchestrator.kubernetes": kubernetes_settings
    }
)
  ...

Check out the API docs for a full list of available attributes and this docs page for more information on how to specify settings.

A concrete example of using the Kubernetes orchestrator can be found here.

For more information and a full list of configurable attributes of the Kubernetes orchestrator, check out the API Docs.

Enabling CUDA for GPU-backed hardware

Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow the instructions on this page to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration.

PreviousKubeflow Orchestrator NextGoogle Cloud VertexAI Orchestrator

Last updated 6 months ago