Kubeflow Orchestrator

How to orchestrate pipelines with Kubeflow

When to use it

You should use the Kubeflow orchestrator if:

you're looking for a proven production-grade orchestrator.
you're looking for a UI in which you can track your pipeline runs.
you're already using Kubernetes or are not afraid of setting up and maintaining a Kubernetes cluster.
you're willing to deploy and maintain Kubeflow Pipelines on your cluster.

How to deploy it

The Kubeflow orchestrator supports two different modes: Local and remote. In case you want to run the orchestrator on a local Kubernetes cluster running on your machine, there is no additional infrastructure setup necessary.

If you want to run your pipelines on a remote cluster instead, you'll need to set up a Kubernetes cluster and deploy Kubeflow Pipelines:

Since Kubernetes v1.19, AKS has shifted

. However, the workflow controller installed with the Kubeflow installation has Docker set as the default runtime. In order to make your pipelines work, you have to change the value to one of the options

, preferably k8sapi.
This change has to be made by editing the containerRuntimeExecutor property of the ConfigMap corresponding to the workflow controller. Run the following commands to first know what config map to change and then to edit it to reflect your new value.

If one or more of the deployments are not in the Running state, try increasing the number of nodes in your cluster.

If you're installing Kubeflow Pipelines manually, make sure the Kubernetes service is called exactly ml-pipeline. This is a requirement for ZenML to connect to your Kubeflow Pipelines deployment.

How to use it

To use the Kubeflow orchestrator, we need:

The ZenML kubeflow integration installed. If you haven't done so, run

When using the Kubeflow orchestrator locally, you'll additionally need

The local Kubeflow Pipelines deployment requires more than 2 GB of RAM, so if you're using Docker Desktop make sure to update the resource limits in the preferences.

We can then register the orchestrator and use it in our active stack:

When using the Kubeflow orchestrator with a remote cluster, you'll additionally need

The name of your Kubernetes context which points to your remote cluster. Run kubectl config get-contexts to see a list of available contexts.

We can then register the orchestrator and use it in our active stack:

or set it when registering a new Kubeflow orchestrator:

Once the orchestrator is part of the active stack, we need to run zenml stack up before running any pipelines. This command

forwards a port so you can view the Kubeflow UI in your browser.
(in the local case) uses K3D to provision a Kubernetes cluster on your machine and deploys Kubeflow Pipelines on it.

You can now run any ZenML pipeline using the Kubeflow orchestrator:

PreviousLocal Orchestrator NextKubernetes Orchestrator

Last updated 7 months ago

Kubeflow Orchestrator

How to orchestrate pipelines with Kubeflow

This is an older version of the ZenML documentation. To read and view the latest version please .

The Kubeflow orchestrator is an flavor provided with the ZenML kubeflow integration that uses to run your pipelines.

When to use it

You should use the Kubeflow orchestrator if:

you're looking for a proven production-grade orchestrator.
you're looking for a UI in which you can track your pipeline runs.
you're already using Kubernetes or are not afraid of setting up and maintaining a Kubernetes cluster.
you're willing to deploy and maintain Kubeflow Pipelines on your cluster.

How to deploy it

If you want to run your pipelines on a remote cluster instead, you'll need to set up a Kubernetes cluster and deploy Kubeflow Pipelines:

Have an existing AWS set up.
Make sure you have the set up.
Download and kubectl and it to talk to your EKS cluster using the following command:
```
aws eks --region REGION update-kubeconfig --name CLUSTER_NAME
```
Kubeflow Pipelines onto your cluster.

Have an existing GCP set up.
Make sure you have the set up first.
Download and kubectl and it to talk to your GKE cluster using the following command:
```
gcloud container clusters get-credentials CLUSTER_NAME
```
Kubeflow Pipelines onto your cluster.

Have an existing set up.
Make sure you have the set up first.
Download and kubectl and it to talk to your AKS cluster using the following command:
```
az aks get-credentials --resource-group RESOURCE_GROUP --name CLUSTER_NAME
```
Kubeflow Pipelines onto your cluster.

Since Kubernetes v1.19, AKS has shifted

. However, the workflow controller installed with the Kubeflow installation has Docker set as the default runtime. In order to make your pipelines work, you have to change the value to one of the options

listed

, preferably k8sapi.
This change has to be made by editing the containerRuntimeExecutor property of the ConfigMap corresponding to the workflow controller. Run the following commands to first know what config map to change and then to edit it to reflect your new value.
kubectl get configmap -n kubeflow
kubectl edit configmap CONFIGMAP_NAME -n kubeflow
# This opens up an editor that can be used to make the change.

If one or more of the deployments are not in the Running state, try increasing the number of nodes in your cluster.

If you're installing Kubeflow Pipelines manually, make sure the Kubernetes service is called exactly ml-pipeline. This is a requirement for ZenML to connect to your Kubeflow Pipelines deployment.

How to use it

To use the Kubeflow orchestrator, we need:

The ZenML kubeflow integration installed. If you haven't done so, run
```
zenml integration install kubeflow
```
installed and running.
installed.

When using the Kubeflow orchestrator locally, you'll additionally need

installed to spin up a local Kubernetes cluster.
A as part of your stack.

The local Kubeflow Pipelines deployment requires more than 2 GB of RAM, so if you're using Docker Desktop make sure to update the resource limits in the preferences.

We can then register the orchestrator and use it in our active stack:

zenml orchestrator register <NAME> \
    --flavor=kubeflow

# Add the orchestrator to the active stack
zenml stack update -o <NAME>

When using the Kubeflow orchestrator with a remote cluster, you'll additionally need

Kubeflow pipelines deployed on a remote cluster. See the for more information.
The name of your Kubernetes context which points to your remote cluster. Run kubectl config get-contexts to see a list of available contexts.
A as part of your stack.
A as part of your stack. Kubeflow Pipelines already comes with its own MySQL database that is deployed in your Kubernetes cluster. If you want to use this database as your metadata store to get started quickly, check out the corresponding . For a more production-ready setup we suggest using a instead.
A as part of your stack.

We can then register the orchestrator and use it in our active stack:

zenml orchestrator register <NAME> \
    --flavor=kubeflow \
    --kubernetes_context=<KUBERNETES_CONTEXT>

# Add the orchestrator to the active stack
zenml stack update -o <NAME>

ZenML will build a Docker image called zenml-kubeflow which includes your code and use it to run your pipeline steps in Kubeflow. Check out if you want to learn more about how ZenML builds these images and how you can customize them.

If you decide you need the full flexibility of having a , you can update your existing orchestrator

zenml orchestrator update <NAME> \
--custom_docker_base_image_name=<IMAGE_NAME>

or set it when registering a new Kubeflow orchestrator:

zenml orchestrator register <NAME> \
--flavor=kubeflow \
--custom_docker_base_image_name=<IMAGE_NAME>

Once the orchestrator is part of the active stack, we need to run zenml stack up before running any pipelines. This command

forwards a port so you can view the Kubeflow UI in your browser.
(in the local case) uses K3D to provision a Kubernetes cluster on your machine and deploys Kubeflow Pipelines on it.

You can now run any ZenML pipeline using the Kubeflow orchestrator:

python file_that_runs_a_zenml_pipeline.py

A concrete example of using the Kubeflow orchestrator can be found .

For more information and a full list of configurable attributes of the Kubeflow orchestrator, check out the .

PreviousLocal Orchestrator NextKubernetes Orchestrator

Last updated 7 months ago