Kubeflow Orchestrator

How to orchestrate pipelines with Kubeflow

This is an older version of the ZenML documentation. To read and view the latest version please visit this up-to-date URL.

The Kubeflow orchestrator is an orchestrator flavor provided with the ZenML kubeflow integration that uses Kubeflow Pipelines to run your pipelines.

When to use it

You should use the Kubeflow orchestrator if:

  • you're looking for a proven production-grade orchestrator.

  • you're looking for a UI in which you can track your pipeline runs.

  • you're already using Kubernetes or are not afraid of setting up and maintaining a Kubernetes cluster.

  • you're willing to deploy and maintain Kubeflow Pipelines on your cluster.

How to deploy it

The Kubeflow orchestrator supports two different modes: Local and remote. In case you want to run the orchestrator on a local Kubernetes cluster running on your machine, there is no additional infrastructure setup necessary.

If you want to run your pipelines on a remote cluster instead, you'll need to set up a Kubernetes cluster and deploy Kubeflow Pipelines:

  • Have an existing AWS EKS cluster set up.

  • Make sure you have the AWS CLI set up.

  • Download and install kubectl and configure it to talk to your EKS cluster using the following command:

    aws eks --region REGION update-kubeconfig --name CLUSTER_NAME
  • Install Kubeflow Pipelines onto your cluster.

If one or more of the deployments are not in the Running state, try increasing the number of nodes in your cluster.

If you're installing Kubeflow Pipelines manually, make sure the Kubernetes service is called exactly ml-pipeline. This is a requirement for ZenML to connect to your Kubeflow Pipelines deployment.

How to use it

To use the Kubeflow orchestrator, we need:

  • The ZenML kubeflow integration installed. If you haven't done so, run

    zenml integration install kubeflow
  • Docker installed and running.

  • kubectl installed.

When using the Kubeflow orchestrator locally, you'll additionally need

The local Kubeflow Pipelines deployment requires more than 2 GB of RAM, so if you're using Docker Desktop make sure to update the resource limits in the preferences.

We can then register the orchestrator and use it in our active stack:

zenml orchestrator register <NAME> \
    --flavor=kubeflow

# Add the orchestrator to the active stack
zenml stack update -o <NAME>

ZenML will build a Docker image called <CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME> which includes your code and use it to run your pipeline steps in Kubeflow. Check out this page if you want to learn more about how ZenML builds these images and how you can customize them.

Once the orchestrator is part of the active stack, we need to run zenml stack up before running any pipelines. This command

  • forwards a port so you can view the Kubeflow UI in your browser.

  • (in the local case) uses K3D to provision a Kubernetes cluster on your machine and deploys Kubeflow Pipelines on it.

You can now run any ZenML pipeline using the Kubeflow orchestrator:

python file_that_runs_a_zenml_pipeline.py

A concrete example of using the Kubeflow orchestrator can be found here.

For more information and a full list of configurable attributes of the Kubeflow orchestrator, check out the API Docs.

Important Note for Multi-Tenancy Deployments

Kubeflow has a notion of multi-tenancy built into its deployment. Kubeflow’s multi-user isolation simplifies user operations because each user only views and edited\s the Kubeflow components and model artifacts defined in their configuration.

Currently, the default ZenML Kubeflow orchestrator yields the following error when running a pipeline:

HTTP response body: {"error":"Invalid input error: Invalid resource references for experiment. ListExperiment requires filtering by namespace.","code":3,"message":"Invalid input error: Invalid resource references for experiment. ListExperiment requires filtering by 
namespace.","details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid resource references for experiment. ListExperiment requires filtering by namespace.","error_details":"Invalid input error: Invalid resource references for experiment. ListExperiment requires filtering by namespace."}]}

The current workaround is as follows. Please place the following code at the top of your runner script (commonly called run.py):

import json
import os
import kfp
import os 
from kubernetes import client as k8s_client

NAMESPACE = "namespace_name"  # set this
USERNAME = "foo"  # set this
PASSWORD = "bar"  # set this
HOST = "https://qux.com" # set this
KFP_CONFIG = '~/.config/kfp/context.json'  # set this manually if  you'd like

def get_kfp_token(username: str, password: str) -> str:
    """Get token for kubeflow authentication."""
    session = requests.Session()
    response = session.get(HOST)
    headers = {
        "Content-Type": "application/x-www-form-urlencoded",
    }
    data = {"login": username, "password": password}
    session.post(response.url, headers=headers, data=data)
    session_cookie = session.cookies.get_dict()["authservice_session"]
    return session_cookie

token = get_kfp_token()
cookies = 'authservice_session=' + token

# 1: Set user namespace globally
kfp.Client(host=HOST, cookies=cookies).set_user_namespace(NAMESPACE)

# 2: Set cookie globally in the kfp config file
with open(KFP_CONFIG, 'r') as f:
    data = json.load(f)
    data['client_authentication_cookie'] = cookies

os.remove(KFP_CONFIG)
with open(KFP_CONFIG, 'w') as f:
    json.dump(data, f)

original = KubeflowOrchestrator._configure_container_op

def patch_container_op(container_op):
    original(container_op)
    container_op.container.add_env_variable(
        k8s_client.V1EnvVar(name="ZENML_RUN_NAME", value="{{workflow.annotations.pipelines.kubeflow.org/run_name}}")
    )

KubeflowOrchestrator._configure_container_op = staticmethod(patch_container_op)

def patch_get_run_name(self, pipeline_name):
    return os.getenv("ZENML_RUN_NAME")

KubeflowEntrypointConfiguration.get_run_name = patch_get_run_name

# Continue with your normal pipeline runner code..

Please note that in the above code, HOST should be registered on orchestration registration, with the kubeflow_hostname parameter:

export HOST=https://qux.com
zenml orchestrator register multi_tenant_kf --flavor=kubeflow \
   --kubeflow_hostname=$(HOST)/pipeline  # /pipeline is important!
   --other_params..

Further note that the above is also currently not tested on all Kubeflow versions, so there might be further bugs with older Kubeflow versions. In this case, please reach out to us on Slack.

In future ZenML versions, multi-tenancy will be natively supported. See this Slack thread for more details on how the above workaround came to effect.

Please note that the above is all to initialize the kfp.Client() class in the standard orchestrator logic. This code can be seen here.

You can simply override this logic and add your custom authentication scheme if needed. Read here for more details on how to create a custom orchestrator.

Last updated