Kubeflow

Run your ML pipelines on Kubeflow Pipelines.

The ZenML Kubeflow Orchestrator allows you to run your ML pipelines on Kubeflow Pipelines without writing Kubeflow code.

Prerequisites

To use the Kubeflow Orchestrator, you'll need:

  • ZenML kubeflow integration installed (zenml integration install kubeflow)

  • Docker installed and running

  • kubectl installed (optional, see below)

  • A Kubernetes cluster with Kubeflow Pipelines installed (see deployment guide for your cloud provider)

  • A remote artifact store and container registry in your ZenML stack

  • A remote ZenML server deployed to the cloud

  • The name of your Kubernetes context pointing to the remote cluster (optional, see below)

Configuring the Orchestrator

There are two ways to configure the orchestrator:

  1. Using a Service Connector to connect to the remote cluster (recommended for cloud-managed clusters). No local kubectl context needed.

zenml orchestrator register <ORCHESTRATOR_NAME> --flavor kubeflow
zenml service-connector list-resources --resource-type kubernetes-cluster -e  
zenml orchestrator connect <ORCHESTRATOR_NAME> --connector <CONNECTOR_NAME>
zenml stack update -o <ORCHESTRATOR_NAME>
  1. Configuring kubectl with a context pointing to the remote cluster and setting kubernetes_context in the orchestrator config:

zenml orchestrator register <ORCHESTRATOR_NAME> \
    --flavor=kubeflow \
    --kubernetes_context=<KUBERNETES_CONTEXT>
    
zenml stack update -o <ORCHESTRATOR_NAME>

Running a Pipeline

Once configured, you can run any ZenML pipeline using the Kubeflow Orchestrator:

python your_pipeline.py

This will create a Kubernetes pod for each step in your pipeline. You can view pipeline runs in the Kubeflow UI.

Additional Configuration

You can further configure the orchestrator using KubeflowOrchestratorSettings:

from zenml.integrations.kubeflow.flavors.kubeflow_orchestrator_flavor import KubeflowOrchestratorSettings

kubeflow_settings = KubeflowOrchestratorSettings(
   client_args={},  
   user_namespace="my_namespace",
   pod_settings={
       "affinity": {...},
       "tolerations": [...]
   }
)

@pipeline(
   settings={
       "orchestrator.kubeflow": kubeflow_settings
   }
)

This allows specifying client arguments, user namespace, pod affinity/tolerations, and more.

Multi-Tenancy Deployments

For multi-tenant Kubeflow deployments, specify the kubeflow_hostname ending in /pipeline when registering the orchestrator:

zenml orchestrator register <NAME> \
   --flavor=kubeflow \
   --kubeflow_hostname=<KUBEFLOW_HOSTNAME> # e.g. https://mykubeflow.example.com/pipeline

And provide the namespace, username and password in the orchestrator settings:

kubeflow_settings = KubeflowOrchestratorSettings(
   client_username="admin",
   client_password="abc123", 
   user_namespace="namespace_name"
)

@pipeline(
   settings={
       "orchestrator.kubeflow": kubeflow_settings
   }
)

For more advanced options and details, refer to the full Kubeflow Orchestrator documentation.

Last updated