Orchestrators

Orchestrating the execution of ML pipelines.

The orchestrator is an essential component in any MLOps stack as it is responsible for running your machine learning pipelines. To do so, the orchestrator provides an environment that is set up to execute the steps of your pipeline. It also makes sure that the steps of your pipeline only get executed once all their inputs (which are outputs of previous steps of your pipeline) are available.

Many of ZenML's remote orchestrators build Docker images in order to transport and execute your pipeline code. If you want to learn more about how Docker images are built by ZenML, check out this guide.

When to use it

The orchestrator is a mandatory component in the ZenML stack. It is used to store all artifacts produced by pipeline runs, and you are required to configure it in all of your stacks.

Orchestrator Flavors

Out of the box, ZenML comes with a local orchestrator already part of the default stack that runs pipelines locally. Additional orchestrators are provided by integrations:

Orchestrator Flavor Integration Notes

Orchestrator	Flavor	Integration	Notes
LocalOrchestrator	`local`	built-in	Runs your pipelines locally.
LocalDockerOrchestrator	`local_docker`	built-in	Runs your pipelines locally using Docker.
KubernetesOrchestrator	`kubernetes`	`kubernetes`	Runs your pipelines in Kubernetes clusters.
KubeflowOrchestrator	`kubeflow`	`kubeflow`	Runs your pipelines using Kubeflow.
VertexOrchestrator	`vertex`	`gcp`	Runs your pipelines in Vertex AI.
SagemakerOrchestrator	`sagemaker`	`aws`	Runs your pipelines in Sagemaker.
TektonOrchestrator	`tekton`	`tekton`	Runs your pipelines using Tekton.
AirflowOrchestrator	`airflow`	`airflow`	Runs your pipelines using Airflow.
SkypilotAWSOrchestrator	`vm_aws`	`skypilot[aws]`	Runs your pipelines in AWS VMs using SkyPilot
SkypilotGCPOrchestrator	`vm_gcp`	`skypilot[gcp]`	Runs your pipelines in GCP VMs using SkyPilot
SkypilotAzureOrchestrator	`vm_azure`	`skypilot[azure]`	Runs your pipelines in Azure VMs using SkyPilot
HyperAIOrchestrator	`hyperai`	`hyperai`	Runs your pipeline in HyperAI.ai instances.
Custom Implementation	custom		Extend the orchestrator abstraction and provide your own implementation

LocalOrchestrator

local

built-in

Runs your pipelines locally.

LocalDockerOrchestrator

local_docker

built-in

Runs your pipelines locally using Docker.

KubernetesOrchestrator

kubernetes

Runs your pipelines in Kubernetes clusters.

KubeflowOrchestrator

kubeflow

Runs your pipelines using Kubeflow.

VertexOrchestrator

vertex

gcp

Runs your pipelines in Vertex AI.

SagemakerOrchestrator

sagemaker

aws

Runs your pipelines in Sagemaker.

TektonOrchestrator

tekton

Runs your pipelines using Tekton.

AirflowOrchestrator

airflow

Runs your pipelines using Airflow.

SkypilotAWSOrchestrator

vm_aws

skypilot[aws]

Runs your pipelines in AWS VMs using SkyPilot

SkypilotGCPOrchestrator

vm_gcp

skypilot[gcp]

Runs your pipelines in GCP VMs using SkyPilot

SkypilotAzureOrchestrator

vm_azure

skypilot[azure]

Runs your pipelines in Azure VMs using SkyPilot

HyperAIOrchestrator

hyperai

Runs your pipeline in HyperAI.ai instances.

Custom Implementation

custom

Extend the orchestrator abstraction and provide your own implementation

If you would like to see the available flavors of orchestrators, you can use the command:

zenml orchestrator flavor list

How to use it

You don't need to directly interact with any ZenML orchestrator in your code. As long as the orchestrator that you want to use is part of your active ZenML stack, using the orchestrator is as simple as executing a Python file that runs a ZenML pipeline:

python file_that_runs_a_zenml_pipeline.py

Inspecting Runs in the Orchestrator UI

If your orchestrator comes with a separate user interface (for example Kubeflow, Airflow, Vertex), you can get the URL to the orchestrator UI of a specific pipeline run using the following code snippet:

from zenml.client import Client

pipeline_run = Client().get_pipeline_run("<PIPELINE_RUN_NAME>")
orchestrator_url = pipeline_run.run_metadata["orchestrator_url"].value

Specifying per-step resources

If your steps require the orchestrator to execute them on specific hardware, you can specify them on your steps as described here.

If your orchestrator of choice or the underlying hardware doesn't support this, you can also take a look at step operators.

PreviousIntegration overview NextLocal Orchestrator

Last updated 2 months ago