Google Cloud VertexAI Orchestrator
How to orchestrate pipelines with Vertex AI
The Vertex orchestrator is an orchestrator flavor provided with the ZenML
gcpintegration that uses Vertex AI to run your pipelines.
This component is only meant to be used within the context of remote ZenML deployment scenario. Usage with a local ZenML deployment may lead to unexpected behavior!
You should use the Vertex orchestrator if:
- you're already using GCP.
- you're looking for a proven production-grade orchestrator.
- you're looking for a UI in which you can track your pipeline runs.
- you're looking for a managed solution for running your pipelines.
- you're looking for a serverless solution for running your pipelines.
In order to use a Vertex AI orchestrator, you need to first deploy ZenML to the cloud. It would be recommended to deploy ZenML in the same Google Cloud project as where the Vertex infrastructure is deployed, but it is not necessary to do so. You must ensure that you are connected to the remote ZenML server before using this stack component.
The only other thing necessary to use the ZenML Vertex orchestrator is enabling Vertex relevant APIs on the Google Cloud project.
In order to quickly enable APIs, and create other resources necessary for to use this integration, you can also consider using the Vertex AI stack recipe, which helps you set up the infrastructure with one click.
To use the Vertex orchestrator, we need:
- The ZenML
gcpintegration installed. If you haven't done so, runzenml integration install gcp
- The GCP project ID and location in which you want to run your Vertex AI pipelines.
- The pipeline client environment needs permissions to create a job in Vertex Pipelines, e.g. the
Vertex AI Userrole: https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.user
- To run on a schedule, the client environment also needs permissions to create a Google Cloud Function (e.g. with the
cloudfunctions.serviceAgent Role) and to create a Google Cloud Scheduler (e.g. with the Cloud Scheduler Job Runner Role). Additionally, it needs the Storage Object Creator Role to be able to write the pipeline JSON file to the artifact store directly.
We can then register the orchestrator and use it in our active stack:
zenml orchestrator register <ORCHESTRATOR_NAME> \
# Register and activate a stack with the new orchestrator
zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> ... --set
ZenML will build a Docker image called
<CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME>which includes your code and use it to run your pipeline steps in Vertex AI. Check out this page if you want to learn more about how ZenML builds these images and how you can customize them.
You can now run any ZenML pipeline using the Vertex orchestrator:
Vertex comes with its own UI that you can use to find further details about your pipeline runs, such as the logs of your steps. For any runs executed on Vertex, you can get the URL to the Vertex UI in Python using the following code snippet:
from zenml.post_execution import get_run
pipeline_run = get_run("<PIPELINE_RUN_NAME>")
orchestrator_url = deployer_step.metadata["orchestrator_url"].value
The Vertex Pipelines orchestrator supports running pipelines on a schedule, using logic resembling the official approach recommended by GCP.
ZenML utilizes the Cloud Scheduler and Cloud Functions services to enable scheduling on Vertex Pipelines. The following is the sequence of events that happen when running a pipeline on Vertex with a schedule:
- Cloud Function is created that creates the Vertex Pipeline job when triggered.
- Cloud Scheduler job is created that triggers the Cloud Function on the defined schedule.
Therefore, to run on a schedule, the client environment needs permissions to create a Google Cloud Function (e.g. with the
cloudfunctions.serviceAgentRole) and to create a Google Cloud Scheduler (e.g. with the Cloud Scheduler Job Runner Role). Additionally, it needs the Storage Object Creator Role to be able to write the pipeline JSON file to the artifact store directly.
Once your have these permissions set in your local GCP CLI, here is how to create a scheduled Vertex pipeline in ZenML:
from zenml.config.schedule import Schedule
# Run a pipeline every 5th minute
cron_expression="*/5 * * * *"
The Vertex orchestrator only supports the
cron_expressionparameter in the
Scheduleobject, and will ignore all other parameters supplied to define the schedule.
How to delete a scheduled pipeline
Note that ZenML only gets involved to schedule a run, but maintaining the lifecycle of the schedule is the responsibility of the user.
In order to cancel a scheduled Vertex pipeline, you need to manually delete the generated Google Cloud Function, along with the Cloud Scheduler job that schedules it (via the UI or the CLI).
For additional configuration of the Vertex orchestrator, you can pass
VertexOrchestratorSettingswhich allows you to configure (among others) the following attributes:
pod_settings: Node selectors, affinity and tolerations to apply to the Kubernetes Pods running your pipeline. These can be either specified using the Kubernetes model objects or as dictionaries.
from zenml.integrations.gcp.flavors.vertex_orchestrator_flavor import VertexOrchestratorSettings
from kubernetes.client.models import V1Toleration
vertex_settings = VertexOrchestratorSettings(
Check out the API docs for a full list of available attributes and this docs page for more information on how to specify settings.
For more information and a full list of configurable attributes of the Vertex orchestrator, check out the API Docs.
Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow the instructions on this page to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration.