AzureML Orchestrator
Orchestrating your pipelines to run on AzureML.
AzureML is a cloud-based orchestration service provided by Microsoft, that enables data scientists, machine learning engineers, and developers to build, train, deploy, and manage machine learning models. It offers a comprehensive and integrated environment that supports the entire machine learning lifecycle, from data preparation and model development to deployment and monitoring.
When to use it
You should use the AzureML orchestrator if:
you're already using Azure.
you're looking for a proven production-grade orchestrator.
you're looking for a UI in which you can track your pipeline runs.
you're looking for a managed solution for running your pipelines.
How it works
The ZenML AzureML orchestrator implementation uses the Python SDK v2 of AzureML to allow our users to build their Machine Learning pipelines. For each ZenML step, it creates an AzureML CommandComponent and brings them together in a pipeline.
How to deploy it
Would you like to skip ahead and deploy a full ZenML cloud stack already, including an AzureML orchestrator? Check out the in-browser stack deployment wizard, the stack registration wizard, or the ZenML Azure Terraform module for a shortcut on how to deploy & register this stack component.
In order to use an AzureML orchestrator, you need to first deploy ZenML to the cloud. It would be recommended to deploy ZenML in the same region as you plan on using for AzureML, but it is not necessary to do so. You must ensure that you are connected to the remote ZenML server before using this stack component.
How to use it
In order to use the AzureML orchestrator, you need:
The ZenML
azure
integration installed. If you haven't done so, run:
Docker installed and running or a remote image builder in your stack.
A remote artifact store as part of your stack.
A remote container registry as part of your stack.
An Azure resource group equipped with an AzureML workspace to run your pipeline on.
There are two ways of authenticating your orchestrator with AzureML:
Default Authentication simplifies the authentication process while developing your workflows that deploy to Azure by combining credentials used in Azure hosting environments and credentials used in local development.
Service Principal Authentication (recommended) is using the concept of service principals on Azure to allow you to connect your cloud components with proper authentication. For this method, you will need to create a service principal on Azure, assign it the correct permissions and use it to register a ZenML Azure Service Connector.
Docker
For each pipeline run, ZenML will build a Docker image called <CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME>
which includes your code and use it to run your pipeline steps in AzureML. Check out this page if you want to learn more about how ZenML builds these images and how you can customize them.
AzureML UI
Each AzureML workspace comes equipped with an Azure Machine Learning studio. Here you can inspect, manage, and debug your pipelines and steps.
Double-clicking any of the steps on this view will open up the overview page for that specific step. Here you can check the configuration of the component and its execution logs.
Settings
The ZenML AzureML orchestrator comes with a dedicated class called AzureMLOrchestratorSettings
for configuring its settings, and it controls the compute resources used for pipeline execution in AzureML.
Currently, it supports three different modes of operation.
1. Serverless Compute (Default)
Set
mode
toserverless
.Other parameters are ignored.
Example:
2. Compute Instance
Set
mode
tocompute-instance
.Requires a
compute_name
.If a compute instance with the same name exists, it uses the existing compute instance and ignores other parameters. (It will throw a warning if the provided configuration does not match the existing instance.)
If a compute instance with the same name doesn't exist, it creates a new compute instance with the
compute_name
. For this process, you can specifysize
andidle_type_before_shutdown_minutes
.
Example:
3. Compute Cluster
Set
mode
tocompute-cluster
.Requires a
compute_name
.If a compute cluster with the same name exists, it uses existing cluster, ignores other parameters. (It will throw a warning if the provided
configuration does not match the existing cluster.)
If a compute cluster with the same name doesn't exist, it creates a new compute cluster. Additional parameters can be used for configuring this process.
Example:
In order to learn more about the supported sizes for compute instances and clusters, you can check the AzureML documentation.
Run pipelines on a schedule
The AzureML orchestrator supports running pipelines on a schedule using its JobSchedules. Both cron expression and intervals are supported.
Once you run the pipeline with a schedule, you can find the schedule and the corresponding run under the All Schedules
tab Jobs
in the jobs page on AzureML.
Note that ZenML only gets involved to schedule a run, but maintaining the lifecycle of the schedule is the responsibility of the user. That means, if you want to cancel a schedule that you created on AzureML, you will have to do it through the Azure UI.
Last updated