LogoLogo
ProductResourcesGitHubStart free
  • Documentation
  • Learn
  • ZenML Pro
  • Stacks
  • API Reference
  • SDK Reference
  • Overview
  • Integrations
  • Stack Components
    • Orchestrators
      • Local Orchestrator
      • Local Docker Orchestrator
      • Kubeflow Orchestrator
      • Kubernetes Orchestrator
      • Google Cloud VertexAI Orchestrator
      • AWS Sagemaker Orchestrator
      • AzureML Orchestrator
      • Databricks Orchestrator
      • Tekton Orchestrator
      • Airflow Orchestrator
      • Skypilot VM Orchestrator
      • HyperAI Orchestrator
      • Lightning AI Orchestrator
      • Develop a custom orchestrator
    • Artifact Stores
      • Local Artifact Store
      • Amazon Simple Cloud Storage (S3)
      • Google Cloud Storage (GCS)
      • Azure Blob Storage
      • Develop a custom artifact store
    • Container Registries
      • Default Container Registry
      • DockerHub
      • Amazon Elastic Container Registry (ECR)
      • Google Cloud Container Registry
      • Azure Container Registry
      • GitHub Container Registry
      • Develop a custom container registry
    • Step Operators
      • Amazon SageMaker
      • AzureML
      • Google Cloud VertexAI
      • Kubernetes
      • Modal
      • Spark
      • Develop a Custom Step Operator
    • Experiment Trackers
      • Comet
      • MLflow
      • Neptune
      • Weights & Biases
      • Google Cloud VertexAI Experiment Tracker
      • Develop a custom experiment tracker
    • Image Builders
      • Local Image Builder
      • Kaniko Image Builder
      • AWS Image Builder
      • Google Cloud Image Builder
      • Develop a Custom Image Builder
    • Alerters
      • Discord Alerter
      • Slack Alerter
      • Develop a Custom Alerter
    • Annotators
      • Argilla
      • Label Studio
      • Pigeon
      • Prodigy
      • Develop a Custom Annotator
    • Data Validators
      • Great Expectations
      • Deepchecks
      • Evidently
      • Whylogs
      • Develop a custom data validator
    • Feature Stores
      • Feast
      • Develop a Custom Feature Store
    • Model Deployers
      • MLflow
      • Seldon
      • BentoML
      • Hugging Face
      • Databricks
      • vLLM
      • Develop a Custom Model Deployer
    • Model Registries
      • MLflow Model Registry
      • Develop a Custom Model Registry
  • Service Connectors
    • Introduction
    • Complete guide
    • Best practices
    • Connector Types
      • Docker Service Connector
      • Kubernetes Service Connector
      • AWS Service Connector
      • GCP Service Connector
      • Azure Service Connector
      • HyperAI Service Connector
  • Popular Stacks
    • AWS
    • Azure
    • GCP
    • Kubernetes
  • Deployment
    • 1-click Deployment
    • Terraform Modules
    • Register a cloud stack
    • Infrastructure as code
  • Contribute
    • Custom Stack Component
    • Custom Integration
Powered by GitBook
On this page
  • When to use it
  • How to deploy it
  • How to use it

Was this helpful?

Edit on GitHub
  1. Stack Components
  2. Step Operators

AzureML

Executing individual steps in AzureML.

PreviousAmazon SageMakerNextGoogle Cloud VertexAI

Last updated 22 days ago

Was this helpful?

offers specialized compute instances to run your training jobs and has a comprehensive UI to track and manage your models and logs. ZenML's AzureML step operator allows you to submit individual steps to be run on AzureML compute instances.

When to use it

You should use the AzureML step operator if:

  • one or more steps of your pipeline require computing resources (CPU, GPU, memory) that are not provided by your orchestrator.

  • you have access to AzureML. If you're using a different cloud provider, take a look at the or step operators.

How to deploy it

Would you like to skip ahead and deploy a full ZenML cloud stack already, including an AzureML step operator? Check out the , the , or for a shortcut on how to deploy & register this stack component.

  • Create a Machine learning . This should include an Azure container registry and an Azure storage account that will be used as part of your stack.

  • (Optional) Once your resource is created, you can head over to the Azure Machine Learning Studio and to run your pipelines. If omitted, the AzureML step operator will use the serverless compute target or will provision a new compute target on the fly, depending on the settings used to configure the step operator.

  • (Optional) Create a for authentication. This is required if you intend to use a service connector to authenticate your step operator.

How to use it

To use the AzureML step operator, we need:

  • The ZenML azure integration installed. If you haven't done so, run

    zenml integration install azure
  • installed and running.

  • An as part of your stack. Take a look for a guide on how to set that up.

  • An as part of your stack. This is needed so that both your orchestration environment and AzureML can read and write step artifacts. Take a look for a guide on how to set that up.

  • An AzureML workspace and an optional compute cluster. Note that the AzureML workspace can share the Azure container registry and Azure storage account that are required above. See the for detailed instructions.

There are two ways you can authenticate your step operator to be able to run steps on Azure:

zenml service-connector register <CONNECTOR_NAME> --type azure -i
zenml step-operator register <STEP_OPERATOR_NAME> \
    --flavor=azureml \
    --subscription_id=<AZURE_SUBSCRIPTION_ID> \
    --resource_group=<AZURE_RESOURCE_GROUP> \
    --workspace_name=<AZURE_WORKSPACE_NAME> \
#   --compute_target_name=<AZURE_COMPUTE_TARGET_NAME> # optionally specify an existing compute target

zenml step-operator connect <STEP_OPERATOR_NAME> --connector <CONNECTOR_NAME>
zenml stack register <STACK_NAME> -s <STEP_OPERATOR_NAME> ... --set

If you don't connect your step operator to a service connector:

  • If using a remote orchestrator: the remote environment in which the orchestrator runs needs to be able to implicitly authenticate to Azure and have permissions to create and manage AzureML jobs. This is only possible if the orchestrator is also running in Azure and uses a form of implicit workload authentication like a service role. If this is not the case, you will need to use a service connector.

zenml step-operator register <NAME> \
    --flavor=azureml \
    --subscription_id=<AZURE_SUBSCRIPTION_ID> \
    --resource_group=<AZURE_RESOURCE_GROUP> \
    --workspace_name=<AZURE_WORKSPACE_NAME> \
#   --compute_target_name=<AZURE_COMPUTE_TARGET_NAME> # optionally specify an existing compute target

zenml stack register <STACK_NAME> -s <STEP_OPERATOR_NAME> ... --set

Once you added the step operator to your active stack, you can use it to execute individual steps of your pipeline by specifying it in the @step decorator as follows:

from zenml import step


@step(step_operator=<NAME>)
def trainer(...) -> ...:
    """Train a model."""
    # This step will be executed in AzureML.

Additional configuration

The ZenML AzureML step operator comes with a dedicated class calledAzureMLStepOperatorSettings for configuring its settings and it controls the compute resources used for step execution in AzureML.

Currently, it supports three different modes of operation.

  1. Serverless Compute (Default)

  • Set mode to serverless.

  • Other parameters are ignored.

  1. Compute Instance

  • Set mode to compute-instance.

  • Requires a compute_name.

    • If a compute instance with the same name exists, it uses the existing compute instance and ignores other parameters.

    • If a compute instance with the same name doesn't exist, it creates a new compute instance with the compute_name. For this process, you can specify compute_size and idle_type_before_shutdown_minutes.

  1. Compute Cluster

  • Set mode to compute-cluster.

  • Requires a compute_name.

    • If a compute cluster with the same name exists, it uses existing cluster, ignores other parameters.

    • If a compute cluster with the same name doesn't exist, it creates a new compute cluster. Additional parameters can be used for configuring this process.

Here is an example how you can use the AzureMLStepOperatorSettings to define a compute instance:

from zenml.integrations.azure.flavors import AzureMLStepOperatorSettings

azureml_settings = AzureMLStepOperatorSettings(
    mode="compute-instance",
    compute_name="MyComputeInstance",
    compute_size="Standard_NC6s_v3",
)

@step(
   settings={
       "step_operator": azureml_settings
   }
)
def my_azureml_step():
    # YOUR STEP CODE
    ...

Enabling CUDA for GPU-backed hardware

The recommended way to authenticate your AzureML step operator is by registering or using an existing and connecting it to your AzureML step operator. The credentials configured for the connector must have permissions to create and manage AzureML jobs (e.g. ). The AzureML step operator uses the azure-generic resource type, so make sure to configure the connector accordingly:

If using a : ZenML will try to implicitly authenticate to Azure via the local . Make sure the Azure CLI has permissions to create and manage AzureML jobs (e.g. ).

ZenML will build a Docker image called <CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME> which includes your code and use it to run your steps in AzureML. Check out if you want to learn more about how ZenML builds these images and how you can customize them.

You can check out the for a full list of available attributes and for more information on how to specify settings.

Note that if you wish to use this step operator to run steps on a GPU, you will need to follow to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration.

Azure Service Connector
the AzureML Data Scientist and AzureML Compute Operator managed roles
local orchestrator
Azure CLI configuration
the AzureML Data Scientist and AzureML Compute Operator managed roles
this page
AzureMLStepOperatorSettings SDK docs
this docs page
the instructions on this page
AzureML
SageMaker
Vertex
in-browser stack deployment wizard
stack registration wizard
the ZenML Azure Terraform module
workspace on Azure
create a compute instance or cluster
Service Principal
Docker
Azure container registry
here
Azure artifact store
here
deployment section
ZenML Scarf