Develop a custom orchestrator
Learning how to develop a custom orchestrator.
Last updated
Was this helpful?
Learning how to develop a custom orchestrator.
Last updated
Was this helpful?
ZenML aims to enable orchestration with any orchestration tool. This is where the BaseOrchestrator
comes into play. It abstracts away many of the ZenML-specific details from the actual implementation and exposes a simplified interface:
If you want to create your own custom flavor for an orchestrator, you can follow the following steps:
Create a class that inherits from the BaseOrchestrator
class and implement the abstract prepare_or_run_pipeline(...)
and get_orchestrator_run_id()
methods.
If you need to provide any configuration, create a class that inherits from the BaseOrchestratorConfig
class and add your configuration parameters.
Bring both the implementation and the configuration together by inheriting from the BaseOrchestratorFlavor
class. Make sure that you give a name
to the flavor through its abstract property.
Once you are done with the implementation, you can register it through the CLI. Please ensure you point to the flavor class via dot notation:
For example, if your flavor class MyOrchestratorFlavor
is defined in flavors/my_flavor.py
, you'd register it by doing:
If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root.
Afterward, you should see the new flavor in the list of available flavors:
It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow.
The CustomOrchestratorFlavor class is imported and utilized upon the creation of the custom flavor through the CLI.
The CustomOrchestratorConfig class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As Config
object are inherently pydantic
objects, you can also add your own custom validators here.
The CustomOrchestrator only comes into play when the component is ultimately in use.
The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the CustomOrchestratorFlavor
and the CustomOrchestratorConfig
are implemented in a different module/path than the actual CustomOrchestrator
).
Create your orchestrator class: This class should either inherit from BaseOrchestrator
, or more commonly from ContainerizedOrchestrator
. If your orchestrator uses container images to run code, you should inherit from ContainerizedOrchestrator
which handles building all Docker images for the pipeline to be executed. If your orchestator does not use container images, you'll be responsible that the execution environment contains all the necessary requirements and code files to run the pipeline.
Implement the prepare_or_run_pipeline(...)
method: This method is responsible for running or scheduling the pipeline. In most cases, this means converting the pipeline into a format that your orchestration tool understands and running it. To do so, you should:
Loop over all steps of the pipeline and configure your orchestration tool to run the correct command and arguments in the correct Docker image
Make sure the passed environment variables are set when the container is run
Make sure the containers are running in the correct order
Implement the get_orchestrator_run_id()
method: This must return a ID that is different for each pipeline run, but identical if called from within Docker containers running different steps of the same pipeline run. If your orchestrator is based on an external tool like Kubeflow or Airflow, it is usually best to use an unique ID provided by this tool.
There are some additional optional features that your orchestrator can implement:
Running pipelines on a schedule: if your orchestrator supports running pipelines on a schedule, make sure to handle deployment.schedule
if it exists. If your orchestrator does not support schedules, you should either log a warning and or even raise an exception in case the user tries to schedule a pipeline.
Specifying hardware resources: If your orchestrator supports setting resources like CPUs, GPUs or memory for the pipeline or specific steps, make sure to handle the values defined in step.config.resource_settings
. See the code sample below for additional helper methods to check whether any resources are required from your orchestrator.
This is a slimmed-down version of the base implementation which aims to highlight the abstraction layer. In order to see the full implementation and get the complete docstrings, please check .
ZenML resolves the flavor class by taking the path where you initialized zenml (via zenml init
) as the starting point of resolution. Therefore, please ensure you follow of initializing zenml at the root of your repository.
Check out the below for more details on how to fetch the Docker image, command, arguments and step order.
To see a full end-to-end worked example of a custom orchestrator, .
To see a full end-to-end worked example of a custom orchestrator, .
Note that if you wish to use your custom orchestrator to run steps on a GPU, you will need to follow to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration.