BaseOrchestratorcomes into play. It abstracts away many of the ZenML specific details from the actual implementation and exposes a simplified interface:
StackComponent, it inherits from the
StackComponentclass. This sets the
TYPEvariable to the specific
FLAVORclass variable needs to be set in the particular sub-class as it is meant to identify the implementation flavor of the particular orchestrator.
prepare_or_run_pipeline. In the implementation of every
Orchestratorflavor, it is required to define this method with respect to the flavor at hand.
LocalOrchestratorimplementation, which is a simple implementation for running your pipelines locally.
integrationsmodules, such as the
airflowintegration and the
prepare_or_run_pipeline()based on your desired orchestrator.
localorchestrator can be summarized in two lines of code:
airfloworchestrator has a slightly more complex implementation of the
prepare_or_run_pipeline()method. Instead of immediately executing a step, a
PythonOperatoris created which contains a
_step_callablewill ultimately execute the
self.run_step(...)method of the orchestrator. The PythonOperators are assembled into an AirflowDag which is returned. Through some Airflow magic, this DAG is loaded by the connected instance of Airflow and orchestration of this DAG is performed either directly or on a set schedule.
kubefloworchestrator is a great example of container-based orchestration. In an implementation-specific method called
prepare_pipeline_deployment()a Docker image containing the complete project context is built.
prepare_or_run_pipeline()a yaml file is created as an intermediate representation of the pipeline and uploaded to the Kubeflow instance. To create this yaml file a callable is defined within which a
dsl.ContainerOpis created for each step. This ContainerOp contains the container entrypoint command and arguments that will make the image run just the one step. The ContainerOps are assembled according to their interdependencies inside a
dsl.Pipelinewhich can then be compiled into the yaml file.
src.zenml.entrypoints.step_entrypoint.pyis the default entrypoint to run a specific step. It does so by loading an orchestrator specific
StepEntrypointConfigurationobject. This object is then used to parse all entrypoint arguments (e.g. --step_source ). Finally, the
StepEntrypointConfiguration.run()method is used to execute the step. Under the hood this will eventually also call the orchestrators
StepEntrypointConfigurationis the base class that already implements most of the required functionality. Let's dive right into it:
DEFAULT_SINGLE_STEP_CONTAINER_ENTRYPOINT_COMMANDis the default entrypoint command for the Docker container.
run()method uses the parsed arguments to set up all required prerequisites before ultimately executing the step.
get_run_name(...)that you need to implement in order to get a functioning entrypoint. Inside this method you need to return a string which has to be the same for all steps that are executed as part of the same pipeline run.
get_custom_entrypoint_options(): This method should return all the additional options that you require in the entrypoint.
get_custom_entrypoint_arguments(...): This method should return a list of arguments that should be passed to the entrypoint. The arguments need to provide values for all options defined in the
custom_entrypoint_options()method mentioned above.