This is an older version of the ZenML documentation. To check the latest version please visit https://docs.zenml.io
Steps & Pipelines
Step
Steps are the atomic components of a ZenML pipeline. Each step is defined by its inputs, the logic it applies and its outputs. Here is a very simple example of such a step:
from zenml.steps import step, Output@stepdefmy_first_step() ->Output(output_int=int, output_float=float):"""Step that returns a pre-defined integer and float"""return7,0.1
As this step has multiple outputs, we need to use the zenml.steps.step_output.Output class to indicate the names of each output. These names can be used to directly access an output within the post execution workflow.
Let's come up with a second step that consumes the output of our first step and performs some sort of transformation on it. In this case, let's double the input.
from zenml.steps import step, Output@stepdefmy_second_step(input_int:int,input_float:float ) ->Output(output_int=int, output_float=float):"""Step that doubles the inputs"""return2* input_int,2* input_float
Now we can go ahead and create a pipeline with our two steps to make sure they work.
In case you want to run the step function outside the context of a ZenML pipeline, all you need to do is call the .entrypoint() method with the same input signature. For example:
Here we define the pipeline. This is done agnostic of implementation by simply routing outputs through the steps within the pipeline. You can think of this as a recipe for how we want data to flow through our steps.
from zenml.pipelines import pipeline@pipelinedeffirst_pipeline(step_1,step_2): output_1, output_2 =step_1()step_2(output_1, output_2)
Instantiate and run your Pipeline
With your pipeline recipe in hand you can now specify which concrete step implementations are used. And with that, you are ready to run:
You'll learn how to inspect the finished run within the chapter on our Post Execution Workflow.
Summary in Code
Code Example for this Section
from zenml.steps import step, Outputfrom zenml.pipelines import pipeline@stepdefmy_first_step() ->Output(output_int=int, output_float=float):"""Step that returns a pre-defined integer and float"""return7,0.1@stepdefmy_second_step(input_int:int,input_float:float ) ->Output(output_int=int, output_float=float):"""Step that doubles the inputs"""return2* input_int,2* input_float@pipelinedeffirst_pipeline(step_1,step_2): output_1, output_2 =step_1()step_2(output_1, output_2)first_pipeline(step_1=my_first_step(), step_2=my_second_step()).run()
Give each pipeline run a name
When running a pipeline by calling my_pipeline.run(), ZenML uses the current date and time as the name for the pipeline run. In order to change the name for a run, simply pass it as a parameter to the run() function:
Pipeline run names must be unique, so make sure to compute it dynamically if you plan to run your pipeline multiple times.
Once the pipeline run is finished we can easily access this specific run during our post-execution workflow:
from zenml.repository import Repositoryrepo =Repository()pipeline = repo.get_pipeline(pipeline_name="first_pipeline")run = pipeline.get_run("custom_pipeline_run_name")
Summary in Code
Code Example for this Section
from zenml.steps import step, Output, BaseStepConfigfrom zenml.pipelines import pipeline@stepdefmy_first_step() ->Output(output_int=int, output_float=float):"""Step that returns a pre-defined integer and float"""return7,0.1classSecondStepConfig(BaseStepConfig):"""Trainer params""" multiplier:int=4@stepdefmy_second_step(config: SecondStepConfig,input_int:int,input_float:float ) ->Output(output_int=int, output_float=float):"""Step that multiply the inputs"""return config.multiplier * input_int, config.multiplier * input_float@pipelinedeffirst_pipeline(step_1,step_2): output_1, output_2 =step_1()step_2(output_1, output_2)# Set configuration when executingfirst_pipeline(step_1=my_first_step(), step_2=my_second_step(SecondStepConfig(multiplier=3)) ).run(run_name="custom_pipeline_run_name")# Set configuration based on ymlfirst_pipeline(step_1=my_first_step(), step_2=my_second_step() ).with_config("config.yml").run()