Use pipeline/step parameters

Steps and pipelines can be parameterized just like any other python function that you are familiar with.

Parameters for your steps

When calling a step in a pipeline, the inputs provided to the step function can either be an artifact or a parameter. An artifact represents the output of another step that was executed as part of the same pipeline and serves as a means to share data between steps. Parameters, on the other hand, are values provided explicitly when invoking a step. They are not dependent on the output of other steps and allow you to parameterize the behavior of your steps.

from zenml import step, pipeline

@step
def my_step(input_1: int, input_2: int) -> None:
    pass


@pipeline
def my_pipeline():
    int_artifact = some_other_step()
    # We supply the value of `input_1` as an artifact and
    # `input_2` as a parameter
    my_step(input_1=int_artifact, input_2=42)
    # We could also call the step with two artifacts or two
    # parameters instead:
    # my_step(input_1=int_artifact, input_2=int_artifact)
    # my_step(input_1=1, input_2=2)

Parameters of steps and pipelines can also be passed in using YAML configuration files. The following configuration file and Python code can work together and give you the flexibility to update configuration only in YAML file, once needed:

# config.yaml

# these are parameters of the pipeline
parameters:
  environment: production

steps:
  my_step:
    # these are parameters of the step `my_step`
    parameters:
      input_2: 42
from zenml import step, pipeline
@step
def my_step(input_1: int, input_2: int) -> None:
    ...

# input `environment` will come from the configuration file,
# and it is evaluated to `production`
@pipeline
def my_pipeline(environment: str):
    ...

if __name__=="__main__":
    my_pipeline.with_options(config_path="config.yaml")()

Parameters and caching

When an input is passed as a parameter, the step will only be cached if all parameter values are exactly the same as for previous executions of the step.

Artifacts and caching

When an artifact is used as a step function input, the step will only be cached if all the artifacts are exactly the same as for previous executions of the step. This means that if any of the upstream steps that produce the input artifacts for a step were not cached, the step itself will always be executed.


See Also:

ZenML Scarf

Last updated

Was this helpful?