Fetching Historic Runs
Interact with Past Runs inside a Step

The need to fetch historic runs

Sometimes, it is necessary to fetch information from previous runs in order to make a decision within a currently executing step. Examples of this:
  • Fetch the best model evaluation results from all past pipeline runs to decide whether to deploy a newly-trained model.
  • Fetching a model out of a list of trained models.
  • Fetching the latest model produced by a different pipeline to run an inference on.

Utilizing StepContext

ZenML allows users to fetch historical parameters and artifacts using the StepContext fixture.
As an example, see this step that uses the StepContext to query the metadata store while running a step. We use this to evaluate all models of past training pipeline runs and store the current best model. In our inference pipeline, we could then easily query the metadata store to fetch the best performing model.
1
from zenml.steps import step, StepContext
2
3
@step
4
def my_third_step(context: StepContext, input_int: int) -> bool:
5
"""Step that decides if this pipeline run produced the highest value
6
for `input_int`"""
7
highest_int = 0
8
9
# Inspect all past runs of `first_pipeline`
10
try:
11
pipeline_runs = (context.metadata_store
12
.get_pipeline("first_pipeline")
13
.runs)
14
except KeyError:
15
# If this is the first time running this pipeline you don't want
16
# it to fail
17
print('No previous runs found, this run produced the highest '
18
'number by default.')
19
return True
20
else:
21
for run in pipeline_runs:
22
# get the output of the second step
23
try:
24
multiplied_output_int = (run.get_step("step_2")
25
.outputs['multiplied_output_int']
26
.read())
27
except KeyError:
28
# If you never ran the pipeline or ran it with steps that
29
# don't produce a step with the name
30
# `multiplied_output_int` then you don't want this to fail
31
pass
32
else:
33
if multiplied_output_int > highest_int:
34
highest_int = multiplied_output_int
35
36
if highest_int > input_int:
37
print('Previous runs produced a higher number.')
38
return False # There was a past run that produced a higher number
39
else:
40
print('This run produced the highest number.')
41
return True # The current run produced the highest number
Copied!
Just like that you are able to compare runs with each other from within the run itself.

Summary in Code

Code Example of this Section