Metadata
Enrich your ML workflow with contextual information using ZenML metadata.
Metadata in ZenML provides critical context to your ML workflows, allowing you to track additional information about your steps, runs, artifacts, and models. This enhanced traceability helps you better understand, compare, and reproduce your experiments.

Metadata is any additional contextual information you want to associate with your ML workflow components. In ZenML, you can attach metadata to:
Steps: Log evaluation metrics, execution details, or configuration information
Pipeline Runs: Track overall run characteristics like environment variables or git information
Artifacts: Document data characteristics, source information, or processing details
Models: Capture evaluation results, hyperparameters, or deployment information
ZenML makes it easy to log and retrieve this information through a simple interface, and visualizes it in the dashboard for quick analysis.
Logging Metadata
The primary way to log metadata in ZenML is through the log_metadata function, which allows you to attach JSON-serializable key-value pairs to various entities.
The log_metadata function is versatile and can target different entities depending on the parameters provided.
Attaching Metadata to Steps
To log metadata for a step, you can either call log_metadata within the step (which automatically associates with the current step), or specify a step explicitly:
Attaching Metadata to Pipeline Runs
You can log metadata for an entire pipeline run, either from within a step during execution or manually after the run:
When logging from within a step to the pipeline run, the metadata key will have the pattern step_name::metadata_key, allowing multiple steps to use the same metadata key.
Attaching Metadata to Artifacts
Artifacts are the data objects produced by pipeline steps. You can log metadata for these artifacts to provide more context about the data:
Attaching Metadata to Models
Models in ZenML represent a higher-level concept that can encapsulate multiple artifacts and steps. Logging metadata for models helps track performance and other important information:
Bulk Metadata Logging
The log_metadata function does not support logging the same metadata for multiple entities simultaneously. To achieve this, you can use the bulk_log_metadata function:
Note that the bulk_log_metadata function has a slightly different signature compared to log_metadata. You can use the Identifier class objects to specify any parameter combination that uniquely identifies an object:
VersionedIdentifiers
ArtifactVersionIdentifier & ModelVersionIdentifier
Specify either an id or a combination of name and version.
PipelineRunIdentifier
Specify an id, name, or prefix.
StepRunIdentifier
Specify an id or a combination of name and a pipeline run identifier.
Similar to the log_metadata function, if you are calling bulk_log_metadata from within a step, you can use the infer options to automatically log metadata for the step’s model version or artifacts:
Keep in mind that when using the infer_artifacts option, the bulk_log_metadata function logs metadata to all output artifacts of the step. When logging metadata, you may need the option to use infer options in combination with identifier references. For instance, you may want to log metadata to a step's outputs but also to its inputs. The bulk_log_metadata function enables you to use both options in one go:
Performance improvements hints
Both log_metadata and bulk_log_metadata internally use parameters such as name and version to resolve the actual IDs of entities. For example, when you provide an artifact's name and version, the function performs an additional lookup to resolve the artifact version ID.
To improve performance, prefer using the entity's ID directly instead of its name, version, or other identifiers whenever possible.
Using the client directly
If the log_metadata or bulk_log_metadata functions are too restrictive for your use case, you can use the ZenML Client directly to create run metadata for resources:
Special Metadata Types
ZenML includes several special metadata types that provide standardized ways to represent common metadata:
These special types ensure metadata is logged in a consistent and interpretable manner, and they receive special treatment in the ZenML dashboard.
Organizing Metadata in the Dashboard
To improve visualization in the ZenML dashboard, you can group metadata into logical sections by passing a dictionary of dictionaries:
In the ZenML dashboard, "model_metrics" and "data_details" will appear as separate cards, each containing their respective key-value pairs, making it easier to navigate and interpret the metadata.
Visualizing and Comparing Metadata (Pro)
Once you've logged metadata in your runs, you can use ZenML's Experiment Comparison tool to analyze and compare metrics across different run.
The metadata comparison tool is a ZenML Pro-only feature.
Comparison Views
The Experiment Comparison tool offers two complementary views for analyzing your pipeline metadata:
Table View: Compare metadata across runs with automatic change tracking

Parallel Coordinates Plot: Visualize relationships between different metrics

The tool lets you compare up to 20 pipeline runs simultaneously and supports any numerical metadata (float or int) that you've logged in your pipelines.
Fetching Metadata
Retrieving Metadata Programmatically
Once metadata has been logged, you can retrieve it using the ZenML Client:
Accessing Context Within Steps
The StepContext object is your handle to the current pipeline/step run while a step executes. Use it to read run/step information, inspect upstream input metadata, and work with step outputs: URIs, materializers, run metadata, and tags.
It is available:
Inside functions decorated with
@step(during execution, not composition time).Inside step hooks like
on_failure/on_success.Inside materializers triggered by a step’s
save/load.Calling
get_step_context()elsewhere raisesRuntimeError.
Getting the context is done via get_step_context():
This exposes the following properties:
ctx.pipeline→ thePipelineResponsefor this run (convenience; may raise if the run has no pipeline object).ctx.pipeline_run→PipelineRunResponse(id, name, status, timestamps, etc.).ctx.step_run→StepRunResponse(name, parameters viactx.step_run.config.parameters, status).ctx.model→ the configuredModel(resolved from step or pipeline); raises if none configured.ctx.inputs→{input_name: StepRunInputResponse}; use...["x"].run_metadatato read upstream metadata.ctx.step_name→ convenience name string.
Working with outputs
For a single-output step you can omit output_name. For multi-output steps you must pass it (unnamed outputs are called output_1, output_2, …).
get_output_artifact_uri(output_name=None) -> str– where the output artifact lives (write side files, etc.).get_output_materializer(output_name=None, *, custom_materializer_class=None, data_type=None) -> BaseMaterializer– get an initialized materializer; passdata_typeto select fromUnion[...]materializers orcustom_materializer_classto override.add_output_metadata(metadata, output_name=None)/get_output_metadata(output_name=None)– set/read run metadata for the output. Values provided viaArtifactConfig(..., run_metadata=...)on the return annotation are merged with runtime values.add_output_tags(tags, output_name=None)/get_output_tags(output_name=None)/remove_output_tags(tags, output_name=None)– manage tags for the produced artifact version. Configured tags viaArtifactConfig(..., tags=...)are unioned with runtime tags; duplicates are de‑duplicated in the final artifact.
Minimal example:
Reading upstream metadata via inputs
inputsHooks and materializers (advanced)
Common errors to expect.
RuntimeErrorifget_step_context()is called outside a running step.StepContextErrorfor output helpers when:The step has no outputs,
You omit
output_nameon a multi‑output step,You reference an unknown
output_name.
See the full SDK docs for StepContext for a concise reference to this object.
Accessing Context During Pipeline Composition
During pipeline composition, you can access the pipeline configuration using the PipelineContext:
Best Practices
To make the most of ZenML's metadata capabilities:
Use consistent keys: Define standard metadata keys for your organization to ensure consistency
Group related metadata: Use nested dictionaries to create logical groupings in the dashboard
Leverage special types: Use ZenML's special metadata types for standardized representation
Log relevant information: Focus on metadata that aids reproducibility, understanding, and decision-making
Consider automation: Set up automatic metadata logging for standard metrics and information
Combine with tags: Use metadata alongside tags for a comprehensive organization system
Conclusion
Metadata in ZenML provides a powerful way to enhance your ML workflows with contextual information. By tracking additional details about your steps, runs, artifacts, and models, you can gain deeper insights into your experiments, make more informed decisions, and ensure reproducibility of your ML pipelines.
Last updated
Was this helpful?
