Organize data with tags

Use tags to organize tags in ZenML.

Organizing and categorizing your machine learning artifacts and models can streamline your workflow and enhance discoverability. ZenML enables the use of tags as a flexible tool to classify and filter your ML assets.

Tags are visible in the ZenML Dashboard

Tagging different entities

Assigning tags to artifacts

You can tag artifact versions by using the add_tags utility function:

from zenml import add_tags

add_tags(tags=["my_tag"], artifact="my_artifact_name_or_id")

Alternatively, you can tag an artifact by using CLI as well:

zenml artifacts update my_artifact -t my_tag

Assigning tags to artifact versions

In order to tag an artifact through the Python SDK, you can use either use the ArtifactConfig object:

from zenml import step, ArtifactConfig

@step
def data_loader() -> (
    Annotated[pd.DataFrame, ArtifactConfig(name="my_output", tags=["my_tag"])]
):
    ...

or the add_tags utility function:

from zenml import add_tags

# Automatic tagging to an artifact version within a step execution
## A step with a single output
add_tags(tags=["my_tag"], infer_artifact=True)
## A step with multiple outputs (need to specify the output name)
add_tags(tags=["my_tag"], artifact_name="my_output", infer_artifact=True)

# Manual tagging to an artifact version (can happen in a step or outside of it)
## By specifying the artifact name and version
add_tags(tags=["my_tag"], artifact_name="my_output", artifact_version="v1")
## By specifying the artifact version ID
add_tags(tags=["my_tag"], artifact_version_id="artifact_version_uuid")

Moreover, you can tag an artifact version by using the CLI:

# Tag the artifact version
zenml artifacts versions update iris_dataset raw_2023 -t sklearn

In the upcoming chapters, you will also learn how to use an cascade tag to tag an artifact version as well.

Assigning tags to pipelines

Assigning tags to pipelines is only possible through the Python SDK and you can use the add_tags utility function:

from zenml import add_tags

add_tags(tags=["my_tag"], pipeline="pipeline_name_or_id")

Assigning tags to runs

To assign tags to a pipeline run in ZenML, you can use the add_tags utility function:

from zenml import add_tags

# Manual tagging to a run
add_tags(tags=["my_tag"], run="run_name_or_id")

Alternatively, you can use the same function within a step without specifying any arguments, which will automatically tag the run:

from zenml import step

@step
def my_step():
    add_tags(tags=["my_tag"])

You can also use the pipeline decorator to tag the run:

from zenml import pipeline

@pipeline(tags=["my_tag"])
def my_pipeline():
    ...

Assigning tags to models and model versions

When creating a model version using the Model object, you can specify tags as key-value pairs that will be attached to the model version upon creation.

from zenml.models import Model

# Create a model version with tags
model = Model(
    name="iris_classifier",
    version="1.0.0",
    tags=["experiment", "v1", "classification-task"],
)

# Use this tagged model in your steps and pipelines as needed
@pipeline(model=model)
def my_pipeline(...):
    ...

You can also assign tags when creating or updating models with the Python SDK:

from zenml.models import Model
from zenml.client import Client

# Create or register a new model with tags
Client().create_model(
    name="iris_logistic_regression",
    tags=["classification", "iris-dataset"],
)

# Create or register a new model version also with tags
Client().create_model_version(
    model_name_or_id="iris_logistic_regression",
    name="2",
    tags=["version-1", "experiment-42"],
)

To add tags to existing models and their versions using the ZenML CLI, you can use the following commands:

# Tag an existing model
zenml model update iris_logistic_regression --tag "classification"

# Tag a specific model version
zenml model version update iris_logistic_regression 2 --tag "experiment3"

Assigning tags to run templates

Assigning tags to run templates is only possible through the Python SDK and you can use the add_tags utility function:

from zenml import add_tags

add_tags(tags=["my_tag"], run_template="run_template_name_or_id")

Advanced Usage

ZenML provides several advanced tagging features to help you better organize and manage your ML assets.

Exclusive Tags

Exclusive tags are special tags that can be associated with only one instance of a specific entity type within a certain scope at a time. When you apply an exclusive tag to a new entity, it's automatically removed from any previous entity of the same type that had this tag. Exclusive tags can be used with:

  • One pipeline run per pipeline

  • One run template per pipeline

  • One artifact version per artifact

The recommended way to create exclusive tags is using the Tag object:

from zenml import pipeline, Tag

@pipeline(tags=["not_an_exclusive_tag", Tag("an_exclusive_tag", exclusive=True)])
def my_pipeline():
    ...

Alternatively, you can also create an exclusive tag separately and use it later:

from zenml.client import Client

Client().create_tag(name="an_exclusive_tag", exclusive=True)

@pipeline(tags=["an_exclusive_tag"])
def my_pipeline():
    ...

Cascade Tags

Cascade tags allow you to associate a tag from a pipeline with all artifact versions created during its execution.

from zenml import pipeline, Tag

@pipeline(tags=["normal_tag", Tag("cascade_tag", cascade=True)])
def my_pipeline():
    ...

When this pipeline runs, the cascade_tag will be automatically applied to all artifact versions created during the pipeline execution.

Filtering

ZenML allows you to filter taggable objects using multiple tag conditions:

from zenml import add_tags

from zenml.client import Client

# Add tags to a pipeline
add_tags(tags=["one", "two", "three"], pipeline="my_pipeline")

# Will return `my_pipeline`
Client().list_pipelines(tags=["contains:wo", "startswith:t", "equals:three"])

# Will not return `my_pipeline`
Client().list_pipelines(tags=["contains:wo", "startswith:t", "equals:four"])

The example above shows how you can use multiple tag conditions to filter an entity. In ZenML, the default logical operator is AND, which means that the entity will be returned only if there is at least one tag that matches all the conditions.

Removing Tags

Similar to the add_tags utility function, you can use the remove_tags utility function to remove tags from an entity.

from zenml.utils.tag_utils import remove_tags

# Remove tags from a pipeline
remove_tags(tags=["one", "two"], pipeline="my_pipeline")

# Remove tags from an artifact
remove_tags(tags=["three"], artifact="my_artifact")
ZenML Scarf

Last updated

Was this helpful?