Organizing Stacks Pipelines Models

A step-by-step tutorial on effectively organizing your ML assets in ZenML using tags and projects

This cookbook demonstrates how to effectively organize your machine learning assets in ZenML using tags and projects. We'll implement a fraud detection system while applying increasingly sophisticated organization techniques.

Introduction: The Organization Challenge

As ML projects grow, effective organization becomes critical. ZenML provides two powerful organization mechanisms:

  1. Tags: Flexible labels that can be applied to various entities (pipelines, runs, artifacts, models)

  2. Projects (ZenML Pro): Namespace-based isolation for logical separation between initiatives or teams

For our full reference documentation on things covered in this tutorial, see the Tagging page, the Projects page, and the Model Control Plane page.

Prerequisites

Before starting this tutorial, make sure you have:

  1. ZenML installed and configured

  2. Basic understanding of ZenML pipelines and steps

  3. ZenML Pro account (for the Projects section only)

Part 1: Basic Pipeline Organization with Tags

Creating and Tagging a Simple Pipeline

Let's create a basic fraud detection pipeline with tags:

Adding Tags at Runtime

You can add tags when running a pipeline:

Finding Pipelines by Tags

Part 2: Organizing Artifacts with Tags

Tagging Artifacts During Creation

Use ArtifactConfig to tag artifacts as they're created:

Tagging Artifacts Dynamically

Finding Tagged Artifacts

Part 3: Model Organization with Tags

Creating and Tagging Models

Part 4: Advanced Tagging Techniques

Exclusive Tags for Production Tracking

Read more about exclusive tags here.

Cascade Tags for Automatic Artifact Tagging

Read more about cascade tags here.

Advanced Tag Filtering

Part 5: Organizing with Projects (ZenML Pro)

Projects provide logical separation between different initiatives or teams.

Creating and Setting a Project

You can also use the CLI:

Implementing Cross-Project Organization

For consistency across projects, use a standardized tagging strategy:

Part 6: Practical Organization Patterns

Create a Tag Registry for Consistency

Find and Fix Orphaned Resources

Conclusion and Best Practices

A well-designed tagging strategy helps maintain organization as your ML project grows:

  1. Use consistent tag naming conventions - Create a tag registry to ensure consistency

  2. Apply tags at all levels - Tag pipelines, runs, artifacts, and models

  3. Create meaningful tag categories - Environment, domain, status, algorithm type, etc.

  4. Use exclusive tags for state management - Perfect for tracking current production models

  5. Combine tags with projects for complete organization - Use projects for major boundaries, tags for cross-cutting concerns

  6. Document your tagging strategy - Ensure everyone on the team follows the same conventions

Next Steps

Now that you understand how to organize your ML assets, consider exploring:

  1. Managing scheduled pipelines to automate your ML workflows

  2. Integrating your tagging strategy with CI/CD pipelines

Last updated

Was this helpful?