YAML Configuration

Learn how to configure ZenML pipelines using YAML configuration files.

ZenML provides configuration capabilities through YAML files that allow you to customize pipeline and step behavior without changing your code. This is particularly useful for separating configuration from code, experimenting with different parameters, and ensuring reproducibility.

Basic Usage

You can apply a YAML configuration file when running a pipeline:

my_pipeline.with_options(config_path="config.yaml")()

This allows you to change pipeline behavior without modifying your code.

Sample Configuration File

Here's a simple example of a YAML configuration file:

# Enable/disable features
enable_cache: False
enable_step_logs: True

# Pipeline parameters
parameters: 
  dataset_name: "my_dataset"
  learning_rate: 0.01

# Step-specific configuration
steps:
  train_model:
    parameters:
      learning_rate: 0.001  # Override the pipeline parameter for this step
    enable_cache: True      # Override the pipeline cache setting

Configuration Hierarchy

ZenML follows a specific hierarchy when resolving configuration:

  1. Runtime Python code - Highest precedence

  2. Step-level YAML configuration

  3. Pipeline-level YAML configuration

  4. Default values in code - Lowest precedence

This hierarchy allows you to define base configurations at the pipeline level and override them for specific steps as needed.

Configuring Steps and Pipelines

Pipeline and Step Parameters

You can specify parameters for pipelines and steps, similar to how you'd define them in Python code:

These settings correspond directly to the parameters you'd normally pass to your pipeline and step functions.

Enable Flags

These boolean flags control aspects of pipeline execution that were covered in the Advanced Features section:

Run Name

Set a custom name for the pipeline run:

Resource and Component Configuration

Docker Settings

Configure Docker container settings for pipeline execution:

Resource Settings

Configure compute resources for pipeline or step execution:

Stack Component Settings

Configure specific stack components for steps:

Working with Configuration Files

Autogenerating Template YAML Files

ZenML provides a command to generate a template configuration file:

This generates a YAML file with all pipeline parameters, step parameters, and configuration options with their default values.

Environment Variables in Configuration

You can reference environment variables in your YAML configuration:

Using Configuration Files for Different Environments

A common pattern is to maintain different configuration files for different environments:

Example development configuration:

Example production configuration:

You can then specify which configuration to use:

Advanced Configuration Options

Model Configuration

Link a pipeline to a ZenML Model:

Scheduling

Configure pipeline scheduling when using an orchestrator that supports it:

Conclusion

YAML configuration in ZenML provides a powerful way to customize pipeline behavior without changing your code. By separating configuration from implementation, you can make your ML workflows more flexible, maintainable, and reproducible.

See also:

Last updated

Was this helpful?