🐣Starter guide

Everything you need to know to start using ZenML.

This is an older version of the ZenML documentation. To read and view the latest version please visit this up-to-date URL.

🐣 Starter guide

ZenML helps you standardize your ML workflows as Pipelines consisting of decoupled, modular Steps. This enables you to write portable code that can be moved from experimentation to production in seconds.

Pipelines & steps

The simplest ZenML pipeline could look like this:

from zenml import pipeline, step


@step
def step_1() -> str:
    """Returns the `world` string."""
    return "world"


@step(enable_cache=False)
def step_2(input_one: str, input_two: str) -> None:
    """Combines the two strings at its input and prints them."""
    combined_str = f"{input_one} {input_two}"
    print(combined_str)


@pipeline
def my_pipeline():
    output_step_one = step_1()
    step_2(input_one="hello", input_two=output_step_one)


if __name__ == "__main__":
    my_pipeline()

@step is a decorator that converts its function into a step that can be used within a pipeline
@pipeline defines a function as a pipeline and within this function, the steps are called and their outputs are routed

Copy this code into a file run.py and run it.

$ python run.py

Registered pipeline my_pipeline (version 1).
Running pipeline my_pipeline on stack default (caching enabled)
Step step_1 has started.
Step step_1 has finished in 0.121s.
Step step_2 has started.
hello world
Step step_2 has finished in 0.046s.
Pipeline run my_pipeline-... has finished in 0.676s.
Pipeline visualization can be seen in the ZenML Dashboard. Run zenml up to see your pipeline!

In the output, there's a line with something like this.

Pipeline visualization can be seen in the ZenML Dashboard. Run zenml up to see your pipeline!

ZenML offers you a comprehensive Dashboard to interact with your Pipelines, Artifacts, and Infrastructure. To see it, simply deploy the ZenML server locally in the next section.

Explore the dashboard

Run zenml up in the environment where you have ZenML installed.

After a few seconds, your browser should open the ZenML Dashboard for you at http://127.0.0.1:8237/

The default user account is Username: default with no password.

As you can see, the dashboard shows you that there is 1 pipeline and 1 pipeline run. (feel free to ignore the stack and components for the time being) and continue to the run you just executed.

If you navigate to the run that you just executed, you will see a diagram view of the pipeline run, including a visualization of the data that is passed between the steps. Feel free to explore the Run, its steps, and its artifacts.

If you have closed the browser tab with the ZenML dashboard, you can always reopen it by running zenml show in your terminal.

Recap

Step

Steps are functions. These functions have inputs and outputs. For ZenML to work properly, these need to be typed.

@step(enable_cache=False)
def step_2(input_one: str, input_two: str) -> None:
    """Combines the two strings at its input and prints them."""
    combined_str = f"{input_one} {input_two}"
    return combined_str

Artifacts

The inputs and outputs of a step are called artifacts. They are automatically tracked and stored by ZenML in the artifact store. Artifacts are produced by and circulated among steps whenever your step returns an object or a value. This means the data is not passed between steps in memory. Rather at the output of a step they are written to storage and at the input of the step they are loaded from storage.

Pipeline

Pipelines are also functions. However, you are only allowed to call steps within this function. The inputs for steps called within a pipeline can either be the outputs of previous steps or alternatively, you can pass in values directly (as long as they're JSON serializable).

@pipeline
def my_pipeline():
    output_step_one = step_1()
    step_2(input_one="hello", input_two=output_step_one)

Executing the Code

Executing the Pipeline is as easy as just calling the function that you decorated with the @pipeline decorator.

if __name__ == "__main__":
    my_pipeline()

In the following sections, you will learn more about the following topics:

Create an ML Pipeline

Learning how to setup and configure your pipeline.

Caching previous executions

Iterating quickly with ZenML through fast caching.

Version pipelines

Understanding how and when the version of a pipeline is incremented.

Fetch runs after execution

Inspecting a finished pipeline run and its outputs.

Understand stacks

Learning how to switch the infrastructure backend of your code.

Switch to production

Bringing your pipelines into production using cloud stacks.

Follow best practices

Recommended repository structure and best practices.

ZenML project templates

Rocketstart your ZenML journey!

Code Example of this Section

from zenml import pipeline, step


@step
def step_1() -> str:
    """Returns the `world` substring."""
    return "world"


@step(enable_cache=False)
def step_2(input_one: str, input_two: str) -> None:
    """Combines the two strings at its input and prints them."""
    combined_str = input_one + ' ' + input_two
    print(combined_str)


@pipeline
def my_pipeline():
    output_step_one = step_1()
    step_2(input_one="hello", input_two=output_step_one)


if __name__ == "__main__":
    my_pipeline()

PreviousCore concepts NextCreate an ML pipeline