🐣Starter guide
Everything you need to know to start using ZenML.
This is an older version of the ZenML documentation. To read and view the latest version please visit this up-to-date URL.
🐣 Starter guide
ZenML helps you standardize your ML workflows as Pipelines consisting of decoupled, modular Steps. This enables you to write portable code that can be moved from experimentation to production in seconds.
Pipelines & steps
The simplest ZenML pipeline could look like this:
@step
is a decorator that converts its function into a step that can be used within a pipeline@pipeline
defines a function as a pipeline and within this function, the steps are called and their outputs are routed
Copy this code into a file run.py
and run it.
In the output, there's a line with something like this.
ZenML offers you a comprehensive Dashboard to interact with your Pipelines, Artifacts, and Infrastructure. To see it, simply deploy the ZenML server locally in the next section.
Explore the dashboard
Run zenml up
in the environment where you have ZenML installed.
After a few seconds, your browser should open the ZenML Dashboard for you at http://127.0.0.1:8237/
The default user account is Username: default with no password.
As you can see, the dashboard shows you that there is 1 pipeline and 1 pipeline run. (feel free to ignore the stack and components for the time being) and continue to the run you just executed.
If you navigate to the run that you just executed, you will see a diagram view of the pipeline run, including a visualization of the data that is passed between the steps. Feel free to explore the Run, its steps, and its artifacts.
If you have closed the browser tab with the ZenML dashboard, you can always reopen it by running zenml show
in your terminal.
Recap
Step
Steps are functions. These functions have inputs and outputs. For ZenML to work properly, these need to be typed.
Artifacts
The inputs and outputs of a step are called artifacts. They are automatically tracked and stored by ZenML in the artifact store. Artifacts are produced by and circulated among steps whenever your step returns an object or a value. This means the data is not passed between steps in memory. Rather at the output of a step they are written to storage and at the input of the step they are loaded from storage.
Pipeline
Pipelines are also functions. However, you are only allowed to call steps within this function. The inputs for steps called within a pipeline can either be the outputs of previous steps or alternatively, you can pass in values directly (as long as they're JSON serializable).
Executing the Code
Executing the Pipeline is as easy as just calling the function that you decorated with the @pipeline
decorator.
In the following sections, you will learn more about the following topics:
Create an ML Pipeline
Learning how to setup and configure your pipeline.
Caching previous executions
Iterating quickly with ZenML through fast caching.
Version pipelines
Understanding how and when the version of a pipeline is incremented.
Fetch runs after execution
Inspecting a finished pipeline run and its outputs.
Understand stacks
Learning how to switch the infrastructure backend of your code.
Switch to production
Bringing your pipelines into production using cloud stacks.
Follow best practices
Recommended repository structure and best practices.
ZenML project templates
Rocketstart your ZenML journey!