Core concepts
Discovering the core concepts behind ZenML.
Last updated
Discovering the core concepts behind ZenML.
Last updated
ZenML is an extensible, open-source MLOps framework for creating portable, production-ready MLOps pipelines. It's built for data scientists, ML Engineers, and MLOps Developers to collaborate as they develop to production. In order to achieve this goal, ZenML introduces various concepts for different aspects of an ML workflow and we can categorize these concepts under three different threads:
1. Development
As a developer, how do I design my machine learning workflows?
2. Execution
While executing, how do my workflows utilize the large landscape of MLOps tooling/infrastructure?
3. Management
How do I establish and maintain a production-grade and efficient solution?
First, let's look at the main concepts which play a role during the development stage of an ML workflow with ZenML.
Steps are functions annotated with the @step
decorator. The easiest one could look like this.
These functions can also have inputs and outputs. For ZenML to work properly, these should preferably be typed.
At its core, ZenML follows a pipeline-based workflow for your projects. A pipeline consists of a series of steps, organized in any order that makes sense for your use case.
As seen in the image, a step might use the outputs from a previous step and thus must wait until the previous step is completed before starting. This is something you can keep in mind when organizing your steps.
Pipelines and steps are defined in code using Python decorators or classes. This is where the core business logic and value of your work lives, and you will spend most of your time defining these two things.
Even though pipelines are simple Python functions, you are only allowed to call steps within this function. The inputs for steps called within a pipeline can either be the outputs of previous steps or alternatively, you can pass in values directly (as long as they're JSON-serializable).
Executing the Pipeline is as easy as calling the function that you decorated with the @pipeline
decorator.
Artifacts represent the data that goes through your steps as inputs and outputs and they are automatically tracked and stored by ZenML in the artifact store. They are produced by and circulated among steps whenever your step returns an object or a value. This means the data is not passed between steps in memory. Rather, when the execution of a step is completed they are written to storage, and when a new step gets executed they are loaded from storage.
The serialization and deserialization logic of artifacts is defined by Materializers.
Models are used to represent the outputs of a training process along with all metadata associated with that output. In other words: models in ZenML are more broadly defined as the weights as well as any associated information. Models are first-class citizens in ZenML and as such viewing and using them is unified and centralized in the ZenML API, client as well as on the ZenML Pro dashboard.
Materializers define how artifacts live in between steps. More precisely, they define how data of a particular type can be serialized/deserialized, so that the steps are able to load the input data and store the output data.
All materializers use the base abstraction called the BaseMaterializer
class. While ZenML comes built-in with various implementations of materializers for different datatypes, if you are using a library or a tool that doesn't work with our built-in options, you can write your own custom materializer to ensure that your data can be passed from step to step.
When we think about steps as functions, we know they receive input in the form of artifacts. We also know that they produce output (in the form of artifacts, stored in the artifact store). But steps also take parameters. The parameters that you pass into the steps are also (helpfully!) stored by ZenML. This helps freeze the iterations of your experimentation workflow in time, so you can return to them exactly as you run them. On top of the parameters that you provide for your steps, you can also use different Setting
s to configure runtime configurations for your infrastructure and pipelines.
ZenML exposes the concept of a Model
, which consists of multiple different model versions. A model version represents a unified view of the ML models that are created, tracked, and managed as part of a ZenML project. Model versions link all other entities to a centralized view.
Once you have implemented your workflow by using the concepts described above, you can focus your attention on the execution of the pipeline run.
When you want to execute a pipeline run with ZenML, Stacks come into play. A Stack is a collection of stack components, where each component represents the respective configuration regarding a particular function in your MLOps pipeline such as orchestration systems, artifact repositories, and model deployment platforms.
For instance, if you take a close look at the default local stack of ZenML, you will see two components that are required in every stack in ZenML, namely an orchestrator and an artifact store.
Keep in mind, that each one of these components is built on top of base abstractions and is completely extensible.
An Orchestrator is a workhorse that coordinates all the steps to run in a pipeline. Since pipelines can be set up with complex combinations of steps with various asynchronous dependencies between them, the orchestrator acts as the component that decides what steps to run and when to run them.
ZenML comes with a default local orchestrator designed to run on your local machine. This is useful, especially during the exploration phase of your project. You don't have to rent a cloud instance just to try out basic things.
An Artifact Store is a component that houses all data that pass through the pipeline as inputs and outputs. Each artifact that gets stored in the artifact store is tracked and versioned and this allows for extremely useful features like data caching which speeds up your workflows.
Similar to the orchestrator, ZenML comes with a default local artifact store designed to run on your local machine. This is useful, especially during the exploration phase of your project. You don't have to set up a cloud storage system to try out basic things.
ZenML provides a dedicated base abstraction for each stack component type. These abstractions are used to develop solutions, called Flavors, tailored to specific use cases/tools. With ZenML installed, you get access to a variety of built-in and integrated Flavors for each component type, but users can also leverage the base abstractions to create their own custom flavors.
When it comes to production-grade solutions, it is rarely enough to just run your workflow locally without including any cloud infrastructure.
Thanks to the separation between the pipeline code and the stack in ZenML, you can easily switch your stack independently from your code. For instance, all it would take you to switch from an experimental local stack running on your machine to a remote stack that employs a full-fledged cloud infrastructure is a single CLI command.
In order to benefit from the aforementioned core concepts to their fullest extent, it is essential to deploy and manage a production-grade environment that interacts with your ZenML installation.
To use stack components that are running remotely on a cloud infrastructure, you need to deploy a ZenML Server so it can communicate with these stack components and run your pipelines. The server is also responsible for managing ZenML business entities like pipelines, steps, models, etc.
In order to benefit from the advantages of using a deployed ZenML server, you can either choose to use the ZenML Pro SaaS offering which provides a control plane for you to create managed instances of ZenML servers, or deploy it in your self-hosted environment.
On top of the communication with the stack components, the ZenML Server also keeps track of all the bits of metadata around a pipeline run. With a ZenML server, you are able to access all of your previous experiments with the associated details. This is extremely helpful in troubleshooting.
The ZenML Server also acts as a centralized secrets store that safely and securely stores sensitive data such as credentials used to access the services that are part of your stack. It can be configured to use a variety of different backends for this purpose, such as the AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, and Hashicorp Vault.
Secrets are sensitive data that you don't want to store in your code or configure alongside your stacks and pipelines. ZenML includes a centralized secrets store that you can use to store and access your secrets securely.
Collaboration is a crucial aspect of any MLOps team as they often need to bring together individuals with diverse skills and expertise to create a cohesive and effective workflow for machine learning projects. A successful MLOps team requires seamless collaboration between data scientists, engineers, and DevOps professionals to develop, train, deploy, and maintain machine learning models.
With a deployed ZenML Server, users have the ability to create their own teams and project structures. They can easily share pipelines, runs, stacks, and other resources, streamlining the workflow and promoting teamwork.
The ZenML Dashboard also communicates with the ZenML Server to visualize your pipelines, stacks, and stack components. The dashboard serves as a visual interface to showcase collaboration with ZenML. You can invite users, and share your stacks with them.
When you start working with ZenML, you'll start with a local ZenML setup, and when you want to transition you will need to deploy ZenML. Don't worry though, there is a one-click way to do it which we'll learn about later.
ZenML also provides a VS Code extension that allows you to interact with your ZenML stacks, runs and server directly from your VS Code editor. If you're working on code in your editor, you can easily switch and inspect the stacks you're using, delete and inspect pipelines as well as even switch stacks.