Track ML models
Creating a full picture of a ML model using the Model Control Plane
This is an older version of the ZenML documentation. To read and view the latest version please visit this up-to-date URL.
Keeping track of ML models in ZenML
As discussed in the Core Concepts, ZenML also contains the notion of a Model
, which consists of many ModelVersions
(the iterations of the model). These concepts are exposed in the Model Control Plane
(MCP for short).
What is a ZenML Model?
Before diving in, let's take some time to build an understanding of what we mean when we say Model
in ZenML terms. A Model
is simply an entity that groups pipelines, artifacts, metadata, and other crucial business data into a unified entity. Please note that one of the most common artifacts that is associated with a Model in ZenML is the so-called technical model, which is the actual model file/files that hold the weight and parameters of a machine learning training result. However, this is not the only artifact that is relevant; artifacts such as the training data and the predictions this model produces in production are also linked inside a ZenML Model. In this sense, a ZenML Model is a concept that more broadly encapsulates your ML product's business logic.
Models are first-class citizens in ZenML and as such viewing and using them is unified and centralized in the ZenML API, the ZenML client as well as on the ZenML Cloud dashboard.
These models can be viewed within ZenML:
zenml model list
can be used to list all models.
Configuring a model in a pipeline
The easiest way to use a ZenML model is to pass a model version object as part of a pipeline run. This can be done easily at a pipeline or a step level, or via a YAML config.
Once you configure a pipeline this way, all artifacts generated during pipeline runs are automatically linked to the specified model version. This connecting of artifacts provides lineage tracking and transparency into what data and models are used during training, evaluation, and inference.
The above will establish a link between all artifacts that pass through this ZenML pipeline and this model. This includes the technical model which is what comes out of the svc_trainer
step. You will be able to see all associated artifacts and pipeline runs, all within one view.
Further, this pipeline run and all other pipeline runs that are configured with this model version will be linked to this model as well.
You can see all versions of a model, and associated artifacts and run like this:
zenml model version list <MODEL_NAME>
can be used to list all versions of a particular model.
The following commands can be used to list the various pipeline runs associated with a model:
zenml model version runs <MODEL_NAME> <MODEL_VERSIONNAME>
The following commands can be used to list the various artifacts associated with a model:
zenml model version data_artifacts <MODEL_NAME> <MODEL_VERSIONNAME>
zenml model version model_artifacts <MODEL_NAME> <MODEL_VERSIONNAME>
zenml model version deployment_artifacts <MODEL_NAME> <MODEL_VERSIONNAME>
Fetching the model in a pipeline
When configured at the pipeline or step level, the model version will be available through the StepContext or PipelineContext.
Logging metadata to the ModelVersion
object
ModelVersion
objectJust as one can associate metadata with artifacts, model versions too can take a dictionary of key-value pairs to capture their metadata. This is achieved using the log_model_metadata
method:
Choosing log metadata with artifacts or model versions depends on the scope and purpose of the information you wish to capture. Artifact metadata is best for details specific to individual outputs, while model version metadata is suitable for broader information relevant to the overall model. By utilizing ZenML's metadata logging capabilities and special types, you can enhance the traceability, reproducibility, and analysis of your ML workflows.
For further depth, there is an advanced metadata logging guide that goes more into detail about logging metadata in ZenML.
Using the stages of a model
A model's versions can exist in various stages. These are meant to signify their lifecycle state:
staging
: This version is staged for production.production
: This version is running in a production setting.latest
: The latest version of the model.archived
: This is archived and no longer relevant. This stage occurs when a model moves out of any other stage.
Facilitating artifacts exchange between pipelines using MCP
A ZenML Model spans multiple pipelines and is a key concept that brings disparate pipelines together. A simple example is illustrated below:
Each time the train_and_promote
pipeline runs, it creates a new iris_classifier
. However, it only promotes the created model to production
if a certain accuracy threshold is met. The do_predictions
pipeline simply picks up the latest promoted model and runs batch inference on it. That way these two pipelines can independently be run, but can rely on each other's output.
One way of achieving this is to fetch the model directly in your step:
However, this approach has the downside that if the step is cached, then it could lead to unexpected results. You could simply disable the cache in the above step or the corresponding pipeline. However, one other way of achieving this would be to resolve the artifact at the pipeline level:
Ultimately, both approaches are fine. Users should decide which one to use based on their own preferences.
ZenML Model and Model Versions are some of the most powerful features in ZenML. To understand them in a deeper way, read the dedicated Model Management. guide.
Last updated