How to track pipeline runs with metadata stores
Keeping a historical record of your pipeline runs is a core MLOps practice. This makes it possible to trace back the lineage or provenance of the data that was used to train a model, for example, or to go back and compare the performance of a particular model at different points in time. Features like these are becoming increasingly important in production ML to help you make informed decision about the project and to provide better visibility when something goes wrong. They are especially useful in cases where legal compliance and liability are a factor.
The Metadata Store is a central component in the MLOps stack where the pipeline runtime information is versioned and stored. The configuration of each pipeline, step and produced artifacts are all tracked within the Metadata Store.
ZenML puts a lot of emphasis on guaranteed tracking of inputs across pipeline steps. Information about every pipeline run is collected and automatically recorded in the Metadata Store: the pipeline configuration, the pipeline steps and their configuration, as well as the types of artifacts produced by pipeline step runs and the location in the Artifact Store where they are kept. This is coupled with saving the artifact contents themselves in the Artifact Store to provide extremely useful features such as caching, provenance/lineage tracking and pipeline reproducibility.
The Metadata Store is a mandatory component in the ZenML stack. It is used to keep a log of detailed information about every pipeline run and you are required to configure it in all of your stacks.
Out of the box, ZenML comes with a
sqliteMetadata Store already part of the default stack that stores metadata in a SQLite database file on your local filesystem and a
mysqlMetadata Store flavor that you can connect to a MySQL compatible database. Additional Metadata Store flavors are provided by integrations. These flavors are to be used in different contexts, but in general, we suggest to use the
mysqlflavor for most use cases:
If you would like to see the available flavors of Metadata Stores, you can use the command:
zenml metadata-store flavor list
The Metadata Store provides low-level metadata storage services for other ZenML mechanisms. When you develop ZenML pipelines, you don't even have to be aware of its existence or interact with it directly. ZenML provides higher-level APIs that can be used as an alternative to record and access information about pipeline executions:
- information about your pipeline step executions is automatically recorded in the Metadata Store.