Repositories: Link Stacks to Code
How to link stacks to code with repositories
ZenML has two main locations where it stores information on the local machine. These are the Global Configuration and the Repository. The latter is also referred to as the .zen folder.
The ZenML Repository related to a pipeline run is the folder that contains all the files needed to execute the run, such as the respective Python scripts and modules where the pipeline is defined, or other associated files. The repository plays a double role in ZenML:
  • It is used by ZenML to identify which files must be copied into Docker images in order to execute pipeline steps remotely, e.g., when orchestrating pipelines with Kubeflow.
  • It defines the local active Profile and active Stack that will be used when running pipelines from the repository root or one of its sub-folders, as shown below.

Registering a Repository

You can register your current working directory as a ZenML repository by running:
zenml init
This will create a .zen directory, which contains a single config.yaml file that stores the local settings:
active_profile_name: default
active_stack_name: default
It is recommended to use the zenml init command to initialize a ZenML Repository in the same location of your custom Python source tree where you would normally point PYTHONPATH, especially if your Python code relies on a hierarchy of modules spread out across multiple sub-folders.
ZenML CLI commands and ZenML code will display a warning if they are not running in the context of a ZenML repository, e.g.:
[email protected]:/tmp$ zenml stack list
Unable to find ZenML repository in your current working directory (/tmp) or any parent directories. If you want to use an existing repository which is in a different location, set the environment variable 'ZENML_REPOSITORY_PATH'. If you want to create a new repository, run zenml init.
Running without an active repository root.
Running with active profile: 'default' (global)
┏━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┓
┃ ACTIVE │ STACK NAME │ ARTIFACT_STORE │ METADATA_STORE │ ORCHESTRATOR ┃
┠────────┼────────────┼────────────────┼────────────────┼──────────────┨
┃ 👉 │ default │ default │ default │ default ┃
┗━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┛

Setting Local Active Profile and Stack

One of the most useful features of repositories is that you can configure a different active profile and stack for each of your projects. This is great if you want to use ZenML for multiple projects on the same machine. Whenever you create a new ML project, we recommend you run zenml init to create a separate repository, then create and activate a new profile, and then use it to define your stacks:
zenml init
zenml profile create <NEW_PROFILE_NAME>
zenml profile set <NEW_PROFILE_NAME>
zenml stack register ...
zenml stack set ...
If you do this, the correct profile and stack will automatically get activated whenever you change directory from one project to another in your terminal.
Note that the stacks and stack components are still stored globally, even when running from inside a ZenML repository. It is only the active profile and active stack settings that can be configured locally.

Detailed Example

Detailed usage example of local stacks and profiles

Using the Repository in Python

You can access your repository in Python using the zenml.repository.Repository class:
from zenml.repository import Repository
repo = Repository()
This allows you to perform various repository operations directly in Python, such as Inspecting Finished Pipeline Runs or accessing and managing stacks, as shown below.
To explore all possible operations that can be performed via the Repository, please consult the API docs section on Repository.

Accessing the Active Stack

The following code snippet shows how you can retrieve or modify information of your active stack and stack components in Python:
from zenml.repository import Repository
repo = Repository()
active_stack = repo.active_stack
print(active_stack.name)
print(active_stack.orchestrator.name)
print(active_stack.artifact_store.name)
print(active_stack.artifact_store.path)
print(active_stack.metadata_store.name)
print(active_stack.metadata_store.uri)

Registering and Changing Stacks

In the following we use the repository to register a new ZenML stack called local and set it as the active stack of the repository:
from zenml.repository import Repository
from zenml.artifact_stores import LocalArtifactStore
from zenml.metadata_stores import SQLiteMetadataStore
from zenml.orchestrators import LocalOrchestrator
from zenml.stack import Stack
repo = Repository()
# Create a new orchestrator
orchestrator = LocalOrchestrator(name="local")
# Create a new metadata store
metadata_store = SQLiteMetadataStore(
name="local",
uri="/tmp/zenml/zenml.db",
)
# Create a new artifact store
artifact_store = LocalArtifactStore(
name="local",
path="/tmp/zenml/artifacts",
)
# Create a new stack with the new components
stack = Stack(
name="local",
orchestrator=orchestrator,
metadata_store=metadata_store,
artifact_store=artifact_store,
)
# Register the new stack in the currently active profile
repo.register_stack(stack)
# Set the stack as the active stack of the repository
repo.activate_stack(stack.name)

Unregistering a Repository

To unregister a repository, delete the .zen directory in the respective location, e.g., via
rm -rf .zen