Connect your git repository
Tracking your code and avoiding unnecessary Docker builds by connecting your git repo.
Last updated
Tracking your code and avoiding unnecessary Docker builds by connecting your git repo.
Last updated
A code repository in ZenML refers to a remote storage location for your code. Some commonly known code repository platforms include GitHub and GitLab.
Code repositories enable ZenML to keep track of the code version that you use for your pipeline runs. Additionally, running a pipeline that is tracked in a registered code repository can speed up the Docker image building for containerized stack components by eliminating the need to rebuild Docker images each time you change one of your source code files.
Learn more about how code repositories benefit development here.
If you are planning to use one of the available implementations of code repositories, first, you need to install the corresponding ZenML integration:
Afterward, code repositories can be registered using the CLI:
For concrete options, check out the section on the GitHubCodeRepository
, the GitLabCodeRepository
or how to develop and register a custom code repository implementation.
ZenML comes with builtin implementations of the code repository abstraction for the GitHub
and GitLab
platforms, but it's also possible to use a custom code repository implementation.
ZenML provides built-in support for using GitHub as a code repository for your ZenML pipelines. You can register a GitHub code repository by providing the URL of the GitHub instance, the owner of the repository, the name of the repository, and a GitHub Personal Access Token (PAT) with access to the repository.
Before registering the code repository, first, you have to install the corresponding integration:
Afterward, you can register a GitHub code repository by running the following CLI command:
where <REPOSITORY> is the name of the code repository you are registering, <OWNER> is the owner of the repository, <NAME> is the name of the repository, <GITHUB_TOKEN> is your GitHub Personal Access Token and <GITHUB_URL> is the URL of the GitHub instance which defaults to https://github.com.
You will need to set a URL if you are using GitHub Enterprise.
After registering the GitHub code repository, ZenML will automatically detect if your source files are being tracked by GitHub and store the commit hash for each pipeline run.
ZenML also provides built-in support for using GitLab as a code repository for your ZenML pipelines. You can register a GitLab code repository by providing the URL of the GitLab project, the group of the project, the name of the project, and a GitLab Personal Access Token (PAT) with access to the project.
Before registering the code repository, first, you have to install the corresponding integration:
Afterward, you can register a GitLab code repository by running the following CLI command:
where <NAME>
is the name of the code repository you are registering, <GROUP>
is the group of the project, <PROJECT>
is the name of the project, <GITLAB_TOKEN> is your GitLab Personal Access Token, and <GITLAB_URL> is the URL of the GitLab instance which defaults to https://gitlab.com.
You will need to set a URL if you have a self-hosted GitLab instance.
After registering the GitLab code repository, ZenML will automatically detect if your source files are being tracked by GitLab and store the commit hash for each pipeline run.
If you're using some other platform to store your code, and you still want to use a code repository in ZenML, you can implement and register a custom code repository.
First, you'll need to subclass and implement the abstract methods of the zenml.code_repositories.BaseCodeRepository
class:
After you're finished implementing this, you can register it as follows: