Self-hosted deployment
Guide for installing ZenML Pro self-hosted in a Kubernetes cluster.
This page provides instructions for installing ZenML Pro - the ZenML Pro Control Plane and one or more ZenML Pro Tenant servers - on-premise in a Kubernetes cluster.
Overview
ZenML Pro can be installed as an self-hosted deployment. You need to be granted access to the ZenML Pro container images and you'll have to provide your own infrastructure: a Kubernetes cluster, a database server and a few other common prerequisites usually needed to expose Kubernetes services via HTTPs - a load balancer, an Ingress controller, HTTPs certificate(s) and DNS rule(s).
This document will guide you through the process.
Preparation and prerequisites
Software Artifacts
The ZenML Pro on-prem installation relies on a set of container images and Helm charts. The container images are stored in private ZenML AWS ECR container registries located at 715803424590.dkr.ecr.eu-west-1.amazonaws.com
.
If you haven't done so already, please book a demo to get access to the private ZenML Pro container images.
To access these repositories, you need to set up an AWS IAM user or IAM role in your AWS account. The steps below outline how to create an AWS account, configure the necessary IAM entities, and pull images from the private repositories. If you're familiar with AWS or even plan on using an AWS EKS cluster to deploy ZenML Pro, then you can simply use your existing IAM user or IAM role and skip steps 1. and 2.
Step 1: Create a Free AWS Account
Visit the AWS Free Tier page.
Click Create a Free Account.
Follow the on-screen instructions to provide your email address, create a root user, and set a secure password.
Enter your contact and payment information for verification purposes. While a credit or debit card is required, you won't be charged for free-tier eligible services.
Confirm your email and complete the verification process.
Log in to the AWS Management Console using your root user credentials.
Step 2: Create an IAM User or IAM Role
A. Create an IAM User
Log in to the AWS Management Console.
Navigate to the IAM service.
Click Users in the left-hand menu, then click Add Users.
Provide a user name (e.g.,
zenml-ecr-access
).Select Access Key - Programmatic access as the AWS credential type.
Click Next: Permissions.
Choose Attach policies directly, then select the following policies:
AmazonEC2ContainerRegistryReadOnly
Click Next: Tags and optionally add tags for organization purposes.
Click Next: Review, then Create User.
Note the Access Key ID and Secret Access Key displayed after creation. Save these securely.
B. Create an IAM Role
Navigate to the IAM service.
Click Roles in the left-hand menu, then click Create Role.
Choose the type of trusted entity:
Select AWS Account.
Enter your AWS account ID and click Next.
Select the AmazonEC2ContainerRegistryReadOnly policy.
Click Next: Tags, optionally add tags, then click Next: Review.
Provide a role name (e.g.,
zenml-ecr-access-role
) and click Create Role.
Step 3: Provide the IAM User/Role ARN
For an IAM user, the ARN can be found in the Users section under the Summary tab.
For an IAM role, the ARN is displayed in the Roles section under the Summary tab.
Send the ARN to ZenML Support so it can be granted permission to access the ZenML Pro container images and Helm charts.
Step 4: Authenticate your Docker Client
Run these steps on the machine that you'll use to pull the ZenML Pro images. It is recommended that you copy the container images into your own container registry that will be accessible from the Kubernetes cluster where ZenML Pro will be stored, otherwise you'll have to find a way to configure the Kubernetes cluster to authenticate directly to the ZenML Pro container registry and that will be problematic if your Kubernetes cluster is not running on AWS.
A. Install AWS CLI
Follow the instructions to install the AWS CLI: AWS CLI Installation Guide.
B. Configure AWS CLI Credentials
Open a terminal and run
aws configure
Enter the following when prompted:
Access Key ID: Provided during IAM user creation.
Secret Access Key: Provided during IAM user creation.
Default region name:
eu-west-1
Default output format: Leave blank or enter
json
.
If you chose to use an IAM role, update the AWS CLI configuration file to specify the role you want to assume. Open the configuration file located at
~/.aws/config
and add the following:Replace
<IAM-ROLE-ARN>
with the ARN of the role you created and ensuresource_profile
points to a profile with sufficient permissions to assume the role.
C. Authenticate Docker with ECR
Run the following command to authenticate your Docker client with the ZenML ECR repository:
If you used an IAM role, use the specified profile to execute commands. For example:
This will allow you to authenticate to the ZenML Pro container registries and pull images with Docker, e.g.:
ZenML Pro Control Plane Artifacts
The following artifacts are required to install the ZenML Pro control plane in your own Kubernetes cluster:
715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-api
- private container images for the ZenML Pro API server715803424590.dkr.ecr.eu-west-1.amazonaws.com/zenml-pro-dashboard
- private container images for the ZenML Pro dashboardoci://public.ecr.aws/zenml/zenml-pro
- the public ZenML Pro helm chart (as an OCI artifact)
The container image tags and the Helm chart versions are both synchronized and linked to the ZenML Pro releases. You can find the ZenML Pro Helm chart along with the available released versions in the ZenML Pro ArtifactHub repository.
If you're planning on copying the container images to your own private registry (recommended if your Kubernetes cluster isn't running on AWS and can't authenticate directly to the ZenML Pro container registry) make sure to include and keep the same tags.
By default, the ZenML Pro Helm chart uses the same container image tags as the helm chart version. Configuring custom container image tags when setting up your Helm distribution is also possible, but not recommended because it doesn't yield reproducible results and may even cause problems if used with the wrong Helm chart version.
ZenML Pro Tenant Server Artifacts
The following artifacts are required to install ZenML Pro tenant servers in your own Kubernetes cluster:
715803424590.dkr.ecr.eu-central-1.amazonaws.com/zenml-pro-server
- private container images for the ZenML Pro tenant serveroci://public.ecr.aws/zenml/zenml
- the public open-source ZenML Helm chart (as an OCI artifact).
The container image tags and the Helm chart versions are both synchronized and linked to the ZenML open-source releases. To find the latest ZenML OSS release, please check the ZenML release page.
If you're planning on copying the container images to your own private registry (recommended if your Kubernetes cluster isn't running on AWS and can't authenticated directly to the ZenML Pro container registry) make sure to include and keep the same tags.
By default, the ZenML OSS Helm chart uses the same container image tags as the helm chart version. Configuring custom container image tags when setting up your Helm distribution is also possible, but not recommended because it doesn't yield reproducible results and may even cause problems if used with the wrong Helm chart version.
ZenML Pro Client Artifacts
If you're planning on running containerized ZenML pipelines, or using other containerization related ZenML features, you'll also need to access the public ZenML client container image located in Docker Hub at zenmldocker/zenml
. This isn't a problem unless you're deploying ZenML Pro in an air-gapped environment, in which case you'll also have to copy the client container image into your own container registry. You'll also have to configure your code to use the correct base container registry via DockerSettings (see the DockerSettings documentation for more information).
Air-Gapped Installation
If you need to install ZenML Pro in an air-gapped environment (a network with no direct internet access), you'll need to transfer all required artifacts to your internal infrastructure. Here's a step-by-step process:
1. Prepare a Machine with Internet Access
First, you'll need a machine with both internet access and sufficient storage space to temporarily store all artifacts. On this machine:
Follow the authentication steps described above to gain access to the private repositories
Install the required tools:
Docker
AWS CLI
Helm
A tool like
skopeo
for copying container images (optional but recommended)
2. Download All Required Artifacts
A Bash script like the following can be used to download all necessary components, or you can run the listed commands manually:
3. Transfer Artifacts to Air-Gapped Environment
Copy the
zenml-artifacts.tar.gz
file to your preferred transfer medium (e.g., USB drive, approved file transfer system)Transfer the archive to a machine in your air-gapped environment that has access to your internal container registry
4. Load Artifacts in Air-Gapped Environment
Create a script to load the artifacts in your air-gapped environment or run the listed commands manually:
5. Update Configuration
When deploying ZenML Pro in your air-gapped environment, make sure to update all references to container images in your Helm values to point to your internal registry. For example:
Remember to maintain the same version tags when copying images to your internal registry to ensure compatibility between components.
The scripts provided above are examples and may need to be adjusted based on your specific security requirements and internal infrastructure setup.
6. Using the Helm Charts
After downloading the Helm charts, you can use their local paths instead of a remote OCI registry to deploy ZenML Pro components. Here's an example of how to use them:
Infrastructure Requirements
To deploy the ZenML Pro control plane and one or more ZenML Pro tenant servers, ensure the following prerequisites are met:
Kubernetes Cluster
A functional Kubernetes cluster is required as the primary runtime environment.
Database Server(s)
The ZenML Pro Control Plane and ZenML Pro Tenant servers need to connect to an external database server. To minimize the amount of infrastructure resources needed, you can use a single database server in common for the Control Plane and for all tenants, or you can use different database servers to ensure server-level database isolation, as long as you keep in mind the following limitations:
the ZenML Pro Control Plane can be connected to either MySQL or Postgres as the external database
the ZenML Pro Tenant servers can only be connected to a MySQL database (no Postgres support is available)
the ZenML Pro Control Plane as well as every ZenML Pro Tenant server needs to use its own individual database (especially important when connected to the same server)
Ensure you have a valid username and password for the different ZenML Pro services. For improved security, it is recommended to have different users for different services. If the database user does not have permissions to create databases, you must also create a database and give the user full permissions to access and manage it (i.e. create, update and delete tables).
Ingress Controller
Install an Ingress provider in the cluster (e.g., NGINX, Traefik) to handle HTTP(S) traffic routing. Ensure the Ingress provider is properly configured to expose the cluster's services externally.
Domain Name
You'll need an FQDN for the ZenML Pro Control Plane as well as for every ZenML Pro tenant. For this reason, it's highly recommended to use a DNS prefix and associated SSL certificate instead of individual FQDNs and SSL certificates, to make this process easier.
FQDN or DNS Prefix Setup Obtain a Fully Qualified Domain Name (FQDN) or DNS prefix (e.g.,
*.zenml-pro.mydomain.com
) from your DNS provider.Identify the external Load Balancer IP address of the Ingress controller using the command
kubectl get svc -n <ingress-namespace>
. Look for theEXTERNAL-IP
field of the Load Balancer service.Create a DNS
A
record (orCNAME
for subdomains) pointing the FQDN to the Load Balancer IP. Example:Host:
zenml-pro.mydomain.com
Type:
A
Value:
<Load Balancer IP>
Use a DNS propagation checker to confirm that the DNS record is resolving correctly.
SSL Certificate
The ZenML Pro services do not terminate SSL traffic. It is your responsibility to generate and configure the necessary SSL certificates for the ZenML Pro Control Plane as well as all the ZenML Pro tenants that you will deploy (see the previous point on how to use a DNS prefix to make the process easier).
Obtaining SSL Certificates
Acquire an SSL certificate for the domain. You can use:
A commercial SSL certificate provider (e.g., DigiCert, Sectigo).
Free services like Let's Encrypt for domain validation and issuance.
Configuring SSL Termination
Once the SSL certificate is obtained, configure your load balancer or Ingress controller to terminate HTTPS traffic:
For NGINX Ingress Controller:
You can configure SSL termination globally for the NGINX Ingress Controller by setting up a default SSL certificate or configuring it at the ingress controller level, or you can specify SSL certificates when configuring the ingress in the ZenML server Helm values.
Here's how you can do it globally:
Create a TLS Secret
Store your SSL certificate and private key as a Kubernetes TLS secret in the namespace where the NGINX Ingress Controller is deployed.
Update NGINX Ingress Controller Configurations
Configure the NGINX Ingress Controller to use the default SSL certificate.
If using the NGINX Ingress Controller Helm chart, modify the
values.yaml
file or use-set
during installation:Or directly pass the argument during Helm installation or upgrade:
If the NGINX Ingress Controller was installed manually, edit its deployment to include the argument in the
args
section of the container:
For Traefik:
Configure Traefik to use TLS by creating a certificate resolver for Let's Encrypt or specifying the certificates manually in the
traefik.yml
orvalues.yaml
file. Example for Let's Encrypt:Reference the domain in your IngressRoute or Middleware configuration.
Stage 1/2: Install the ZenML Pro Control Plane
Configure the Helm Chart
There are a variety of options that can be configured for the ZenML Pro helm chart before installation.
You can take look at the values.yaml
file and familiarize yourself with some of the configuration settings that you can customize for your ZenML Pro deployment. Alternatively, you can unpack the values.yaml
file included in the helm chart:
This is an example Helm values YAML file that covers the most common configuration options:
Minimum required settings:
the database credentials (
zenml.database.external
)the URL (
zenml.serverURL
) and Ingress hostname (zenml.ingress.host
) where the ZenML Pro Control Plane API and Dashboard will be reachable
In addition to the above, the following might also be relevant for you:
custom container image repository locations (
zenml.image.api
andzenml.image.dashboard
)the username and password used for the default admin account (
zenml.auth.password
)additional Ingress settings (
zenml.ingress
)Kubernetes resources allocated to the pods (
resources
)If you set up a common DNS prefix that you plan on using for all the ZenML Pro services, you may configure the domain of the HTTP cookies used by the ZenML Pro dashboard to match it by setting
zenml.auth.authCookieDomain
to the DNS prefix (e.g..my.domain
instead ofzenml-pro.my-domain
)
Install the Helm Chart
Ensure that your Kubernetes cluster has access to all the container images. By default, the tags used for the container images are the same as the Helm chart version and it is recommended to keep them in sync, even though it is possible to override the tag values.
To install the helm chart (assuming the customized configuration values are in a my-values.yaml
file), run:
If the installation is successful, you should be able to see the following workloads running in your cluster:
The Helm chart will output information explaining how to connect and authenticate to the ZenML Pro dashboard:
The credentials are for the default administrator user account provisioned on installation. With these on-hand, you can proceed to the next step and on-board additional users.
Onboard Additional Users
Creating user accounts is not currently supported in the ZenML Pro dashboard, because this is not a typical ZenML Pro deployment used in production. A production ZenML Pro deployment should be configured to connect to an external OAuth 2.0 / OIDC identity provider.
However, this feature is currently supported with helper Python scripts, as described below.
The deployed ZenML Pro service will come with a pre-installed default administrator account. This admin account serves the purpose of creating and recovering other users. First you will need to get the admin password following the instructions at the previous step.
Create a
users.yml
file that contains a list of all the users that you want to create for ZenML. Also set a default password. The users will be asked to change this password on their first login.
Run the
create_users.py
script below. This will create all of the users.
[file: create_users.py]
The script will prompt you for the URL of your deployment, the admin account email and admin account password and finally the location of your users.yml
file.
Create an Organization
The ZenML Pro admin user should only be used for administrative operations: creating other users, resetting the password of existing users and enrolling tenants. All other operations should be executed while logged in as a regular user.
Head on over to your deployment in the browser and use one of the users you just created to log in.
After logging in for the first time, you will need to create a new password. (Be aware: For the time being only the admin account will be able to reset this password)
Finally you can create an Organization. This Organization will host all the tenants you enroll at the next stage.
Invite Other Users to the Organization
Now you can invite your whole team to the org. For this open the drop-down in the top right and head over to the settings.
Here in the members tab, add all the users you created in the previous step.
For each user, finally head over to the Pending invited screen and copy the invite link for each user.
Finally, send the invitation link, along with the account's email and initial password over to your team members.
Stage 2/2: Enroll and Deploy ZenML Pro tenants
Installing and updating on-prem ZenML Pro tenant servers is not automated, as it is with the SaaS version. You will be responsible for enrolling tenant servers in the right ZenML Pro organization, installing them and regularly updating them. Some scripts are provided to simplify this task as much as possible.
Enrolling a Tenant
run the
enroll-tenant.py
script below. This will collect all the necessary data, then enroll the tenant in the organization and generate a Helmvalues.yaml
file template that you can use to install the tenant server:[file: enroll-tenant.py]
Running the script does two things:
it creates a tenant entry in the ZenML Pro database. The tenant will remain in a "provisioning" state and won't be accessible until you actually install it using Helm.
it outputs a YAML file with Helm chart configuration values that you can use to deploy the ZenML Pro tenant server in your Kubernetes cluster.
This is an example of a generated Helm YAML file:
deploy the ZenML Pro tenant server with Helm:
The ZenML Pro tenant server is nothing more than a slightly modified open-source ZenML server. The deployment even uses the official open-source helm chart.
There are a variety of options that can be configured for the ZenML Pro tenant server chart before installation. You can start by taking a look at the
values.yaml
file and familiarize yourself with some of the configuration settings that you can customize for your ZenML server deployment. Alternatively, you can unpack thevalues.yaml
file included in the helm chart:To configure the Helm chart, use the generated YAML file generated at the previous step as a template and fill in the necessary values marked by
TODO
comments. At a minimum, you'll need to configure the following:the MySQL database credentials (
zenml.database.url
)the container image repository where the ZenML Pro tenant server container images are stored (
zenml.image.repository
)the hostname where the ZenML Pro tenant server will be reachable (
zenml.ingress.host
andzenml.serverURL
)
You may also choose to configure additional features, if you need them, such as secrets stores, and database backup and restore, Kubernetes resources and so on. These are documented in the official OSS ZenML Helm deployment documentation pages.
To install the helm chart (assuming the customized configuration values are in the generated
zenml-f8e306ef-90e7-4b2f-99db-28298834feed-values.yaml
file), run e.g.:The deployment is ready when the ZenML server pod is running and healthy:
After deployment, your tenant should show up as running in the ZenML Pro dashboard and can be accessed at the next step.
If you need to deploy multiple tenants, simply run the enrollment script again with different values.
Accessing the Tenant
The newly enrolled tenant should be accessible in the ZenML Pro tenant dashboard and the CLI now. You need to login as an organization member and add yourself as a tenant member first):
Then follow the instructions in the checklist to unlock the full dashboard:
Last updated